IntegratedML Flexible Model Integration - Complete User Guide

Welcome to the comprehensive guide for getting started with IntegratedML Flexible Model Integration! This guide will take you from installation through running your first machine learning models integrated directly into database workflows.

🖥️ System Requirements

Minimum Requirements

Python: 3.8 or higher
Memory: 4GB RAM (8GB recommended for all demos)
Storage: 2GB free space
Operating System: Windows 10+, macOS 10.15+, or Linux (Ubuntu 18.04+)

Optional but Recommended

InterSystems IRIS: For full IntegratedML integration
Jupyter: For interactive notebooks
Git: For cloning and contributing

Demo-Specific Requirements

Credit Risk: Scikit-learn, pandas, numpy
Fraud Detection: XGBoost, scikit-learn (GPU optional)
Sales Forecasting: Prophet, LightGBM (additional system dependencies)

🚀 Using the Notebooks

The primary way to interact with the demos is through Jupyter Notebooks.

Quickstart

Launch Jupyter:
```
jupyter lab
```
or
```
jupyter notebook
```
Open the Quickstart Notebook:
- notebooks/Iris_IntegratedML_Quickstart.ipynb
Explore Domain-Specific Notebooks:

Shared Modules

The notebooks utilize shared modules for common tasks:

Database Connection: shared/database/connection.py
Data Loading: shared/database/data_loader.py
Model Management: shared/database/model_manager.py

⚡ Installation & Setup

Option 1: Quick Installation (Recommended)

# Clone the repository
git clone https://github.com/intersystems/integratedml-demos.git
cd integratedml-demos

# Create and activate virtual environment (recommended)
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# Install all dependencies
pip install -r requirements.txt

# Install the package in development mode
pip install -e .

Option 2: Conda Installation

# Clone the repository
git clone https://github.com/intersystems/integratedml-demos.git
cd integratedml-demos

# Create conda environment
conda create -n integratedml-demos python=3.9
conda activate integratedml-demos

# Install dependencies
pip install -r requirements.txt
pip install -e .

Option 3: Docker Installation

# Clone and run with Docker
git clone https://github.com/intersystems/integratedml-demos.git
cd integratedml-demos

# Build and run the development environment
docker-compose up -d

# Access Jupyter Lab at http://localhost:8888

✅ Quick Verification

Let's verify your installation works correctly by running a simple test:

# Test basic installation
python -c "
import sys
print('✅ Python version:', sys.version)

try:
    import pandas as pd
    import numpy as np
    import sklearn
    print('✅ Core dependencies loaded successfully')
    
    # Test our demo imports
    from demos.credit_risk.models.credit_risk_classifier import CustomCreditRiskClassifier
    print('✅ Demo models imported successfully')
    
    print('\n🎉 Installation verified! Ready to run demos.')
except ImportError as e:
    print('❌ Import error:', e)
    print('💡 Try: pip install -r requirements.txt')
"

Quick Demo Test

Run the quick start example to ensure everything works:

# Run the quick start example
python examples/quick_start_example.py

Expected Output:

IntegratedML Flexible Model Integration Demo - Quick Start Examples
========================================================

DEMO 1: Credit Risk Assessment with Custom Feature Engineering
============================================================
Training data shape: (800, 15)
Test data shape: (200, 15)
...
Accuracy: 0.xxx

🎉 All demos completed successfully!

📊 Demo Portfolio Overview

Our three progressive demos demonstrate different aspects of IntegratedML integration:

🟢 Demo 1: Credit Risk Assessment

Perfect for: First-time users, understanding custom feature engineering

Complexity: Beginner-friendly
Time Commitment: 15-30 minutes
Key Learning: Custom preprocessing within database context
Business Value: Secure financial data processing

🟡 Demo 2: Fraud Detection

Perfect for: Understanding ensemble techniques and real-time processing

Complexity: Intermediate
Time Commitment: 30-45 minutes
Key Learning: Ensemble orchestration, real-time constraints
Business Value: 67ms latency, 95.4% accuracy validated

🔴 Demo 3: Sales Forecasting

Perfect for: Advanced users, third-party library integration

Complexity: Advanced
Time Commitment: 45-60 minutes
Key Learning: Prophet + LightGBM hybrid architecture
Business Value: 20%+ forecasting improvement

🚀 Demo Walkthroughs

Demo 1: Credit Risk Assessment

Step 1: Navigate to Demo Directory

cd demos/credit_risk

Step 2: Generate Sample Data

# Create realistic credit risk dataset
python data/generate_sample_data.py

Step 3: Train and Test the Model

# Launch interactive notebook
jupyter notebook notebooks/01_Credit_Risk_Complete_Demo.ipynb

# OR run the Python script directly
python -m demos.credit_risk.models.credit_risk_classifier

Step 4: Explore Custom Features

The demo showcases several custom feature engineering techniques:

Debt-to-Income Ratios: Financial health indicators
Credit Utilization Scores: Spending pattern analysis
Risk Interaction Terms: Complex relationship modeling
Domain-Specific Transformations: Financial industry best practices

Step 5: IntegratedML Integration

-- Example SQL commands for IntegratedML
CREATE MODEL CreditRiskModel PREDICTING (default_risk)
FROM CreditApplications 
USING CustomCreditRiskClassifier(
    enable_debt_ratio=true,
    enable_interaction_terms=true,
    decision_threshold=0.6
);

📖 Complete Credit Risk Tutorial →

Demo 2: Fraud Detection

Step 1: Navigate and Setup

cd demos/fraud_detection

# Install additional dependencies if needed
pip install xgboost

Step 2: Generate Transaction Data

# Create synthetic fraud transaction dataset
python data/generate_transaction_data.py

Step 3: Run Performance Benchmarks

# Verify latency requirements
python scripts/verify_latency_requirements.py

# Expected output:
# ✅ Average Latency: 67ms (Target: ≤100ms)
# ✅ P95 Latency: 89ms (Target: ≤150ms)  
# ✅ Success Rate: 96.8% (Target: ≥90%)

Step 4: Explore Ensemble Architecture

The fraud detection system combines:

Rule-based Detector: Fast heuristics (~8ms)
Anomaly Detection: IRIS Vector Search integration (~15ms)
Neural Network: Pattern recognition (~12ms)
Behavioral Analysis: Customer profiling (~9ms)

Step 5: Real-time Testing

# Launch interactive demo
jupyter notebook notebooks/01_Fraud_Detection_Complete_Demo.ipynb

# Test ensemble performance
python -m pytest tests/test_performance.py -v

📖 Complete Fraud Detection Tutorial →

Demo 3: Sales Forecasting

Step 1: Install Dependencies

cd demos/sales_forecasting

# Install Prophet and LightGBM
pip install prophet lightgbm

# Note: Prophet may require additional system dependencies
# See troubleshooting section if you encounter issues

Step 2: Generate Sales Data

# Create multi-store retail sales dataset
python data/generate_sales_data.py

Step 3: Train Hybrid Model

# Launch forecasting notebook
jupyter notebook notebooks/01_Sales_Forecasting_Complete_Demo.ipynb

Step 4: Explore Hybrid Architecture

The sales forecasting system combines:

Prophet Component: Trend and seasonality detection
LightGBM Component: Feature-rich ML predictions
Ensemble Strategy: Horizon-weighted combination
Confidence Intervals: Business-ready uncertainty quantification

Step 5: Production Integration

-- Example IntegratedML deployment
CREATE MODEL SalesForecastModel PREDICTING (monthly_sales)
FROM HistoricalSales 
USING HybridForecastingModel(
    trend_model='prophet',
    ml_model='lightgbm',
    forecast_horizon=12,
    include_confidence_intervals=true
);

📖 Complete Sales Forecasting Tutorial →

🔧 Common Issues & Troubleshooting

Installation Issues

Problem: pip install fails with dependency conflicts

# Solution: Use fresh virtual environment
python -m venv fresh_env
source fresh_env/bin/activate  # or fresh_env\Scripts\activate on Windows
pip install --upgrade pip
pip install -r requirements.txt

Problem: Prophet installation fails

# On macOS:
brew install cmake
pip install prophet

# On Ubuntu/Debian:
sudo apt-get install python3-dev python3-pip python3-venv
pip install prophet

# On Windows:
# Install Visual C++ Build Tools first, then:
pip install prophet

Problem: XGBoost GPU support issues

# Use CPU version (recommended for most users):
pip install xgboost

# For GPU support (advanced users):
pip install xgboost[gpu]

Runtime Issues

Problem: "Module not found" errors

# Ensure package is installed in development mode:
pip install -e .

# Verify PYTHONPATH:
export PYTHONPATH="${PYTHONPATH}:$(pwd)"

Problem: Jupyter notebook kernel issues

# Install Jupyter kernel for your environment:
python -m ipykernel install --user --name=integratedml-demos

Problem: Memory errors during training

# Reduce dataset size for testing:
export DEMO_SAMPLE_SIZE=1000

# Or increase system memory allocation

Performance Issues

Problem: Fraud detection latency too high

Check system resources (CPU/Memory usage)
Verify no other intensive processes running
Consider reducing ensemble complexity for testing

Problem: Sales forecasting taking too long

Reduce forecast horizon for testing
Use smaller dataset for initial exploration
Check Prophet installation (C++ dependencies)

IntegratedML Integration Issues

Problem: SQL model creation fails

Verify IntegratedML is properly installed
Check model class imports and paths
Ensure database connectivity and permissions

🎯 Next Steps

For ML Practitioners

Explore Custom Features: Study the feature engineering in Demo 1
Build Your Own Model: Follow Tutorial 4: Custom Models
Performance Optimization: Review Architecture Documentation

For IntegratedML Users

Production Deployment: Review Deployment Guide
Integration Patterns: Study Architecture Overview
API Reference: Explore API Documentation

For Data Scientists

Performance Analysis: Deep dive into Performance Benchmarks
Model Comparison: Run all three demos and compare approaches
Custom Algorithms: Adapt the frameworks for your specific use cases

For Open Source Contributors

Development Setup: Follow the contributor setup in main README
Contributing Guidelines: Review CONTRIBUTING.md
Issue Reporting: Use GitHub Issues for bugs and feature requests

🆘 Getting Help

📖 Documentation: Complete reference in docs/ directory
🐛 Issues: GitHub Issues
💬 Community: InterSystems Developer Community
✉️ Email: support@intersystems.com

🎉 You're all set! Choose your starting demo based on your experience level and dive into the world of IntegratedML Flexible Model Integration. Happy coding! 🚀

FilesExpand file tree

user_guide.md

Latest commit

History