Skip to main content

Probabilistic Model Specification

NimbusSDK provides pre-built probabilistic models (Bayesian LDA and Bayesian GMM) powered by RxInfer.jl, a reactive message passing framework for efficient Bayesian inference.
Current Status: NimbusSDK provides ready-to-use models (Bayesian LDA / RxLDA and Bayesian GMM / RxGMM) rather than a custom model specification language. This page documents the conceptual foundations and future directions.

Available Models

Bayesian LDA (RxLDA)

API Name: RxLDAModel
Mathematical Model: Pooled Gaussian Classifier (PGC)
A probabilistic classifier using Bayesian LDA with reactive message passing:
using NimbusSDK

# Load pre-trained model
model = load_model(RxLDAModel, "motor_imagery_4class_v1")

# Or train your own
trained_model = train_model(RxLDAModel, training_data; iterations=50)
Characteristics:
  • Shared covariance across all classes (Pooled Gaussian Classifier)
  • Fast inference: 10-20ms per trial
  • Best for: Well-separated classes (motor imagery, P300)
  • Mathematical model: p(xc)=N(xμc,Σ)p(x|c) = \mathcal{N}(x | \mu_c, \Sigma)

Bayesian GMM (RxGMM)

API Name: RxGMMModel
Mathematical Model: Heteroscedastic Gaussian Classifier (HGC)
A more flexible classifier with class-specific covariances:
using NimbusSDK

# Load pre-trained model
model = load_model(RxGMMModel, "p300_binary_v1")

# Or train your own
trained_model = train_model(RxGMMModel, training_data; iterations=50)
Characteristics:
  • Class-specific covariances (Heteroscedastic Gaussian Classifier)
  • Moderate inference: 15-25ms per trial
  • Best for: Overlapping classes, complex distributions
  • Mathematical model: p(xc)=N(xμc,Σc)p(x|c) = \mathcal{N}(x | \mu_c, \Sigma_c)

Model Architecture

Both Bayesian LDA and Bayesian GMM are built on factor graphs with reactive message passing:
        Prior               Likelihood            Posterior
    p(class)  ────────────────────────────────►  p(class|data)


                      p(data|class)
                     (Gaussian)

Factor Graphs

Factor graphs represent the joint probability distribution: p(class,data)=p(class)p(dataclass)p(\text{class}, \text{data}) = p(\text{class}) \cdot p(\text{data}|\text{class}) Components:
  1. Prior: p(class)p(\text{class}) - uniform or learned class probabilities
  2. Likelihood: p(dataclass)p(\text{data}|\text{class}) - Gaussian distributions
  3. Posterior: p(classdata)p(\text{class}|\text{data}) - computed via message passing

Reactive Message Passing

RxInfer.jl uses reactive programming for efficient inference:
Data Stream ──► Factor Graph ──► Message Passing ──► Posterior Updates


              (Incremental Updates)
Benefits:
  • Incremental processing: Process data chunks as they arrive
  • Low latency: 10-25ms per chunk
  • Memory efficient: Constant memory usage
  • Real-time capable: Streaming inference without buffering

Training Process

Supervised Training

Both models learn from labeled training data:
using NimbusSDK

# Prepare labeled training data
features = randn(16, 250, 100)  # 100 trials
labels = rand(1:4, 100)  # 4 classes

metadata = BCIMetadata(
    sampling_rate = 250.0,
    paradigm = :motor_imagery,
    feature_type = :csp,
    n_features = 16,
    n_classes = 4
)

training_data = BCIData(features, metadata, labels)

# Train model
model = train_model(
    RxLDAModel, 
    training_data; 
    iterations = 50,
    showprogress = true
)
What happens during training:
  1. Initialize parameters: μc,Σ\mu_c, \Sigma (RxLDA) or μc,Σc\mu_c, \Sigma_c (RxGMM)
  2. E-step: Compute posterior probabilities p(cxi)p(c|x_i)
  3. M-step: Update model parameters to maximize likelihood
  4. Iterate: Repeat until convergence

Model Calibration

Adapt pre-trained models to individual users:
# Load baseline model
baseline_model = load_model(RxLDAModel, "motor_imagery_baseline_v1")

# Collect small calibration dataset from new user
calib_data = collect_calibration_data(num_trials = 20)

# Personalize model
personalized_model = calibrate_model(baseline_model, calib_data; iterations = 20)
Calibration updates:
  • Class means μc\mu_c to match user-specific patterns
  • Covariance Σ\Sigma to reflect user variability
  • Prior probabilities p(c)p(c) based on user performance

Inference Modes

Batch Inference

Process entire trials at once:
# Batch inference on test set
results = predict_batch(model, test_data; iterations=10)

# Returns full posterior distribution
posterior = results.posteriors  # [n_classes × n_trials]
predictions = results.predictions  # [n_trials]
confidences = results.confidences  # [n_trials]
Use cases:
  • Offline analysis
  • Model validation
  • Performance benchmarking

Streaming Inference

Process data incrementally in real-time:
# Initialize streaming session
session = init_streaming(model, metadata)

# Process chunks as they arrive
for chunk in eeg_stream
    result = process_chunk(session, chunk)
    # result contains: prediction, confidence, posterior
end

# Finalize with aggregation
final_result = finalize_trial(session; method=:weighted_vote)
Use cases:
  • Real-time BCI control
  • Online neurofeedback
  • Adaptive systems

Uncertainty Quantification

Both models provide explicit uncertainty measures:

Confidence Scores

results = predict_batch(model, data)

# Confidence = max posterior probability
for (i, conf) in enumerate(results.confidences)
    if conf > 0.9
        println("Trial $i: High confidence")
    elseif conf > 0.7
        println("Trial $i: Medium confidence")
    else
        println("Trial $i: Low confidence - reject?")
    end
end

Posterior Distributions

# Full probability distribution over classes
posterior = results.posteriors[:, trial_idx]

# Example: [0.05, 0.12, 0.78, 0.05] for 4 classes
# Class 3 has highest probability (78%)

# Entropy as uncertainty measure
entropy = -sum(posterior .* log.(posterior .+ 1e-10))
# High entropy → high uncertainty

Model Limitations

Current Limitations of RxLDA/RxGMM:
  1. Gaussian assumptions: Assume features follow Gaussian distributions
  2. Static models: No built-in temporal dynamics (use preprocessing for temporal features)
  3. Supervised only: Require labeled training data
  4. Fixed structure: Cannot modify factor graph structure
  5. Linear decision boundaries (RxLDA): May struggle with complex, nonlinear patterns

Future Directions

Coming Soon

Advanced Models (Planned):
  • Hidden Markov Models (HMM): Temporal sequence modeling
  • Kalman Filters: Continuous state tracking
  • Custom Factor Graphs: User-defined probabilistic models
  • Hierarchical Models: Multi-level Bayesian models
  • Transfer Learning: Cross-subject model adaptation

Custom Model Development

Using RxInfer.jl Directly

Advanced users can build custom models using RxInfer.jl:
using RxInfer

# Define custom factor graph
@model function custom_bci_model(y, σ²)
    # Define latent variables and distributions
    x ~ NormalMeanVariance(0.0, 1.0)
    y ~ NormalMeanVariance(x, σ²)
end

# Perform inference
result = infer(
    model = custom_bci_model(),
    data  = (y = observations,),
    constraints = MeanField(),
    iterations = 50
)
Note: Custom RxInfer.jl models are not directly integrated with NimbusSDK’s training/inference API. Future versions may support custom model integration.

Model Comparison

FeatureRxLDARxGMM
CovarianceShared across classesClass-specific
FlexibilityLowerHigher
SpeedFaster (10-20ms)Slower (15-25ms)
Training TimeFastModerate
ParametersFewer (efficient)More (flexible)
Best ForWell-separated classesOverlapping distributions
Overfitting RiskLowerHigher (more parameters)

Next Steps


Development Philosophy: NimbusSDK prioritizes battle-tested, production-ready models (RxLDA, RxGMM) over experimental architectures. Custom model specification may be added in future releases based on user demand.