Bayesian GMM (RxGMM) - Bayesian Gaussian Mixture Model
API Name:RxGMMModelMathematical Model: Heteroscedastic Gaussian Classifier (HGC) Bayesian GMM (also known as RxGMM in the codebase) is a Bayesian classification model with class-specific covariance matrices, making it more flexible than Bayesian LDA for modeling complex class distributions. Implemented using RxInfer.jl’s reactive message passing.
Bayesian GMM (RxGMM) is currently implemented in NimbusSDK.jl and ready for production BCI applications. GMM is widely recognized in machine learning, and “Bayesian” signals our uncertainty quantification and posterior probability outputs.
Overview
Bayesian GMM extends beyond traditional Gaussian classifiers by allowing each class to have its own covariance structure:- ✅ Class-specific covariances (unlike RxLDA’s shared covariance)
- ✅ More flexible modeling of heterogeneous distributions
- ✅ Posterior probability distributions with uncertainty quantification
- ✅ Fast inference (~15-25ms per trial)
- ✅ Training and calibration support
- ✅ Batch and streaming inference modes
When to Use Bayesian GMM
Bayesian GMM is ideal for:- Complex, overlapping class distributions
- Classes with significantly different variances
- P300 detection (target/non-target with different spreads)
- When Bayesian LDA accuracy is unsatisfactory
- When you need maximum flexibility
- Classes are well-separated and have similar spreads
- Speed is critical (Bayesian LDA is faster)
- Training data is limited (Bayesian LDA needs less data)
- Memory is constrained (Bayesian LDA uses less memory)
Model Architecture
Mathematical Foundation (Heteroscedastic Gaussian Classifier)
Bayesian GMM implements a Heteroscedastic Gaussian Classifier (HGC), which models class-conditional distributions with class-specific precision matrices:μ_k= mean vector for class kW_k= class-specific precision matrix (different for each class)- Allows different covariance structures per class
Model Structure
RxInfer Implementation
Learning Phase:Usage
1. Load Pre-trained Model
2. Train Custom Model
iterations: Number of variational inference iterations (default: 50)- More iterations = better convergence, typical range: 50-100
showprogress: Display progress bar during trainingname: Model identifierdescription: Model description
3. Subject-Specific Calibration
4. Batch Inference
5. Streaming Inference
Training Requirements
Data Requirements
- Minimum: 40 trials per class (80 total for 2-class)
- Recommended: 80+ trials per class
- For calibration: 10-20 trials per class
Bayesian GMM requires at least 2 observations per class to estimate class-specific statistics. Training will fail if any class has fewer than 2 observations.
Feature Requirements
Bayesian GMM expects preprocessed features, not raw EEG: ✅ Required preprocessing:- Bandpass filtering (paradigm-specific)
- Artifact removal
- Feature extraction (CSP, ERP amplitude, bandpower, etc.)
- Proper temporal aggregation
- Raw EEG channels
- Unfiltered data
Performance Characteristics
Computational Performance
| Operation | Latency | Notes |
|---|---|---|
| Training | 15-40 seconds | 50 iterations, 100 trials per class |
| Calibration | 8-20 seconds | 20 iterations, 20 trials per class |
| Batch Inference | 15-25ms per trial | 10 iterations |
| Streaming Chunk | 15-25ms | 10 iterations per chunk |
Classification Accuracy
| Paradigm | Classes | Typical Accuracy | When to Use Bayesian GMM |
|---|---|---|---|
| P300 | 2 (Target/Non-target) | 85-95% | Target/non-target have different variances |
| Motor Imagery | 2-4 | 70-85% | When Bayesian LDA accuracy insufficient |
| SSVEP | 2-6 | 85-98% | Complex frequency responses |
Bayesian GMM typically provides 2-5% higher accuracy than Bayesian LDA when class covariances differ significantly, at the cost of ~5-10ms additional latency.
Model Inspection
View Model Parameters
Visualize Class Differences
Advantages & Limitations
Advantages
✅ Flexible Modeling: Each class has its own covariance✅ Better for Complex Data: Handles heterogeneous distributions
✅ Higher Accuracy: 2-5% improvement when classes differ significantly
✅ Uncertainty Quantification: Full Bayesian posteriors
✅ Production-Ready: Battle-tested in P300 applications
Limitations
❌ More Parameters: Requires more training data than RxLDA❌ Slower Inference: ~15-25ms vs ~10-15ms for RxLDA
❌ Higher Memory: Stores n_classes precision matrices
❌ More Complex: Longer training time
Comparison: Bayesian GMM vs Bayesian LDA
| Aspect | Bayesian GMM (RxGMM) | Bayesian LDA (RxLDA) |
|---|---|---|
| Precision Matrix | Class-specific | Shared (one for all) |
| Mathematical Model | Heteroscedastic Gaussian Classifier (HGC) | Pooled Gaussian Classifier (PGC) |
| Training Speed | Slower | Faster |
| Inference Speed | 15-25ms | 10-15ms |
| Flexibility | High | Moderate |
| Data Requirements | More | Less |
| Memory Usage | Higher | Lower |
| Best For | Heterogeneous classes | Homogeneous classes |
| Accuracy Gain | +2-5% (when applicable) | Baseline |
Decision Tree
Practical Examples
P300 Detection
When Bayesian LDA Fails
Next Steps
Bayesian LDA (RxLDA)
Faster model with shared covariance
Training Guide
Complete training tutorial
Julia SDK
Full SDK reference
Code Examples
Working examples
References
Implementation:- RxInfer.jl: https://rxinfer.com/
- Source code:
/src/models/rxgmm/in NimbusSDK.jl
- McLachlan, G. J., & Peel, D. (2000). “Finite Mixture Models”
- Bishop, C. M. (2006). “Pattern Recognition and Machine Learning” (Chapter 9)
- Heteroscedastic Gaussian Classifier (HGC) with class-specific covariances
- Farwell, L. A., & Donchin, E. (1988). “Talking off the top of your head”
- Lotte et al. (2018). “A review of classification algorithms for EEG-based BCI”