Active Learning
Use active learning to collect labels for the trials that are most likely to improve your BCI model. The Python SDK exposes stateless helpers for pool-based calibration, streaming label requests, and label-free stopping.Active learning operates on preprocessed feature rows, not raw EEG. Use arrays shaped
(n_trials, n_features) for pools and (n_features,) or (1, n_features) for single streaming trials.Core Workflow
Start with a small seed calibration set, rank an unlabeled feature pool, collect labels for the most informative trials, update withpartial_fit(), and stop when the posterior stops changing.
Pool-Based Trial Ranking
Usesuggest_next_trial() when you have an unlabeled pool of candidate feature rows and want the top n trials to label next.
suggest_next_trial() accepts either a fitted Nimbus classifier (NimbusLDA, NimbusQDA, NimbusSoftmax, NimbusSTS) or a raw NimbusModel snapshot. It returns a QueryResult dataclass with:
indices: top-nindices intoX_pool.scores: raw informativeness score for every row inX_pool.strategy: the strategy used.n_posterior_samples: posterior samples used for the score (1for cheap strategies).
strategy="bald" is supported for NimbusLDA, NimbusQDA, and NimbusSoftmax. It is not supported for NimbusSTS in this release because STS posterior sampling needs temporal-coupling support.Streaming Query Gate
Useshould_query() when a single trial arrives during a live session and you need to decide whether asking for a label is worth the calibration cost.
StreamingQueryDecision dataclass with:
should_query: whether the score crossed the threshold.score: raw informativeness score.threshold: threshold used for the decision.strategy: strategy used.
Stopping Calibration
Usecalibration_sufficient() to stop collecting labels once additional cues are unlikely to change predictions over the pool.
calibration_sufficient() returns a CalibrationStatus dataclass with:
is_sufficient:Truewhen the criterion signal is below the threshold.signal: mean total variation forposterior_stability, or mean BALD forexpected_info_gain.threshold: threshold used for the comparison.criterion: criterion used.details: extra diagnostic values such as max/min TV or BALD.
Stopping Criteria
posterior_stability compares two consecutive model snapshots over the same X_pool. It measures the mean total-variation distance between predict_proba outputs and works for every Nimbus head, including NimbusSTS.
expected_info_gain measures mean BALD over the current pool. It does not use previous, and it is available for NimbusLDA, NimbusQDA, and NimbusSoftmax.
Strategy Guide
| Strategy | Query direction | Units | Works with STS? | Best use |
|---|---|---|---|---|
entropy | Higher is more informative | Bits, [0, log2(n_classes)] | Yes | Fast default for uncertain predictions |
margin | Lower is more informative | Probability gap, [0, 1] | Yes | Top-1 vs top-2 ambiguity |
least_confidence | Higher is more informative | 1 - max(p) | Yes | Simple max-confidence thresholding |
bald | Higher is more informative | Bits, [0, log2(n_classes)] | No | Epistemic uncertainty from posterior samples |
Practical Defaults
- Start with
strategy="bald"for pool-based calibration when usingNimbusLDA,NimbusQDA, orNimbusSoftmax. - Use
num_posterior_samples=256for BALD ranking stability. Lower values can be faster but noisier. - Use
strategy="entropy"for streamingshould_query()gates. - Use
criterion="posterior_stability"for label-free stopping, with a threshold near0.02as an initial tuning point. - Keep
X_poolfixed across a calibration round so scores and stability checks are comparable.
Next Read
Python API Reference
Function signatures and dataclass fields for active learning.
Streaming Inference
Combine real-time prediction with query gates and feedback.
sklearn Integration
Use Nimbus classifiers inside sklearn workflows.
Model Selection
Choose the right Bayesian head before calibration.