Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.nimbusbci.com/llms.txt

Use this file to discover all available pages before exploring further.

NimbusSoftmax — Bayesian Softmax Classifier

Python: NimbusSoftmax | Julia equivalent: NimbusProbit
Mathematical model: Bayesian multinomial logistic regression with Polya-Gamma variational inference
NimbusSoftmax is the Python SDK’s flexible non-Gaussian static classifier. Use it when NimbusLDA and NimbusQDA are too restrictive, but you still want posterior uncertainty, sklearn compatibility, and active-learning support.
Availability
  • Python SDK: ✅ NimbusSoftmax via the optional softmax extra
  • Julia SDK: ❌ Use NimbusProbit for Julia’s non-Gaussian static classifier

Install

NimbusSoftmax depends on the optional JAX-based softmax backend:
pip install nimbus-bci[softmax]

Quick Start

from nimbus_bci import NimbusSoftmax

clf = NimbusSoftmax(
    w_scale=1.0,
    num_steps=50,
    num_posterior_samples=50,
)

clf.fit(X_train, y_train)

predictions = clf.predict(X_test)
probabilities = clf.predict_proba(X_test)
NimbusSoftmax expects preprocessed feature rows shaped (n_trials, n_features), not raw EEG. Use CSP, ERP amplitude, bandpower, or your own feature extraction before fitting.

When to Use NimbusSoftmax

  • You are using the Python SDK and need a flexible static classifier.
  • Class boundaries are non-Gaussian or not well represented by class-conditional means/covariances.
  • You need uncertainty-aware outputs for rejection policies, active learning, or calibration analysis.
  • NimbusLDA / NimbusQDA accuracy has plateaued on a complex multinomial task.

When Not to Use It

  • If latency is the top priority: start with NimbusLDA, then NimbusQDA.
  • If class centers and Mahalanobis distance are important for interpretability or outlier diagnostics: use NimbusLDA or NimbusQDA.
  • If the session is drifting over time: use NimbusSTS.
  • If you are using Julia: use NimbusProbit.

Model Architecture

NimbusSoftmax fits a Bayesian multinomial logistic regression model. It uses a reference-class parameterization, so one class has zero logits and the remaining classes are modeled relative to it.
beta_k ~ Normal(beta_mean_k, beta_cov_k)
logits = X_aug @ beta_mean.T
p(y = k | x) = softmax(logits)_k
The fitted model stores posterior Gaussian approximations for non-reference class weights. Predictions can draw posterior samples to quantify uncertainty.

Hyperparameters

ParameterDefaultDescription
w_loc0.0Prior mean for feature weights
w_scale1.0Prior scale for feature weights
b_loc0.0Prior mean for bias terms
b_scale1.0Prior scale for bias terms
learning_rate0.2Damping factor for variational updates
num_steps50Number of coordinate-ascent update sweeps
num_posterior_samples50Posterior samples used for prediction
rng_seed0Random seed for reproducible posterior sampling

Usage

Train and Predict

from nimbus_bci import NimbusSoftmax

clf = NimbusSoftmax(w_scale=1.0, num_steps=50)
clf.fit(X_train, y_train)

probs = clf.predict_proba(X_test)
preds = clf.predict(X_test)

Tune with sklearn

from sklearn.model_selection import GridSearchCV, train_test_split
from nimbus_bci import NimbusSoftmax

param_grid = {
    "w_scale": [0.5, 1.0, 2.0],
    "num_steps": [50, 100],
}

grid = GridSearchCV(
    NimbusSoftmax(),
    param_grid,
    cv=5,
    scoring="accuracy",
)
grid.fit(X_train, y_train)

print(grid.best_params_)

Online Updates

clf = NimbusSoftmax()
clf.fit(X_seed, y_seed)

for X_new, y_new in calibration_batches:
    clf.partial_fit(X_new, y_new)

Active Learning

NimbusSoftmax supports BALD through posterior predictive samples, so it can drive label-efficient calibration loops.
from nimbus_bci.active_learning import calibration_sufficient, suggest_next_trial

previous = clf.get_model()

ranked = suggest_next_trial(
    clf,
    X_pool,
    strategy="bald",
    n=4,
    num_posterior_samples=256,
)

X_new, y_new = collect_labels_for(ranked.indices)
clf.partial_fit(X_new, y_new)

status = calibration_sufficient(
    clf,
    X_pool,
    criterion="posterior_stability",
    previous=previous,
    threshold=0.02,
)
See Active Learning for the full calibration workflow.

Training Requirements

  • Minimum: at least 2 observations are required.
  • Recommended: 40+ trials per class for stable estimates.
  • Feature normalization: strongly recommended for cross-session stability.
  • Input shape: (n_trials, n_features).
from sklearn.preprocessing import StandardScaler
from nimbus_bci import NimbusSoftmax

scaler = StandardScaler()
X_train_norm = scaler.fit_transform(X_train)
X_test_norm = scaler.transform(X_test)

clf = NimbusSoftmax()
clf.fit(X_train_norm, y_train)
predictions = clf.predict(X_test_norm)

Performance Characteristics

OperationTypical costNotes
TrainingModerateMore expensive than NimbusLDA / NimbusQDA
Batch inference~15-25ms per trialDepends on posterior sample count
Streaming chunk~15-25msUse lower sample counts if latency is binding
NimbusSoftmax is usually slower than NimbusLDA and NimbusQDA, but can improve accuracy when class boundaries are not well represented by Gaussian class-conditionals.

Model Inspection

params = clf.model_.params

print("Posterior weight means:")
print(params["beta_mean"])

print("Posterior covariance Cholesky factors:")
print(params["beta_cov_chol"].shape)

print("Reference class:")
print(params["ref_class"])

Model Selection Context

Use NimbusSoftmax when you are in Python and need a non-Gaussian static classifier with posterior sampling support. If you need explicit class centers or Mahalanobis diagnostics, use NimbusLDA or NimbusQDA. If the session drifts over time, use NimbusSTS. For the canonical side-by-side comparison, see Model Specification.

Next Read

NimbusProbit (Julia)

Julia’s non-Gaussian static classifier.

Python API Reference

Full NimbusSoftmax constructor and method reference.

Active Learning

Use posterior samples to reduce calibration labels.

Model Selection

Compare Nimbus model families.

References

Implementation:
  • Python source code: nimbus_bci/models/nimbus_softmax/ in nimbus-bci
Theory:
  • Polson, N. G., Scott, J. G., & Windle, J. (2013). “Bayesian inference for logistic models using Pólya-Gamma latent variables”
  • Windle, J., Polson, N. G., & Scott, J. G. (2014). “Sampling Pólya-Gamma random variates: alternative and approximate techniques”
  • Bayesian multinomial logistic regression with Polya-Gamma augmentation and variational inference