> ## Documentation Index > Fetch the complete documentation index at: https://docs.nimbusbci.com/llms.txt > Use this file to discover all available pages before exploring further. # NimbusSoftmax (Python) > NimbusSoftmax for Python: Bayesian multinomial logistic regression with Polya-Gamma variational inference for non-Gaussian BCI decision boundaries. # NimbusSoftmax — Bayesian Softmax Classifier **Python**: `NimbusSoftmax` | **Julia equivalent**: [`NimbusProbit`](/models/nimbusprobit)
**Mathematical model**: Bayesian multinomial logistic regression with Polya-Gamma variational inference `NimbusSoftmax` is the Python SDK's flexible non-Gaussian static classifier. Use it when `NimbusLDA` and `NimbusQDA` are too restrictive, but you still want posterior uncertainty, sklearn compatibility, and active-learning support. **Availability** * **Python SDK**: ✅ `NimbusSoftmax` via the optional `softmax` extra * **Julia SDK**: ❌ Use [`NimbusProbit`](/models/nimbusprobit) for Julia's non-Gaussian static classifier ## Install `NimbusSoftmax` depends on the optional JAX-based softmax backend: ```bash theme={null} pip install nimbus-bci[softmax] ``` ## Quick Start ```python theme={null} from nimbus_bci import NimbusSoftmax clf = NimbusSoftmax( w_scale=1.0, num_steps=50, num_posterior_samples=50, ) clf.fit(X_train, y_train) predictions = clf.predict(X_test) probabilities = clf.predict_proba(X_test) ``` `NimbusSoftmax` expects preprocessed feature rows shaped `(n_trials, n_features)`, not raw EEG. Use CSP, ERP amplitude, bandpower, or your own feature extraction before fitting. ## When to Use NimbusSoftmax * You are using the **Python SDK** and need a flexible static classifier. * Class boundaries are non-Gaussian or not well represented by class-conditional means/covariances. * You need uncertainty-aware outputs for rejection policies, active learning, or calibration analysis. * `NimbusLDA` / `NimbusQDA` accuracy has plateaued on a complex multinomial task. ## When Not to Use It * If latency is the top priority: start with `NimbusLDA`, then `NimbusQDA`. * If class centers and Mahalanobis distance are important for interpretability or outlier diagnostics: use `NimbusLDA` or `NimbusQDA`. * If the session is drifting over time: use `NimbusSTS`. * If you are using Julia: use [`NimbusProbit`](/models/nimbusprobit). ## Model Architecture `NimbusSoftmax` fits a Bayesian multinomial logistic regression model. It uses a reference-class parameterization, so one class has zero logits and the remaining classes are modeled relative to it. ```text theme={null} beta_k ~ Normal(beta_mean_k, beta_cov_k) logits = X_aug @ beta_mean.T p(y = k | x) = softmax(logits)_k ``` The fitted model stores posterior Gaussian approximations for non-reference class weights. Predictions can draw posterior samples to quantify uncertainty. ## Hyperparameters | Parameter | Default | Description | | ----------------------- | ------- | ----------------------------------------------- | | `w_loc` | `0.0` | Prior mean for feature weights | | `w_scale` | `1.0` | Prior scale for feature weights | | `b_loc` | `0.0` | Prior mean for bias terms | | `b_scale` | `1.0` | Prior scale for bias terms | | `learning_rate` | `0.2` | Damping factor for variational updates | | `num_steps` | `50` | Number of coordinate-ascent update sweeps | | `num_posterior_samples` | `50` | Posterior samples used for prediction | | `rng_seed` | `0` | Random seed for reproducible posterior sampling | ## Usage ### Train and Predict ```python theme={null} from nimbus_bci import NimbusSoftmax clf = NimbusSoftmax(w_scale=1.0, num_steps=50) clf.fit(X_train, y_train) probs = clf.predict_proba(X_test) preds = clf.predict(X_test) ``` ### Tune with sklearn ```python theme={null} from sklearn.model_selection import GridSearchCV, train_test_split from nimbus_bci import NimbusSoftmax param_grid = { "w_scale": [0.5, 1.0, 2.0], "num_steps": [50, 100], } grid = GridSearchCV( NimbusSoftmax(), param_grid, cv=5, scoring="accuracy", ) grid.fit(X_train, y_train) print(grid.best_params_) ``` ### Online Updates ```python theme={null} clf = NimbusSoftmax() clf.fit(X_seed, y_seed) for X_new, y_new in calibration_batches: clf.partial_fit(X_new, y_new) ``` ### Active Learning `NimbusSoftmax` supports BALD through posterior predictive samples, so it can drive label-efficient calibration loops. ```python theme={null} from nimbus_bci.active_learning import CalibrationSession session = CalibrationSession( clf, X_pool, pool_strategy="bald", batch_size=4, stopping_threshold=0.02, num_posterior_samples=256, ) ranked = session.suggest_next_trial() global_indices = session.remaining_indices[ranked.indices] y_new = collect_labels_for(global_indices) session.update(ranked.indices, y_new) status = session.calibration_sufficient() ``` See [Active Learning](/python-sdk/active-learning) for the full calibration workflow. ## Training Requirements * **Minimum**: at least 2 observations are required. * **Recommended**: 40+ trials per class for stable estimates. * **Feature normalization**: strongly recommended for cross-session stability. * **Input shape**: `(n_trials, n_features)`. ```python theme={null} from sklearn.preprocessing import StandardScaler from nimbus_bci import NimbusSoftmax scaler = StandardScaler() X_train_norm = scaler.fit_transform(X_train) X_test_norm = scaler.transform(X_test) clf = NimbusSoftmax() clf.fit(X_train_norm, y_train) predictions = clf.predict(X_test_norm) ``` ## Performance Characteristics | Operation | Typical cost | Notes | | --------------- | ------------------- | --------------------------------------------- | | Training | Moderate | More expensive than `NimbusLDA` / `NimbusQDA` | | Batch inference | \~15-25ms per trial | Depends on posterior sample count | | Streaming chunk | \~15-25ms | Use lower sample counts if latency is binding | `NimbusSoftmax` is usually slower than `NimbusLDA` and `NimbusQDA`, but can improve accuracy when class boundaries are not well represented by Gaussian class-conditionals. ## Model Inspection ```python theme={null} params = clf.model_.params print("Posterior weight means:") print(params["beta_mean"]) print("Posterior covariance Cholesky factors:") print(params["beta_cov_chol"].shape) print("Reference class:") print(params["ref_class"]) ``` ## Model Selection Context Use `NimbusSoftmax` when you are in Python and need a non-Gaussian static classifier with posterior sampling support. If you need explicit class centers or Mahalanobis diagnostics, use `NimbusLDA` or `NimbusQDA`. If the session drifts over time, use `NimbusSTS`. For the canonical side-by-side comparison, see [Model Specification](/model-specification). ## Next Read Julia's non-Gaussian static classifier. Full `NimbusSoftmax` constructor and method reference. Use posterior samples to reduce calibration labels. Compare Nimbus model families. ## References **Implementation:** * Python source code: `nimbus_bci/models/nimbus_softmax/` in `nimbus-bci` **Theory:** * Polson, N. G., Scott, J. G., & Windle, J. (2013). "Bayesian inference for logistic models using Pólya-Gamma latent variables" * Windle, J., Polson, N. G., & Scott, J. G. (2014). "Sampling Pólya-Gamma random variates: alternative and approximate techniques" * Bayesian multinomial logistic regression with Polya-Gamma augmentation and variational inference