> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nimbusbci.com/llms.txt
> Use this file to discover all available pages before exploring further.

# NimbusSoftmax (Python)

> NimbusSoftmax for Python: Bayesian multinomial logistic regression with Polya-Gamma variational inference for non-Gaussian BCI decision boundaries.

# NimbusSoftmax — Bayesian Softmax Classifier

**Python**: `NimbusSoftmax` | **Julia equivalent**: [`NimbusProbit`](/models/nimbusprobit)<br />
**Mathematical model**: Bayesian multinomial logistic regression with Polya-Gamma variational inference

`NimbusSoftmax` is the Python SDK's flexible non-Gaussian static classifier. Use it when `NimbusLDA` and `NimbusQDA` are too restrictive, but you still want posterior uncertainty, sklearn compatibility, and active-learning support.

<Note>
  **Availability**

  * **Python SDK**: ✅ `NimbusSoftmax` via the optional `softmax` extra
  * **Julia SDK**: ❌ Use [`NimbusProbit`](/models/nimbusprobit) for Julia's non-Gaussian static classifier
</Note>

## Install

`NimbusSoftmax` depends on the optional JAX-based softmax backend:

```bash theme={null}
pip install nimbus-bci[softmax]
```

## Quick Start

```python theme={null}
from nimbus_bci import NimbusSoftmax

clf = NimbusSoftmax(
    w_scale=1.0,
    num_steps=50,
    num_posterior_samples=50,
)

clf.fit(X_train, y_train)

predictions = clf.predict(X_test)
probabilities = clf.predict_proba(X_test)
```

<Tip>
  `NimbusSoftmax` expects preprocessed feature rows shaped `(n_trials, n_features)`, not raw EEG. Use CSP, ERP amplitude, bandpower, or your own feature extraction before fitting.
</Tip>

## When to Use NimbusSoftmax

* You are using the **Python SDK** and need a flexible static classifier.
* Class boundaries are non-Gaussian or not well represented by class-conditional means/covariances.
* You need uncertainty-aware outputs for rejection policies, active learning, or calibration analysis.
* `NimbusLDA` / `NimbusQDA` accuracy has plateaued on a complex multinomial task.

## When Not to Use It

* If latency is the top priority: start with `NimbusLDA`, then `NimbusQDA`.
* If class centers and Mahalanobis distance are important for interpretability or outlier diagnostics: use `NimbusLDA` or `NimbusQDA`.
* If the session is drifting over time: use `NimbusSTS`.
* If you are using Julia: use [`NimbusProbit`](/models/nimbusprobit).

## Model Architecture

`NimbusSoftmax` fits a Bayesian multinomial logistic regression model. It uses a reference-class parameterization, so one class has zero logits and the remaining classes are modeled relative to it.

```text theme={null}
beta_k ~ Normal(beta_mean_k, beta_cov_k)
logits = X_aug @ beta_mean.T
p(y = k | x) = softmax(logits)_k
```

The fitted model stores posterior Gaussian approximations for non-reference class weights. Predictions can draw posterior samples to quantify uncertainty.

## Hyperparameters

| Parameter               | Default | Description                                     |
| ----------------------- | ------- | ----------------------------------------------- |
| `w_loc`                 | `0.0`   | Prior mean for feature weights                  |
| `w_scale`               | `1.0`   | Prior scale for feature weights                 |
| `b_loc`                 | `0.0`   | Prior mean for bias terms                       |
| `b_scale`               | `1.0`   | Prior scale for bias terms                      |
| `learning_rate`         | `0.2`   | Damping factor for variational updates          |
| `num_steps`             | `50`    | Number of coordinate-ascent update sweeps       |
| `num_posterior_samples` | `50`    | Posterior samples used for prediction           |
| `rng_seed`              | `0`     | Random seed for reproducible posterior sampling |

## Usage

### Train and Predict

```python theme={null}
from nimbus_bci import NimbusSoftmax

clf = NimbusSoftmax(w_scale=1.0, num_steps=50)
clf.fit(X_train, y_train)

probs = clf.predict_proba(X_test)
preds = clf.predict(X_test)
```

### Tune with sklearn

```python theme={null}
from sklearn.model_selection import GridSearchCV, train_test_split
from nimbus_bci import NimbusSoftmax

param_grid = {
    "w_scale": [0.5, 1.0, 2.0],
    "num_steps": [50, 100],
}

grid = GridSearchCV(
    NimbusSoftmax(),
    param_grid,
    cv=5,
    scoring="accuracy",
)
grid.fit(X_train, y_train)

print(grid.best_params_)
```

### Online Updates

```python theme={null}
clf = NimbusSoftmax()
clf.fit(X_seed, y_seed)

for X_new, y_new in calibration_batches:
    clf.partial_fit(X_new, y_new)
```

### Active Learning

`NimbusSoftmax` supports BALD through posterior predictive samples, so it can drive label-efficient calibration loops.

```python theme={null}
from nimbus_bci.active_learning import CalibrationSession

session = CalibrationSession(
    clf,
    X_pool,
    pool_strategy="bald",
    batch_size=4,
    stopping_threshold=0.02,
    num_posterior_samples=256,
)

ranked = session.suggest_next_trial()
global_indices = session.remaining_indices[ranked.indices]
y_new = collect_labels_for(global_indices)
session.update(ranked.indices, y_new)

status = session.calibration_sufficient()
```

See [Active Learning](/python-sdk/active-learning) for the full calibration workflow.

## Training Requirements

* **Minimum**: at least 2 observations are required.
* **Recommended**: 40+ trials per class for stable estimates.
* **Feature normalization**: strongly recommended for cross-session stability.
* **Input shape**: `(n_trials, n_features)`.

```python theme={null}
from sklearn.preprocessing import StandardScaler
from nimbus_bci import NimbusSoftmax

scaler = StandardScaler()
X_train_norm = scaler.fit_transform(X_train)
X_test_norm = scaler.transform(X_test)

clf = NimbusSoftmax()
clf.fit(X_train_norm, y_train)
predictions = clf.predict(X_test_norm)
```

## Performance Characteristics

| Operation       | Typical cost        | Notes                                         |
| --------------- | ------------------- | --------------------------------------------- |
| Training        | Moderate            | More expensive than `NimbusLDA` / `NimbusQDA` |
| Batch inference | \~15-25ms per trial | Depends on posterior sample count             |
| Streaming chunk | \~15-25ms           | Use lower sample counts if latency is binding |

`NimbusSoftmax` is usually slower than `NimbusLDA` and `NimbusQDA`, but can improve accuracy when class boundaries are not well represented by Gaussian class-conditionals.

## Model Inspection

```python theme={null}
params = clf.model_.params

print("Posterior weight means:")
print(params["beta_mean"])

print("Posterior covariance Cholesky factors:")
print(params["beta_cov_chol"].shape)

print("Reference class:")
print(params["ref_class"])
```

## Model Selection Context

Use `NimbusSoftmax` when you are in Python and need a non-Gaussian static classifier with posterior sampling support. If you need explicit class centers or Mahalanobis diagnostics, use `NimbusLDA` or `NimbusQDA`. If the session drifts over time, use `NimbusSTS`.

For the canonical side-by-side comparison, see [Model Specification](/model-specification).

## Next Read

<CardGroup cols={2}>
  <Card title="NimbusProbit (Julia)" icon="brain" href="/models/nimbusprobit">
    Julia's non-Gaussian static classifier.
  </Card>

  <Card title="Python API Reference" icon="book" href="/python-sdk/api-reference">
    Full `NimbusSoftmax` constructor and method reference.
  </Card>

  <Card title="Active Learning" icon="target" href="/python-sdk/active-learning">
    Use posterior samples to reduce calibration labels.
  </Card>

  <Card title="Model Selection" icon="box" href="/model-specification">
    Compare Nimbus model families.
  </Card>
</CardGroup>

## References

**Implementation:**

* Python source code: `nimbus_bci/models/nimbus_softmax/` in `nimbus-bci`

**Theory:**

* Polson, N. G., Scott, J. G., & Windle, J. (2013). "Bayesian inference for logistic models using Pólya-Gamma latent variables"
* Windle, J., Polson, N. G., & Scott, J. G. (2014). "Sampling Pólya-Gamma random variates: alternative and approximate techniques"
* Bayesian multinomial logistic regression with Polya-Gamma augmentation and variational inference

<script
  type="application/ld+json"
  dangerouslySetInnerHTML={{
__html: JSON.stringify({
  '@context': 'https://schema.org',
  '@type': 'TechArticle',
  headline: 'NimbusSoftmax (Python)',
  description:
    'Python Bayesian multinomial logistic regression with Polya-Gamma variational inference for non-Gaussian BCI classification.',
  author: {
    '@type': 'Organization',
    name: 'Nimbus BCI',
    url: 'https://nimbusbci.com',
  },
  publisher: {
    '@type': 'Organization',
    name: 'Nimbus BCI',
    url: 'https://nimbusbci.com',
  },
  about: {
    '@type': 'Thing',
    name: 'Brain-Computer Interface',
    description: 'Bayesian softmax classification for complex BCI tasks',
  },
  keywords:
    'NimbusSoftmax, Bayesian Softmax, Python SDK, multinomial logistic regression, Polya-Gamma, BCI',
  inLanguage: 'en-US',
  isAccessibleForFree: true,
}),
}}
/>
