Preprocessing Integration Guide
Complete workflows for preprocessing EEG data with external tools and using the features in Nimbus BCI (Python and Julia SDKs).
Python SDK Users: The nimbus-bci package has native MNE-Python integration! See the MNE Integration guide for streamlined workflows.Julia SDK Users: This guide shows how to preprocess in external tools and load into Julia.
Integration Options
| Tool | Language | Difficulty | Best For | Python SDK | Julia SDK |
|---|
| MNE-Python | Python | Easy | Complete workflows, research | ✅ Native | ✅ Via export |
| EEGLAB | MATLAB/GUI | Medium | Visual inspection, ICA | ⚠️ Via export | ✅ Via export |
| OpenVibe | GUI | Easy | Real-time preprocessing | ⚠️ Via export | ✅ Via export |
Python SDK: Native MNE Integration
Recommended for Python users: Use the Python SDK’s native MNE integration for seamless workflows.
Quick Start with Python SDK
from nimbus_bci import NimbusLDA
from nimbus_bci.mne_integration import from_mne_epochs, extract_csp_features
import mne
# 1. Preprocess with MNE-Python
raw = mne.io.read_raw_brainvision('motor_imagery.vhdr', preload=True)
raw.filter(8.0, 30.0)
# 2. Create epochs
events = mne.find_events(raw)
event_id = {'Left': 1, 'Right': 2, 'Feet': 3, 'Tongue': 4}
epochs = mne.Epochs(raw, events, event_id, tmin=0, tmax=4.0, baseline=None, preload=True)
# 3. Extract CSP features using nimbus-bci
X_csp, y = extract_csp_features(epochs, n_components=6)
# 4. Train directly (no data export needed!)
clf = NimbusLDA()
clf.fit(X_csp, y)
# 5. Predict
predictions = clf.predict(X_csp)
print(f"Accuracy: {np.mean(predictions == y):.1%}")
Complete Python SDK + MNE Workflow
See the MNE Integration guide for:
from_mne_epochs() for direct conversion
extract_csp_features() for motor imagery
extract_bandpower_features() for spectral analysis
- Complete preprocessing pipelines
Python users: Skip the export/import steps below and use the native Python SDK integration instead!
MNE-Python → Julia SDK Integration
For Julia SDK users: This section shows how to preprocess in Python with MNE and export to Julia.Python SDK users: Use the native MNE integration instead (see above).
Complete pipeline for Motor Imagery preprocessing with MNE-Python and export to Julia SDK.
Installation
pip install mne scipy scikit-learn numpy matplotlib
Complete Motor Imagery Pipeline
import mne
import numpy as np
from scipy.io import savemat
from mne.decoding import CSP
from mne.preprocessing import ICA
# Load raw EEG data
raw = mne.io.read_raw_brainvision('motor_imagery.vhdr', preload=True)
# Step 1: Set channel locations (if available)
# raw.set_montage('standard_1020')
# Step 2: Bandpass filter (8-30 Hz for motor imagery)
raw.filter(8.0, 30.0, method='iir', picks='eeg')
# Step 3: Remove artifacts with ICA
ica = ICA(n_components=15, random_state=42, max_iter='auto')
ica.fit(raw)
# Identify and exclude bad components (eye blinks, muscle)
ica.exclude = [] # You'll identify these by inspecting ica.plot_components()
# Example: ica.exclude = [0, 1] # First two components are eye blinks
raw_clean = ica.apply(raw)
# Step 4: Find events
events = mne.find_events(raw_clean, stim_channel='STI')
# Create event mapping (adjust based on your setup)
event_id = {
'Left Hand': 1,
'Right Hand': 2,
'Feet': 3,
'Tongue': 4
}
# Step 5: Epoch data (0-4 seconds post-cue)
epochs = mne.Epochs(
raw_clean,
events,
event_id=event_id,
tmin=0,
tmax=4.0,
baseline=None,
preload=True
)
# Step 6: Extract CSP features
# Get data for CSP (epochs × channels × time)
X = epochs.get_data() # (n_epochs, n_channels, n_times)
y = epochs.events[:, -1]
# Train CSP
csp = CSP(n_components=8, transform_into='csp_space', reg='shrinkage')
X_csp = csp.fit_transform(X, y) # (n_epochs, n_components, n_times)
# Step 7: Format for NimbusSDK.jl
# NimbusSDK expects: (n_features × n_samples × n_trials)
# From MNE we have: (n_trials, n_components, n_times)
# Need to: transpose to (n_components, n_times, n_trials)
X_csp_julia = np.transpose(X_csp, (1, 2, 0))
# Step 8: Save to .mat for Julia
savemat('motor_imagery_features.mat', {
'features': X_csp_julia, # (16, 1000, 40) - 16 features, 1 second, 40 trials
'labels': y, # [1, 2, 3, 4, 1, 2, ...]
'channels': epochs.ch_names,
'sampling_rate': epochs.info['sfreq']
})
print("✓ Preprocessing complete!")
print(f" Features: {X_csp_julia.shape[0]} features × {X_csp_julia.shape[1]} samples × {X_csp_julia.shape[2]} trials")
print(f" Labels: {len(np.unique(y))} classes")
Loading in Julia
using NimbusSDK, MAT
# Load features
data = matread("motor_imagery_features.mat")
features = data["features"] # Should be (16, 1000, 40)
labels = vec(Int.(data["labels"]))
println("Loaded: $(size(features))")
println("Classes: $(unique(labels))")
# Create BCIData
metadata = BCIMetadata(
sampling_rate = data["sampling_rate"],
paradigm = :motor_imagery,
feature_type = :csp,
n_features = 16,
n_classes = 4,
chunk_size = nothing
)
bci_data = BCIData(features, metadata, labels)
# Authenticate and run inference
NimbusSDK.install_core("your-api-key")
# Choose model type based on your needs: RxLDAModel, RxGMMModel, or RxPolyaModel
model = load_model(RxLDAModel, "motor_imagery_4class_v1")
results = predict_batch(model, bci_data)
println("Predictions: ", results.predictions)
println("Mean confidence: ", mean(results.confidences))
P300 Detection Pipeline
import mne
import numpy as np
from scipy.io import savemat
# Load and filter for P300 (0.5-10 Hz)
raw = mne.io.read_raw_brainvision('p300_data.vhdr', preload=True)
raw.filter(0.5, 10.0, method='iir')
# ICA artifact removal
ica = ICA(n_components=15, random_state=42)
ica.fit(raw)
ica.exclude = [0, 1] # Eye blinks
raw_clean = ica.apply(raw)
# Events
events = mne.find_events(raw_clean)
event_id = {'Target': 1, 'Non-target': 2}
# Epoch around stimulus (-0.2 to 0.8 seconds)
epochs = mne.Epochs(
raw_clean,
events,
event_id=event_id,
tmin=-0.2,
tmax=0.8,
baseline=(-0.2, 0),
preload=True
)
# Extract ERP amplitudes
X = epochs.get_data() # (n_epochs, n_channels, n_times)
y = epochs.events[:, -1]
# Average across time window for each channel
X_erp = np.mean(X, axis=2) # (n_epochs, n_channels)
# Save for Julia (transform to expected shape)
X_erp_julia = X_erp.T # (n_channels, n_epochs)
savemat('p300_features.mat', {
'features': X_erp_julia,
'labels': y
})
EEGLAB Integration
EEGLAB provides powerful ICA and visualization tools.
MATLAB Pipeline
% Load dataset
EEG = pop_loadset('motor_imagery_data.set');
% Step 1: Bandpass filter (8-30 Hz)
EEG = pop_eegfiltnew(EEG, 8, 30, [], [], 0, 0, 'fir', 'onepass-zerophase');
% Step 2: Run ICA
EEG = pop_runica(EEG, 'icatype', 'runica');
% Step 3: Inspect components and remove artifacts
% Use EEGLAB GUI: EEGLAB > Tools > Inspect components
% Mark bad components for rejection
EEG = pop_subcomp(EEG, bad_components, 0); % Remove bad components
% Step 4: Create epochs
EEG = pop_epoch(EEG, {'1', '2', '3', '4'}, [0 4], 'newname', 'epoched');
% Step 5: Save for CSP extraction (use external tool or Python)
pop_saveset(EEG, 'filename', 'epoched_data.set');
% Export to Python for CSP
pop_exportbids(EEG, 'output', 'bids_export'); % Or use pop_eeglab2mne
Alternative: Direct CSP in MATLAB
% Use BCILAB for CSP feature extraction
% BCILAB is an EEGLAB extension for BCI analysis
% Install BCILAB
cd ~/eeglab/plugins
wget('http://www.bcilab.org/download/bcilab.zip')
unzip('bcilab.zip')
% Use BCILAB CSP
[sources, patterns, eigenvalues] = proc_multiclass_csp(EEG.data, ...
EEG.epoch(:, :, labels), 'classes', [1, 2, 3, 4], 'ncomp', 8);
% Extract CSP features
csp_features = proc_spatialfilter(EEG.data, patterns);
features = reshape(csp_features, 16, size(csp_features, 2), size(csp_features, 3));
OpenVibe Integration
OpenVibe provides GUI-based real-time preprocessing.
OpenVibe Scenario for Motor Imagery
- Signal Acquisition → Connect to your EEG device
- Temporal Filter → 8-30 Hz bandpass for motor imagery
- Spatial Filter → Common Average Reference (CAR)
- Spatial Filter → CSP (requires training with labeled data first)
- Feature Extraction → Log-variance of CSP
- Generic Stream Writer → Save to CSV or stream directly
Exporting from OpenVibe
import pandas as pd
import numpy as np
# OpenVibe exports as CSV
df = pd.read_csv('openvibe_output.csv')
# Assuming format: Time, Channel1, Channel2, ..., Label
features_raw = df.iloc[:, 1:-1].values.T # Transpose to (channels × samples)
labels = df['Label'].values
# Reshape for NimbusSDK (if you have trial boundaries)
# You'll need to segment based on labels
features_segmented = segment_by_labels(features_raw, labels)
# Result: (n_features, n_samples_per_trial, n_trials)
# Save for Julia
import scipy.io as sio
sio.savemat('openvibe_features.mat', {
'features': features_segmented,
'labels': labels
})
Critical: Different tools output different shapes. Always verify:
MNE-Python → Julia
# MNE: (n_epochs, n_components, n_times)
features_mne # (40, 8, 1000)
# Julia SDK expects: (n_features × n_samples × n_trials)
features_julia = np.transpose(features_mne, (1, 2, 0)) # (8, 1000, 40)
EEGLAB/MATLAB → Julia
% MATLAB: (channels, samples, epochs)
features_matlab % (16, 1000, 40)
% Already correct for Julia! No transformation needed
OpenVibe CSV → Julia
# CSV: Time series with labels
df = pd.read_csv('data.csv')
# Convert to trials (segment by label changes)
features_julia = segment_to_trials(df) # (n_features, n_samples, n_trials)
Quality Checks
Python Preprocessing Check
def check_preprocessing_quality(features, labels):
"""Check data quality before saving for Julia"""
# Check shape
assert features.ndim == 3, "Features must be 3D: (features, samples, trials)"
n_features, n_samples, n_trials = features.shape
# Check for NaN/Inf
assert not np.any(np.isnan(features)), "Features contain NaN values"
assert not np.any(np.isinf(features)), "Features contain Inf values"
# Check labels
assert len(labels) == n_trials, "Labels must match number of trials"
assert labels.min() >= 1, "Labels must be 1-indexed (start at 1)"
assert labels.max() <= len(np.unique(labels)), "Labels must be consecutive"
# Check value ranges (not too large)
assert np.abs(features).max() < 1e6, "Features have suspiciously large values"
print("✓ All quality checks passed!")
print(f" Shape: ({n_features}, {n_samples}, {n_trials})")
print(f" Classes: {np.unique(labels)}")
print(f" Range: [{features.min():.3f}, {features.max():.3f}]")
return True
# Use it
check_preprocessing_quality(X_csp_julia, y)
Julia Loading Check
using NimbusSDK
# Load and validate
data = matread("features.mat")
features = data["features"]
labels = Int.(vec(data["labels"]))
println("Loaded shape: ", size(features))
println("Labels: ", unique(labels))
# Check with SDK
metadata = BCIMetadata(
sampling_rate = 250.0,
paradigm = :motor_imagery,
feature_type = :csp,
n_features = size(features, 1),
n_classes = length(unique(labels))
)
bci_data = BCIData(features, metadata, labels)
# Run diagnostics
report = diagnose_preprocessing(bci_data)
if !isempty(report.errors)
@error "Issues found: $(report.errors)"
end
println("Quality score: ", round(report.quality_score * 100, digits=1), "%")
Feature Normalization
Critical for cross-session BCI performance!EEG amplitude varies 50-200% across sessions. Always normalize your features for models used across sessions.
Normalization Workflow
using NimbusSDK
# 1. Estimate normalization parameters from TRAINING data
train_features = csp_features_train # (16 × 250 × 80)
norm_params = estimate_normalization_params(train_features; method=:zscore)
# 2. Apply to BOTH training and test data
train_norm = apply_normalization(train_features, norm_params)
test_norm = apply_normalization(test_features, norm_params)
# 3. Save parameters with your model
@save "model_with_norm.jld2" model norm_params
# 4. Later: Load and apply same parameters
@load "model_with_norm.jld2" model norm_params
new_data_norm = apply_normalization(new_data, norm_params)
Available Methods
:zscore (recommended): Mean=0, Std=1 - Best for most BCI applications
:robust: Uses median/MAD - Better for noisy data with artifacts
:minmax: Scale to [0, 1] - For bounded features
| Scenario | Accuracy Improvement |
|---|
| Same session | +1% |
| Cross-session (next day) | +15-25% |
| Multi-subject transfer | +15-20% |
Common Issues and Solutions
Issue 1: Wrong Data Shape
Error: Dimension mismatch: expected (n_features, n_samples, n_trials), got (trials, features, samples)
Solution:
# Transpose from MNE format to Julia format
features_julia = np.transpose(features_mne, (1, 2, 0))
Issue 2: Labels Start at Zero
Error: Labels must be 1-indexed
Solution:
# MNE labels start at 0, Julia expects 1-indexed
labels_julia = labels + 1
Issue 3: Mixed Data Types
Error: Type mismatch
Solution:
# In Julia, ensure proper types
features = Float64.(data["features"])
labels = Int.(vec(data["labels"]))
Next Steps