Skip to main content

Preprocessing Integration Guide

Complete workflows for preprocessing EEG data with external tools and using the features in Nimbus BCI (Python and Julia SDKs).
Python SDK Users: The nimbus-bci package has native MNE-Python integration! See the MNE Integration guide for streamlined workflows.Julia SDK Users: This guide shows how to preprocess in external tools and load into Julia.
New to preprocessing? Start with Preprocessing Requirements for an overview before diving into specific tools.

Integration Options

ToolLanguageDifficultyBest ForPython SDKJulia SDK
MNE-PythonPythonEasyComplete workflows, research✅ Native✅ Via export
EEGLABMATLAB/GUIMediumVisual inspection, ICA⚠️ Via export✅ Via export
OpenVibeGUIEasyReal-time preprocessing⚠️ Via export✅ Via export

Python SDK: Native MNE Integration

Recommended for Python users: Use the Python SDK’s native MNE integration for seamless workflows.

Quick Start with Python SDK

from nimbus_bci import NimbusLDA
from nimbus_bci.mne_integration import from_mne_epochs, extract_csp_features
import mne

# 1. Preprocess with MNE-Python
raw = mne.io.read_raw_brainvision('motor_imagery.vhdr', preload=True)
raw.filter(8.0, 30.0)

# 2. Create epochs
events = mne.find_events(raw)
event_id = {'Left': 1, 'Right': 2, 'Feet': 3, 'Tongue': 4}
epochs = mne.Epochs(raw, events, event_id, tmin=0, tmax=4.0, baseline=None, preload=True)

# 3. Extract CSP features using nimbus-bci
X_csp, y = extract_csp_features(epochs, n_components=6)

# 4. Train directly (no data export needed!)
clf = NimbusLDA()
clf.fit(X_csp, y)

# 5. Predict
predictions = clf.predict(X_csp)
print(f"Accuracy: {np.mean(predictions == y):.1%}")

Complete Python SDK + MNE Workflow

See the MNE Integration guide for:
  • from_mne_epochs() for direct conversion
  • extract_csp_features() for motor imagery
  • extract_bandpower_features() for spectral analysis
  • Complete preprocessing pipelines
Python users: Skip the export/import steps below and use the native Python SDK integration instead!

MNE-Python → Julia SDK Integration

For Julia SDK users: This section shows how to preprocess in Python with MNE and export to Julia.Python SDK users: Use the native MNE integration instead (see above).
Complete pipeline for Motor Imagery preprocessing with MNE-Python and export to Julia SDK.

Installation

pip install mne scipy scikit-learn numpy matplotlib

Complete Motor Imagery Pipeline

import mne
import numpy as np
from scipy.io import savemat
from mne.decoding import CSP
from mne.preprocessing import ICA

# Load raw EEG data
raw = mne.io.read_raw_brainvision('motor_imagery.vhdr', preload=True)

# Step 1: Set channel locations (if available)
# raw.set_montage('standard_1020')

# Step 2: Bandpass filter (8-30 Hz for motor imagery)
raw.filter(8.0, 30.0, method='iir', picks='eeg')

# Step 3: Remove artifacts with ICA
ica = ICA(n_components=15, random_state=42, max_iter='auto')
ica.fit(raw)

# Identify and exclude bad components (eye blinks, muscle)
ica.exclude = []  # You'll identify these by inspecting ica.plot_components()
# Example: ica.exclude = [0, 1]  # First two components are eye blinks

raw_clean = ica.apply(raw)

# Step 4: Find events
events = mne.find_events(raw_clean, stim_channel='STI')

# Create event mapping (adjust based on your setup)
event_id = {
    'Left Hand': 1,
    'Right Hand': 2,
    'Feet': 3,
    'Tongue': 4
}

# Step 5: Epoch data (0-4 seconds post-cue)
epochs = mne.Epochs(
    raw_clean,
    events,
    event_id=event_id,
    tmin=0,
    tmax=4.0,
    baseline=None,
    preload=True
)

# Step 6: Extract CSP features
# Get data for CSP (epochs × channels × time)
X = epochs.get_data()  # (n_epochs, n_channels, n_times)
y = epochs.events[:, -1]

# Train CSP
csp = CSP(n_components=8, transform_into='csp_space', reg='shrinkage')
X_csp = csp.fit_transform(X, y)  # (n_epochs, n_components, n_times)

# Step 7: Format for NimbusSDK.jl
# NimbusSDK expects: (n_features × n_samples × n_trials)
# From MNE we have: (n_trials, n_components, n_times)
# Need to: transpose to (n_components, n_times, n_trials)
X_csp_julia = np.transpose(X_csp, (1, 2, 0))

# Step 8: Save to .mat for Julia
savemat('motor_imagery_features.mat', {
    'features': X_csp_julia,  # (16, 1000, 40) - 16 features, 1 second, 40 trials
    'labels': y,              # [1, 2, 3, 4, 1, 2, ...]
    'channels': epochs.ch_names,
    'sampling_rate': epochs.info['sfreq']
})

print("✓ Preprocessing complete!")
print(f"  Features: {X_csp_julia.shape[0]} features × {X_csp_julia.shape[1]} samples × {X_csp_julia.shape[2]} trials")
print(f"  Labels: {len(np.unique(y))} classes")

Loading in Julia

using NimbusSDK, MAT

# Load features
data = matread("motor_imagery_features.mat")
features = data["features"]  # Should be (16, 1000, 40)
labels = vec(Int.(data["labels"]))

println("Loaded: $(size(features))")
println("Classes: $(unique(labels))")

# Create BCIData
metadata = BCIMetadata(
    sampling_rate = data["sampling_rate"],
    paradigm = :motor_imagery,
    feature_type = :csp,
    n_features = 16,
    n_classes = 4,
    chunk_size = nothing
)

bci_data = BCIData(features, metadata, labels)

# Authenticate and run inference
NimbusSDK.install_core("your-api-key")
# Choose model type based on your needs: RxLDAModel, RxGMMModel, or RxPolyaModel
model = load_model(RxLDAModel, "motor_imagery_4class_v1")
results = predict_batch(model, bci_data)

println("Predictions: ", results.predictions)
println("Mean confidence: ", mean(results.confidences))

P300 Detection Pipeline

import mne
import numpy as np
from scipy.io import savemat

# Load and filter for P300 (0.5-10 Hz)
raw = mne.io.read_raw_brainvision('p300_data.vhdr', preload=True)
raw.filter(0.5, 10.0, method='iir')

# ICA artifact removal
ica = ICA(n_components=15, random_state=42)
ica.fit(raw)
ica.exclude = [0, 1]  # Eye blinks
raw_clean = ica.apply(raw)

# Events
events = mne.find_events(raw_clean)
event_id = {'Target': 1, 'Non-target': 2}

# Epoch around stimulus (-0.2 to 0.8 seconds)
epochs = mne.Epochs(
    raw_clean,
    events,
    event_id=event_id,
    tmin=-0.2,
    tmax=0.8,
    baseline=(-0.2, 0),
    preload=True
)

# Extract ERP amplitudes
X = epochs.get_data()  # (n_epochs, n_channels, n_times)
y = epochs.events[:, -1]

# Average across time window for each channel
X_erp = np.mean(X, axis=2)  # (n_epochs, n_channels)

# Save for Julia (transform to expected shape)
X_erp_julia = X_erp.T  # (n_channels, n_epochs)

savemat('p300_features.mat', {
    'features': X_erp_julia,
    'labels': y
})

EEGLAB Integration

EEGLAB provides powerful ICA and visualization tools.

MATLAB Pipeline

% Load dataset
EEG = pop_loadset('motor_imagery_data.set');

% Step 1: Bandpass filter (8-30 Hz)
EEG = pop_eegfiltnew(EEG, 8, 30, [], [], 0, 0, 'fir', 'onepass-zerophase');

% Step 2: Run ICA
EEG = pop_runica(EEG, 'icatype', 'runica');

% Step 3: Inspect components and remove artifacts
% Use EEGLAB GUI: EEGLAB > Tools > Inspect components
% Mark bad components for rejection

EEG = pop_subcomp(EEG, bad_components, 0);  % Remove bad components

% Step 4: Create epochs
EEG = pop_epoch(EEG, {'1', '2', '3', '4'}, [0 4], 'newname', 'epoched');

% Step 5: Save for CSP extraction (use external tool or Python)
pop_saveset(EEG, 'filename', 'epoched_data.set');

% Export to Python for CSP
pop_exportbids(EEG, 'output', 'bids_export');  % Or use pop_eeglab2mne

Alternative: Direct CSP in MATLAB

% Use BCILAB for CSP feature extraction
% BCILAB is an EEGLAB extension for BCI analysis

% Install BCILAB
cd ~/eeglab/plugins
wget('http://www.bcilab.org/download/bcilab.zip')
unzip('bcilab.zip')

% Use BCILAB CSP
[sources, patterns, eigenvalues] = proc_multiclass_csp(EEG.data, ...
    EEG.epoch(:, :, labels), 'classes', [1, 2, 3, 4], 'ncomp', 8);

% Extract CSP features
csp_features = proc_spatialfilter(EEG.data, patterns);
features = reshape(csp_features, 16, size(csp_features, 2), size(csp_features, 3));

OpenVibe Integration

OpenVibe provides GUI-based real-time preprocessing.

OpenVibe Scenario for Motor Imagery

  1. Signal Acquisition → Connect to your EEG device
  2. Temporal Filter → 8-30 Hz bandpass for motor imagery
  3. Spatial Filter → Common Average Reference (CAR)
  4. Spatial Filter → CSP (requires training with labeled data first)
  5. Feature Extraction → Log-variance of CSP
  6. Generic Stream Writer → Save to CSV or stream directly

Exporting from OpenVibe

import pandas as pd
import numpy as np

# OpenVibe exports as CSV
df = pd.read_csv('openvibe_output.csv')

# Assuming format: Time, Channel1, Channel2, ..., Label
features_raw = df.iloc[:, 1:-1].values.T  # Transpose to (channels × samples)
labels = df['Label'].values

# Reshape for NimbusSDK (if you have trial boundaries)
# You'll need to segment based on labels
features_segmented = segment_by_labels(features_raw, labels)
# Result: (n_features, n_samples_per_trial, n_trials)

# Save for Julia
import scipy.io as sio
sio.savemat('openvibe_features.mat', {
    'features': features_segmented,
    'labels': labels
})

Data Shape Transformation

Critical: Different tools output different shapes. Always verify:

MNE-Python → Julia

# MNE: (n_epochs, n_components, n_times)
features_mne  # (40, 8, 1000)

# Julia SDK expects: (n_features × n_samples × n_trials)
features_julia = np.transpose(features_mne, (1, 2, 0))  # (8, 1000, 40)

EEGLAB/MATLAB → Julia

% MATLAB: (channels, samples, epochs)
features_matlab  % (16, 1000, 40)

% Already correct for Julia! No transformation needed

OpenVibe CSV → Julia

# CSV: Time series with labels
df = pd.read_csv('data.csv')

# Convert to trials (segment by label changes)
features_julia = segment_to_trials(df)  # (n_features, n_samples, n_trials)

Quality Checks

Python Preprocessing Check

def check_preprocessing_quality(features, labels):
    """Check data quality before saving for Julia"""
    
    # Check shape
    assert features.ndim == 3, "Features must be 3D: (features, samples, trials)"
    n_features, n_samples, n_trials = features.shape
    
    # Check for NaN/Inf
    assert not np.any(np.isnan(features)), "Features contain NaN values"
    assert not np.any(np.isinf(features)), "Features contain Inf values"
    
    # Check labels
    assert len(labels) == n_trials, "Labels must match number of trials"
    assert labels.min() >= 1, "Labels must be 1-indexed (start at 1)"
    assert labels.max() <= len(np.unique(labels)), "Labels must be consecutive"
    
    # Check value ranges (not too large)
    assert np.abs(features).max() < 1e6, "Features have suspiciously large values"
    
    print("✓ All quality checks passed!")
    print(f"  Shape: ({n_features}, {n_samples}, {n_trials})")
    print(f"  Classes: {np.unique(labels)}")
    print(f"  Range: [{features.min():.3f}, {features.max():.3f}]")
    
    return True

# Use it
check_preprocessing_quality(X_csp_julia, y)

Julia Loading Check

using NimbusSDK

# Load and validate
data = matread("features.mat")
features = data["features"]
labels = Int.(vec(data["labels"]))

println("Loaded shape: ", size(features))
println("Labels: ", unique(labels))

# Check with SDK
metadata = BCIMetadata(
    sampling_rate = 250.0,
    paradigm = :motor_imagery,
    feature_type = :csp,
    n_features = size(features, 1),
    n_classes = length(unique(labels))
)

bci_data = BCIData(features, metadata, labels)

# Run diagnostics
report = diagnose_preprocessing(bci_data)
if !isempty(report.errors)
    @error "Issues found: $(report.errors)"
end

println("Quality score: ", round(report.quality_score * 100, digits=1), "%")

Feature Normalization

Critical for cross-session BCI performance!EEG amplitude varies 50-200% across sessions. Always normalize your features for models used across sessions.

Normalization Workflow

using NimbusSDK

# 1. Estimate normalization parameters from TRAINING data
train_features = csp_features_train  # (16 × 250 × 80)
norm_params = estimate_normalization_params(train_features; method=:zscore)

# 2. Apply to BOTH training and test data
train_norm = apply_normalization(train_features, norm_params)
test_norm = apply_normalization(test_features, norm_params)

# 3. Save parameters with your model
@save "model_with_norm.jld2" model norm_params

# 4. Later: Load and apply same parameters
@load "model_with_norm.jld2" model norm_params
new_data_norm = apply_normalization(new_data, norm_params)

Available Methods

  • :zscore (recommended): Mean=0, Std=1 - Best for most BCI applications
  • :robust: Uses median/MAD - Better for noisy data with artifacts
  • :minmax: Scale to [0, 1] - For bounded features

Performance Impact

ScenarioAccuracy Improvement
Same session+1%
Cross-session (next day)+15-25%
Multi-subject transfer+15-20%
For complete details, see the Feature Normalization guide.

Common Issues and Solutions

Issue 1: Wrong Data Shape

Error: Dimension mismatch: expected (n_features, n_samples, n_trials), got (trials, features, samples) Solution:
# Transpose from MNE format to Julia format
features_julia = np.transpose(features_mne, (1, 2, 0))

Issue 2: Labels Start at Zero

Error: Labels must be 1-indexed Solution:
# MNE labels start at 0, Julia expects 1-indexed
labels_julia = labels + 1

Issue 3: Mixed Data Types

Error: Type mismatch Solution:
# In Julia, ensure proper types
features = Float64.(data["features"])
labels = Int.(vec(data["labels"]))

Next Steps