> ## Documentation Index > Fetch the complete documentation index at: https://docs.nimbusbci.com/llms.txt > Use this file to discover all available pages before exploring further. # Streaming Inference Configuration > Configure cross-SDK chunk size, temporal aggregation, and streaming decision patterns for low-latency BCI inference. # Streaming Inference Configuration This page is the cross-SDK configuration overview for streaming inference. It explains the shared concepts: when to stream, how to choose chunk sizes, and how to aggregate chunk predictions. For implementation-specific APIs, use the SDK pages: * **Python**: [Python Streaming Inference](/python-sdk/streaming-inference) documents `StreamingSession` and `StreamingSessionSTS`. * **Julia**: [Julia Streaming Inference](/julia-sdk/streaming-inference) documents `init_streaming`, `process_chunk`, and `finalize_trial`. ## Streaming vs Batch Use **batch inference** when full trials are already available and you are doing offline evaluation, model validation, or analytics. Use **streaming inference** when feature chunks arrive over time and the application needs low-latency updates before a complete trial has finished. ```text theme={null} EEG stream -> preprocessing -> feature chunks -> streaming session -> chunk posteriors -> final decision ``` ## Core Configuration Every streaming setup needs: * `sampling_rate`: acquisition rate in Hz. * `chunk_size`: number of samples per chunk. * `paradigm`: task type such as motor imagery, P300, or SSVEP. * `feature_type`: feature representation such as CSP, bandpower, or ERP amplitude. * `n_features`: feature count per chunk. * `n_classes`: number of output classes. * `temporal_aggregation`: how to reduce feature time structure when required. ## Chunk Size Guidelines | Chunk duration | Typical use | Trade-off | | -------------- | -------------------------------------------- | -------------------------------------- | | 0.25-0.5s | Fast feedback, games, exploratory interfaces | Lower latency, less evidence per chunk | | 0.5-1.0s | Most real-time BCI systems | Balanced responsiveness and confidence | | 1.0-2.0s | Medical or high-stakes decisions | Higher confidence, slower feedback | Start with 0.5-1.0s chunks for motor imagery and adjust based on confidence, latency, and user experience. ## Aggregation Methods Streaming produces one posterior per chunk. Trial-level decisions combine those chunk posteriors. | Method | Use when | | ---------------- | ------------------------------------------------------ | | `weighted_vote` | You want confidence-weighted decisions across chunks | | `posterior_mean` | You want smooth posterior averaging across chunks | | `max_confidence` | You trust the single most confident chunk | | `unanimous` | You need conservative decisions only when chunks agree | Use `weighted_vote` as the default. Switch to `posterior_mean` when you want smoother probabilities, or `max_confidence` when the task has short high-signal windows. ## Quality Gates Streaming systems should monitor: * confidence (`max posterior probability`) * entropy (`prediction uncertainty`) * class balance over recent trials * rejection rate * per-chunk and per-trial latency For Python rejection and quality APIs, see [Python SDK API Reference](/python-sdk/api-reference). For end-to-end real-time setup, see [Real-Time BCI Setup](/inference-configuration/real-time-setup). ## Next Read Python `StreamingSession` examples and STS state handling. Julia streaming API for local chunk processing. Hardware, LSL, BrainFlow, and acquisition-loop guidance. Offline trial processing and diagnostics.