Brain-computer interfaces (BCIs) are finally arriving at their ‘foundation model’ moment. Zyfra, a research lab focused on large-scale models, recently released zunaA 380M-parameter foundation model specifically for EEG signals. ZUNA is a masked diffusion auto-encoder designed to perform channel infilling and super-resolution for any electrode layout. This release includes weights under the Apache-2.0 license and an MNE-compliant inference stack.
The problem with ‘brittle’ EEG models
For decades, researchers have struggled with the ‘Wild West’ of EEG data. Different datasets use different numbers of channels and inconsistent electrode positions. Most deep learning models are trained on fixed channel montages, causing them to fail when applied to new datasets or recording situations. Additionally, EEG measurements often suffer from noise from electrode shift or subject movement.
ZUNA’s 4D Architecture: Spatial Intelligence
ZUNA solves the problem of generalization by treating brain signals as spatially based data. Instead of assuming a fixed grid, ZUNA injects spatiotemporal structure through a 4D Rotary Positional Encoding (4D RoPE).
The model tokenizes multichannel EEG into a small temporal window of 0.125 seconds or 32 samples. Each token is mapped to 4D coordinates: its 3D scalp location (x, y, z) and its coarse-time index

Diffusion as a generative engine
ZUNA uses a propagation approach because EEG signals are continuous and real-valued. The model combines a diffusion decoder with an encoder that stores signal information in a hidden barrier.
During training, Zyfra used a heavy channel-dropout objective. They randomly removed 90% of the channels in the encoder input and replaced them with zeros. The model was tasked with reconstructing these ‘hidden’ signals from the information in the remaining 10% channels. This forced the model to learn deep cross-channel correlations and a powerful internal representation of brain activity.
Huge data pipeline: 2 million hours
Data quality is the heartbeat of any foundation model. ZIFRA assembled a cohesive corpus spanning 208 public datasets. This huge collection includes:
- 2 million Channel-hours of EEG recording.
- Above 24 million Non-overlapping 5-second samples.
- A wide range of channel counts 2 To 256 per recording.
Preprocessing pipeline standardized all signals to a common sampling rate 256 hertz. they used mne-python To apply high-pass filter 0.5Hz and an adaptive notch filter to remove line noise. Signals were z-score normalized to ensure zero-mean and unit-variance while preserving spatial structure.
Benchmark: eliminating circular spline
For years, the industry standard for filling in missing EEG data has been spherical spline interpolation. While splines are useful for capturing local smoothness, they are not ‘learned in advance’ and fail when the gap between sensors becomes too large.
ZUNA consistently outperforms spherical-spline interpolation in several benchmarks, including the ANPHY-sleep dataset and the BCI2000 motor-imagery dataset. The performance gap increases significantly at high dropout rates. In extreme 90% dropout scenarios – essentially 10x upsampling – ZUNA maintains high reconstruction fidelity while spline methods degrade rapidly.


key takeaways
- Universal Generalization: zuna is one 380m-parameter Model that works with any EEG system, regardless of the number or position of electrodes. Unlike previous AI models limited to fixed layouts, it generalizes to diverse datasets and novel channel situations.
- 4D Spatiotemporal Intelligence: The model uses a 4D Rotary Positional Encoding (4D RoPE) A system for mapping brain signals in 3D space (X,Y,Z) and time
- Superior Channel Reconstruction: by training as masked propagation autoencoderZUNA outperforms traditional spherical-spline interpolation. It excels in ‘super-resolution’, maintaining high accuracy even 90% Brain signals are missing or corrupted.
- Comprehensive training scale: The model was trained on a cohesive corpus 208 dataseta total of approximately 2 million channel-hours And 24 million Unique 5-second samples. This scale allows it to learn deep cross-channel correlations that simple geometric methods miss.
check it out paper, technical details, repo And model weight. Also, feel free to follow us Twitter And don’t forget to join us 100k+ ml subreddit and subscribe our newsletter. wait! Are you on Telegram? Now you can also connect with us on Telegram.

