emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography
emg2qwerty is the largest public surface electromyography (sEMG) dataset to date, comprising 1,135 sessions from 108 participants performing touch typing on a QWERTY keyboard. The dataset captures wrist-based sEMG signals (32 channels, 2000 Hz sampling rate) synchronized with keystroke ground truth, totaling 346.4 hours of data and 5.26 million keystrokes. Designed to enable keyboard-free text input through decoding of typing intent from neuromuscular activity, the dataset supports research in sequence-to-sequence learning, cross-user generalization, domain adaptation, and neuromotor interfaces for AR/VR and accessibility applications.
AI-generated description, may include mistakesLoading demographics…
Coming soon. Per-file data-quality summaries are precomputed by the NEMAR processing pipeline. The static aggregate is on the way — tracked at nemar-cli#511.