Transcribing audio recordings of piano to MIDI data
The technique is based on the "Onsets and Frames" paper developed at Google Magenta (link). We tried the same approach but we wanted a system that could (theoretically) transcribe in real-time, because it would not need to look into the future. Basically we use a uni-directional recurrent network instead of a bidirectional one. The results are less convincing than those that Google obtained, but onset detection (determining the time and pitch of the beginning of the notes) is not bad.
Example
original
onset transcription
full note transcription
Link to PDF of presentation.
The resulting system was used as part of the “Dear Glenn” project, presented at Ars Electronica 2019 (link).