Show demo

Extract the (MIDI) notes from a single voiced audio file.

The melody analysis task detects the note pitches in a monophonic audio file, such as a recording of a vocal or single-voiced instrument. It can also return a detailed pitch contour curve. The result can be used to generate a score-like representation of the audio or to convert audio to MIDI. Furthermore, the results can be re-used in the elastiqueTune task to modify individual notes. For more details on changing individual notes, see process/elastiqueTune.


Parameter Type Description
detailed_result optional bool [false, true] default: false Include additional data (raw pitch curve) in the task result.

This can result in a very long list of pitch marks in your analysis result, so you should only turn this on if you really need it.



The response gives you a status code, the file_id and the corresponding download URL of the resulting file (audio for process tasks and xml for analyze tasks). For processing reports, use the /file/status request with the parameter format=xml/json/jsonp/xmlp.

Name Description
status The status code of the task.
file_id The unique identifier of the file.
href The direct download link to the file including the file_id.


The results contains the estimated key and frequency of the input file, a list of all detected notes and optionally the raw pitch curve (if the detailed_result parameter was set to true).

Name Type Description
key string The name of the extracted key.

The algorithm is able to detect both major (Maj) and minor (min) scales from 12 different root notes (C, C#, D, D#, E, F, F#, G, G#, A, A#, B). There are no flat root notes as we assume enharmonic equivalency (c sharp equals d flat).

key_index integer A representation of the key as an index.

The key_index can take the values 0...23. The first 12 indices indicate the major keys, and remaining indices the minor keys, both starting from C (example: d# min is 15).

tuning_frequency float The melody's tuning frequency in Hz.

The tuning frequency is the frequency of the concert pitch A4 and frequently equals 440Hz. It may deviate a few Hertz for some songs.

notes List of notes extracted from the music.

Each note has several attributes with information on the position, duration and pitch (see below).

Each note has the following attributes:

Name Type Description
midi_pitch float The average pitch (perceived frequency) of the note as MIDI pitch index.

Example: C4 (Middle C) is represented by the midi_pitch 60.0; a midi_pitch 60.5 would be 50 Cents higher. All MIDI pitches are given with respect to a tuning frequency of 440 Hz. If, during a note, a frequency deviation such as vibrato is present, then the average pitch will be the centre frequency.

onset_time float The beginning time of the note (in seconds)
duration float The duration of the note (in seconds)
volume float The volume or velocity of the note (normalized to the range between 0 and 1)

Each pitch_mark in the optional pitch_curve result has these attributes:

Name Type Description
tonal string "true" for tonal sections, "false" if the signal is either silent or non-tonal.
midi_pitch float The instantaneous pitch at the pitch mark position as MIDI pitch index.
time float The position of the pitch mark (in seconds)
volume float The instantaneous volume at the pitch mark position (normalized to the range between 0 and 1)

One-click example

You can analyze the melody of this saxophone recording with a simple click on this button:
Analyze Learn more The button is just a link with a specially constructed URL:
By requesting this URL, the input file will be imported into the system, processed by sonicAPI and the analysis result is displayed in your browser. The Live-Demo generates additional example code.