The Standard MIDI File (SMF) is a file format designed to store the data that a sequencer records and plays. The format stores the standard MIDI messages plus a time-stamp for each message.
The format was designed to be generic so that any sequencer could read or write such a file without losing the most important data, and flexible enough for a particular application to store its own proprietary, "extra" data without disturbing other applications.
Data is always saved within a chunk. There can be many chunks inside of a MIDI file. Each chunk can be a different size (number of bytes in the chunk). A chunk is simply a group of related bytes.
Each chunk begins with a 4 character (4 ASCII bytes) ID which tells what "type" of chunk this is. The next 4 bytes form a 32-bit length (size) of the chunk.
All chunks must begin with these two fields (8 bytes), which are referred to as the chunk header.
As all data are saved within chunks the format allows proprietary data chunks. An example of this is the additional data chunks in Yamaha keyboard style files (CASM, OTS, and MDB).
NOTE: The Length does not include the 8 byte chunk header. It simply tells you how many bytes of data are in the chunk following this header.
The MThd chunk header (with bytes expressed in hex): 4D 54 68 64 00 00 00 06
The first 4 bytes make up the ASCII ID of MThd (the first four bytes are ASCII values for 'M', 'T', 'h', and 'd').
The next 4 bytes tell us that there should be 6 more data bytes in the chunk (and after that we should find the next chunk header or the end of the file).
The first two data bytes tell the Format. There are 3 different formats of MIDI files.
• Format 0: One single track containing MIDI data on possibly all 16 MIDI channels.
• Format 1: One or more simultaneous (i.e. all start from an assumed time of 0) tracks, perhaps each on a single MIDI channel.
• Format 2: One or more sequentially independent single-track patterns.
The next 2 bytes tell how many tracks are stored in the file. Of course, for format type 0, this is always 1. For the other 2 types, there can be numerous tracks.
The last two bytes indicate how many Pulses (i.e. clocks) Per Quarter Note (abbreviated as PPQN) resolution the time-stamps are based upon.
For example, if your sequencer has 96 ppqn, this field would be (in hex): 00 60
4D 54 68 64 | MThd ID |
00 00 00 06 | Length of MThd chunk is 6 |
00 01 | Format type is 1 |
00 02 | There are 2 MTrk chunks in this file |
00 60 | Pulses Per Quarter Note is 96. |
After the MThd chunk, you should find an MTrk chunk, as this is the only other currently defined MIDI chunk. A MTrk chunk contains all of the MIDI data (with timing bytes).
There will be as many MTrk chunks in the file as the MThd chunk indicates.
The MTrk header begins with the ID of MTrk, followed by the Length (i.e. number of data bytes to read for this track). The Length will likely be different for each track.
A MIDI track contains a series of events: The first event in the track may be to sound a middle C note. The second event may be to sound the E above middle C.
These two events may both happen at the same time. The third event may be to release the middle C note. This event may occur a few musical beats after the first two events.
Each event has a "time" when it must occur, and the events are arranged within a chunk in the order that they occur.
In a MIDI file, an event's "time" precedes the data bytes that make up that event itself i.e. the time-stamp comes before the message.
A given event's time-stamp is referenced from the previous event. For example, if the first event occurs 4 clocks after the start of play, then its "delta-time" is 04. If the next event occurs simultaneously with that first event, its time is 00.
A delta-time is stored as a series of bytes (up to 4 bytes) which is called a variable length quantity.