ZIM file format

From openZIM
Revision as of 12:09, 22 February 2009 by Tntnet (talk | contribs) (documentation updated)
Jump to: navigation, search

The ZIM file format is based on the zeno file format. It starts with a header, which is described here:

length in byte, all types are littlendian

Field Name Type Offset Length Description
rMagicNumber integer 0 4 Magic number to recognise the file format, must be "1439867043"
rVersion integer 4 4 wp2006=2, wp2007=3, ZIM=4, version of the file format for backwards compatibility
rCount integer 8 4 total number of articles
integer 12 4 deprecated
rIndexPos integer 16 8 position of the article index
rIndexLen integer 24 4 length of the article index
headerLen integer 28 4 length of header (currently 60)
rIndexPtrPos integer 32 8 position to the directory pointerlist
rIndexPtrLen integer 40 4 length of directory pointerlist (always 4*rCount)
rMainPage integer 44 4 article index of main page or 0xffff if no main page
rLayoutPage integer 48 4 article index of layout page or 0xffff if no layout page
integer 52 8 deprecated

Each article in the zim file has a directory entry. Since the directory entry has a variable size, we have a index pointerlist, which is a list of 4-byte offsets, which points to the directory entries.