Difference between revisions of "ZIM file format"

From openZIM
Jump to: navigation, search
(add header length field)
(documentation updated)
Line 1: Line 1:
 +
The ZIM file format is based on the zeno file format. It starts with a header, which is described here:
 +
 
length in byte, all types are littlendian
 
length in byte, all types are littlendian
  
Line 28: Line 30:
 
|                                  || integer || 52 || 8 || ''deprecated''
 
|                                  || integer || 52 || 8 || ''deprecated''
 
|}
 
|}
 +
 +
Each article in the zim file has a directory entry. Since the directory entry has a variable size, we have a index pointerlist, which is a list of 4-byte offsets, which points to the directory  entries.

Revision as of 12:09, 22 February 2009

The ZIM file format is based on the zeno file format. It starts with a header, which is described here:

length in byte, all types are littlendian

Field Name Type Offset Length Description
rMagicNumber integer 0 4 Magic number to recognise the file format, must be "1439867043"
rVersion integer 4 4 wp2006=2, wp2007=3, ZIM=4, version of the file format for backwards compatibility
rCount integer 8 4 total number of articles
integer 12 4 deprecated
rIndexPos integer 16 8 position of the article index
rIndexLen integer 24 4 length of the article index
headerLen integer 28 4 length of header (currently 60)
rIndexPtrPos integer 32 8 position to the directory pointerlist
rIndexPtrLen integer 40 4 length of directory pointerlist (always 4*rCount)
rMainPage integer 44 4 article index of main page or 0xffff if no main page
rLayoutPage integer 48 4 article index of layout page or 0xffff if no layout page
integer 52 8 deprecated

Each article in the zim file has a directory entry. Since the directory entry has a variable size, we have a index pointerlist, which is a list of 4-byte offsets, which points to the directory entries.