Difference between revisions of "ZIM file format"

Jump to navigation Jump to search
59 bytes added ,  09:02, 17 October 2010
no edit summary
Line 93: Line 93:
The indirection from titles via URLs to directory entries has two reasons:
The indirection from titles via URLs to directory entries has two reasons:
* the pointer list is only half in size as 4 bytes are enough for each entry
* the pointer list is only half in size as 4 bytes are enough for each entry
* accessing directory entries by title also makes use of cached directory entries which are referenced by the URL pointers, as implemented in zimlib
* accessing directory entries by title also makes use of cached directory entries which are referenced by the URL pointers, as implemented in zimlib.
 
== Cluster pointer list (clusterPtrPos) ==
 
The cluster pointer list is a global list of 8 byte offsets which point to all data clusters in a ZIM file.
 
{|{{Prettytable}}
! Field Name            !! Type    !!Offset!!Length!! Description
|-
| <1st Cluster>          || integer ||    0 ||    8 || Pointer to the <1st Cluster>
|-
| <1st Cluster>          || integer ||    8 ||    8 || Pointer to the <2nd Cluster>
|-
| <nth Cluster>          || integer ||(n-1)*8||  8 || Pointer to the <nth Cluster>
|-
| ...                    || integer || ...  ||    8 || ...
|}


== Directory entries ==
== Directory entries ==
Directory entries hold the meta information about all articles, images and other objects in a ZIM file.


length in byte, all data is little endian.
There are 2 types of directory entries: article entries and redirect entries. If the first two bytes are 0xffff the
There are 2 types of directory entries: article entries and redirect entries. If the first two bytes are 0xffff the
directory entrie is a redirect.
directory entrie is a redirect.


=== article entry ===
=== article entry ===
{|{{Prettytable}}
{|{{Prettytable}}
! Field Name !! Type !! Offset !! Length !! Description
! Field Name !! Type !! Offset !! Length !! Description
Line 163: Line 146:
| parameter || data || || see extra len || extra parameters
| parameter || data || || see extra len || extra parameters
|-
|-
|}
== Cluster pointer list (clusterPtrPos) ==
The cluster pointer list is a global list of 8 byte offsets which point to all data clusters in a ZIM file.
{|{{Prettytable}}
! Field Name            !! Type    !!Offset!!Length!! Description
|-
| <1st Cluster>          || integer ||    0 ||    8 || Pointer to the <1st Cluster>
|-
| <1st Cluster>          || integer ||    8 ||    8 || Pointer to the <2nd Cluster>
|-
| <nth Cluster>          || integer ||(n-1)*8||  8 || Pointer to the <nth Cluster>
|-
| ...                    || integer || ...  ||    8 || ...
|}
|}


== Clusters ==
== Clusters ==
The clusters contain the actual article data. This file section contain a list of clusters, which contain a list of blobs each. The blob is the data of one specific article. So this blob is adressed by the cluster number and the blob number in this cluster. The cluster number is used to look up the file offset in the cluster pointer list.
The clusters contain the actual article data. This file section contain a list of clusters, which contain a list of blobs each. The blob is the data of one specific article. So this blob is adressed by the cluster number and the blob number in this cluster. The cluster number is used to look up the file offset in the cluster pointer list.


Navigation menu