Difference between revisions of "ZIM file format"

Jump to navigation Jump to search
25 bytes removed ,  07:44, 23 April 2021
→‎Clusters: Clarification around the cluster compression types
(Remove the idea that titlePtrPos may be set to zero.W)
(→‎Clusters: Clarification around the cluster compression types)
Line 188: Line 188:
The first byte of the cluster identifies some information about the cluster.
The first byte of the cluster identifies some information about the cluster.


The first fourth low bits identifies if the cluster is compressed (4) or not (0):
The first fourth low bits identifies if the cluster compression type:
* The default is uncompressed indicated by a value of 0 or 1 (obsoleted, inherited by Zeno).
* No compression is indicated by a value of 1
* Compressed clusters are indicated by a value of 4 ([[LZMA2 compression]] (or more precisely XZ, since there is a XZ header)) and 5 (Zstandard compression).
* Compressed clusters are indicated by a value of 4 ([[LZMA2 compression]] (or more precisely XZ, since there is a XZ header)) or 5 (Zstandard compression).
* There have been other compression algorithms used before (2: zlib, 3: bzip2) which have been removed.
* There have been other compression algorithms used before which have been removed: 2 for zlib and 3 for bzip2.
The firth bit identifies if the cluster is extended or not :
* 0 is an obselete code for no compression (inhereted from the Zeno)
 
The fifth bit identifies the cluster is extended or not :
* By default (5th bit == 0) the cluster is not extended. It means that the offsets are stored in a 4 bytes length integer. Thus contents stored in the cluster cannot exceed 4Go.
* By default (5th bit == 0) the cluster is not extended. It means that the offsets are stored in a 4 bytes length integer. Thus contents stored in the cluster cannot exceed 4Go.
* If the cluster is extended (5th bit == 1), the offsets are stored in 8 bytes length integer. Thus contents stored in the cluster can exceed 4Go.
* If the cluster is extended (5th bit == 1), the offsets are stored in 8 bytes length integer. Thus contents stored in the cluster can exceed 4Go.
Line 204: Line 206:
! Field Name !! Type !!Offset!!Length!! Description                 
! Field Name !! Type !!Offset!!Length!! Description                 
|-
|-
| cluster information || integer || 0 || 1 || Fourth low bits : 0: default (no compression), 1: none (inherited from Zeno), 4: LZMA2 compressed, 5: zstd compressed
| cluster information || integer || 0 || 1 || Fourth low bits : 1: no compression, 4: LZMA2 compressed, 5: zstd compressed
Firth bits : 0: normal (OFFSET_SIZE=4) 1: extended (OFFSET_SIZE=8)               
Firth bits : 0: normal (OFFSET_SIZE=4) 1: extended (OFFSET_SIZE=8)               
|-
|-

Navigation menu