518
edits
(→URLs) |
(→Clusters: Add 5 for Zstandard compression) |
||
(3 intermediate revisions by 2 users not shown) | |||
Line 13: | Line 13: | ||
! Field Name !! Type !! Offset !! Length !! Description | ! Field Name !! Type !! Offset !! Length !! Description | ||
|- | |- | ||
| magicNumber || integer || 0 || 4 || Magic number to recognise the file format, must be | | magicNumber || integer || 0 || 4 || Magic number to recognise the file format, must be 72173914 (0x44D495A) | ||
|- | |- | ||
|majorVersion | |majorVersion | ||
Line 199: | Line 199: | ||
The first byte of the cluster identifies some information about the cluster. | The first byte of the cluster identifies some information about the cluster. | ||
The first fourth low bits identifies if the cluster is compressed (4) or not (0). The default is uncompressed indicated by a value of 0 or 1 (obsoleted, inherited by Zeno) while compressed clusters are indicated by a value of 4 which indicates [[LZMA2 compression]] (or more precisely XZ, since there is a XZ header). There have been other compression algorithms used before (2: zlib, 3: bzip2) which have been removed. The zimlib uses [http://tukaani.org/xz/ xz-utils] as a C++ implementation of lzma2, for Java see [http://tukaani.org/xz/java.html XZ-Java]. | The first fourth low bits identifies if the cluster is compressed (4) or not (0). The default is uncompressed indicated by a value of 0 or 1 (obsoleted, inherited by Zeno) while compressed clusters are indicated by a value of 4 which indicates [[LZMA2 compression]] (or more precisely XZ, since there is a XZ header) and 5 the Zstandard compression. There have been other compression algorithms used before (2: zlib, 3: bzip2) which have been removed. The zimlib uses [http://tukaani.org/xz/ xz-utils] as a C++ implementation of lzma2, for Java see [http://tukaani.org/xz/java.html XZ-Java]. | ||
The firth bit identifies if the cluster is extended or not : | The firth bit identifies if the cluster is extended or not : | ||
Line 261: | Line 261: | ||
| W || categories per article, category list - see [[Category Handling]] | | W || categories per article, category list - see [[Category Handling]] | ||
|- | |- | ||
| X || | | X || search indexes | ||
|} | |} | ||