519
edits
Mgautierfr (talk | contribs) (Change spec description to new namespace usage.) |
(→Clusters: Clarification around the cluster compression types) |
||
(2 intermediate revisions by one other user not shown) | |||
Line 29: | Line 29: | ||
|- | |- | ||
| titlePtrPos || integer || 40 || 8 || position of the directory pointerlist ordered by Title | | titlePtrPos || integer || 40 || 8 || position of the directory pointerlist ordered by Title | ||
This is considered as obsolete, readers should use <code>X/listing/titleordered/v0</code> instead and fallback to <code>titlePtrPos</code> if entry is not present. | This is considered as obsolete, readers should use <code>[[Search indexes#Title index v0|X/listing/titleordered/v0]]</code> instead and fallback to <code>titlePtrPos</code> if entry is not present. | ||
|- | |- | ||
| clusterPtrPos || integer || 48 || 8 || position of the cluster pointer list | | clusterPtrPos || integer || 48 || 8 || position of the cluster pointer list | ||
Line 75: | Line 73: | ||
The URL pointer list is a list of 8 byte offsets to the directory entries. | The URL pointer list is a list of 8 byte offsets to the directory entries. | ||
The directory entries are always ordered by URL. Ordering is simply done by comparing the URL strings. | The directory entries are always ordered by "full" URL (<code><namespace><path></code>). Ordering is simply done by comparing the URL strings. | ||
Since directory entries have variable sizes this is needed for random access. | Since directory entries have variable sizes this is needed for random access. | ||
Line 94: | Line 92: | ||
== Title Pointer List (titlePtrPos) == | == Title Pointer List (titlePtrPos) == | ||
The title pointer list is a list of entry indices ordered by title. The title pointer list actually points to entries in the URL pointer list. | The title pointer list is a list of entry indices ordered by title (<code><namespace><title></code>). The title pointer list actually points to entries in the URL pointer list. | ||
Note that the title pointers are only 4 bytes. They are not offsets in the file but entry numbers. | Note that the title pointers are only 4 bytes. They are not offsets in the file but entry numbers. | ||
Line 190: | Line 188: | ||
The first byte of the cluster identifies some information about the cluster. | The first byte of the cluster identifies some information about the cluster. | ||
The first fourth low bits identifies if the cluster | The first fourth low bits identifies if the cluster compression type: | ||
* | * No compression is indicated by a value of 1 | ||
* Compressed clusters are indicated by a value of 4 ([[LZMA2 compression]] (or more precisely XZ, since there is a XZ header)) | * Compressed clusters are indicated by a value of 4 ([[LZMA2 compression]] (or more precisely XZ, since there is a XZ header)) or 5 (Zstandard compression). | ||
* There have been other compression algorithms used before | * There have been other compression algorithms used before which have been removed: 2 for zlib and 3 for bzip2. | ||
The | * 0 is an obselete code for no compression (inhereted from the Zeno) | ||
The fifth bit identifies the cluster is extended or not : | |||
* By default (5th bit == 0) the cluster is not extended. It means that the offsets are stored in a 4 bytes length integer. Thus contents stored in the cluster cannot exceed 4Go. | * By default (5th bit == 0) the cluster is not extended. It means that the offsets are stored in a 4 bytes length integer. Thus contents stored in the cluster cannot exceed 4Go. | ||
* If the cluster is extended (5th bit == 1), the offsets are stored in 8 bytes length integer. Thus contents stored in the cluster can exceed 4Go. | * If the cluster is extended (5th bit == 1), the offsets are stored in 8 bytes length integer. Thus contents stored in the cluster can exceed 4Go. | ||
Line 206: | Line 206: | ||
! Field Name !! Type !!Offset!!Length!! Description | ! Field Name !! Type !!Offset!!Length!! Description | ||
|- | |- | ||
| cluster information || integer || 0 || 1 || Fourth low bits : | | cluster information || integer || 0 || 1 || Fourth low bits : 1: no compression, 4: LZMA2 compressed, 5: zstd compressed | ||
Firth bits : 0: normal (OFFSET_SIZE=4) 1: extended (OFFSET_SIZE=8) | Firth bits : 0: normal (OFFSET_SIZE=4) 1: extended (OFFSET_SIZE=8) | ||
|- | |- |