50
edits
(Add see also link to Metadata) |
(Add links to Content team/ZIM Metadata Convention.) |
||
(5 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
<blockquote>This page was originally located at https://github.com/openzim/overview/wiki/ZIMs-Naming-Convention </blockquote>This page explains the naming convention use both for the ZIM `Name` metadata and the ZIM filename, for ZIMs published by openZIM. | <blockquote>This page was originally located at https://github.com/openzim/overview/wiki/ZIMs-Naming-Convention </blockquote>This page explains the naming convention use both for the ZIM `Name` metadata and the ZIM filename, for ZIMs published by openZIM. Other metadata conventions are documented in [[Content team/ZIM Metadata Convention]]. | ||
This is an openZIM convention, i.e. other publishers are free to follow the same convention or develop their own. | This is an openZIM convention, i.e. other publishers are free to follow the same convention or develop their own. | ||
Line 26: | Line 26: | ||
The <code>_</code> character is reserved as separator between the parts. | The <code>_</code> character is reserved as separator between the parts. | ||
The parts must only contain alphanums or <code>-</code> or <code>.</code> characters. | The parts must be all lowercase and only contain alphanums (<code>a-z</code>, no accentuated or special characters) or <code>-</code> or <code>.</code> characters (regex is <code>[a-z0-9\-\.]</code>). | ||
{| class="wikitable" | {| class="wikitable" | ||
|+Components of ZIM <code>Name</code> Metadata | |+Components of ZIM <code>Name</code> Metadata | ||
Line 40: | Line 39: | ||
|- | |- | ||
|<code>lang</code> | |<code>lang</code> | ||
|ISO-639 language code or <code>mul</code> <sup>2</sup> | |ISO-639 language code or based on it or <code>mul</code> <sup>2</sup> | ||
|<code>en</code>, <code>fr</code>, <code>zh</code>, <code>mul</code> | |<code>en</code>, <code>fr</code>, <code>zh</code>, <code>nds-nl</code>, <code>mul</code> | ||
|- | |- | ||
|<code>selection</code> | |<code>selection</code> | ||
Line 49: | Line 48: | ||
* <sup>1</sup> By default, use the web domain name associated with the content (including for Youtube channels, ...). Project names are exceptions (basically valid only if we at least have a dedicated category for this project); use domain names if unsure, or best, ask on Slack. Should domain name could contains illegal characters for our convention, it will be encoded with Punycode, e.g. https://www.punycoder.com/) | * <sup>1</sup> By default, use the web domain name associated with the content (including for Youtube channels, ...). Project names are exceptions (basically valid only if we at least have a dedicated category for this project); use domain names if unsure, or best, ask on Slack. Should domain name could contains illegal characters for our convention, it will be encoded with Punycode, e.g. https://www.punycoder.com/) | ||
*2 Whenever possible, prefer to use the ISO-639-1 (2 chars) language code. When the ISO-639-1 code does not exists or is ambiguous (leading to conflict of ZIM Name between two different | *<sup>2</sup> Whenever possible, prefer to use the ISO-639-1 (2 chars) language code. When the ISO-639-1 code does not exists or is ambiguous (leading to conflict of ZIM Name between two different content), using the ISO-639-3 is recommended. When ISO-639-3 is missing or still ambiguous (leading to non-unique ZIM Name for two distinct content), we can add more precision with a dash-variant (<code>-{variant}</code>) after the ISO code, like <code>nds-nl</code> or <code>en-simple</code>. When multiple languages are present inside the ZIM, <code>mul</code> is to be used whenever possible. Again, we could use variants of it with a dash to ensure uniqueness of ZIM Name among contents when <code>mul</code> is ambiguous. Note that the ZIM <code>Language</code> metadata lists all the languages (ISO-639-3) instead of using <code>mul</code>. See discussion at https://github.com/openzim/overview/issues/51#issuecomment-2904587084 for details. | ||
=== ZIM filename === | === ZIM filename === | ||
Line 56: | Line 55: | ||
The <code>_</code> character is reserved as separator between the parts. | The <code>_</code> character is reserved as separator between the parts. | ||
The parts must only contain alphanums or <code>-</code> or <code>.</code> characters. | The parts must be all lowercase and only contain alphanums (<code>a-z</code>, no accentuated or special characters) or <code>-</code> or <code>.</code> characters (regex is <code>[a-z0-9\-\.]</code>). | ||
{| class="wikitable" | {| class="wikitable" | ||
|+Components of ZIM filename | |+Components of ZIM filename | ||
Line 88: | Line 86: | ||
=== See also === | === See also === | ||
[[Content team/ZIM Metadata Convention]] | |||
[[Metadata]] | [[Metadata]] |
edits