Difference between revisions of "Content team/ZIM Naming Convention"

Jump to navigation Jump to search
better explain the lang + use domain instead of project in ZIM Name format
(Copy page from Github wiki to openZIM wiki)
 
(better explain the lang + use domain instead of project in ZIM Name format)
Line 1: Line 1:
<blockquote>This page was originally located at https://github.com/openzim/overview/wiki/ZIMs-Naming-Convention </blockquote>
<blockquote>This page was originally located at https://github.com/openzim/overview/wiki/ZIMs-Naming-Convention </blockquote>This page explains the naming convention use both for the ZIM `Name` metadata and the ZIM filename, for ZIMs published by openZIM.


This is an openZIM convention, i.e. other publishers are free to follow the same convention or develop their own.
=== Context ===
=== Context ===


Line 21: Line 22:


=== ZIM <code>Name</code> Metadata ===
=== ZIM <code>Name</code> Metadata ===
Format: '''<code>{project}_{lang}_{selection}</code>'''
Format: '''<code>{domain}_{lang}_{selection}</code>'''


The <code>_</code> character is reserved as separator between the parts.  
The <code>_</code> character is reserved as separator between the parts.  
Line 34: Line 35:
!Example
!Example
|-
|-
|<code>project</code>
|<code>domain</code>
|Domain name (or project) <sup>1</sup>
|Domain name (or project) <sup>1</sup>
|<code>android.stackexchange.com</code>, <code>wikipedia</code>
|<code>android.stackexchange.com</code>, <code>wikipedia</code>
|-
|-
|<code>lang</code>
|<code>lang</code>
|ISO-639-1 (2 chars) language code
|ISO-639 language code or <code>mul</code> <sup>2</sup>
|<code>en</code>, <code>fr</code>, <code>zh</code>, <code>mul</code><sup>2</sup>
|<code>en</code>, <code>fr</code>, <code>zh</code>, <code>mul</code>
|-
|-
|<code>selection</code>
|<code>selection</code>
Line 47: Line 48:
|}
|}


* <sup>1</sup> Domain name by default, project names are exceptions (basically valid only if we at least have a dedicated category for this project); use domain names if unsure, or best, ask on Slack. Should domain name could contains illegal characters for our convention, it will be encoded with Punycode, e.g. https://www.punycoder.com/)
* <sup>1</sup> By default, use the web domain name associated with the content (including for Youtube channels, ...). Project names are exceptions (basically valid only if we at least have a dedicated category for this project); use domain names if unsure, or best, ask on Slack. Should domain name could contains illegal characters for our convention, it will be encoded with Punycode, e.g. https://www.punycoder.com/)
* <sup>2</sup> <code>mul</code> is to be used for multiple-language ZIMs. Note that the ZIM <code>Language</code> metadata lists the languages (ISO-639-3) instead of using <code>mul</code>
*2 Whenever possible, prefer to use the ISO-639-1 (2 chars) language code. When the ISO-639-1 code does not exists or is ambiguous (leading to conflict of ZIM Name between two different ZIMs), using the ISO-639-3 is recommended. When multiple languages are present inside the ZIM, <code>mul</code> is to be used. Note that the ZIM <code>Language</code> metadata lists all the languages (ISO-639-3) instead of using <code>mul</code>


=== ZIM filename ===
=== ZIM filename ===
Line 79: Line 80:
* <sup>1</sup> It doesn't need to be the equal to the `Name` metadata but requirements identical.
* <sup>1</sup> It doesn't need to be the equal to the `Name` metadata but requirements identical.


=== Zimfarm ===
=== Implementation on the Zimfarm ===
Depending on the scraper, setting the <code>Name</code> metadata in the Zimfarm can be mandatory (follow above instructions) or optional. When optional, the scraper usually properly sets it according to the convention. Should it not, open a ticket on the scraper repo and set it manually in the recipe until it is fixed.
Depending on the scraper, setting the <code>Name</code> metadata in the Zimfarm can be mandatory (follow above instructions) or optional. When optional, the scraper usually properly sets it according to the convention. Should it not, open a ticket on the scraper repo and set it manually in the recipe until it is fixed.


26

edits

Navigation menu