50
edits
(Relax constraints on lang in ZIM name (and filename)) |
|||
Line 39: | Line 39: | ||
|- | |- | ||
|<code>lang</code> | |<code>lang</code> | ||
|ISO-639 language code or <code>mul</code> <sup>2</sup> | |ISO-639 language code or based on it or <code>mul</code> <sup>2</sup> | ||
|<code>en</code>, <code>fr</code>, <code>zh</code>, <code>mul</code> | |<code>en</code>, <code>fr</code>, <code>zh</code>, <code>nds-nl</code>, <code>mul</code> | ||
|- | |- | ||
|<code>selection</code> | |<code>selection</code> | ||
Line 48: | Line 48: | ||
* <sup>1</sup> By default, use the web domain name associated with the content (including for Youtube channels, ...). Project names are exceptions (basically valid only if we at least have a dedicated category for this project); use domain names if unsure, or best, ask on Slack. Should domain name could contains illegal characters for our convention, it will be encoded with Punycode, e.g. https://www.punycoder.com/) | * <sup>1</sup> By default, use the web domain name associated with the content (including for Youtube channels, ...). Project names are exceptions (basically valid only if we at least have a dedicated category for this project); use domain names if unsure, or best, ask on Slack. Should domain name could contains illegal characters for our convention, it will be encoded with Punycode, e.g. https://www.punycoder.com/) | ||
*2 Whenever possible, prefer to use the ISO-639-1 (2 chars) language code. When the ISO-639-1 code does not exists or is ambiguous (leading to conflict of ZIM Name between two different | *<sup>2</sup> Whenever possible, prefer to use the ISO-639-1 (2 chars) language code. When the ISO-639-1 code does not exists or is ambiguous (leading to conflict of ZIM Name between two different content), using the ISO-639-3 is recommended. When ISO-639-3 is missing or still ambiguous (leading to non-unique ZIM Name for two distinct content), we can add more precision with a dash-variant (<code>-{variant}</code>) after the ISO code, like <code>nds-nl</code> or <code>en-simple</code>. When multiple languages are present inside the ZIM, <code>mul</code> is to be used whenever possible. Again, we could use variants of it with a dash to ensure uniqueness of ZIM Name among contents when <code>mul</code> is ambiguous. Note that the ZIM <code>Language</code> metadata lists all the languages (ISO-639-3) instead of using <code>mul</code>. See discussion at https://github.com/openzim/overview/issues/51#issuecomment-2904587084 for details. | ||
=== ZIM filename === | === ZIM filename === |
edits