3
edits
(Huge rewrite of the whole page) |
(mwoffliner docker images moved from dockerhub to github container registry (try 2)) |
||
(6 intermediate revisions by 2 users not shown) | |||
Line 13: | Line 13: | ||
If your content match openZIM [[Content team#Publishing|publishing policies]], you may ask the Kiwix team to create a ZIM file for you. | If your content match openZIM [[Content team#Publishing|publishing policies]], you may ask the Kiwix team to create a ZIM file for you. | ||
This | This main limitations is that you have no control on the timeline when the ZIM will be available. | ||
Kiwix does its best to create ZIMs in a timely manner, but being a free service the resources are limited. | |||
It is also possible to pay Kiwix to create ZIMs, and in such a situation the service will of course be much quicker and responsive. | |||
To request such a ZIM, simply follow the process described in the [https://github.com/openzim/zim-requests/ zim-requests Github repository]. | To request such a ZIM, simply follow the process described in the [https://github.com/openzim/zim-requests/ zim-requests Github repository]. | ||
=== | === YouZimit === | ||
[https:// | [https://zimit.kiwix.org zimit.kiwix.org] is an online website where you can request an automated system to create a ZIM of any online website. | ||
Once the ZIM is produced, a download link will be provided to your email address. | Once the ZIM is produced, a download link will be provided to your email address. | ||
Line 35: | Line 39: | ||
If your use-case match all these limitations, it is clearly the quickest solution to get a ZIM (even if the processing capabilities are limited and your job might end-up in a waiting queue for few hours). | If your use-case match all these limitations, it is clearly the quickest solution to get a ZIM (even if the processing capabilities are limited and your job might end-up in a waiting queue for few hours). | ||
It should be noted for now advanced use of [https:// | It should be noted for now advanced use of [https://zimit.kiwix.org zimit.kiwix.org] requires some technical skills and expert knowledge to configure the advanced options. This process should be enhanced in 2024 to provide more explanations and guide the user in the configuration process. | ||
== Ops style == | == Ops style == | ||
Line 52: | Line 56: | ||
MWoffliner is a tool which allows to "dump" a Wikimedia project (Wikipedia, Wiktionary, ...) to a local storage. It should also work for any [https://mediawiki.org Mediawiki] instance. It goes through all articles (or a selection if specified) of the project and write HTML/pictures to your local filesystem as plain HTML/JS/CSS/... files or in a ZIM file. | MWoffliner is a tool which allows to "dump" a Wikimedia project (Wikipedia, Wiktionary, ...) to a local storage. It should also work for any [https://mediawiki.org Mediawiki] instance. It goes through all articles (or a selection if specified) of the project and write HTML/pictures to your local filesystem as plain HTML/JS/CSS/... files or in a ZIM file. | ||
It is distributed via [https://www.npmjs.com/package/mwoffliner npm] and [https:// | It is distributed via [https://www.npmjs.com/package/mwoffliner npm] and [https://github.com/openzim/mwoffliner/pkgs/container/mwoffliner Docker]. | ||
If you are a developer, you can download it directly from its [https://github.com/openzim/mwoffliner git repository]. | If you are a developer, you can download it directly from its [https://github.com/openzim/mwoffliner git repository]. | ||
Line 64: | Line 68: | ||
== Devs style == | == Devs style == | ||
If you have developments skills, you can create your own tool to create a ZIM from your content. | If you have developments skills, you can create your own tool to create a ZIM from your content. This is what is called it a scraper, even if most of them do not "scrape" a website but used specific techniques like APIs or exported databases. | ||
The libzim library (openZIM implementation of the ZIM specification, to read and write ZIM files, written in C++) has bindings available for many programming languages: Python, Node.JS, Java. | The libzim library (openZIM implementation of the ZIM specification, to read and write ZIM files, written in C++) has bindings available for many programming languages: Python, Node.JS, Java. | ||
Since most openZIM scraper are written in Python, there is even a python-scraperlib library providing higher level functions to simplify common scraper tasks. | Since most openZIM scraper are written in Python, there is even a python-scraperlib library providing higher level functions to simplify common scraper tasks. There is even a [[How-to create a Python scraper]] dedicated page. | ||
== Older tools == | == Older tools == |
edits