519
edits
(→Goals) |
|||
(18 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
The '''Content team''' gathers people in charge of providing books in the ZIM format | The '''Content team''' gathers people in charge of providing books in the ZIM format ("books" being understood here as web content stored as single web archives). | ||
== Purpose == | == Purpose == | ||
Provide the | Provide web-based educational content to people without internet access, and make the experience as seamless as possible. Access and discovery must be user-friendly and market ready, the content up-to-date and as portable as can technically be. | ||
== Goals == | == Goals == | ||
* | * Book curation must remain focused on educational material, broadly construed; | ||
* Books should have proper visual formatting | * Books should have proper visual formatting; | ||
* Books should be up-to-date | * Books should be up-to-date like custom apps; | ||
* Library should allow to | * The Kiwix Library should allow easy and friendly discovery of content. | ||
== Responsabilities == | |||
* Content Requests | |||
** Collaborate with requesters to qualify requests properly. Keep them informed. | |||
** Ensure we are allowed and able to fullfill requests | |||
** Initiate new recipes and manage first publishing if new book | |||
** Collaborate with scraper dev. team if necessary | |||
** Keep the tickets up2date | |||
* Scraping | |||
** Ensure Zimfarm works fine and contribute to its improvements with dev. team | |||
** Analyses failures or unexpected behaviors | |||
** Ensure recipes run properly, fix configuration when necessary and contribute to scraper improvements with dev. team | |||
** Ensure workers are online and are properly configured | |||
** Ensure scrapes lifecycle is correct (Reasonable pipeline size, Running scrapes progressing appropriately, not too many failures) | |||
* Library management | |||
** Ensure ZIM filenames and location (paths) are correct | |||
** Ensure ZIM Metadata are correct | |||
** Ensure ZIM are recent and kept up2date (AFAP) | |||
** Ensure library is coherent and user-friendly | |||
== Policies == | |||
=== Publishing === | |||
* Content has to be legal in Switzerland | |||
* Content should not advertise [https://en.wikipedia.org/wiki/Fringe_theory fringe theory] | |||
* Content should betterne [https://en.wikipedia.org/wiki/Free_content free content] | |||
* If not free, content should be: | |||
** Open content OR | |||
** Educational content OR | |||
** has an authorization of reproduction | |||
* Any content we publish should | |||
** have (almost) no user visible error | |||
** have proper/correct metadata | |||
** be easily discoverable in the public library | |||
=== Content Requests === | |||
* Allow everybody to request new, changes or deletion of content | |||
* In full transparency track the lifecycle of our content portfolio | |||
* New content should be assessed and vetted content against publishing policy (see above) | |||
* Content requests should be closed: | |||
** when fully implemented (user visible) | |||
** if refusal or impossibility of implementation | |||
* ZIM Medata should be given for new content | |||
* Only once all prerequisites are satisfied, then start with scraping | |||
=== Scraping === | |||
* Scraping leadership means the initiative should come from the content team | |||
* First analysis of error should be done by content team | |||
* If error in scraper is suspected | |||
** Issue should be updated to corresponding scraper code repository | |||
** Scraper problem analysis does not super-seed in any manner content request | |||
* ZIM quality should be vetted against publishing policy | |||
* Any recipe should run successfully first in dev before been put in production | |||
* Hardware resources should be saved | |||
=== Library Management === | |||
=== Custom Apps === | |||
== Processes == | |||
=== Content Requests === | |||
=== Scraping === | |||
=== Library Management === | |||
=== Custom Apps === | |||
== Worflows == | |||
## To create a new recipe for youtube files | |||
**It’s recommended to clone an existing Youtube recipe.** | |||
* Create the recipe name as per the naming conventions [here](https://github.com/openzim/overview/wiki/Naming-Convention). | |||
* In the Language space, choose the language of the website you are creating the recipe for. | |||
* From Category space, choose (other) | |||
* From warehouse path space, choose (/.hidden/.dev) always as a first time in order to test the resulted file, if the file is tested and all is correct then you update the recipe with the proper path (videos). | |||
* Make sure the Status is set to Enabled. | |||
* You can choose Periodicity to be monthly or quarterly. | |||
* In Offliner space choose: Youtube | |||
* In platform space choose Youtube. | |||
* Keep the rest the same with no change. | |||
**In Youtube command flags:** | |||
* In Playlist mode: choose (Not Set) if you are doing the recipe for a whole channel. | |||
* If you are doing the recipe for a playlist, choose (Set). | |||
* In Type: choose (Channel) or (Playlist) as per your required file. | |||
* In Youtube ID: type the ID of the channel or the playlist. | |||
* For the API Key: There is a list of keys mostly as per the channel or the playlists sizes, ask for the list to choose the appropriate API Key. | |||
* In Zim Name: the recipe name as per the naming conventions [here](https://github.com/openzim/overview/wiki/Naming-Convention). | |||
* In Title: type the name you want for the output file. | |||
* Description: type a short description of your required zim file. | |||
* Leave Optimisation Cache URL as it is (cloned from old recipe). | |||
* Leave the rest of the fields empty or as per the cloned recipe. | |||
* Finally, click in the bottom on (Update offliner details). | |||
* Review all your entries once again, then go back to the top of the page and click on (Request). | |||
* After about an hour, check the recipe if it failed or succeeded (or the next day if the source website is large). | |||
* If successful, go to this link ([dev.library.kiwix.org](https://dev.library.kiwix.org/)) and check your created file, check the size and check if the file is working properly. If the file does not appear, wait a bit as updates are made every 15 minutes. | |||
* If the file looks good and complete, go back to your recipe, In warehouse path space, change(/.hidden/.dev) to the proper category related to your file content (Wikipedia, Wikihow, … etc). | |||
* Click on Update offliner details and then click on Request again. | |||
* Finally, check the file in (https://library.kiwix.org/ ). If all is good, do not forget to go back to the initial ticket (most likely at zim-requests) and put the link of the output file and close the ticket. | |||
== Members == | |||
* [https://github.com/Popolechien Popolechien], manager in line | |||
* [https://github.com/RavanJAltaie Ravan], content manager | |||
* [https://github.com/benoit74 Benoit74], scrapers lead dev | |||
== See also == | == See also == | ||
* [[Content strategy]] | * [[Content strategy]] |