Difference between revisions of "Content team"

Jump to navigation Jump to search
1 byte added ,  14:47, 2 August 2024
no edit summary
Line 94: Line 94:


=== Scraping ===
=== Scraping ===
==== Create a Youtube recipe ====
To create a new recipe to scrape videos from a Youtube Channel/Username or one-or-more Playlists.
It’s recommended to clone an existing Youtube recipe.
* In "Content settings":
# Create the recipe name as per [https://github.com/openzim/overview/wiki/Naming-Convention the naming conventions].
# In the Language space, choose the language(s) of the Youtube page you are creating the recipe for.
# From Category space, choose (other)
# From warehouse path space, choose "/.hidden/.dev" always as a first time in order to test the resulted ZIM file.
# if the file is tested and all is correct then you update the recipe with the proper path "videos". Otherwise tune the recipe and relaunch a task.
# Make sure the Status is set to Enabled.
# You can choose Periodicity to be monthly or quarterly. Use monthly per default.
* In "Task settings":
# In Offliner space choose: Youtube
# In platform space choose Youtube.
# Keep the rest the same with no change.
*In "Scraper settings: youtube command flags":
# In Playlist mode: choose (Not Set) if you are doing the recipe for a whole channel.
# If you are doing the recipe for a playlist, choose (Set).
# In Type: choose (Channel) or (Playlist) as per your required file.
# In Youtube ID: type the ID of the channel or the playlist.
# For the API Key: There is a list of keys mostly as per the channel or the playlists sizes, ask for the list to choose the appropriate API Key.
# In ZIM Name: the recipe name as per the naming conventions [here](https://github.com/openzim/overview/wiki/Naming-Convention).
# In Title: type the name you want for the output file.
# Description: type a short description of your required zim file.
# Leave Optimisation Cache URL as it is (cloned from old recipe).
# Leave the rest of the fields empty or as per the cloned recipe.
# Finally, click in the bottom on (Update offliner details).
# Review all your entries once again, then go back to the top of the page and click on (Request).
# After about an hour, check the recipe if it failed or succeeded (or the next day if the source website is large).
# If successful, go to this link ([dev.library.kiwix.org](https://dev.library.kiwix.org/)) and check your created file, check the size and check if the file is working properly. If the file does not appear, wait a bit as updates are made every 15 minutes.
# If the file looks good and complete, go back to your recipe, In warehouse path space, change(/.hidden/.dev) to the proper category related to your file content (Wikipedia, Wikihow, … etc).
# Click on Update offliner details and then click on Request again.
# Finally, check the file in [https://library.kiwix.org/ Kiwix Content Library]. If all is good, do not forget to go back to [https://github.com/openzim/zim-requests/issues the initial ticket] and put the link of the output file and close the ticket.


==== Change a recipe/ZIM warehouse path ====
==== Change a recipe/ZIM warehouse path ====
Line 165: Line 205:


''Nota'': Moving a file to the archive has to be considered as a file deletion.
''Nota'': Moving a file to the archive has to be considered as a file deletion.
==== Create a Youtube recipe ====
To create a new recipe to scrape videos from a Youtube Channel/Username or one-or-more Playlists.
It’s recommended to clone an existing Youtube recipe.
* In "Content settings":
# Create the recipe name as per [https://github.com/openzim/overview/wiki/Naming-Convention the naming conventions].
# In the Language space, choose the language(s) of the Youtube page you are creating the recipe for.
# From Category space, choose (other)
# From warehouse path space, choose "/.hidden/.dev" always as a first time in order to test the resulted ZIM file.
# if the file is tested and all is correct then you update the recipe with the proper path "videos". Otherwise tune the recipe and relaunch a task.
# Make sure the Status is set to Enabled.
# You can choose Periodicity to be monthly or quarterly. Use monthly per default.
* In "Task settings":
# In Offliner space choose: Youtube
# In platform space choose Youtube.
# Keep the rest the same with no change.
*In "Scraper settings: youtube command flags":
# In Playlist mode: choose (Not Set) if you are doing the recipe for a whole channel.
# If you are doing the recipe for a playlist, choose (Set).
# In Type: choose (Channel) or (Playlist) as per your required file.
# In Youtube ID: type the ID of the channel or the playlist.
# For the API Key: There is a list of keys mostly as per the channel or the playlists sizes, ask for the list to choose the appropriate API Key.
# In ZIM Name: the recipe name as per the naming conventions [here](https://github.com/openzim/overview/wiki/Naming-Convention).
# In Title: type the name you want for the output file.
# Description: type a short description of your required zim file.
# Leave Optimisation Cache URL as it is (cloned from old recipe).
# Leave the rest of the fields empty or as per the cloned recipe.
# Finally, click in the bottom on (Update offliner details).
# Review all your entries once again, then go back to the top of the page and click on (Request).
# After about an hour, check the recipe if it failed or succeeded (or the next day if the source website is large).
# If successful, go to this link ([dev.library.kiwix.org](https://dev.library.kiwix.org/)) and check your created file, check the size and check if the file is working properly. If the file does not appear, wait a bit as updates are made every 15 minutes.
# If the file looks good and complete, go back to your recipe, In warehouse path space, change(/.hidden/.dev) to the proper category related to your file content (Wikipedia, Wikihow, … etc).
# Click on Update offliner details and then click on Request again.
# Finally, check the file in [https://library.kiwix.org/ Kiwix Content Library]. If all is good, do not forget to go back to [https://github.com/openzim/zim-requests/issues the initial ticket] and put the link of the output file and close the ticket.


== Members ==
== Members ==

Navigation menu