50
edits
(Update ZIM deletion procedure) |
(Fix formatting of recipe periodicity) |
||
(5 intermediate revisions by 3 users not shown) | |||
Line 68: | Line 68: | ||
* Any recipe should run successfully first in dev before been put in production | * Any recipe should run successfully first in dev before been put in production | ||
* Hardware resources should be saved | * Hardware resources should be saved | ||
** Handling of server side errors | |||
*** HTML content HTTP 4xx and HTTP 5xx requery should ether lead to a scraper error (exit) or the content could be replaced by a placeholder explaining | |||
*** This is a server-side error and not a scraper error | |||
*** Sharing a few details about the nature of the error | |||
*** Explaining if that this might be temporary | |||
*** Ideally linking to our ticketing system for further information. This implies that the list of telerated errors is clearly documented in the code. | |||
*** A low tolerance in percentage of the total amount of pages AND with a fix value should be hardcoded in the scraper | |||
*** The ist of errors should be share at the end of the scraping process | |||
=== Library Management === | === Library Management === | ||
Line 133: | Line 141: | ||
# Click on Update offliner details and then click on Request again. | # Click on Update offliner details and then click on Request again. | ||
# Finally, check the file in [https://library.kiwix.org/ Kiwix Content Library]. If all is good, do not forget to go back to [https://github.com/openzim/zim-requests/issues the initial ticket] and put the link of the output file and close the ticket. | # Finally, check the file in [https://library.kiwix.org/ Kiwix Content Library]. If all is good, do not forget to go back to [https://github.com/openzim/zim-requests/issues the initial ticket] and put the link of the output file and close the ticket. | ||
==== Choose proper recipe periodicity ==== | |||
'''''This is a draft proposal''''' | |||
When we configure a recipe on the Zimfarm, we have to decide on the periodicity at which the recipe will be ran. | |||
Following rules should be followed, unless justified by an exception: | |||
* by default, the periodicity is quarterly | |||
* recipes linked to content which is very regularly updated might switch to monthly updates ; this is typically the case for all recipes linked to Wikimedia wikis | |||
* recipes known to take a lot of time to complete / consume much resources / be linked to content not regularly updated should be switched to bi-annually or annually periodicity (at the discretion of recipe maintainer) | |||
* recipes in DEV (pushing to /.hidden/dev) have a manual periodicity: | |||
** the person setting up the recipe will take care of updating the ZIM when needed, having a manual process helps to avoid side-effects during testing by not all testing the same ZIM | |||
** we aim to put the time during which a recipe is in DEV to a minimum | |||
** we have too many recipe in DEV which are failing and not yet disabled, if the update is automated it will continuously waste resources | |||
* recipes building ZIMs for a specific customer have a manual periodicity by default, unless we have a clear maintenance contract paying us to update ZIMs at a given interval, or unless the ZIM in question is of general interest (but then we usually do not consider this ZIM to be linked to a specific customer) | |||
==== Change a recipe/ZIM warehouse path and/or a ZIM name ==== | ==== Change a recipe/ZIM warehouse path and/or a ZIM name ==== | ||
Line 202: | Line 224: | ||
It is hence mandatory that, whenever a recipe/ZIM needs to be deleted, [https://github.com/openzim/zim-requests openzim/zim_requests a ticket is opened on GitHub] and assigned to both @benoit74 and @rgaudin for proper coordination: | It is hence mandatory that, whenever a recipe/ZIM needs to be deleted, [https://github.com/openzim/zim-requests openzim/zim_requests a ticket is opened on GitHub] and assigned to both @benoit74 and @rgaudin for proper coordination: | ||
# Add a delete marker on storage (if <code>zim/zimit/my_zim.zim</code> needs to be removed from catalog, you | # Add a delete marker on storage (if <code>zim/zimit/my_zim.zim</code> needs to be removed from catalog, you have to "touch" <code>zim/zimit/my_zim.delete</code>) | ||
#Wait for library catalog to be regenerated | #Wait for library catalog to be regenerated | ||
#Check that there are no more in-progress Orders in the Kiwix Hotspot Imager that include those ZIMs | #Check that there are no more in-progress Orders in the Kiwix Hotspot Imager that include those ZIMs | ||
Line 210: | Line 232: | ||
==== Demo a ZIM ==== | ==== Demo a ZIM ==== | ||
From time to time, we need to demo a ZIM to a customer before releasing it into the wild. We have a demo instance at https:// | From time to time, we need to demo a ZIM to a customer before releasing it into the wild. We have a demo instance at https://clients.library.kiwix.org/ | ||
Configuration is done through the file at https://github.com/kiwix/operations/blob/main/zim/ | Configuration is done through the file at https://github.com/kiwix/operations/blob/main/zim/clients-library/demos.yaml ; should you need to create a new demo, modify or delete an existing one, simply open a PR with your modifications on this file and ask @rgaudin or @benoit74 for review. | ||
Every ZIM can be referenced either by full path or by path up-to-the-date, I which case most recent one will be automatically selected at each configuration redeployment. | Every ZIM can be referenced either by full path or by path up-to-the-date, I which case most recent one will be automatically selected at each configuration redeployment. |
edits