Difference between revisions of "Content team"

Jump to navigation Jump to search
 
Line 68: Line 68:
* Any recipe should run successfully first in dev before been put in production
* Any recipe should run successfully first in dev before been put in production
* Hardware resources should be saved
* Hardware resources should be saved
** Handling of server side errors
*** HTML content HTTP 4xx and HTTP 5xx requery should ether lead to a scraper error (exit) or the content could be replaced by a placeholder explaining
*** This is a server-side error and not a scraper error
*** Sharing a few details about the nature of the error
*** Explaining if that this might be temporary
*** Ideally linking to our ticketing system for further information. This implies that the list of telerated errors is clearly documented in the code.
*** A low tolerance in percentage of the total amount of pages AND with a fix value should be hardcoded in the scraper
*** The ist of errors should be share at the end of the scraping process


=== Library Management ===
=== Library Management ===

Navigation menu