Difference between revisions of "Roadmap"

From openZIM
Jump to navigation Jump to search
Line 4: Line 4:
* Finalizing ZIM file format
* Finalizing ZIM file format
** ZIM file header:
** ZIM file header:
*** add Pointer to UrlPointerList (IndexPointerList will be named "TitlePointerList")
*** add Pointer to UrlPointerList (IndexPointerList will be named "TitlePointerList") - DONE
*** add Pointer to MimeTypeList
*** add Pointer to MimeTypeList - DONE
** ZIM file structure:
** ZIM file structure:
*** add UrlPointerList (article list ordered by URL)
*** add UrlPointerList (article list ordered by URL) - DONE
*** add MimeTypeList to store MimeTypes in a zero-terminated list
*** add MimeTypeList to store MimeTypes in a zero-terminated list - DONE
*** make new integer compression (UTF-8 compression / ZInt compression)
*** make new integer compression (UTF-8 compression / ZInt compression) - DONE
*** break version number into major / minor number
*** break version number into major / minor number
** Directory Entry:
** Directory Entry:
*** drop QUnicode on article titles
*** drop QUnicode on article titles - DONE
*** add URL
*** add URL - DONE
*** add rev_id int compressed
*** add rev_id int compressed - DONE (not int compressed)
** Index Namespace (X)
** Index Namespace (X)
*** switch to new int compression
*** switch to new int compression - DONE


;Later
;Later
Line 28: Line 28:


* Cluster compression
* Cluster compression
** add LZMA compression
** add LZMA compression - DONE [[LZMA compression]]
** switch to compression streaming (only keep in memory what is really needed)
** switch to compression streaming (only keep in memory what is really needed)



Revision as of 23:20, 1 January 2010

See also current Status and next steps.

Until end of 2009
  • Finalizing ZIM file format
    • ZIM file header:
      • add Pointer to UrlPointerList (IndexPointerList will be named "TitlePointerList") - DONE
      • add Pointer to MimeTypeList - DONE
    • ZIM file structure:
      • add UrlPointerList (article list ordered by URL) - DONE
      • add MimeTypeList to store MimeTypes in a zero-terminated list - DONE
      • make new integer compression (UTF-8 compression / ZInt compression) - DONE
      • break version number into major / minor number
    • Directory Entry:
      • drop QUnicode on article titles - DONE
      • add URL - DONE
      • add rev_id int compressed - DONE (not int compressed)
    • Index Namespace (X)
      • switch to new int compression - DONE
Later
  • Layout Namespace (A / B)
    • A - HTML body
    • B - HTML header template
    • reader sets flag when loading library if it wants to get HTML body or full layout using header template
    • MIME types used
      • html-body
      • html-layout
  • Cluster compression
    • add LZMA compression - DONE LZMA compression
    • switch to compression streaming (only keep in memory what is really needed)
  • Packaging
    • Debian maintainer, contact by Tommi
    • RPM? - maybe openSuSE buildservice
    • static binaries should be updated regularly
    • Emmanuel adds Microsoft Visual Studio project file to SVN
  • Category Namespace (U / V)
    • U contains standard article text
    • V contains article pointers to articles within that category
  • Metadata Namespace (M) - these fields should be available as variables for layout templates
    • "language" - ISO Code 639-3
    • "creator"
    • "date" - YYYYMMDD
    • "description"
    • "relation"
    • "source" - URL
  • ZIM export running on http://download.wikimedia.org/ This is not likely to happen until after the WMF fundraiser finishes which will be after 1/2010 Tomasz 15:06, 22 November 2009 (UTC)
    • work in MW API
    • dumper has to add license name and link to the HTML content
  • Updating
    • tool to merge two ZIM files
April 2009