Difference between revisions of "Google Summer of Code 2010"

From openZIM
Jump to navigation Jump to search
Line 2: Line 2:
* Website: http://socghop.appspot.com/
* Website: http://socghop.appspot.com/
* FAQ: http://socghop.appspot.com/document/show/gsoc_program/google/gsoc2010/faqs
* FAQ: http://socghop.appspot.com/document/show/gsoc_program/google/gsoc2010/faqs
;Subpages
* [[/Organisation Application]]


=== Timeline ===
=== Timeline ===
Line 71: Line 74:
|}
|}


== Mentors ==
=== Mentors ===
We will do the mentoring in a team. As we are all working on this project during our free time and all have real jobs during the day, we spread the tasks involved in mentoring so we can back up each other.
We will do the mentoring in a team. As we are all working on this project during our free time and all have real jobs during the day, we spread the tasks involved in mentoring so we can back up each other.


Line 84: Line 87:
openZIM provides a solution - or at least a part of it - to most of the Wikimedia Offline proposals in the Wikimedia Strategy.
openZIM provides a solution - or at least a part of it - to most of the Wikimedia Offline proposals in the Wikimedia Strategy.


=== MediaWiki extension to create ZIM files ===
=== WMF Strategy, Outline for Recommendation #1 - content reuse from WMF projects ===
;Proposal 1: The File Format
this is our target goal. With ZIM we provide a very efficient format which works for a variety of data types (hypertext, images and other files, categories...) and on a broad variety on platforms and devices (eg. it's optimized for very limited devices like handhelds). As an open documented standard and by the availability of GPLed libraries it can be easily be integrated in any reader application (severals are already available). ZIM files of all sources can be exchanged with any reader application and thus enables easy reuse of Wikimedia content.
this is our target goal. The File Format is working and has been improved several times, also with the help of the Wikimedia Foundation. The Foundation has decided to include ZIM as a regular dump file format. Other works have to be done before this can happen - namely fixing static dumps, Tomasz Finc is already working on that on Wikimedia's side.
 
The Foundation has decided to include ZIM as a regular dump file format. Other works have to be done before this can happen - namely fixing static dumps, Tomasz Finc is already working on that on Wikimedia's side.
 
Currently creating ZIM files involves a lot of manual work. Several publishers use different tools like Perl scripts to gather the content from MediaWiki and put it into a database. The libzim currently has an interface to read these contents from a database or other ZIM files to create new ZIM files. So automated ZIM file creation is still a missing gap.


With this idea we plan to speed up the development process from another perspective. A "one-click function" to create ZIM files have been one of the first ideas mentioned by Erik Möller (deputy CEO of WMF) when the decision was made by the WMF to adopt ZIM. Using MediaWikis internal structures to create clean HTML out of a list of articles and writing an interface for libzim to read from this extension it is easily possible to achieve that. The libzim interface can then be used to do the final hook-up to the static dump once it has been fixed, so the regular ZIM dump at WMF becomes a step closer.
;Idea: MediaWiki extension to create ZIM files
# Having a "one-click function" to export selected articles from MediaWiki was one of the first ideas the Wikimedia Foundation had when they decided to adopt ZIM. So this idea will become reality, enabling all users of MediaWiki to easily produce ZIM files, spreading the file format more widely and making it more usable outside Wikimedia.
# While the static dumps are being fixed we can prepare the next step in the automated ZIM file creation. The libzim interface for the MediaWiki extension can be simply reused.


;Proposal
=== WMF Strategy, Outline of offline recommendation #2: Use of cellphones ===
As described in #1 the file format is very efficient and has been optimized for usage on devices with limited ressources.


;Specification:
;Idea: zimreader for mobile phones
We don't have specific knowledge on how to write code for cell phones, but we have all the components - file format, free implementation, library - in place. So we had the idea of finding somebody to write sample application for the cell phone of his / her choice as a start.
 
;WMF Strategy, Outline for Recommendation #3 - Schools
ZIM is already used at schools, mainly the Kiwix project (an user of openZIM) is very active in that area. Having a standardized format as described in #1 helps to provide schools with a large variety of compatible content which is updated frequently.
 
Beside that openZIM is working with Linux4Africa as they are about to adopt the ZIM file format using the openZIM zimreader application as a local webserver on their school servers.
 
=== Idea: MediaWiki extension to create ZIM files ===
* content selection by user:  
* content selection by user:  
** add a selector to each article "include this article"
** add a selector to each article "include this article"
Line 111: Line 128:
** retrieve HTML content of an article - only content section
** retrieve HTML content of an article - only content section


=== zimreader for mobile phones ===
=== Idea: zimreader for mobile phones ===
make a HTML viewer that uses zimlib to show contents on a mobile phone
make a HTML viewer that uses zimlib to show contents on a mobile phone


Line 117: Line 134:
* Symbian
* Symbian
* J2ME
* J2ME
== Organisation Application ==
* '''Organization Name:''' openZIM
* '''Description:'''
*:The openZIM project has two different targets:
*:* the ZIM file format, an open, standardized file format to store Wiki content efficiently for offline usage
*:* an open source implementation of the ZIM file format consisting of the zimlib, zimwriter and zimreader
*:The openZIM project is sponsored by Wikimedia CH and supported by the Wikimedia Foundation.
*:Currently we support MediaWiki and are working on having creation of ZIM files for all Wikimedia projects.
* '''Home page:''' http://openzim.org/
* '''Main Organization License:''' GNU General Public License (GPL)
* '''Why is your organization applying to participate in GSoC 2010? What do you hope to gain by participating?'''
*:
* '''Did your organization participate in past GSoCs? If so, please summarize your involvement and the successes and challenges of your participation.'''
*: no
* '''If your organization participated in past GSoCs, please let us know the ratio of students passing to students allocated, e.g. 2006: 3/6 for 3 out of 6 students passed in 2006.'''
*: n/a
* '''If your organization has not previously participated in GSoC, have you applied in the past? If so, for what year(s)?'''
*: no
* '''What is the URL for your ideas page?''' http://openzim.org/Google_Sommer_of_Code_2010
* '''What is the main development mailing list for your organization? This question will be shown to students who would like to get more information about applying to your organization for GSoC 2010. If your organization uses more than one list, please make sure to include a description of the list so students know which to use.'''
*: dev-l@openzim.org
* '''What is the main IRC channel for your organization?''' irc://irc.freenode.net/openzim
* '''Does your organization have an application template you would like to see students use? If so, please provide it now. Please note that it is a very good idea to ask students to provide you with their contact information as part of your template. Their contact details will not be shared with you automatically via the GSoC 2010 site.'''
*:* name
*:* location (city, country)
*:* age
*:* personal website
*:* mail address
*:* IRC nickname
*:* Wikipedia username
*:* your experience with mediawiki (usage, coding)
*:* why did you choose this project for GSoC (motivation, expectations)?
*:* would you be able to present your project at Wikimania Conference, June 9 - 11 2010?
* '''What criteria did you use to select the individuals who will act as mentors for your organization? Please be as specific as possible:'''
*:* technical competence in order to support the students to get their job done
*:* time left to be available for the students to answer their questions
*:* willingness to deal with students and provide the above
* '''What is your plan for dealing with disappearing students?'''
*:* trying to maintain contact
*:* informing them about consequences of their absence
*:* if possible discuss if they come back or if we cancel the project
*:* if the above gives no results note the situation in the evaluation
* '''What is your plan for dealing with disappearing mentors?'''
*:very unlikely case as the main developer is part of the mentoring team
*:* trying to maintain contact
*:* if neccessary there is at least one fall-back available that will take over the responsibility
* '''What steps will you take to encourage students to interact with your project's community before, during and after the program?'''
*: they will be a full member of the development team, so they are involved in all processes with the project at least through the mailinglist
*: they are free to participate on any developers meeting or other event of the openZIM such as ehibitions which are all sponsored by the project team (at least accommodation)
* '''What will you do to ensure that your accepted students stick with the project after GSoC concludes? '''
*:as written above they are a full member of the project team and have all rights and privileges within that. If they like the project - which we hope, but cannot influence - we think that they will stick with it.
* '''Is there anything else you would like to tell the Google Summer of Code program administration team?'''
*:We like the idea of GSoC and thank you very much for this opportunity for us to find skilled developers!
* '''Backup Admin (Link ID):''' tntnet

Revision as of 08:11, 12 March 2010

Process

Subpages

Timeline

February 8: Program announced. Life is good.
March 8:~12 noon PST / 19:00 UTC Mentoring organizations can begin submitting applications to Google.
March 12:4 PM PDT / 23:00 UTC Mentoring organization application deadline.
March 13-17: Google program administrators review organization applications.
March 18:~12 noon PDT / 19:00 UTC List of accepted mentoring organizations published on the Google Summer of Code 2010 site.
March 18-29: Would-be student participants discuss application ideas with mentoring organizations.
March 29:~12 noon PDT / 19:00 UTC Student application period opens.
April 9:12 noon PDT / 19:00 UTC Student application deadline.
Interim Period: Mentoring organizations review and rank student proposals; where necessary, mentoring organizations may request further proposal detail from the student applicant.
April 21: All mentors must be signed up and all student proposals matched with a mentor - 07:00 UTCStudent ranking/scoring deadline. Please do not add private comments with a nonzero score or mark students as ineligible (unless doing so as part of resolving duplicate accepted students) after this deadline - 17:00 UTCIRC meeting to resolve any outstanding duplicate accepted students - timing TBD, will be announced well in advance
April 26:~12 noon PDT / 19:00 UTC Accepted student proposals announced on the Google Summer of Code 2010 site.
Community Bonding Period: Students get to know mentors, read documentation, get up to speed to begin working on their projects.
May 24: Students begin coding for their GSoC projects;Google begins issuing initial student payments provided tax forms are on file and students are in good standing with their communities.
Interim Period: Mentors give students a helping hand and guidance on their projects.
July 12:~12 noon PDT / 19:00 UTC Mentors and students can begin submitting mid-term evaluations.
July 16:12 noon PDT / 19:00 UTC Mid-term evaluations deadline;Google begins issuing mid-term student payments provided passing student survey is on file.
Interim Period: Mentors give students a helping hand and guidance on their projects.
August 9: Suggested 'pencils down' date. Take a week to scrub code, write tests, improve documentation, etc.
August 16:~12 noon PDT / 19:00 UTC Firm 'pencils down' date. Mentors, students and organization administrators can begin submitting final evaluations to Google.
August 20:12 noon PDT / 19:00 UTC Final evaluation deadlineGoogle begins issuing student and mentoring organization payments provided forms and evaluations are on file.
August 23: Final results of GSoC 2010 announced
August 30: Students can begin submitting required code samples to Google
October (date TBD): Mentor Summit at Google: Representatives from each successfully participating organization are invited to Google to greet, collaborate and code. Our mission for the weekend: make the program even better, have fun and make new friends.

Mentors

We will do the mentoring in a team. As we are all working on this project during our free time and all have real jobs during the day, we spread the tasks involved in mentoring so we can back up each other.

  • Tommi Mäkitalo (tntnet) - main openZIM developer, will take care for technical questions
  • Manuel Schneider (x80686) - openZIM project leader, will take care for all the paper stuff, evaluation etc., is available through IRC during the day

For coordination there will be regular IRC and Mumble meetings with the students and mentors. The schedule of this meetings should be at least twice a week and will be fixed as soon as the participants are fixed.

Ideas

see also http://strategy.wikimedia.org/wiki/Task_force/Recommendations/Offline

openZIM provides a solution - or at least a part of it - to most of the Wikimedia Offline proposals in the Wikimedia Strategy.

WMF Strategy, Outline for Recommendation #1 - content reuse from WMF projects

this is our target goal. With ZIM we provide a very efficient format which works for a variety of data types (hypertext, images and other files, categories...) and on a broad variety on platforms and devices (eg. it's optimized for very limited devices like handhelds). As an open documented standard and by the availability of GPLed libraries it can be easily be integrated in any reader application (severals are already available). ZIM files of all sources can be exchanged with any reader application and thus enables easy reuse of Wikimedia content.

The Foundation has decided to include ZIM as a regular dump file format. Other works have to be done before this can happen - namely fixing static dumps, Tomasz Finc is already working on that on Wikimedia's side.

Currently creating ZIM files involves a lot of manual work. Several publishers use different tools like Perl scripts to gather the content from MediaWiki and put it into a database. The libzim currently has an interface to read these contents from a database or other ZIM files to create new ZIM files. So automated ZIM file creation is still a missing gap.

Idea
MediaWiki extension to create ZIM files
  1. Having a "one-click function" to export selected articles from MediaWiki was one of the first ideas the Wikimedia Foundation had when they decided to adopt ZIM. So this idea will become reality, enabling all users of MediaWiki to easily produce ZIM files, spreading the file format more widely and making it more usable outside Wikimedia.
  2. While the static dumps are being fixed we can prepare the next step in the automated ZIM file creation. The libzim interface for the MediaWiki extension can be simply reused.

WMF Strategy, Outline of offline recommendation #2: Use of cellphones

As described in #1 the file format is very efficient and has been optimized for usage on devices with limited ressources.

Idea
zimreader for mobile phones

We don't have specific knowledge on how to write code for cell phones, but we have all the components - file format, free implementation, library - in place. So we had the idea of finding somebody to write sample application for the cell phone of his / her choice as a start.

WMF Strategy, Outline for Recommendation #3 - Schools

ZIM is already used at schools, mainly the Kiwix project (an user of openZIM) is very active in that area. Having a standardized format as described in #1 helps to provide schools with a large variety of compatible content which is updated frequently.

Beside that openZIM is working with Linux4Africa as they are about to adopt the ZIM file format using the openZIM zimreader application as a local webserver on their school servers.

Idea: MediaWiki extension to create ZIM files

  • content selection by user:
    • add a selector to each article "include this article"
    • provide a method to include categories / all articles with selected categories
    • choose if images should be included as well or not
    • choose if selected articles / category should be exported or the whole wiki
  • content selection by extension:
    • make a list of all selected articles
    • make a list of all categories needed
    • make a list of all involved images / files
    • gather a list of involved MIME types
  • ZIM header:
    • send the list of content to zimlib
    • create meta data for zimlib
    • send MIME type list
  • ZIM content:
    • provide an interface for zimlib where it can fetch article content
    • retrieve HTML content of an article - only content section

Idea: zimreader for mobile phones

make a HTML viewer that uses zimlib to show contents on a mobile phone

can be

  • Symbian
  • J2ME