Difference between revisions of "Libzim"

Jump to navigation Jump to search
2,662 bytes added ,  19:11, 21 January 2010
no edit summary
Line 2: Line 2:


== Programming ==
== Programming ==
=== Introduction ===
zimlib is written in C++. To use the library, you need the include files of zimlib have to link against libzim. Both are installed when zimlib is built with the normal "./configure; make; make install".
zimlib is written in C++. To use the library, you need the include files of zimlib have to link against libzim. Both are installed when zimlib is built with the normal "./configure; make; make install".


Line 10: Line 12:
The main class, which accesses the file is zim::File. It has actually a reference to a implementation, so that copies of the class just references the same file. You open a file by passing the file name to the constuctor as a std::string.
The main class, which accesses the file is zim::File. It has actually a reference to a implementation, so that copies of the class just references the same file. You open a file by passing the file name to the constuctor as a std::string.


The API tries to resemble the standard C++ library, so that a zim::File works like a container of instances of zim::Article. It has a const_iterator, which is created using zim::File::begin(). The iterator may be incremented to point to the next article until it reaches zim::File::end(). Iterators pointing to that must not be dereferenced nor incremented.
The API tries to resemble the standard C++ library, so that a ''zim::File'' works like a container of instances of zim::Article. It has a ''const_iterator'', which is created using zim::File::begin(). Dereferencing the iterator gives the zim::Article. The iterator may be incremented to point to the next article until it reaches zim::File::end(). Iterators pointing to that must not be dereferenced nor incremented.


When the iterator is created using zim::File::beginByTitle(), the articles are ordered by title. Otherwise the url field is used.
When the iterator is created using zim::File::beginByTitle(), the articles are ordered by title. Otherwise the url field is used.
Line 37: Line 39:
   }
   }
}
}
</source>
You may save that file under the name "zimlist.cpp" and compile using the command:
'''g++ -o zimlist -lzim zimlist.cpp'''. You get a program, which lists the urls and titles of the file named ''wikipedia.zim''. Of course it is better to pass that name as a parameter in argc/argv. But this should be an easy task for you, so I do not show that.
In subsequent examples I show only the code needed to use the library. The main-function with the error catcher should always be in place.
== Finding articles ==
Articles are addressed either by index or by namespace and url or title. The index is normally not that useful. So let us look how to find a specific article.
''zim::File'' has methods ''find'' and ''findByTitle''. Both take 2 parameters. A char for the namespace (which is normally 'A' for articles) and a string, which specifies the url (in find) or the title (in findByTitle). It returns a const_iterator pointing to the lexicographically next article. Be aware, that the returned iterator may point to end(), so you should check that, before dereferencing the iterator.
;Sample: find a article by title and print the content:
<source lang=c>
zim::File::const_iterator it = file.findByTitle('A', "Wikipedia");
if (it == file.end())
  throw std::runtime_error("article not found");
if (it->isRedirect())
  std::cout << "see: " << it->getRedirectArticle().getTitle() << std::endl;
else
  std::cout << it->getData();
</source>
Incrementing the iterator iterates through the file using url or title order, depending, which method created the iterator.
The method zim::Article::getData() returns the actual data as a instance of zim::Blob. This class has a method data(), which returns a pointer (const char*) to the begin of the data and size(), which returns the size of the article. Be aware, that the data is not zero terminated. Zim files can contain binary data like images, which may have zero bytes in the data.
The data is valid as long as the article is valid.
If you really want zero terminated data since you know, that it do not contain zero bytes, you may use the Blob to create a std::string and use the c_str()-method.
To make life easier, a ostream-operator for the zim::Blob is implemented.
;Sample: get zero-terminated string out of the article data:
<source lang=c>
zim::Article article = ...; // get the article somewhere
zim::Blob blob = article.getData();
std::string stringdata = std::string(blob.data(), blob.size());
const char* zptr = stringdata.c_str();  // c_str() guarantees, that the pointer points to zero terminated data
</source>
</source>


Navigation menu