Libarc

From Wikipedia, the free encyclopedia

Libarc is a C++ library that accesses contents of GZIP compressed ARC files. These ARC files are generated by the Internet Archive's Heritrix web crawler.

This allows you to Open and scan contents of GZIP compressed ARC Files. It also allows you to get an iterator that walks over the contents of said ARC files, member by member.

You are able to specify media type in order to limit the types seen. This allows you to access information in the member’s URL record and response headers from http servers and access to the member’s data in a single API call.

Additionally to the API reference documentation there are two other sources: Programming with libac - This describes the libarac API

License and Copyright held by Basis Technology Corp.