linux poison RSS
linux poison Email

libextractor - Extracting Metadata from any types of files using Linux

GNU libextractor is a library used to extract meta data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types

Currently, libextractor supports the following formats: HTML, PDF, PS, OLE2 (DOC, XLS, PPT), OpenOffice (sxw), StarOffice (sdw), DVI, MAN, FLAC, MP3 (ID3v1 and ID3v2), NSF(E) (NES music), SID (C64 music), OGG, WAV, EXIV2, JPEG, GIF, PNG, TIFF, DEB, RPM, TAR(.GZ), ZIP, ELF, S3M (Scream Tracker 3), XM (eXtended Module), IT (Impulse Tracker), FLV, REAL, RIFF (AVI), MPEG, QT and ASF.  Also, various additional MIME types are detected.

Installation:
wget http://ftpmirror.gnu.org/libextractor/libextractor-0.6.0.tar.gz
tar xvfz libextractor-0.4.1.tar.gz
cd libextractor-0.4.1
./configure --prefix=/usr/local
make
make install
After installing libextractor, the extract tool can be used to obtain metadata from documents.

Using libextractor: 
Below is the simple example of extracting the metadata from any given FLV file




0 comments:

Post a Comment

Related Posts with Thumbnails