RDFParser Class

Languages
PHP4+
Version
2.1.0

Loads and parses RDF/RSS files. These are XML files that are used by many websites to syndicate headlines. The class provides an interface to the loaded file and makes it simple to generate lists/tables of the source site's headlines.

The documentation page contains installation instructions, a run down of the API and some example code showing a typical use of the class. This documentation is also included in the tarball, in both HTML and plain text formats.

The class uses PHP's built in SAX parser and as such should be able to parse and valid XML documents that match on of the RSS or RDF standards. Even most non-standard files should work; just as long as they vaguely resemble an RSS/RDF.

Recent News

Version 2.1.0 released (2009-06-11 19:21:36 vrai)

Basic Atom support has been added to the RDFParser, primarily to support Freshmeat which dropped its RSS/RDF in the latest redesign. Only the subset of Atom that is common with RSS/RDF is support and as such the external interface to the RDFParser class has not changed at all; no changes are required for code using the class, Atom files can have their filename's passed to the parser exactly as if they were RSS/RDF files.

Version 2.0.1 released (2009-03-01 12:54:39 vrai)

A fairly minor, but useful, change for this release. It turns out the PHP4 XML parser ignores the encoding specified in the XML declaration line (the "<?xml ... ?>" line at the start of the file. This change performs a simple parse of the line before the real XML parser is created and extracts the file encoding. This is then explicitly passed to the XML parser which can use the information to correctly handle Unicode characters.

While a small change, this makes the parser much more useful for non-ASCII character sets.

Version 2.0.0 released (2006-01-29 19:51:39 vrai)

The first update on this project for over two years and it's fairly major. The old PHP3 compliant code has been dropped and completely rewritten to use PHP4's built in SAX (XML) parser. While the loss of compatibility with PHP3 only servers is regrettable the new code is much faster than before. The old regular expression based code chugged noticeably when faced with large RSS files (such as freshmeat's); the new code parses even large files in a few hundredths of a second (on an AMD XP 2800 with Apache 20.0.55 / PHP 4.4.0). In addition the new code is much more flexible and can copy with any XML file that is vaguely RSS/RDF like.

To clean up the project page the documentation and example code has been moved to a separate file. The documentation, in both HTML and plain text format, is also included in the tarball.

Version 1.0.1 released (2003-11-30 20:41:01 vrai)
This is a minor update - the handling of CDATA enclosures has been fix (they are no longer ignored) and an optional argument has been added to the class constructor allowing the use of 'raw' mode when handling these enclosures. Full details are in the README file.
Version 1.0.0 released (2003-11-23 13:12:17 vrai)
Initial version of RDFParser now available for download.

Project Files

Click on the file name to download the file. Some browsers may require you to right-click on the file and select "Save Link As".
Version File Description MD5
2.1.0 rdfparser-2.1.0.tar.bz2 Added simple Atom support. 8e137d43b2eab010305d28e311b98f90
2.0.1 rdfparser-2.0.1.tar.bz2 Added support for Unicode encodings. ce84a13c47af7988ac6a6682aba40a6d
2.0.0 rdfparser-2.0.0.tar.bz2 PHP4 rewrite to use built-in SAX parser. d87071d4a9e570428f5f67dfcacf5083
1.0.1 rdfparser-1.0.1.tar.bz2 Fixed handling of CDATA enclosures. 9a91cdb3b4308500d0922d5d08be8e0e
1.0.0 rdfparser-1.0.0.tar.bz2 Initial release. 3f859370b18060b1fb8146424a833946

Navigation

Software Projects


Assorted Scripts - things that don't warrant a whole project but might be useful to someone.

Site hosted by Linode: using Apache, PHP, MySQL and Propel.

Site created using Eclipse PDT and OS X.

Site design and original content Copyright © Vrai Stacey. Unless otherwise stated, source code contained on this site is published under the GNU Public License (GPL).

Valid XHTML PHP Powered No Software Patents