Motivation for the Akara project
I've been involved in XML, and especially in XML processing with Python since the very beginning. I've seen it all, and pretty much implemented it all. At first my motivation was that I found XML the best hope I'd seen for semi-structured databases, so I focused on the basics. By now, XML is really just plumbing. There are many XML processing libraries in Python, and even the standard library finally has a respectable one with ElementTree.
So why a new XML pipeline and processing project, especially one as ambitious as Akara? The first answer is that it's not just about XML, but I'll come back to that. If we do focus on the XML processing kit, the fact is most XML processing tools, not just in Python, but in general, are entirely focused on the plumbing. The idea is to see XML as a temporary inconvenience, rather than as a strategic technology. This is often justified, because most uses of XML by far are products of poor judgment, where other technologies would have been far more suited. But for those cases where XML is well suited, which I'll briefly characterize as where traditional, granular data combines with rich, prosaic expression, the current crop of tools is inadequate.
As an architect I want to be able to treat with XML above the level of plumbing. I want to deal with it at the level of expression. Used correctly XML is not an inconvenience, and I want to deal with it as richly and naturally as possible, because the data in XML will outlive my code a long, long time. At the same time, I want to make it easy to connect stuff that should be in XML with stuff that's best suited to other formats. I want to be able to pipeline XML processing with HTML, JSON, RDBMS, and all that, again without too much coupling to code.
What that means is that I need tools that allow me to work with XML in as declarative a manner as possible, and to operate at the level of pattern, pattern dispatch and data modeling. I want to be able to overlay semantic technology over XML as naturally as possible (and my evolution with RDF is similar to mine with XML: I don't want to focus on RDF, but at the higher level of semantic annotations, of which RDF is just one important format). I want to start by getting the details right, such as proper handling of mixed content, and to keep perspective with broad and powerful modeling techniques such as Schematron Abstract Patterns.
Amara 2.x is designed from the ground up with the above in mind--to treat data as much of possible in the data domain, rather than in the code domain. In practice, you still need good code interfaces, but the key balance to strike is in the nature of the resulting code. I want code that uses Amara 2.x to be compact, so there's less to maintain, but I also want it to be structured such that it could easily be translated from one system of declarations to others, and to reuse standard declarations systems (such as XSLT patterns) where appropriate. The result is an XML processing library that's truly different from anything else out there. Whether you like it or not is a matter of taste, but it's clearly not just a product of NIH.
Akara is designed from ground up to be a lightweight vehicle for Amara 2.x in a RESTful context. It is a framework for tying together all those processing pipelines using Web conventions.
