Akara Use Cases
End User
Generally, this user is someone who would be viewing, searching and creating content for the purposes of the domain Akara was installed for. For the time being, these will somewhat focused on the medical domain, although hopefully, it can be easily translated to other domains.
1. John wants to keep track of how his contracts are working out. He wants to keep track of his client data along side his financial information, while being able to eventually query what sort of progress he making towards better estimates and what kinds of projects he should be taking on. John hires Eric to create a system to help him keep track of the projects. Eric installs Akara and consults with John to establish a way to categorize different aspects of his contracts. Eric then writes a set of XForms for taking in project information afterwards and during a project.When John submits the forms, a new RDF file is created in Akara and it's metadata is indexed for querying later. Eric also sets up a form page where John can change a set of parameters to compare the results. Eric also shows John how he can upload files into Akara such as invoices, notes, and contact information.
2. Maria works at a large software company and wants to keep track of her notes and documentation. She writes a python script to process her notes for specific terms, phrases and acronyms, which are then indexed in Akara. A few years later, Maria is asked to speak at a conference on "social" impact of "hypertext" on "users". She puts together a quick query using Akara's command line tool and finds the articles that help her put a presentation together.
3. Dr. Elise is a surgeon and performs a coronary artery bypass graft operation (CABG, open heart surgery) on her patient. Things go very well. After the surgery, the Dr. Elise, the other doctors and the nurses meet to take notes on what happens. After the notes are compiled, they are decomposed to more structured information and entered into Akara using some XForms. The patient information such as preconditions, symptoms and any other helpful information is included along side details of the surgery. All this information is saved in Akara as an XML file that gets transformed into and RDF document. Later when an intern is to observe a similar procedure, the intern is able to find the details of the surgery (not the patient) and learn some helpful tips. (Paraphrased from a conversation with John L. Clark, Thanks John!)
Developers
Developers using Akara do so as a library and or prepackaged service.
1. Mike wants to set up a document storage service for his custom CMS. Currently his users are using simple forms to input content, but they want to start adding things like Word documents and PDFs to the system (the nerve!). Mike has some libraries he can use to crack open and read data from the files but he doesn't want to deal with the storage and indexing himself. He also considers that the storage could be pretty hefty and wants to get into this cloud computing thing and turns to S3. Mike realizes the latency between his machine in the office and S3 blows, so he installs Akara on an EC2 instance (after all, they have persistent storage now). He now implements his storage operations using HTTP. He also takes his custom file reading tools and creates extension functions that are used in an XSLT to index the files as they are loaded. Finally, Mike implements a query interface (like google no less) that formats Versa queries that are then sent along to the server.
2. Eric is building a multitenent AtomPub implementation and needs an way to store and index Atom entries as they are created. Eric builds the custom HTTP interface for implementing AtomPub simply proxies the requests to an Akara instance where he wrapped the indexing service to index the required elements of the entry. Eric also wrapped the queries to the resource manager so that entries are given up to date atom author information when they are requested. Eric's boss is very impressed with his work and gives him raise! Then his boss mentions that he wants the entire web site to be moved to the same system so they can be included in the same queries. Eric says, "Sure... for a bonus!". The boss sneers and says OK. Eric then creates a quick script to scp the files to a development box where they are imported and indexed into Akara. A job well done, Eric gives himself a high five, gets embarassed and heads out to catch a movie at the Alamo.
3. Sara works for a company that release software for publishing Word, FrameMaker and DITA documents online. Sara suggest they should move to a SOA in the near future and suggests using Akara. Sara sets up the basic web site for handling the traffic and implements a pipeline of transformations using a XProc file and Akara. Each time a stage in the pipeline finishes, the products of the transforms are saved in Akara and a summary of the resources are passed along to the next stage in the transform by sending a trigger request to a custom transform manager. The transform manager then keeps chugging through the document transforms until they are all finished. One last web trigger is fired to let Sara know things completed. Sara, who was suprisingly devious, tries to hold the documents hostage! Fortunately, Patty, the sysadmin is able to foil Sara's plans by removing her account from the policy manager and giving back the client permissions all before the client even knew. Patty also took careful notes and logs, which she added to Akara as well. This was extremely helpful b/c a few years later Patty quit to work for Google and Sara, who had changed her name to Michelle, came back to try her plan again!!! Fortunately Brian, Patty's replacement, had been reading up on the problem and had implemented a custom policy manager that disallowed developers access to the production Akara instance. Phew!
Architectural Notes
Akara performs two basic functions. The first is storing files. This is almost the same pattern as seen on a *nix filesystem (ACLs, modes, etc.). The second is querying of the files based on RDF metadata. Beyond this, other features have been established such as applying a XSLT resource to an available XML resource and querying the RDF metadata for use in a template. This has further been extended to support basic web application development by allowing XSLT resources have access to GET/POST params from a request in coordination with writing the result of XSLT transforms to the store.
This model is very flexible in that documents added to the store can easily be queried using powerful RDF tools, while keeping content in a denormalized form simplifies its most common use. This differs with traditional data management systems in that data is often normalized before storing to allow easy composition. Akara takes a different approach recognizing that the original document provides an essential source where indexes provide the typical normalized view of the data while providing an efficient pointer to the original resource.
Due to this flexibility and the continued development of Akara (4SS specifically), the storage server functionality has been somewhat coupled with other facets such as rendering and handling HTTP interfaces. The goal then of these use cases is to define how typical usage could benefit from breaking apart separate features from Akara.
Akara as a Database
Persistence
Akara has the ability to be utilized as a generic data store, similar to something such as an ORM for a web framework. These tasks define what Akara does in terms of providing a basic storage mechanism.
- Store generic XML content
- Store generic Plain Text content
- Store semantic XML content (RDF, Atom, RSS, ODF, HTML, etc.)
- Store processing specific XML content (XSLT, XUpdate, XML Catalog, Schematron, XSD, etc.)
- Store any other data as binary content (Images, etc.)
Queries
For all storage tasks metadata is drawn from the original source in order to allow querying of that metadata. This suggests the following tasks.
- Query for resource pointers based on conditions using RDF tools (Versa, GRDDL, SPARQL, etc.) (Note: I really don't know any of these languages so this could be incorrect.
- Improve query performance through more effective indexes and keys
Akara as a Filesystem
Unlike a database, Akara focuses on storing items in a hierarchical or tree based fashion. This is directly analogous to a filesystem. These tasks describe Akara's functionality in terms of parallels to a filesystem.
- Create containers recursively for collections of resources
- Allow defining a resource as "executable" such as XSLT
- Allow defining a resource as restricted based on ACLs
One aspect that is left out of this list is the idea of symlinks. Since Akara is meant to be used on the web, we will assume that "symlinking" will be possible through traditional linking mechanisms and through providing multiple URLs to point to different representations of the resource. This is a subtle difference that still effectively is in line with matching a file system. For example, if you cd into a directly via symlink, ".." is relative to how entered the resource even though the folder you cd'd into is essentially the same.
Akara as a Server
Akara provides an interface to its storage mechanism through a server. This is very similar to something like MySQL, but is in fact closer to CouchDB or an Amazon S3 and Amazon SimpleDB combination. For this reason Akara essentially provides a set of services that function within the scope of the storage features. The primary feature beyond basic resource serving, is application of XSLT to resources. More specifically, the ability to apply XSLT that has access to the store as a whole is what sets it apart from a simple XSLT transformation service.
Akara Resource Service
The resource service allows direct access to resources within store based on the "path" of the item.
- Serve resources using the source document's correct content type
- Serve appropriate caching headers (etags, cache-control, if-not-modified, etc.)
- Enforce ACLs using established HTTP authentication schemes (401 Not Authenticated, Digest/Basic Auth, WSSE, etc.)
Provide collections of resources when the resource requested is a container (AtomPub Collection, directory listing)
Akara Transformation Service
Currently Akara provides a robust transformation service through the use of query string parameters attached to a resource URL.
- Serve a resource with a specific XSLT applied to it
- Pass xsl:param's using HTTP variables
- Provide access to the store in the XSLT via extension elements and functions
This functionality could essentially be compared to the relationship PHP has within Apache. The XSLT effectively allows a resource oriented means for creating dynamic pages in a deployed instance of Akara. This is somewhat different than building a web application using Akara as a library.
Akara Application Service
This facet of Akara is not currently created but is envisioned to be a canonical example of how to use the Akara library. This will use the storage repository and implement the specific URL interfaces provided by existing 4SS. The essence of this is a WSGI application that handles the requests and dispatches to the appropriate services and WSGI middleware to return a response. In the world of frameworks this could be considered a thin controller for Akara. The essence of this application service is to provide an end user view of the Resource Service values using the Transformation Service features.
Akara Storage Service
Akara currently uses XML-RPC (and SOAP to some extent) for providing an API to persist and query the store. Ideally this should be a RESTful interface to the store that provides the typical CRUD operations.
- Allow POSTing new items to the store for persistence
- Utilizing HTTP headers as needed for metadata
- Allow POSTing metadata that defines where a resource can be pulled from
- Allow PUTing items for updates
- Allow DELETEing items
- Allow GETing items and containers as collections of items
This is relatively similar to to AtomPub with the biggest difference being a wider array of content types providing semantic information about the resources.
Akara Clients
Akara currently provides a set of clients ranging from a command line client to direct access to the Python objects. Generally the clients should be able to perform the full suite of operations on the store, assuming the user had the credentials to do so on the specific resource.
It would be ideal from a development standpoint to have a single Python library that performs operations over the RESTful interface to the Storage Service. In this way the command line clients, XSLT interfaces and any remote Python scripts have access to the Akara repository utilizing HTTP. More speed could be attained by reducing layers of HTTP. For example, the server could accept request objects directly via socket and cPickle. This might be crazy talk, but the idea is that if Akara can effectively always work within the scope of an WSGI application in terms of interfaces, then the ability to use HTTP and web based paradigms can be the singular focus.
Comparisons to Other Systems
The essence of Akara most resembles that of the Bright Content store, which uses Atom as the metadata construct instead of RDF. The same concept of indexes and storage of all kinds of files is still present. One relatively large difference between BC and Akara is the use of AtomPub. Specifically, AtomPub supports paging of collections, which when translated to Akara's containers, may not be entirely present. Another difference due to AtomPub is a collection representation contains the majority of basic metadata within the collection representation. I'm not sure if this or is not the case in Akara. I'm also not sure if this pattern could or should be translated to Akara. The largest difference between BC and Akara comes in terms of interfaces. BC does not provide a client interface at all for manipulating it as a store. It uses and expects AtomPub clients to be interoperable with its interface. Akara, on the other hand, has a very robust client tool in its command line client.
