Friday, January 2, 2009

Where are the REST frameworks?

I like REST. I’ll admit that I didn’t understand all of Dr. Roy Fielding’s Dissertation (he is, after all writing to experts in the field of API design) but I think that I’ve managed to understand enough of it that Rails’ implementation of REST leaves a lot to be desired. Rails is not alone, many services are releasing APIs that they are calling REST. Dr. Fielding recently addressed this in a post on his site. Like his thesis, he is talking to API designers, not to Joe and Jane Developer who needs to get the latest web site off the ground. Perhaps more practically, Subbu Allamaraju gave the issue of REST and hyperlinking a thorough exfoliation over at InfoQ.

If you read Subbu’s article and compare it to the standard Rails 2.x web application you will quickly see that difference. In a rails application, if you GET a JSON representation of our quintessential blog all you are going to be told is that the owner_id is 1232; your code has to know that Owners are Users and that they can be retrieved from http://api.example.com/users/{user_id}. This is like putting an <img img_id=123> in your html and expect the browser to know to retrieve the file from http://www.example/resources/images/123.

There is the comparison that most people who think they are writing REST APIs are forgetting; HTTP. If you are writing a REST API then you are writing something VERY much like HTTP. None of us are likely to need to create a REST API. I work for a financial institution and we might have a need to create a REST API like HTTP that is geared towards exchanging financial transactions between internal systems, our partners and our clients, but I suspect that would be unnecessary.

No most of us have no need for creating a new REST API. What we should be talking about is publishing RESTful media types. If I’m going to build the archetypical Blog application, then I need media types for a Blog, a Post, a Comment, a Tag, an Author and what ever else.

A Note: I’m making a distinction that I don’t think Dr. Fielding would make. Dr. Fielding makes it clear that the linking mechanism is central to a REST API, and since HTTP leaves much of the linking mechanism (but certainly not all) up to the user agent, a true REST API requires a media types. However, the vast majority of the work that we do is over HTTP so we have no need, in general, to worry about that half of a REST API. I think it greatly clarifies things if we simply talk about creating REST Media Types rather than APIs. I just noticed that I’m not the first one to realize this (after Dr. Fielding of course).

What about the Frameworks

I’ve been trying to find a Ruby framework that can handle REST Media Types right out of the gate. Waves, which its resource oriented architecture seems like a good place to start. They are right at the start of putting together what they call foundations for REST services. I’m writing this article in part to influence how they go about that process.

REST ≠ MVC

One of the biggest problems is that people are trying to bolt REST onto MVC frameworks. The problem is it is probably the wrong programming model to use. The controller for a REST service is absolutely trivial; you can only call a maximum of 4 methods on any given resource, and they are always handled the same, so the controller is completely anemic and should fall into the framework itself. Views are also the wrong paradigm since they only deal with out going representations; you really need some mechanism for specifying the transformation of a resources into a representation (that is its media type), and back. All that you really have left are the resources (assuming they’re like models) and the transformations.

REST and Your Web Application

I’m probably going to annoy a lot of people with this. A web browser (as they are currently implemented) is NOT a RESTful client. You can use XHR object supplied by a browser’s JavaScript engine as a REST client, but the browser itself interacting with plain old HTML cannot act RESTfully. In part, its because your browser won’t make PUT or DELETE requests (you have to fake it with extra parameters on a POST). Mostly its a matter of usability.

For a non-trivial use case, you are going to want to something wizard like. That means you are going to have to create a number of intermediate, transient representations to maintain that state. Its an affectation in this case. The transient representations serve no purpose but to maintain this fiction that its REST.

I honestly think that you should really simply leave the your REST API as an API and build a separate application to act as your User Interface. You’ll make it easier on yourself in the end

REST and MVC

Like I said earlier, REST is not MVC. But that doesn’t mean that its not related to MVC. MVC is very well suited to building a User Interface. As I said earlier, you should build a separate application for your User Interface. Well, there is no reason why you can’t use a Resource as the Model in your MVC application. Rails already does this; ActiveResource allows you to use a REST Resource instead of a database table for your model.

This is where things get really interesting.

If you are already using resource representations for the models in your MVC user interface, why limit yourself to a single domain? There is no reason why you couldn’t use multiple domains. This allows for what I call DMVC: Distributed Model View Controller. By distributing your models over a number of applications (via REST media types) you can get get some great long term efficiencies by the serendipitous, and unplanned, reuse of REST APIs.

For example, if you have a non-trivial content management system, you are going to want to move your content changes from the author to an editor to perhaps a legal reviewer and maybe finally to a marketing manager before those changes are “live”. At the same time, you may have defect tracking system where defects move from reported, through any number of stages to completed. Traditionally, you would include something like acts_as_statemachine in each of your rails apps to get this functionality.

But if you apply the REST architectural style, you can create 3 independent applications: Content Management, Defect Management, Workflow. The Workflow applciation would have something like a State resource which would be associated with some other resource without knowing anything about that resource. If the business logic was sufficiently complex (for instance, a change in the State resource would change the associated resource) you could create another application that combined a State resource with a (let’s say) Defect resource to expose StatefulDefect resources.

Also, because you have decoupled your the workflow from both your CMS and your defect tracking, you can reuse both to create a project management application where the content is not work flowed but managed wiki style and defects are not workflowed rather raised and closed since you have hourly builds to the project server (or what ever).

We already know how to optimize and scale web servers, and the same can be done here. Or, if this is a purely internal set of services (or even largely) you could create a more efficient connection mechanisms/schemes than HTTP (I’m not sure what those would be). Or you might try more efficient media types (for example using JSON or even a binary format).

REST, Associations & ORM

One of the more valuable parts of Rails is the ActiveRecord ORM. It allows you to say that a Blog belongs_to a User and a User has_many blogs. Associations between these objects are by ID as they refer to other records in the database.

But REST blows that apart. Your Blog and your User are no longer stored in the same database. We refer to every unique resource by a URL so it may live somewhere totally unaccessible to us. If I have blogs and you have users, I can say that my blog belongs to a user, but I cannot expect your users will know anything about my blogs. Now perhaps the representation of the user that you give me includes a list of blogs that the user owns, in which case I can PUT back a representation that includes the URI of my Blog, but my code cannot rely on that information being known.

You also get some interesting behaviours.

All you have is a URI. You have NO idea what that resource is. As a result, its conceivable that I might try to create a blog that is “owned” by an image. Or the URI point to a resource that is a legitimate owner of the blog but its media types are not those that the blog application understands. For the most part, that’s OK because the blog application is really just an API, its not our User Interface. The User Interface application will need to have coping mechanisms for this but I’m not sure that’s a problem. If all you’re producing is an API, then its not even your problem.

More importantly, you get all sorts of strangely compelling polymorphism. Its liberating not to know what you’re associated with since it guarantees that you’ll become associated with all sorts of unexpected things.