Creating a web service for public consumption can be difficult. This post is an overview of the decisions that have gone into the next release of Iron Money’s API from my past two years of experience. This is not a guide to architecting an API to match the section about REST in Fielding’s dissertation; this is a look at the decisions I have made while designing a web service for real world use.
Why REST?
REST, SOAP, and XML-RPC are often unfairly compared to one another. REST is an architectural style for APIs while SOAP and XML-RPC are protocols; it would be fairer to compare HTTP, SOAP, and XML-RPC because HTTP is an implementation of REST. SOAP and XML-RPC certainly have their benefits (mainly because of the tools built to work with SOAP and XML-RPC APIs), but the simplicity of designing a web service with RESTful principles in mind makes it easier to get an API up and running over HTTP.
I highly encourage you to use your own API for whatever you’re building: a web app, a mobile app, etc. Using it as the base for what you’re building can provide a stable foundation (your API is going to be well documented and tested, right?). This is something you should do for any API, but both SOAP and XML-RPC can bring additional complexity, dependencies, and overhead to building on top of an API.
Resource design
A RESTful API exposes resources for clients to use. For example, an API might expose a list of accounts at the URI https://example.com/accounts/ (/accounts/ for short).
A resource doesn’t need to have a logical name or structure; it could simply be a random string or number if you wanted. However, making a resource have a logical structure helps people grasp the API easier. For example, your API could make a single account addressable at /accounts/account-id/.
Finding a logical structure for resources won’t always be easy. When designing the resources for a search feature, you might be able to get away with something as simple as /books/phrase/, but if you want to let people search for books published sometime after the year 2000, you will quickly find yourself between a rock and a hard place.
My preferred method to allow for searching within a resource is to append parameters to the resource. For example, to perform the above hypothetical book search, I might make the resource /books/?published-date-min=2001.
I highly recommend making anything that resembles a resource addressable as one. For example, if a book has a list of tags associated with it, create a /books/book-id/tags/ resource and make each tag addressable at /books/book-id/tags/tag-id/. This will make it easier to change data and keep track of the changes; the hows and whys will become apparent in the rest of the post.
DELETE, GET, POST, and PUT methods
HTTP uses the verbs DELETE, GET, POST, and PUT to perform actions on resources. The above examples would have all used GET to get the resources.
POST would be used for adding a resource. For example, one might make a POST request to /books/ to add a new book. The message body would include the data for the new resource.
PUT would be used for editing a resource. For example, one might make a PUT request to /books/book-id/ to edit a book. The message body would include the new data for the resource.
DELETE would be used for deleting a resource. For example, one might make a DELETE request to /books/book-id/ to remove the book from the /books/ resource.
One reason why it’s important to give each resource its own URI is because it makes it easier to apply the above methods to the resources. In my previous example, I created a new resource for a single tag of a book (/books/book-id/tags/tag-id/). If I wanted to remove the tag from the book, I could simply DELETE /books/book-id/tags/tag-id/. If I had tried to do that without a separate resource for a book’s tags, I would have needed to PUT /books/book-id/ and make it possible to indicate that a tag needed to be deleted (or resend the entire tag list).
Status codes
Status codes let clients know the status of a response. The most commonly known response code is 404 Not Found, which an API might return if a client tries to access /books/123/ and book 123 couldn’t be found.
You can find a full list of status codes and their descriptions in the HTTP/1.1 specification. It’s not worth going into more detail about the HTTP status codes because the specification is fairly clear about when to use each code.
ETag and Last-Modified
ETags, or entity tags, are identifiers for a resource. ETags serve a number of useful purposes. ETags are handy for caching; whenever a resource is changed, the ETag should change as well. ETags can also be used for conditionally getting or modifying resources as well.
The If-Match header can be included by the client to only perform a method on a resource if the resource’s current ETag matches the ETag provided by the client. For example, a client might make a PUT request to a resource and include “If-Match: “1″” in the request to indicate that the PUT should not be executed unless the resource has a current ETag of “1”.
The If-None-Match header is similar to the If-Match header except that the request will only be executed if the ETag does not match the resource’s current ETag. For example, a client may make a GET request to a resource and only want the results if the ETags don’t match (which would indicate that the resource changed).
The Last-Modified header can be included in a response to indicate that date and time that a resource was last modified. It’s very similar to an ETag, except that it’s less precise (since it’s only precise to a single second). You can use If-Modified-Since and If-Unmodified-Since to perform similar requests as If-Match and If-None-Match.
I highly recommend supporting the ETag and Last-Modified headers; they can make client applications more efficient and lighten the load on your servers if used correctly.
Format
If you’re API is available in only one format (such as just XML), then you might not worry about including a format in your API. However, if you do plan to have multiple formats available, you’ll need to include the format in the resource URI.
Choosing where to put your format information is really a personal choice. You might decide to have it at the root of your resources (e.g. /json/books/), or append it as a parameter (e.g. /books/?format=json), or include it as an extension (e.g. /books.json).
Versioning
Like formatting, if you don’t version your API, you probably won’t worry about including a version number in your resource URIs. However, I highly suggest versioning your API for your own sanity; if you don’t, you’ll probably be worried about making backwards-incompatible changes, which can stagnate your API’s development.
Like the format parameter, choosing where to put your versioning information is really a personal choice. You might decide to have it at the root of your resources (e.g. /v1/books/), or append it as a parameter (e.g. /books/?v=1).
Linking to resources
Linking to resources is a RESTful concept that some APIs support. The idea is to include the URIs to resources so that clients don’t need to hard-code any resource URIs into their applications. For example, a client might make a request to / to get a list of resources and discover that the books resource is at /books/.
I am not a fan of including the resource URIs in the responses because I think it feels like offloading a problem for the API provider onto the client. The API provider should probably not break URIs for clients, especially when redirecting can be so easy to implement. Additionally, I don’t like the overhead of including URIs in responses.
Including links to resources may be the right thing to do for your web service, but it’s my opinion that doing so can be wasteful.
Multi-resources
REST is not the end-all of API designs. If you would like to allow a client to edit multiple resources at a single time, they can send multiple PUT requests to multiple resources to accomplish the task. However, if you want to provide any sort of atomicity in your API, you’ll have to pull some tricks to make it work.
One option is to allow clients to open transactions with your API. One possible way of doing this is to provide a /transactions/ resource to create a transaction, then perform the PUT requests, and then close the transaction with a final PUT to /transactions/transaction-id/, but this isn’t a very elegant solution (how long does a client have to end the transaction before everything is rolled back?).
Another way of providing the same functionality is to provide a “multi” resource that a client can send all of their changes to. For example, one might make a /multi/ resource that accepts the following JSON:
{
"resources":
{
"/books/":
{"isbn":"978-0684833637","method":"POST","title":"A Moveable Feast"},
"/books/978-0-7432-7356-5/":
{"method":"PUT","title":"The Great Gatsby"}
}
}
This isn’t a very satisfactory method of solving the problem because it isn’t very RESTful, but it’s a messy problem that I haven’t been able to find a good solution for.
Authentication and Authorization
Authentication and authorization might be something you need to handle in your API. Authentication is about identification while authorization is about access. Your API may be completely public (everyone is authorized) or have private resources (in which you need to identify a client and check whether or not they can perform their request).
Authentication is taken care of by HTTP with Basic access authentication or Digest access authentication. Authorization, on the other hand, is something that needs to be handled by your application.
I highly recommend using OAuth as the basis for authorization in your application. While it has its faults, it does the job fairly well, has support from a number of large companies, and has a number of tools already built around it. While you could roll your own authorization protocol, using OAuth is probably a better choice.
More
I highly recommend watching Google’s Intro to REST reading RESTful Web Services, a book about creating RESTful Web Services.