Archive for the 'Web Services' Category

More on Using ETags and UUIDs for Internal Caching

A while back, I wrote about ETags and how they could be used within a web application for caching. Since then, I’ve had some time to work with using ETags and UUIDs in two applications—Iron Money and a yet-to-be-launched site.

Both of these sites use ETags and UUIDs for their resources; when a resource is created, it’s given a UUID and an ETag; when it’s updated, the ETag is changed. They also use Memcached for caching resources; when a resource is created, the Memcached key is the UUID and ETag; when the resource is updated, the resource is simply stored again. Memcached already throws out objects that haven’t been fetched recently, so the old resources eventually fall out of the cache.

When a resource is requested, both sites simply look up the ETag in the database and then check the cache to see if the resource is already stored. If it’s not, then they go through the regular process of fetching the resource from the database. This is particularly efficient because the database can easily have an index on the UUID and ETag, making it very cheap (less than a millisecond) to look up the most recent ETag before fetching from the cache.

For resources that are compromised of other resources (say, a list of accounts), the process is the same; the ETag for the main resource is fetched (the /accounts/ ETag), and Memcached stores a list of UUIDs for the resource (all of the account UUIDs). Then, each resource is fetched individually with the same process (get the ETag, check Memcached for the resource). This means that the resources are not stored multiple times in the cache, but the entire list of resources can still be cached.

Overall, this system works very well; most resources stay in cache and there are no concerns regarding ACID compliance. However, one issue that I have run into is with the aforementioned lists of resources. For small lists of resources, the system works very well; it takes about a millisecond to get the ETag for the list of resources, and then takes about a millisecond for each resource.

This, however, does not work well with very large lists of resources. Even with everything in cache, a millisecond for each resource in a list can take too long if there are 100 or more resources to fetch from the cache.

So what’s the solution? If you’re willing to trade memory for speed, then there’s a rather simply solution: simply store the ETags for each of the resources in a list. By storing the UUIDs and ETags for each resource in a list, you can avoid looking up each resource’s ETag, which will make everything faster.

Implementing Basic Text Search

I added text search to Iron Money in early March and had a few requirements: I wanted to implement the AND and OR operators, I wanted to support quotes for exact matches, I wanted to support a “not” operator (“-”), I didn’t want any stop words, and I needed to be able to search encrypted text. Here are some examples of queries I wanted to support:

  • a b [both a and b]
  • a or b [a or b]
  • “a b” [the string “a b”]
  • -a b [strings that contain b but not a]

I decided that implementing my own text search would be my best option with those requirements. I wasn’t able to find any good articles about implementing text search, thus I’ve decided to write up my thoughts here. I can’t guarantee that this is the best solution for you, but given the above contraints it was the best option for Iron Money.

The first thing I do is parse the string like a CSV with PHP’s str_getcsv(). str_getcsv() is a function for parsing CSV strings; I set the delimiter to the space character and leave the enclosure to the default ‘”‘. If you’re writing this in another language, any CSV parser should do the trick; just make sure it splits the search text by spaces and respects quote values.

After I get an array of strings with str_getcsv(), I run through that array and add it to a brand new array (we’ll call it $parts). As I run through the strings, I check to see if the string starts with ‘-”‘; this indicates that the next string in the array is going to be a NOT value. Once I get to the next string, I append the rest of the NOT value to the previous string that was added to the $parts array and continue with the rest of the array. We end up with an array of $parts that has an array of strings.

With $parts in hand, I check each string in $parts to find if any of the strings are the word “or”. If the string is “or”, then I know that the string before and after the “or” belong to that “or”; thus, both of them are added to a special $or_conditions array, and I remove the three strings from the $parts array.

At this point we have two arrays: $parts has all of our AND conditions, while $or_conditions has all of our OR conditions. My goal, however, is to have an array of all of the conditions that a string must meet; this includes values that it needs to contain, values that it should not contain, and a list of OR conditions which the string must meet. Thus, I create a $conditions array which will contain all of the conditions a string must meet to match the search.

We continue along by creating a new array ($condition) with two values: a $contains array and a $does_not_contain array. We run through all of the strings in $parts; remember, all of the strings in the $parts array are AND conditions. We check each string in $parts to see if it starts with the “-” character; if it does, we put the string in $does_not_contain; otherwise, we put the string in $contains. After running through all of the strings in $parts, we’ll end up with a single $condition array that we add to the $conditions array; it represents all of the AND conditions.

Now we move on to the $or_conditions array. For each of the $or_conditions, we’ll create a new $condition that has a $contains array, a $does_not_contain array, and a value called “operator” that indicates that this is an OR condition. We use the same process as before to modify the $condition array, and then add it to the $conditions array.

At this point in time, you have a $conditions array that has a list of conditions a string must meet to match the search. Each value of the $conditions array has a $contains and $does_not_contain array, and it might have an “operator” value to indicate if it’s an OR condition.

Now, we get all of the potential candidates to search through. In Iron Money’s case, we have an array of transactions that we need to sift through; if the “name” or “description” attribute matches the text search, then the candidate is returned in the search results.

To determine whether or not a value matches the search conditions, we run through the $conditions array that we created earlier. For each $condition, we check to see if the candidate strings match the $condition—if the strings have the $contains text and don’t have the $does_not_contain text. If the candidate is a match, we add it to the search results.

Manually implementing text search can be a real pain; I think it’s almost always better to use existing, proven solutions for text-matching needs. However, if you’re in a situation like I was with Iron Money, hopefully the above will be useful when implementing a robust text-matching algorithm.

More on ETags

In a previous post, I mentioned how including the ETag and Last-Modified headers in an API can be a real benefit to third-party clients. What I didn’t mention is how ETags can provide a large benefit inside your API as well. This post is about implementing ETags in your application and using them to make your application more efficient.

ETags, or entity tags, are identifiers for a resource. In short, they are similar to version identifiers for resources; when a resource changes, its ETag changes. When a client GETs a resource, they can store the associated ETag; then, the next time they GET the resource, they can ask for the resource only if it has changed; or, if they fancy, the client could even make a request that should only be executed if they have the latest ETag for a resource!

Strong and weak

One thing I didn’t mention in my original post is that there are two different kinds of ETags: strong and weak ETags. A strong ETag changes whenever a resource changes at all; if any bit in the resource changes, a new ETag is required. Weak ETags, on the other hand, change whenever the resource semantically changes; a resource won’t necessarily have a new ETag if it means the same thing as its previous version.

So how should you generate your ETags? Strong ETags would probably best be generated right after the message body is generated or if entire message bodies are stored by the API; alternatively, for a more effective use of ETags, you could regenerate them whenever the resource changes, although that might be an expensive solution. [In practice, what Iron Money does is create a hash from the data and then stores the hash with the data in MySQL. Your mileage may vary.]

Weak ETags, on the other hand, could be generated from the internal representation of resources in your API. Whenever the resource is changed, simply generate the weak ETag and store it along with the resource’s data.

If you provide a resource that lists a bunch of child resources (e.g. a /books/ resource that lists all of the book resources), I highly recommend creating an ETag for the resource from all of the ETags of the child resources; this way, you don’t need to generate the entire list when one of the child resources changes.

Generating ETags for resources that are searchable can be problematic. You could generate the list of child resource ETags and create the ETag from that; however, this requires you to perform the search even when the request is conditional. Another way of handling the problem is to use the same ETag for the search as you would use for the entire resource (e.g. the ETags for /books/ and /books/?author=Steinbeck would be the same). I recommend doing the latter; it saves you time and shouldn’t hinder clients’ use of your API.

Using ETags internally for caching

The benefits of ETags to third-party clients is rather obvious, but how can ETags be used within your application?

If you’re using a caching system (like Memcached), you’re probably storing your resources by their UUID or by some sort of resource and primary key identifier. Whenever a resource is created, you put it in the cache; whenever a resource is updated, you delete it from the cache or update the cache, and whenever a resource is deleted, you delete it from the cache.

This might work well for your application. However, cache invalidation can be difficult, especially if your application requires ACID capabilities.

ETags can help your application with both of these issues. If you store each resource by its primary identifier and its ETag, you can always be sure that each cache lookup will only return the version of the resource you want. For example, if you’re simply getting a resource, you can look up its ETag in your database, then retrieve the entire resource from the cache and be sure that, if it’s in the cache, then it’s the correct version that you want. Additionally, you don’t have to worry about cache invalidation, because there is no invalidation—you simply let your cache system remove the old versions of resources as the cache fills up.

ETags: because its good for the Internet

ETags are surprisingly powerful once you start to make the most of them. They help HTTP clients cache resources and make conditional requests, and they can help your application with its caching needs as well.

Iron Money Public Beta

I’ve been working on a personal finance web app called Iron Money for the past couple of years. In September 2008 I released it as a private beta; you could sign up and use the API, but to actually use the web app you’d need my approval. Last week, I made the Public Beta available for everyone to use—no special approval, just sign up and go.

This post is an in-depth look at the kind of decisions that went into making the current version of Iron Money.

Simplify

I focused on making this release as simple as possible. I wanted a simple, well-tested foundation for future versions of Iron Money; thus, I actually cut features from the private beta. I wasn’t happy about cutting some features: I think the undo/redo feature was great from a usability perspective and the budgeting feature was something that was very useful. By the end of September, however, I knew that if I was going to get the public beta out the door, I’d need to cut down on things for the initial release and focus on the features that were core to Iron Money’s purpose: keep users up to date on their finances.

The simplification theme extended to the features that were kept as well. For example, I changed how categories worked so that they were easier to use (Iron Money used to have four default categories, two of which could have children categories; now, it has three default categories, all of which can have children categories).

Everything has a UUID

One of the things that changed fairly early on was my move from using MySQL’s auto-incrementing primary keys as IDs to using UUIDs for IDs. The primary reason for this change was my desire to make sharding user’s data easy. With UUIDs, I don’t need a central table dictating the IDs for everything and I don’t need a complicated auto-increment scheme; instead, I can add as many shards as I want to as Iron Money’s user-base grows.

Everything has an ETag

Iron Money now keeps track of an ETag for every resource. In a future post I’ll talk more about how Etags are used by Iron Money, but creating an ETag for each resource allows Iron Money to support better caching through the API. Whenever something is added, edited, or deleted from Iron Money, at least one resource’s ETag is updated.

api.ironmoney.com

Iron Money now uses a new domain exclusively for the API. This was mainly brought on for security reasons: I didn’t want any cookies set on ironmoney.com to be available on api.ironmoney.com. Furthermore, I think it’ll make it easier to run the API on a separate server in the future.

OAuth permissions

Iron Money’s API now supports a permission system that is fairly flexible: instead of authorizing an application to access all of your Iron Money data, the application can limit itself to just what it needs to access. For example, an application can ask for read permission to your accounts if it wants to show you your latest balances, without requiring read or write access to anything else.

The permissions system is a win for both third-party applications and users. Users have a clearer understanding of exactly what they’re authorizing an application to do, while apps can prove that they are only reading your information (instead of making it just a trust issue).

Almost everything is cached

a href=”http://memcached.org/”>Memcached is an amazing caching system; if you’re not using it in your web app or API right now, you’re missing out on faster reads. Basically, whenever data is written to Iron Money, it updates its cache and almost every subsequent read is from memory.

Simple can be difficult

Rewriting Iron Money over the past few months has been a lot of work, but I think the changes above will pay off in the long run with a faster, easier to use Iron Money.

API First and Only Features

A few days ago, Kellan Elliott-McCrea asked: Anyone have thoughts/experiences with API first, or API only features?

Yes. I initially worked on the API for Iron Money while working on the iPhone app. While I never finished the iPhone app (it’s currently sitting in a Subversion repository while I work on Iron Money’s Public Beta), Iron Money’s API has had API-first and API-only features (and will continue to in future versions).

API-first

API-first features, or features that come to a web service’s API before making it to the web app, can be useful for “testing” new features. For features that can be thoroughly thought through, adding the feature to an API first can be an easy way of letting third-party developers start working with the feature and see what kind of issues and functionality should be added before adding the feature to the web app.

An example of an API-first feature in Iron Money’s private beta was the multi-transaction editing functionality. Editing multiple transactions at the same time was too-complex of a task to add into the initial interface, but adding it to the API was trivial; eventually, the web app “caught up” with the API with regard to this functionality.

API-only

API-only features, or features that are only exposed through an API (and not through the web app), are similar to API-first features; neither are exposed in the web app first. However, API-only features might serve two additional purposes: to provide advanced, potentially more complex functionality, and to make more things possible for third-party apps.

An example of an API-only feature in Iron Money is its notification functionality. It doesn’t make sense for a user to be able to write and send themselves a notification, but it does make sense to allow third-party developers to send their users a notification.

While a feature is API-only, I highly recommend making sure that it doesn’t break the web app (or any other apps for that matter). One way an API might break a web app is if the functionality in the API is more advanced than in the web app; the web app might need to support displaying what can be done through the API.

Let’s say, for example, you’re building a calendar API and web app. You might initially build the web app to support very simple event repetition (say, every day, week, month, and year). However, you might allow the API to do more complex repeats (say, every three days or every January of every other year).

What you want to avoid is having a user set up their events in another calendar app (which does support the API’s more advanced event repetition) and viewing it in your simplified web app. The web app should not break because the event was created through the API; your web app should still display the event correctly.

An API before a web app

There’s one more concept I’d like to discuss: building your API before building your web app. Brent Simmons mentioned this idea in a post about the iPhone App Store Gold Rush, in which he theorized that APIs and iPhone apps might be built together (instead of building the web app, API, and client apps in tandem).

As I mentioned earlier, I built Iron Money’s first public API while writing an iPhone app for it, all of which came before writing the web app. I would avoid doing this again because, as Erik F. Kastner brought up on Twitter, it can limit your ability to make the right thing. I’ve found that engineering the perfect solution ahead of time can result in suboptimal solutions—even if it looks really good in theory.

Here’s an example of this problem in action. In Iron Money’s private beta, transactions could be categorized into three different types of categories: Income, Spending, Transfer, and Uncategorized. Everything would go into Uncategorized by default so that users could have a quick overview of what still needed to be categorized. The Uncategorized category was a sensible default category for all transactions.

However, as I found out while writing the web app, this is an inefficient way of solving the problem of tracking a user’s money. Categorization, as a feature, is mainly useful for searching and reporting. For reporting things such as a person’s income and spending, the web app needed to figure out what was income and what was spending. For transactions in the Uncategorized category, it would guess that positive amounts were income and negative amounts were spending.

This guessing didn’t make things any easier on the web app or any other app built on top of the API; everyone would have to implement their own guessing algorithm. Furthermore, the default graphs for categories weren’t very helpful to users because everything was lumped together by default. At the end of the day, what seemed like the best idea while writing the API turned out to be a suboptimal decision.

I think that building an API as a foundation for your web app and client apps is a good idea; building an API with third-party applications in mind is also a good idea, even if you don’t expect to make it public in the near future. However, I don’t believe that making a brand new API available to the public is a good idea until you’ve had enough time to ensure that your API promotes the best experience for both your users and API consumers.

Creating RESTful Web Services

Creating a web service for public consumption can be difficult. This post is an overview of the decisions that have gone into the next release of Iron Money’s API from my past two years of experience. This is not a guide to architecting an API to match the section about REST in Fielding’s dissertation; this is a look at the decisions I have made while designing a web service for real world use.

Why REST?

REST, SOAP, and XML-RPC are often unfairly compared to one another. REST is an architectural style for APIs while SOAP and XML-RPC are protocols; it would be fairer to compare HTTP, SOAP, and XML-RPC because HTTP is an implementation of REST. SOAP and XML-RPC certainly have their benefits (mainly because of the tools built to work with SOAP and XML-RPC APIs), but the simplicity of designing a web service with RESTful principles in mind makes it easier to get an API up and running over HTTP.

I highly encourage you to use your own API for whatever you’re building: a web app, a mobile app, etc. Using it as the base for what you’re building can provide a stable foundation (your API is going to be well documented and tested, right?). This is something you should do for any API, but both SOAP and XML-RPC can bring additional complexity, dependencies, and overhead to building on top of an API.

Resource design

A RESTful API exposes resources for clients to use. For example, an API might expose a list of accounts at the URI https://example.com/accounts/ (/accounts/ for short).

A resource doesn’t need to have a logical name or structure; it could simply be a random string or number if you wanted. However, making a resource have a logical structure helps people grasp the API easier. For example, your API could make a single account addressable at /accounts/account-id/.

Finding a logical structure for resources won’t always be easy. When designing the resources for a search feature, you might be able to get away with something as simple as /books/phrase/, but if you want to let people search for books published sometime after the year 2000, you will quickly find yourself between a rock and a hard place.

My preferred method to allow for searching within a resource is to append parameters to the resource. For example, to perform the above hypothetical book search, I might make the resource /books/?published-date-min=2001.

I highly recommend making anything that resembles a resource addressable as one. For example, if a book has a list of tags associated with it, create a /books/book-id/tags/ resource and make each tag addressable at /books/book-id/tags/tag-id/. This will make it easier to change data and keep track of the changes; the hows and whys will become apparent in the rest of the post.

DELETE, GET, POST, and PUT methods

HTTP uses the verbs DELETE, GET, POST, and PUT to perform actions on resources. The above examples would have all used GET to get the resources.

POST would be used for adding a resource. For example, one might make a POST request to /books/ to add a new book. The message body would include the data for the new resource.

PUT would be used for editing a resource. For example, one might make a PUT request to /books/book-id/ to edit a book. The message body would include the new data for the resource.

DELETE would be used for deleting a resource. For example, one might make a DELETE request to /books/book-id/ to remove the book from the /books/ resource.

One reason why it’s important to give each resource its own URI is because it makes it easier to apply the above methods to the resources. In my previous example, I created a new resource for a single tag of a book (/books/book-id/tags/tag-id/). If I wanted to remove the tag from the book, I could simply DELETE /books/book-id/tags/tag-id/. If I had tried to do that without a separate resource for a book’s tags, I would have needed to PUT /books/book-id/ and make it possible to indicate that a tag needed to be deleted (or resend the entire tag list).

Status codes

Status codes let clients know the status of a response. The most commonly known response code is 404 Not Found, which an API might return if a client tries to access /books/123/ and book 123 couldn’t be found.

You can find a full list of status codes and their descriptions in the HTTP/1.1 specification. It’s not worth going into more detail about the HTTP status codes because the specification is fairly clear about when to use each code.

ETag and Last-Modified

ETags, or entity tags, are identifiers for a resource. ETags serve a number of useful purposes. ETags are handy for caching; whenever a resource is changed, the ETag should change as well. ETags can also be used for conditionally getting or modifying resources as well.

The If-Match header can be included by the client to only perform a method on a resource if the resource’s current ETag matches the ETag provided by the client. For example, a client might make a PUT request to a resource and include “If-Match: “1″” in the request to indicate that the PUT should not be executed unless the resource has a current ETag of “1”.

The If-None-Match header is similar to the If-Match header except that the request will only be executed if the ETag does not match the resource’s current ETag. For example, a client may make a GET request to a resource and only want the results if the ETags don’t match (which would indicate that the resource changed).

The Last-Modified header can be included in a response to indicate that date and time that a resource was last modified. It’s very similar to an ETag, except that it’s less precise (since it’s only precise to a single second). You can use If-Modified-Since and If-Unmodified-Since to perform similar requests as If-Match and If-None-Match.

I highly recommend supporting the ETag and Last-Modified headers; they can make client applications more efficient and lighten the load on your servers if used correctly.

Format

If you’re API is available in only one format (such as just XML), then you might not worry about including a format in your API. However, if you do plan to have multiple formats available, you’ll need to include the format in the resource URI.

Choosing where to put your format information is really a personal choice. You might decide to have it at the root of your resources (e.g. /json/books/), or append it as a parameter (e.g. /books/?format=json), or include it as an extension (e.g. /books.json).

Versioning

Like formatting, if you don’t version your API, you probably won’t worry about including a version number in your resource URIs. However, I highly suggest versioning your API for your own sanity; if you don’t, you’ll probably be worried about making backwards-incompatible changes, which can stagnate your API’s development.

Like the format parameter, choosing where to put your versioning information is really a personal choice. You might decide to have it at the root of your resources (e.g. /v1/books/), or append it as a parameter (e.g. /books/?v=1).

Linking to resources

Linking to resources is a RESTful concept that some APIs support. The idea is to include the URIs to resources so that clients don’t need to hard-code any resource URIs into their applications. For example, a client might make a request to / to get a list of resources and discover that the books resource is at /books/.

I am not a fan of including the resource URIs in the responses because I think it feels like offloading a problem for the API provider onto the client. The API provider should probably not break URIs for clients, especially when redirecting can be so easy to implement. Additionally, I don’t like the overhead of including URIs in responses.

Including links to resources may be the right thing to do for your web service, but it’s my opinion that doing so can be wasteful.

Multi-resources

REST is not the end-all of API designs. If you would like to allow a client to edit multiple resources at a single time, they can send multiple PUT requests to multiple resources to accomplish the task. However, if you want to provide any sort of atomicity in your API, you’ll have to pull some tricks to make it work.

One option is to allow clients to open transactions with your API. One possible way of doing this is to provide a /transactions/ resource to create a transaction, then perform the PUT requests, and then close the transaction with a final PUT to /transactions/transaction-id/, but this isn’t a very elegant solution (how long does a client have to end the transaction before everything is rolled back?).

Another way of providing the same functionality is to provide a “multi” resource that a client can send all of their changes to. For example, one might make a /multi/ resource that accepts the following JSON:

{
"resources":
{
"/books/":
{"isbn":"978-0684833637","method":"POST","title":"A Moveable Feast"},
"/books/978-0-7432-7356-5/":
{"method":"PUT","title":"The Great Gatsby"}
}
}

This isn’t a very satisfactory method of solving the problem because it isn’t very RESTful, but it’s a messy problem that I haven’t been able to find a good solution for.

Authentication and Authorization

Authentication and authorization might be something you need to handle in your API. Authentication is about identification while authorization is about access. Your API may be completely public (everyone is authorized) or have private resources (in which you need to identify a client and check whether or not they can perform their request).

Authentication is taken care of by HTTP with Basic access authentication or Digest access authentication. Authorization, on the other hand, is something that needs to be handled by your application.

I highly recommend using OAuth as the basis for authorization in your application. While it has its faults, it does the job fairly well, has support from a number of large companies, and has a number of tools already built around it. While you could roll your own authorization protocol, using OAuth is probably a better choice.

More

I highly recommend watching Google’s Intro to REST reading RESTful Web Services, a book about creating RESTful Web Services.