REST Misconceptions Part 1 - the URI Confusion
This is the second post in a series about REST, where I intend to debunk some commonly repeated mistakes and bad advice for practitioners of RESTful services. The central, and agreeably most important element of any API, is the resource. On the Web resources are identified and addressed by URLs. REST defines some clear rules about these identifiers, yet we so commonly break them. Let’s see how this happens and what are the consequences.
In this series:
- Misuse of URIs
- Not linked enough
- More than links
- Resources are application state
- REST “documentation”
What are RESTful identifiers?
Are the URLs below RESTful?
Okay it was a trick question, because there is no such thing. If you haven’t, you should definitely watch this talk, in which Stefan Tilkov discusses various REST mistakes including the misguided notion of RESTful URIs. The misunderstanding of the identifier in Representational State Transfer is important and very often repeated. Here’s what Roy Fielding writes about identifiers in section 6.2.4 of his dissertation (emphasis mine).
Semantics are a by-product of the act of assigning resource identifiers and populating those resources with representations. At no time whatsoever do the server or client software need to know or understand the meaning of a URI — they merely act as a conduit through which the creator of a resource (a human naming authority) can associate representations with the semantics identified by the URI.
In other words, the identifier is just a pointer used to access resource representations and has no meaning itself. It is a string and all resource semantics are included in the representation. Neither clients nor servers should have to parse and derive any meaning from the identifier. Nor should they have to construct identifiers from pieces.
There are many ways the identifier is misunderstood by REST practitioners. In this post I will focus on the design of the URI itself but I will be raising this subject in subsequent posts because it appears to me that it’s the root cause of most of the confusion surrounding REST.
There are a multitude of discussions, articles and blog posts, which promote the idea that it is possible to state that one URI is better that another. Outside of REST this may be true. This may also be true if combined with some aspects of the HTTP protocol but otherwise it is false.
All URIs are made equal
Otherwise it is just like saying the one number is better than another judging by face value alone. Here are some of the common arguments.
I’ve seen this one a lot. URI should be hackable, readable, meaningful, predictable and whatnot. Sure there is no harm in hackable URI as long as it is not actually being hacked. There are very good articles on this by Mark Seeman and Ben Morris.
The problem arises when clients actually start hacking URIs to construct server requests. Libraries have even appeared to help developers manually build identifiers. Here’s an example of Restangular, which its authors describe as AngularJS service to handle Rest API Restful Resources properly and easily.
1 2 3
Sure, it may be easy, but it’s definitely not a proper handling of a REST API. Such practice introduces coupling between the client and a specific URL scheme. First and foremost it violates the self-descriptiveness of a REST resource, because knowledge outside the representation is required. What is worse, such design is brittle, because the client will break whenever the URL structure changes.
Hackable and readable identifier has only superficial value to the developer and should have no value to the client.
‘Plural or singular’
Some of my colleagues were surprised, but this really is a thing. There are heated discussions on the web on
whether collections URI should be plural, but what about collection members. If there is a collections of book under
/books resource. I could add a book by executing the POST method on the collection resource.
In response I would return status code
201 Created and a
Location header. But what should be the header’s value?
Should it be
/book/Hamlet? The answer is, it doesn’t matter. The client should follow links not
There are proponents of URI schemes, where related resources are assigned a hierarchy of predictable set of identifiers. Actually the above example is a good start and such design is quite intuitive. This after all how directory tree of a filesystem works for example. One could expand the bookstore address space to include book chapters etc:
1 2 3 4 5 6 7
It encourages hacking URIs, which I brought up as my first example of URI abuse. Is that good design? Maybe it’s easier to assign such identifiers when they are created. But this structure should not matter further on. That is because the client should follow links not mint URLs.
Nouns not verbs
There is a rumor, which states that an URL must not contain verbs. Of course identifier such as
http://book.store/books/Hamlet/reserve may be an indication of bad design, which tries to imitate na RPC style API.
But that is only true if HTTP verbs are not used correctly. For example
1 2 3 4 5
This is a bad idea, because
PUT must be idempotent. The client should be safe to try again without risk. But replace it
POST and you’re fine. Also in most cases it’s possible to change the verb to noun and be done with it. There’s a
nice post on that subject. To make the identifier more RESTful the URI can be changed to
http://book.store/books/Hamlet/reservation and used just the same way:
1 2 3 4 5
But is the identifier more RESTful? No, again, because there is no such thing. It’s how the resource is used that determines good API design, not the URI.
Content negotiation is not the URI
I don’t even know how to name this. How many times have you seen na API, which requires the client to include the format in the URI:
Or worse yet
Do these pair identify two separate resources? Of course not, they both identify the book The Tragedy of Hamlet, Prince
of Denmark. One could argue of course that they identify documents about the
/books/Hamlet resource and that’s true.
But these documents are two completely different resources. And trying to update the book by requesting
1 2 3 4 5 6
Should change only the title for the XML document, if anything. To update the book itself the actual id
should be used. And to request a specific format there is content negotiation. The server may then redirect to the document
or add the
Content-Location header but that is a different matter entirely. The distinction between resource
and its document has been described by no one else but sir Tim Berners Lee himself.
Query string parameters
Okay, I do understand so of the arguments for the RESTful URI, but I’m totally stumped about this one. Here’s what I read
Query args (everything after the ?) are used on querying/searching resources (exclusively)
Where did that come from? How is
/books?id=1234 worse than
/books/1234? Please scroll back up and see again what Roy
Fielding wrote about URIs. It’s the mapping from identifier to resource representation that matters and not whether the
URI has query parameter instead of a segment. If the server assigned it that way, so be it.
Another point is whether a filtered collection resource is another resource? Foe example let’s search the
for other books by Shakespeare:
Does it mean that
/books?author=Shakespeare is not a resource? It’s probably a matter of semantics and personal taste,
but if someone chose to treat is as a separate (derived) resource then clearly the query string parameter is part of
the identifier. Deal with it.
Good URI design
A RESTful URI doesn’t exist but this doesn’t mean that there aren’t good and bad ways to design identifiers. Let’s just accept the fact that it usually is not directly related to REST style anyway. And alone never makes an API better or worse in that regard.
Proper use of URIs
What really matters is how a URI is used. I’ve already touched that subject when I mentioned the use of verbs. Also content negotiation qualifies as misuse of URIs. But worst of all, particularly in terms of REST APIs, is that representations are not linked. Without links no wonder clients have to hack and mint URIs all the time. This simple yet rarely employed practice will be the topic of my next post.