Web service designers have tried for some time now to correlate CRUD (Create, Retrieve, Update and Delete) semantics with the Representational State Transfer (REST) verbs defined by the HTTP specification–GET, PUT, POST, DELETE, HEAD, etc.
So often, developers will try to correlate these two concepts–CRUD and REST–using a one-to-one mapping of verbs from the two spaces, like this:
- Create = PUT
- Retrieve = GET
- Update = POST
- Delete = DELETE
“How to Create a REST Protocol” is an example of a very well-written article about REST, but which makes this faulty assumption. (In fairness to the author, he may well have merely “simplified REST for the masses”, as his article doesn’t specifically state that this mapping is the ONLY valid mapping. And indeed, he makes the statement that the reader should not assume the mapping indicates a direct mapping to SQL operations.)
In the article, “I don’t get PUT versus POST” the author clearly understands the semantic differences between PUT and POST, but fails to understand the benefits (derived from the HTTP protocol) of the proper REST semantics. Ultimately, he promotes the simplified CRUD to REST mapping as layed out above.
But such a trivial mapping is inaccurate at best. The semantics of these two verb spaces have no direct correlation. This is not to say you can’t create a CRUD client that can talk to a REST service. Rather, you need to add some additional higher-level logic to the mapping to complete the transformation from one space to the other.
While Retrieve really does map to an HTTP GET request, and likewise Delete really does map to an HTTP DELETE operation, the same cannot be said of Create and PUT or Update and POST. In some cases, Create means PUT, but in other cases it means POST. Likewise, in some cases Update means POST, while in others it means PUT.
The crux of the issue comes down to a concept known as idempotency. An operation is idempotent if a sequence of two or more of the same operation results in the same resource state as would a single instance of that operation. According to the HTTP 1.1 specification, GET, HEAD, PUT and DELETE are idempotent, while POST is not. That is, a sequence of multiple attempts to PUT data to a URL will result in the same resource state as a single attempt to PUT data to that URL, but the same cannot be said of a POST request. This is why a browser always pops up a warning dialog when you back up over a POSTed form. “Are you sure you want to purchase that item again!?” (Would that the warning was always this clear!)
After that discussion, a more realistic mapping would seem to be:
- Create = PUT iff you are sending the full content of the specified resource (URL).
- Create = POST if you are sending a command to the server to create a subordinate of the specified resource, using some server-side algorithm.
- Retrieve = GET.
- Update = PUT iff you are updating the full content of the specified resource.
- Update = POST if you are requesting the server to update one or more subordinates of the specified resource.
- Delete = DELETE.
NOTE: “iff” means “if and only if”.
Analysis
Create can be implemented using an HTTP PUT, if (and only if) the payload of the request contains the full content of the exactly specified URL. For instance, assume a client issues the following Create OR Update request:
HTTP/1.1 PUT /GrafPak/Pictures/1000.jpg ... <full content of 1000.jpg ... >
This command is idempotent because sending the same command once or five times in a row will have exactly the same effect; namely that the payload of the request will end up becoming the full content of the resource specified by the URL, “/GrafPak/Pictures/1000.jpg”.
On the other hand, the following request is NOT idempotent because the results of sending it either once or several times are different:
HTTP/1.1 POST /GrafPak/Pictures
...
<?xml version="1.0" encoding="UTF-8"?>
<GrafPak operation="add" type="jpeg">
<[CDATA[ <full content of some picture ... > ]]>
</GrafPak>
Specifically, sending this command twice will result in two “new” pictures being added to the Pictures container on the server. According to the HTTP 1.1 specification, the server’s response should be something like “201 Created” with Location headers for each response containing the resource (URL) references to the newly created resources–something like “/GrafPak/Pictures/1001.jpg” and “/GrafPak/Pictures/1002.jpg”.
The value of the Location response header allows the client application to directly address these new picture objects on the server in subsequent operations. In fact, the client application could even use PUT to directly update these new pictures in an idempotent fashion.
What it comes down to is that PUT must create or update a specified resource by sending the full content of that same resource. POST operations, on the other hand, tell a web service exactly how to modify the contents of a resource that may be considered a container of other resources. POST operations may or may not result in additional directly accessible resources.
I read this article and it explains the PUT and POST clearly.
Yes, thanks very much, John. This is the most lucid explanation I have yet found, and since it concurs with both the HTTP spec and my suspicion that a one-to-one mapping was not the right topology for this problem, I’m embracing it.
[...] Update/May 27th, 2009: There has been some great comments regarding the use of PUT versus POST. So I did some additional research and found this interesting post. [...]
This is more of a user-interface issue due the way POST forms are implemented in most browsers. Although POST has certain semantics, the UI’s design translates each click into a separate request, when a large minority of users have learned that to perform an action you have to double-click. Further, the browser sends no feedback that the action was sent so if the interface is unresponsive so many users will attempt to click again. So even when you set a form to be a POST form, you still have to work around these problems, at least by offering some way to confirm repeat submissions.
[...] PUT or POST: The REST of the Story « Open Sourceryjcalcote.wordpress.com [...]
I fully agree. WebDAV for instance doesn’t use POST at all.
Many of the CRUD systems I’ve seen support the concept of an “upsert”, where the CRUD implementation will determine from the primary key value whether the row should be inserted or updated. How would the CRUD client determine in this case which verb to use since it doesn’t know if the operation would be idempotent?
Scott: I’d say you have a problem there, because a RESTful client must be able to tell whether or not an operation will be idempotent in order to choose a proper verb. However, that said, there’s nothing wrong with using POST in this case. Ideally, you’d always use PUT in an idempotent operation, in order to maximize efficiency within the web architecture – that is, in order to ensure proper caching semantics on the operation for caching proxies, etc. However, if you just can’t tell if an operation will be idempotent, then your only alternative is to use POST, thereby defeating proxy caching, and other web arch scalability features. Ultimately, I’d probably choose NOT to use any “upsert” features in the client if I thought the scalability loss would be worse than the server-side efficiency loss.
Isn’t it simpler than this? At least in the CRUD ‘upsert’ implementations I’ve seen, the decision to update vs. insert is based on the presence of the primary key: if it exists, update, otherwise insert.
Can’t the CRUD client simply PUT to the specific URI if the primary key is known, to update, and POST to the parent URI otherwise, to create?
Incredible explanation. Thank you for the specific examples of the difference between PUT and POST, and the clear language. I admit that I used to understand REST through CRUD-y glasses. You’ve cleared them up (sorry for the pun
)
Good explanation of the difference between PUT and POST.
However I still have trouble understanding how you would make use of PUT’s idempotency when you’re constructing a web app that does *not* deal with the uploading of complete, directly-addressable resources. For example, imagine a meal-planning application, with an interface that lets you create and edit dishes (consisting of one or more ingredients), assign the dishes to meals, etc.
The ingredients, meals, and dishes do not have any kind of implicit or obvious identification scheme before I add them into the application; when I do so, they are assigned IDs by the system. PUT isn’t going to work unless I know what the ultimate URL of the (for example) dish is going to be…
…and even if I do, in this case, the dish is really just one piece of real data (dish name) with several ingredients attached to it.
I apologize if I haven’t stated this clearly.
Hi Matt. I think I understand your question. Client-side applications can handle editing of resources in multiple ways. One way is to download the complete resource, allow the user to edit, and then completely replace the original resource. I would consider this an idempotent operation, because if you PUT that resource back where it came from multiple times, you still end up with the same (modified) resource – perhaps a recipe, in this case.
This scheme can be taken to any number of levels of granularity, depending on the level of resource granularity you wish to implement. If you wish to address and edit individual ingredients, for example, then you’d have to allow for a URI for each ingredient, with the parent URI representing the recipe container.
How these URI’s are manipulated is in the contract between client and server, but the PUT vs. POST rules are still in effect. You use PUT when you can directly address a resource, and update the entire resource. You use POST when you are modifying a resource incrementally, or creating a new unnamed resource, expecting the server to return a URI for the newly created resource.
The real trick to understanding web idempotency is knowing that caching proxies handle PUT and POST requests differently. PUT requests are cached, and can be reapplied multiple times by these proxies without regard to the content, because they know it doesn’t hurt to re-PUT, if needed. They simply don’t cache POST requests because they know they can’t reapply a POST without causing problems. The same concept applies to data coming from servers to clients. Caching proxies will not (necessarily) re-query the server for PUT results, but can return cached copies because of the nature of PUT. These concepts are at the heart of web architecture scalability features.
[...] PUT or POST: The REST of the Story « Open Sourcery (tags: rest post put) [...]
[...] Put or Post [...]
I read some very poor explanations of “POST vs PUT” before I came upon your explanation. It makes clear sense now. The take-home message for me is that PUT is an idempotent operation, POST isn’t. So if I pass in an object that the server is supposed to find a unique ID for and add to a parent object, that’s a POST. If you invoke that method times you’ll create new objects as children of the parent. If I update (or create) an object by passing in a _complete_ copy of it, that’s a PUT.
Thanks
Although your article cleared this up for me quite well, I have one nagging question about the idempotent nature of PUT. Even if you PUT a complete copy of a resource to a particular URI, repeating this operation over time may or not be safe, depending on the situation. As I understand them, idempotent operations can be repeated as many times as you like, but the 2nd and later repetitions are effectively “no-ops” that change nothing on the server, and return the exact same result the 2nd..nth times as the 1st. If it’s possible for some other agent to PUT a resource @ the same URI **in-between the 1st agent’s repeated PUTs**, then it’s no longer a “safe no-op” for any agent to invoke the same PUT on that URI repeatedly.
Unless I’ve taken a wrong turn somewhere, it seems like PUT vs. POST is a question about caching:
a) use PUT if it’s safe to invoke the exactly same method call repeatedly and cache the result
b) use POST otherwise
Basically you want to permit as much caching in your application as will be safe and beneficial, but no more than that.
Am I making sense, or did I get it wrong somehow?
^^ I can’t edit that, so I provide a correction here: ^^
I meant to say “If it’s possible for some other agent to PUT a **different** resource @ the same URI…”
(i.e. clobber the resource @ that URI with a different one).
@questionizer: Yes, I understand your concern. http is not a transactional prototcol. This scenario is possible:
agent 1: read resource
agent 2: read resource
agent 2: write resource with updates
agent 1: write resource with updates
Thus, agent 2′s updates are lost. And yes, you are correct in assuming that PUT is about caching and POST is about control. The crux of web scalability is found in caching. Thus, caching is critical to the web. However, this fact hasn’t stopped many many services out there from implementing their services in terms of POST when they felt that transactions were more important than scalability.
[...] if you recall from my October 2008 post, PUT or POST: The REST of the Story, POST is designed to be used to create new resources whose URL is not known in advance, whereas PUT [...]
[...] ein POST hat mich interessiert wie das nun wirklich ist. Nach ewigem Suchen habe ich dann in einem Blog eine Erklärung gefunden, die eine einleuchtende Erklärung abgibt. Leider gab es viel zu wenige [...]
It is an excellent resource about REST concepts, concise enough to provide quality information at a glance.
Congratulations.
[...] something, you use “POST.” A RESTful web service uses other HTTP verbs as well, namely PUT and DELETE, and can also implement OPTIONS to show which methods are appropriate for a [...]
[...] http://jcalcote.wordpress.com/2008/10/16/put-or-post-the-rest-of-the-story/ [...]
[...] PUT or POST: The REST of the Story « Open Sourcery [...]
Hi,
this is a great article, but what is against your article is the ZEND FRAMEWORK implementation.
The Zend_Rest_Route routes in the following way:
POST – create a resource
PUT – update a resource
Why did they choose this way?
Personally I feel that idempotence should be the only important semantic component of PUT.
The spec makes stronger assertions about its semantics than this, namely that the body of a PUT is supposed to a *full* replacement for the resource at the URL in question. But I think this is a mistake; I don’t think any clients or middleware take advantage of this or would reliably be able to take any useful advantage of these added semantics, and by constraining the use of PUT to these cases, you deny people a useful way to signal idempotence for other more general kinds of requests.
Idempotence is something which client and middleware can usefully take advantage of – eg a browser can safely re-try an idempotent request whereas it can’t safely retry a general POST.
Allowing PUT to be used for requests which are idempotent but not technically full replacement updates (eg for partial updates) would allow middleware to know that they’re safe to repeat.
In fact a bunch of frameworks (eg Rails) already do allow partial updates via PUT, despite this being technically incompatible with the RFCs. This used to bother me but now I’m fine with it.
Yes, I agree with your sentiment about the importance of idempotency over that of other features of the PUT method (as you can probably tell by the original post). The article came about as a result of my research to determine the best interface to use to add audit records to a ReSTful log service. I settled on PUT over POST because it’s important to an audit service that records not be repeated in an audit log. POST doesn’t technically allow me to determine if a record has already been added – it only allows me to add it again. I wasn’t really interested in address-ability (who wants to address a particular record in a log of millions anyway), so I had a problem: how to ensure that my PUT of a record (or set of records) is idempotent – that is, doesn’t happen again if the client has to re-PUT due to I/O errors. Ultimately, the solution was simple – keep track of the MD5 signature of the last 500 records or so in a persistent hash table. If I find a match, I toss out a request and return an appropriate client error (which the client then recognizes as “redundant PUT”). Thanks for your comments
Semantics has to do with language. If you object to the semantics then give us a linguistic analysis not a technical work around to befit your misguided semantic objections.
> … the same cannot be said of Create and PUT or Update and POST. In some cases, Create means PUT, but in other cases it means POST. Likewise, in some cases Update means POST, while in others it means PUT.
Says who? Thanks but no thanks.
PUT = INSERT. POST = UPDATE.
The semantics are perfectly alright. The logic is simple. It just works. Everything else follows.
My comments are not so much semantic objections to popular opinion as they are a technical analysis of Roy Fielding’s definitions. The http standard interface has one design goal and you have another. It’s that simple. But, there’s nothing wrong with your approach. It just won’t scale to the same degree as Fielding’s original design. His design is about scaling to billions of users not creating a web interface for database management. This fact has been misunderstood by many, leading naturally to the definitions you suggest.
[...] essence of the subtlety is captured in the introductory paragraphs of “Put or Post: The Rest of the Story” (an article that was discovered and shared by one of the [...]
[...] PUT or POST: The Rest of the Story [...]