Api Design - Third edition

Creative Commons Attribution-Share Alike 3.0 United States License For license details, see: http://creativecommons.org/licenses/by/3.0/deed.en_US API Design, Third Edition If you haven�t checked it out already, we highly encourage you to check out the API craft group on Google (groups.google.com/group/api-craft). It�s filled with folks who are doing wonderful things around API design, implementation, and the business of APIs. This webcast and all the webcasts that we�ve done in the past, and will do in the future are available at youtube.com/apigee. The slides are available on slideshare.net/apigee. I�m Brian Mulloy, you can find me on twitter as @landlessness, and I�m Kevin Swiber, you can find me on twitter as @kevinswiber. Donald Norman talks a lot about design. There is a particular article he uses to talk about the differences between simplicity and features. We always get pounded with the idea that we need more features, when we really need to keep it simple. Norman has a quote that I really like that says, �The real issue is about design: designing things that have the power required for the job while maintaining understandability, the feeling of control, and the pleasure of accomplishment.� This quote isn�t about the issue of simplicity, it�s about design. We are going to give some pointers about how to take a complex process and add usability to it for the end developer as well. If you think about the process of building a cathedral, it may have complexity but to the users it has to be beautiful. That�s our goal, to add have a seamless design that is pleasant. On the agenda today, we are going to recap of the previous API presentations and discuss: * Modeling * Security * Message Design * Hypermedia * Transactions The initial discussion about design is covered apigee�s eBook on Web API Design (http://pages.apigee.com/web-api-design-ebook.html). We focused on some of the aspects of API design represented on this cheat sheet. Over 10,000 people have downloaded this book and it�s our intention to update it with the material we will cover today. If you look at previous editions of this content, you can see that we focused on URI design and we hinted about versioning, errors, and client considerations. Next, we�d like to talk about designing messages, especially response messages, and take a look at some other things, like what�s happening with hypermedia. Modeling How do we get started with an API? We talk about API modeling because a lot of folks have questions about what we can do with modeling. Before we get too deep about responses and the different methodologies that apply, let�s spend a moment talk about what API modeling really is. In the software world, we do modeling all the time. We can use the same concepts and apply them to your API. We recommend including all the stakeholders involved in your API team including marketing, business analysts, and software engineers, key business people, etc. Develop a language that everyone understands and use that language to build out your resources and parts of your messages, to keep everyone on the same page. Also, document this. Some people have UML or some other version of their own modeling language. That�s totally up to you, as long as you continue to iterate. The key point here is, don�t go off on your own. As soon as you say, I am going to solve this for the company all by myself, you�ll put in hours of work only to realize that marketing doesn�t use some of your terms, or that the company wants to go in another direction with its resources and API. So definitely include the API team, they are an important part of your strategy. Security Throughout apigee webcasts, we have talked about security but we are going to touch on it again because it is an important element of API design. * Twitter�s streaming API has the option of using HTTP basic authentication, which is essentially username and password Base64 encoded, so it�s not encrypted (its cleartext). * Amazon decided to use their own authorization/ authentication, so they built on top of HTTP authorization to create an AWS authorization type. * Google uses OAuth, specifically OAuth2 and Veritokens. Our recommendation is to go the Google route and use OAuth2. HTTP basic is only so secure, lots of folks are moving away from it. And creating your own is a huge hassle. Typically, companies don�t have the security resources to verify their design is rock solid. OAuth has been around the bases quite a bit and there are even more secure options coming in the future for access tokens. Message Design How do we approach message design? First of all, regardless of how you design the message, you have to decide what kind of format you want it to be in. We strongly suggest that you support both JSON and XML. There are folks who have XML and JSON libraries and do things that they love. Unless you have strong business requirements that would limit you to XML or JSON, support both. And, if you�re not sure which one to make the default, pick JSON. Now, we are going to talk about how you can model in your response message, for both a single resource item and a collection of resources, by looking at how three large APIs do this. Single Resource What you see here is a response from Twitter, Foursquare, and Instagram for each of the primary objects from those systems. For example, with Twitter, you are looking at a tweet; with Foursquare you are looking at a check-in; and, with Instagram you are looking at a media image. What we are looking at is the top-level view of each of these. Twitter is verbose at the top level; Foursquare has included meta, notifications, and response; and, Instagram has included meta and data. If you open the attributes on Foursquare and Instagram, you can see what that the core object is revealed at the heart of each of these. Of course, some objects are more verbose than others but what can we learn from this as we go about designing our primary resource? Here are the properties we think you ought to do: include a property for meta, where you can add information about the object, and include the actual object name itself, as the primary area where you hold data (e.g., the resource name and global data, like notifications). If you contrast that recommendation with the slide, Foursquare put their information under response and under check-in. I like that they are use check-in but I don�t like that they have it buried under response, because they already separated response from the rest of the metadata by having a meta tag there. Instagram just calls it data, which we consider a throw away word. We think it is really clever what Foursquare has done with notifications, because you can imagine every response you get back from the API includes some kind of global state or information that�s changed, which you can take advantage of by piggy backing that (notification) information on the API response. How that translates for most app developers is, that when using this API, they see notifications come back every time with every response so they can include and comment on this information. Then, they can place these notifications at the top of the screen or somewhere interesting in their app. That�s our guidance on working with resources. Take the best of what Foursquare and Instagram offer and include a metadata area, name your primary resource, and include notifications or other global information. Collection of Resources Now let�s look at cases where you have more than one resource. Twitter, Foursquare, and Instagram all do basically the same thing, which is to throw the objects into an array of multiple objects. Twitter, Foursquare, and Instagram all include the same information in the collection that they include in a single response. When you are thinking about how you make this decision for your API, what you really want to do is to provide a limited subset of the data in the collection. Then, when you drill down on the data, you can provide the verbosity with the idea that you are preserving bandwidth. I warmed up to the idea of including the same exact information in both the collection and in the single response because there is a sense of predictability, it seems more deterministic, and you can count on it being there. The one exception is that Foursquare includes one extra property called source on the single resource that they don�t include on the collection, though I am unsure what the point is. So when you put this together, my thought is that you include the metadata, global notifications other information that impact the app, and put a plural version of the primary resource (e.g., dogs). By including all the things you include in a single resource, you are off and running. If you have a ridiculous model, where the single resource is huge and you are worried about bandwidth, then you can trim down the resource that goes into the collection. And, you can use partial selection to help you narrow that down. You can see previous version of these webinar talks for information on partial selections. Search Results How do we represent search results in a response? Here is a quote from the Facebook API documentation, �Selecting results is not the same as searching.� Sometimes it�s easy to exaggerate why these are separate activities. Essentially, the idea with searching is that you can create some sort of natural language string and a query box, and it can search across resource types. So instead of limiting to, for example photos, you are saying give me anything about *** and it gives you results across all resource types. Let�s take a look at how three APIs approach this. We�ll look at Bing, Google Custom Search, and Reddit. Without taking a lot of time to drill down into what is coming from the responses, just keep in mind that there is metadata about the results, the query, and there is the actual data itself that is being returned from the search. It can get overwhelming quickly when trying to determine what to include in a search response and what to leave out. Our suggestion is to mostly follow Google custom search. The key pieces of metadata they included are under a couple of different nested objects, including a meta property, a limit and offset for pagination, the total results available, a query string that initiated the event, and (to show off how fast the search engine is) a search time. And, of course, in the results property is the array of all the actual results that were returned from your search. Links How do we represent links in the response? Links are cues inside of your message that point to a different resource. So, if you look at the Netflix API, they use something called web link with an actual spec. You can see an href that is pointing to a ulr with the rel value (person.actor) that refers to a title value (Elijah Wood). The GibHub API was planned to look more like a Netflix API in beta but they changed it up a bit. Now their organization is a url, so if you look at a repository that belongs to an organization url, you would go to that url to get to a particular resource. Our recommendation is to follow Netflix and the Web Linking Spec to play around with it, and point developers to it when interacting with your API. How do you help people understand if they should put links in their API responses or not, what are the heuristics for including links? Links are helpful to developers. If you are looking at a response message and you see a relationship, let�s say between a film and an actor, you can look at that message and continue your exploration of the API. When using links, I find myself going back to the documentation less and less, so I can just focus on exploring an API. That�s one huge benefit. To contrast this, I think the alternative is templates, so instead of just giving back the link you give back the ID and the developer has to plug the ID into the template in the right spot. So if I saw people, I could do a GET on /people and plug in that ID (100637) in at the end (/people/100637). That takes some context switching, if you are doing exploratory work, and it takes some different coding structures to makes sure what your are doing couples properly. Another thing that is correlated with this is going out of band; if you don�t include the link in the response, then the developer is going to have to go out of band and use some documentation outside of API to figure out what to do next. We all know that context switching has a big cost. Actions How do we represent actions in a response? When we say actions, we are talking about state transitions that are coming from a particular resource. Let�s look at some examples. GitHub documents this so we can see we are getting away from the message and going towards documentation to figure out how to do this. In this case, we are doing an update to a repository on GitHub. They document the HTTP method to use, the URI template to use, and the input fields. I couldn�t find a good example of this out in the wild, but in public APIs we have a (HTML) form-based API example. It is doing exactly what GitHub is doing and applying that inline. If you are familiar with HTML, you have a form tag, a method (there is a URL that you use to execute that method) and a list of fields. We recommend the form-based approach. We are seeing this a lot more in the wild, I think this is something we need to keep our eye on. Moving towards the form-based approach allows hidden fields, which are big in HTML forms today because it goes into your API. That means you have data coming back and forth, keeping that stateless nature with your API, to get what you need on the server without effecting your clients at all. To make this come to life, when the action comes back in my HTTP response, my code is going to parse this. Then, I am going to introspect all my properties and build my form based. At the end, the method (e.g., PATCH) is going to go back to this href. If it�s HTML form it�s obvious it�s going there but if I build an iOS it�s going to be an API call with that data. Metadata How do we represent metadata in a response? Flicker includes some metadata inline with an item, while Dropbox has an entire metadata resource. So, if you look at a file or a folder in Dropbox there is a lot of data associated with it. We like having metadata in the response itself but, in some cases, it may make sense to have an entire meta resource. For example, when your metadata becomes so significant that it should be modeled in your API. Hypermedia What can we learn from hypermedia types? A lot of thinking around hypermedia is to address message design. For example, Atom/ AtomPub is the one of the first media types to be used with APIs; it is used with feeds, editing, articles in a list, etc. One important thing that AtomPub showed us is to build a protocol around link relations. In the slide, you can see (link rel= �edit� � in bold? They are using that standard link relation to communicate, to the developer or client that is consuming this API, where to go to edit this particular resource. The cool thing about valid XHTML is that it is valid XML, which means that parsers like it quite a bit. It�s difficult to build an HTML parser because they allow a lot to go on, while XML is well formed. This slide is from a site called rstat.us. They are implementing an ALPS specification. ALPS has meaning around classes, such as avatars, users, and user-list search. In XHTML, we are including that information in there so you really get that semantic connection. Some people might say this is a weird use of XHTML but it exists. One gentleman on our api-craft group (groups.google.com/group/api-craft) has been active about pushing XHTML as a standard. We also have a few JSON-based hypermedia types, like HAL. HAL is suited to augmenting your existing APIs. If you have a JSON message out there, adding _links to your existing JSON message, which will hopefully not collide with any existing _links properties you have in your JSON object, so you can start adding some of those hypermedia elements. We see the link relation becomes a key in this key-value pair inside the links objects, so we have a link to self, next, and prev (previous), iterating through orders in the collection. Sometimes there are options for augmenting your existing API without breaking clients. Collection+JSON is a media type that is trying to model AtomPub and what they are doing, except it�s in JSON, so it is a little more palatable to some folks. It also allows for extensions. In it, you can see that there are queries that are modeled here, but there are also write templates you can use, so it�s getting back to those actions and editing the different resources that you have. Then, there�s Siren. Siren has a mixture of all of these. The goal is to take the best of what we know about hypermedia and make a media type that is suitable to APIs. Here, we see actions, just like what we talked about when we were using form based APIs and entities, which imply a link relationship, like we talked about in the Web Linking Spec, and also links for navigational purpose, next, previous, etc. How do we accept binary data? This question comes up a lot on the api-craft group (groups.google.com/group/api-craft) because there�s no single way to do it. It also seems that folks only need to do this sometimes but, when they need to do it, it is very important that they get it done soon. One method to use is multipart/form-data. This is exactly what your browser does when you are filling out a form and you need to submit a file. There is some Content-Disposition you put in there to name the field with a value and you have the option of assigning different Content-Types with each bit you send. Here, you can see there is a jpeg image being sent along. You can do the same thing in your API. Another option is to inline some Base64 encoded version of it. The downside to that is, when you Base64 encode it, it gets a little bit bigger but, at least, you can put in a textual representation. You can have your client do the Base64 encoding to send it straight to the server and to decode. The last option is a two-step process; it�s creating a new resource to hold the metadata you have and submit a file that attaches to it. Opt for multipart/form data first. There are a lot of tools out there to help you do this as a client. No matter which one you choose; I think multipart/form data is the way to go, but just be consistent. Don�t have three or four different ways in your API to upload a file. It�s just not worth the headache. How do we support caching? Specifically, what are the design implications around caching? There are three main ways to do caching. One is expiration so use a Cache-Control header to say this is private, which means only the clients can cache this, no proxy in the middle can cache this, and a max-age in seconds. For example, 30 days (2592000). This is a direct message to the clients saying hold on to this; it should be good for a length of time. Another method is ETags. ETags are sometimes calculated as a checksum of a file or different data elements that you have. Sub-databases do a revision number on each document they have or you can manage your own keys for ETags, much like Twitter does with their own key generator. So, when you pull down a resource representation it will have an ETag associated with it. If you want to make sure you have the latest, you can do a GET at that same url and include the header If-None-Match. And include that ETag that you received. If there are no ETags with it, then you get the latest response. A third option is using the Last-Modified header as a hint of how fresh the representation is. If you get Last-Modified back with your representation, the next time you go to get it you can put a header there that says If-Modified-Since, with that same date and you only get it back if it�s newer. It�s a network call but you don�t get the full body of the message back. While there�s no one particular caching method to do at all times, it really depends on use cases. A common pit fall we get into is that we don�t think about the client. We think about caching in terms of saving resources and saving bandwidth on the server, but we don�t think about it on the client. A lot of times that client doesn�t need to make another network call so saving that time is valuable. Do we need a JavaScript API? We see this in a number of ways, where the API and maybe the domain system itself have some odd complexities around them. So maybe you do and API for payment processing and set payments from credit cards, checking accounts, PayPal, Google wallet, and so on. For your developer to learn how to use your API and then express how to use it through a user interface to the end user, it can be very difficult. Often times there�s pressure on the API to give hints about UI elements and how to use them. That�s always a mistake. You want a clean separation between the API and the consumer; however, some people think of it as a SDK and some as an API. There is so much understanding of JavaScript, at least at a rudimentary level, that this presents itself as an anomalous thing. When you have the resources the question becomes, should we build the next API or the JavaScript API. I�m feeling more and more, if you have an API and you want adoption and the ability to explain it, then go ahead and create a JavaScript library, SDK, or API, whatever you want to call it. If you want and example of folks who are doing the well, then look to LinkedIn. The JavaScript API does a great job of handling all of the authentication stuff so that developers don�t have to get into the weeds around that. And, the way that the JavaScript library maps onto the core API is transparent and it fits nicely. We recommend going ahead and tackle the JavaScript API. Should you do the iOS API and the Android API? Maybe, depending on your priorities, definitely, but if I had to do one I would do the JavaScript. What about posting data? We talked about actions and state changes to our representations. Let�s talk about how do you serialize data. What form does it take when we take the request and send it to the server? One option is to use application/x-www-form-ulrencoded, which looks just like a query string. Another option is to take XML (application/xml) as their request body but it can get verbose sometimes, as XML tends to do. Finally, there�s JSON (application/json). We favor the application/x-www-form-ulrencoded version because it�s simple and easy. It�s not wrong to use the alternatives but favor simplicity. Transactions How do we handle transactions? We are not going to do a deep dive on transactions; we are just going to hint on it. It�s likely we will do a webcast on transactions down the road. One way you could do this is to model the transaction directly in your API. For this example, we are going to use the concept of a shopping cart. A user would create a shopping cart with POST to get a 201 created back. Then, from the product catalog, they add some items (mittens) to their cart. And they check out. So create the transaction, add items to it, and do the commit. Problems can happen, like getting errors back, but we�ll talk more on that in the future. Summary In summary, we talked about the previous webcasts on URI design. We have an eBook on this and there is a cheat sheet in the slide deck to run you through our suggestions for design. Start with API modeling. It�s a good idea to create a common language and a common understanding among your API team. Use OAuth latest version for security. It�s the best model. And keep in mind, when you are designing your response messages, that the message design is for developers to parse those resources into some objects in their language. And, it�s a good idea to be terse to get to the data quickly, while also including metadata, notifications, etc. We gave a quick overview of hypermedia, with some of the different specs out there. There is a lot to think about regarding links and how you design your messages. Finally, we touched briefly on transactions and we�ll come back to this again in the future. Creative Commons Attribution-Share Alike 3.0 United States License For license details, see: http://creativecommons.org/licenses/by/3.0/deed.en_US