Tip:
Highlight text to annotate it
X
Creative Commons Attribution-Share Alike 3.0 United States License
For license details, see: http://creativecommons.org/licenses/by/3.0/deed.en_US
API Design, Third Edition
If you haven�t checked it out already, we highly encourage you to check out the API
craft group on Google (groups.google.com/group/api-craft). It�s filled with folks who are doing wonderful
things around API design, implementation, and the business of APIs. This webcast and
all the webcasts that we�ve done in the past, and will do in the future are available
at youtube.com/apigee. The slides are available on slideshare.net/apigee. I�m Brian Mulloy,
you can find me on twitter as @landlessness, and I�m Kevin Swiber, you can find me on
twitter as @kevinswiber. Donald Norman talks a lot about design. There
is a particular article he uses to talk about the differences between simplicity and features.
We always get pounded with the idea that we need more features, when we really need to
keep it simple. Norman has a quote that I really like that says, �The real issue is
about design: designing things that have the power required for the job while maintaining
understandability, the feeling of control, and the pleasure of accomplishment.� This
quote isn�t about the issue of simplicity, it�s about design. We are going to give
some pointers about how to take a complex process and add usability to it for the end
developer as well. If you think about the process of building
a cathedral, it may have complexity but to the users it has to be beautiful. That�s
our goal, to add have a seamless design that is pleasant. On the agenda today, we are going
to recap of the previous API presentations and discuss:
* Modeling * Security
* Message Design * Hypermedia
* Transactions The initial discussion about design is covered
apigee�s eBook on Web API Design (http://pages.apigee.com/web-api-design-ebook.html). We focused on some of the aspects of API design
represented on this cheat sheet. Over 10,000 people have downloaded this book
and it�s our intention to update it with the material we will cover today. If you look
at previous editions of this content, you can see that we focused on URI design and
we hinted about versioning, errors, and client considerations. Next, we�d like to talk
about designing messages, especially response messages, and take a look at some other things,
like what�s happening with hypermedia. Modeling
How do we get started with an API? We talk about API modeling because a lot of
folks have questions about what we can do with modeling. Before we get too deep about
responses and the different methodologies that apply, let�s spend a moment talk about
what API modeling really is. In the software world, we do modeling all
the time. We can use the same concepts and apply them to your API. We recommend including
all the stakeholders involved in your API team including marketing, business analysts,
and software engineers, key business people, etc. Develop a language that everyone understands
and use that language to build out your resources and parts of your messages, to keep everyone
on the same page. Also, document this. Some people have UML or some other version of their
own modeling language. That�s totally up to you, as long as you continue to iterate.
The key point here is, don�t go off on your own. As soon as you say, I am going to solve
this for the company all by myself, you�ll put in hours of work only to realize that
marketing doesn�t use some of your terms, or that the company wants to go in another
direction with its resources and API. So definitely include the API team, they are an important
part of your strategy. Security
Throughout apigee webcasts, we have talked about security but we are going to touch on
it again because it is an important element of API design.
* Twitter�s streaming API has the option of using HTTP basic authentication, which
is essentially username and password Base64 encoded, so it�s not encrypted (its cleartext).
* Amazon decided to use their own authorization/ authentication, so they built on top of HTTP
authorization to create an AWS authorization type.
* Google uses OAuth, specifically OAuth2 and Veritokens.
Our recommendation is to go the Google route and use OAuth2. HTTP basic is only so secure,
lots of folks are moving away from it. And creating your own is a huge hassle. Typically,
companies don�t have the security resources to verify their design is rock solid. OAuth
has been around the bases quite a bit and there are even more secure options coming
in the future for access tokens. Message Design
How do we approach message design? First of all, regardless of how you design
the message, you have to decide what kind of format you want it to be in. We strongly
suggest that you support both JSON and XML. There are folks who have XML and JSON libraries
and do things that they love. Unless you have strong business requirements that would limit
you to XML or JSON, support both. And, if you�re not sure which one to make the default,
pick JSON. Now, we are going to talk about how you can
model in your response message, for both a single resource item and a collection of resources,
by looking at how three large APIs do this. Single Resource
What you see here is a response from Twitter, Foursquare, and Instagram for each of the
primary objects from those systems. For example, with Twitter, you are looking at a tweet;
with Foursquare you are looking at a check-in; and, with Instagram you are looking at a media
image. What we are looking at is the top-level view of each of these. Twitter is verbose
at the top level; Foursquare has included meta, notifications, and response; and, Instagram
has included meta and data. If you open the attributes on Foursquare and Instagram, you
can see what that the core object is revealed at the heart of each of these.
Of course, some objects are more verbose than others but what can we learn from this as
we go about designing our primary resource? Here are the properties we think you ought
to do: include a property for meta, where you can add information about the object,
and include the actual object name itself, as the primary area where you hold data (e.g.,
the resource name and global data, like notifications). If you contrast that recommendation with the
slide, Foursquare put their information under response and under check-in. I like that they
are use check-in but I don�t like that they have it buried under response, because they
already separated response from the rest of the metadata by having a meta tag there. Instagram
just calls it data, which we consider a throw away word.
We think it is really clever what Foursquare has done with notifications, because you can
imagine every response you get back from the API includes some kind of global state or
information that�s changed, which you can take advantage of by piggy backing that (notification)
information on the API response. How that translates for most app developers is, that
when using this API, they see notifications come back every time with every response so
they can include and comment on this information. Then, they can place these notifications at
the top of the screen or somewhere interesting in their app.
That�s our guidance on working with resources. Take the best of what Foursquare and Instagram
offer and include a metadata area, name your primary resource, and include notifications
or other global information. Collection of Resources
Now let�s look at cases where you have more than one resource. Twitter, Foursquare, and
Instagram all do basically the same thing, which is to throw the objects into an array
of multiple objects. Twitter, Foursquare, and Instagram all include
the same information in the collection that they include in a single response. When you
are thinking about how you make this decision for your API, what you really want to do is
to provide a limited subset of the data in the collection. Then, when you drill down
on the data, you can provide the verbosity with the idea that you are preserving bandwidth.
I warmed up to the idea of including the same exact information in both the collection and
in the single response because there is a sense of predictability, it seems more deterministic,
and you can count on it being there. The one exception is that Foursquare includes
one extra property called source on the single resource that they don�t include on the
collection, though I am unsure what the point is. So when you put this together, my thought
is that you include the metadata, global notifications other information that impact the app, and
put a plural version of the primary resource (e.g., dogs). By including all the things
you include in a single resource, you are off and running.
If you have a ridiculous model, where the single resource is huge and you are worried
about bandwidth, then you can trim down the resource that goes into the collection. And,
you can use partial selection to help you narrow that down.
You can see previous version of these webinar talks for information on partial selections.
Search Results How do we represent search results in a response?
Here is a quote from the Facebook API documentation, �Selecting results is not the same as searching.�
Sometimes it�s easy to exaggerate why these are separate activities. Essentially, the
idea with searching is that you can create some sort of natural language string and a
query box, and it can search across resource types. So instead of limiting to, for example
photos, you are saying give me anything about *** and it gives you results across all
resource types. Let�s take a look at how three APIs approach
this. We�ll look at Bing, Google Custom Search, and Reddit.
Without taking a lot of time to drill down into what is coming from the responses, just
keep in mind that there is metadata about the results, the query, and there is the actual
data itself that is being returned from the search. It can get overwhelming quickly when
trying to determine what to include in a search response and what to leave out.
Our suggestion is to mostly follow Google custom search. The key pieces of metadata
they included are under a couple of different nested objects, including a meta property,
a limit and offset for pagination, the total results available, a query string that initiated
the event, and (to show off how fast the search engine is) a search time. And, of course,
in the results property is the array of all the actual results that were returned from
your search. Links
How do we represent links in the response? Links are cues inside of your message that
point to a different resource. So, if you look at the Netflix API, they use something
called web link with an actual spec. You can see an href that is pointing to a
ulr with the rel value (person.actor) that refers to a title value (Elijah Wood). The
GibHub API was planned to look more like a Netflix API in beta but they changed it up
a bit. Now their organization is a url, so if you look at a repository that belongs to
an organization url, you would go to that url to get to a particular resource.
Our recommendation is to follow Netflix and the Web Linking Spec to play around with it,
and point developers to it when interacting with your API.
How do you help people understand if they should put links in their API responses or
not, what are the heuristics for including links?
Links are helpful to developers. If you are looking at a response message and you see
a relationship, let�s say between a film and an actor, you can look at that message
and continue your exploration of the API. When using links, I find myself going back
to the documentation less and less, so I can just focus on exploring an API. That�s one
huge benefit. To contrast this, I think the alternative is templates, so instead of just
giving back the link you give back the ID and the developer has to plug the ID into
the template in the right spot. So if I saw people, I could do a GET on /people
and plug in that ID (100637) in at the end (/people/100637). That takes some context
switching, if you are doing exploratory work, and it takes some different coding structures
to makes sure what your are doing couples properly. Another thing that is correlated
with this is going out of band; if you don�t include the link in the response, then the
developer is going to have to go out of band and use some documentation outside of API
to figure out what to do next. We all know that context switching has a big cost.
Actions How do we represent actions in a response?
When we say actions, we are talking about state transitions that are coming from a particular
resource. Let�s look at some examples. GitHub documents this so we can see we are
getting away from the message and going towards documentation to figure out how to do this.
In this case, we are doing an update to a repository on GitHub. They document the HTTP
method to use, the URI template to use, and the input fields. I couldn�t find a good
example of this out in the wild, but in public APIs we have a (HTML) form-based API example.
It is doing exactly what GitHub is doing and applying that inline. If you are familiar
with HTML, you have a form tag, a method (there is a URL that you use to execute that method)
and a list of fields. We recommend the form-based approach. We are
seeing this a lot more in the wild, I think this is something we need to keep our eye
on. Moving towards the form-based approach allows hidden fields, which are big in HTML
forms today because it goes into your API. That means you have data coming back and forth,
keeping that stateless nature with your API, to get what you need on the server without
effecting your clients at all. To make this come to life, when the action
comes back in my HTTP response, my code is going to parse this. Then, I am going to introspect
all my properties and build my form based. At the end, the method (e.g., PATCH) is going
to go back to this href. If it�s HTML form it�s obvious it�s going there but if I
build an iOS it�s going to be an API call with that data.
Metadata How do we represent metadata in a response?
Flicker includes some metadata inline with an item, while Dropbox has an entire metadata
resource. So, if you look at a file or a folder in Dropbox there is a lot of data associated
with it. We like having metadata in the response itself
but, in some cases, it may make sense to have an entire meta resource. For example, when
your metadata becomes so significant that it should be modeled in your API.
Hypermedia What can we learn from hypermedia types?
A lot of thinking around hypermedia is to address message design. For example, Atom/
AtomPub is the one of the first media types to be used with APIs; it is used with feeds,
editing, articles in a list, etc. One important thing that AtomPub showed us is to build a
protocol around link relations. In the slide, you can see (link rel= �edit�
� in bold? They are using that standard link relation
to communicate, to the developer or client that is consuming this API, where to go to
edit this particular resource. The cool thing about valid XHTML is that it
is valid XML, which means that parsers like it quite a bit. It�s difficult to build
an HTML parser because they allow a lot to go on, while XML is well formed.
This slide is from a site called rstat.us. They are implementing an ALPS specification.
ALPS has meaning around classes, such as avatars, users, and user-list search. In XHTML, we
are including that information in there so you really get that semantic connection. Some
people might say this is a weird use of XHTML but it exists. One gentleman on our api-craft
group (groups.google.com/group/api-craft) has been active about pushing XHTML as a standard.
We also have a few JSON-based hypermedia types, like HAL. HAL is suited to augmenting your
existing APIs. If you have a JSON message out there, adding _links to your existing
JSON message, which will hopefully not collide with any existing _links properties you have
in your JSON object, so you can start adding some of those hypermedia elements.
We see the link relation becomes a key in this key-value pair inside the links objects,
so we have a link to self, next, and prev (previous), iterating through orders in the
collection. Sometimes there are options for augmenting your existing API without breaking
clients. Collection+JSON is a media type that is trying
to model AtomPub and what they are doing, except it�s in JSON, so it is a little more
palatable to some folks. It also allows for extensions. In it, you can see that there
are queries that are modeled here, but there are also write templates you can use, so it�s
getting back to those actions and editing the different resources that you have.
Then, there�s Siren. Siren has a mixture of all of these. The goal is to take the best
of what we know about hypermedia and make a media type that is suitable to APIs. Here,
we see actions, just like what we talked about when we were using form based APIs and entities,
which imply a link relationship, like we talked about in the Web Linking Spec, and also links
for navigational purpose, next, previous, etc.
How do we accept binary data? This question comes up a lot on the api-craft
group (groups.google.com/group/api-craft) because there�s no single way to do it.
It also seems that folks only need to do this sometimes but, when they need to do it, it
is very important that they get it done soon. One method to use is multipart/form-data.
This is exactly what your browser does when you are filling out a form and you need to
submit a file. There is some Content-Disposition you put
in there to name the field with a value and you have the option of assigning different
Content-Types with each bit you send. Here, you can see there is a jpeg image being sent
along. You can do the same thing in your API. Another option is to inline some Base64 encoded
version of it. The downside to that is, when you Base64 encode it, it gets a little bit
bigger but, at least, you can put in a textual representation. You can have your client do
the Base64 encoding to send it straight to the server and to decode.
The last option is a two-step process; it�s creating a new resource to hold the metadata
you have and submit a file that attaches to it. Opt for multipart/form data first. There
are a lot of tools out there to help you do this as a client. No matter which one you
choose; I think multipart/form data is the way to go, but just be consistent. Don�t
have three or four different ways in your API to upload a file. It�s just not worth
the headache. How do we support caching? Specifically, what
are the design implications around caching? There are three main ways to do caching.
One is expiration so use a Cache-Control header to say this is private, which means only the
clients can cache this, no proxy in the middle can cache this, and a max-age in seconds.
For example, 30 days (2592000). This is a direct message to the clients saying hold
on to this; it should be good for a length of time.
Another method is ETags. ETags are sometimes calculated as a checksum of a file or different
data elements that you have. Sub-databases do a revision number on each document they
have or you can manage your own keys for ETags, much like Twitter does with their own key
generator. So, when you pull down a resource representation it will have an ETag associated
with it. If you want to make sure you have the latest, you can do a GET at that same
url and include the header If-None-Match. And include that ETag that you received. If
there are no ETags with it, then you get the latest response.
A third option is using the Last-Modified header as a hint of how fresh the representation
is. If you get Last-Modified back with your representation, the next time you go to get
it you can put a header there that says If-Modified-Since, with that same date and you only get it back
if it�s newer. It�s a network call but you don�t get the full body of the message
back. While there�s no one particular caching
method to do at all times, it really depends on use cases. A common pit fall we get into
is that we don�t think about the client. We think about caching in terms of saving
resources and saving bandwidth on the server, but we don�t think about it on the client.
A lot of times that client doesn�t need to make another network call so saving that
time is valuable. Do we need a JavaScript API?
We see this in a number of ways, where the API and maybe the domain system itself have
some odd complexities around them. So maybe you do and API for payment processing and
set payments from credit cards, checking accounts, PayPal, Google wallet, and so on. For your
developer to learn how to use your API and then express how to use it through a user
interface to the end user, it can be very difficult. Often times there�s pressure
on the API to give hints about UI elements and how to use them. That�s always a mistake.
You want a clean separation between the API and the consumer; however, some people think
of it as a SDK and some as an API. There is so much understanding of JavaScript, at least
at a rudimentary level, that this presents itself as an anomalous thing. When you have
the resources the question becomes, should we build the next API or the JavaScript API.
I�m feeling more and more, if you have an API and you want adoption and the ability
to explain it, then go ahead and create a JavaScript library, SDK, or API, whatever
you want to call it. If you want and example of folks who are doing
the well, then look to LinkedIn. The JavaScript API does a great job of handling all of the
authentication stuff so that developers don�t have to get into the weeds around that. And,
the way that the JavaScript library maps onto the core API is transparent and it fits nicely.
We recommend going ahead and tackle the JavaScript API. Should you do the iOS API and the Android
API? Maybe, depending on your priorities, definitely, but if I had to do one I would
do the JavaScript. What about posting data?
We talked about actions and state changes to our representations. Let�s talk about
how do you serialize data. What form does it take when we take the request and send
it to the server? One option is to use application/x-www-form-ulrencoded,
which looks just like a query string. Another option is to take XML (application/xml)
as their request body but it can get verbose sometimes, as XML tends to do.
Finally, there�s JSON (application/json). We favor the application/x-www-form-ulrencoded
version because it�s simple and easy. It�s not wrong to use the alternatives but favor
simplicity. Transactions
How do we handle transactions? We are not going to do a deep dive on transactions; we
are just going to hint on it. It�s likely we will do a webcast on transactions down
the road. One way you could do this is to model the
transaction directly in your API. For this example, we are going to use the concept of
a shopping cart. A user would create a shopping cart with POST to get a 201 created back.
Then, from the product catalog, they add some items (mittens) to their cart. And they check
out. So create the transaction, add items to it,
and do the commit. Problems can happen, like getting errors back, but we�ll talk more
on that in the future. Summary
In summary, we talked about the previous webcasts on URI design. We have an eBook on this and
there is a cheat sheet in the slide deck to run you through our suggestions for design.
Start with API modeling. It�s a good idea to create a common language and a common understanding
among your API team. Use OAuth latest version for security. It�s the best model. And keep
in mind, when you are designing your response messages, that the message design is for developers
to parse those resources into some objects in their language. And, it�s a good idea
to be terse to get to the data quickly, while also including metadata, notifications, etc.
We gave a quick overview of hypermedia, with some
of the different specs out there. There is a lot
to think about regarding links and
how you design your messages. Finally, we touched briefly on transactions and we�ll
come back to this again in
the future.
Creative Commons Attribution-Share Alike 3.0 United States License
For license details, see: http://creativecommons.org/licenses/by/3.0/deed.en_US