Google I/o 2010 - Architecting for performance with gwt

>>Joel Webber: Good afternoon. My name is Joel Webber on the Google Web Toolkit team. >>Adam Schuck: And hi. I'm Adam Schuck from Sydney, Australia. I work on Google Wave. >>Joel Webber: We're going to talk a little bit today about performance in GWT and as it applies to the learning's of the Google Wave team. Also, don't forget, there is a live Wave going on at the following bit.ly link if you're not already there. There's been a little turbulence there, but it should be sorted out now. >>Adam Schuck: Okay. So you're all sitting in the audience today, and you're obviously interested in performance, but it's important to know, why does it matter? Why architect for performance? Well, it turns out, speed of your application really matters. Users want a fast application. And we've certainly seen this with Google.com. We've learned the faster it is, the more people use it. Typically, we talk in three powers of ten in terms of time limits that matter when you're writing a web application; point one of a second, one second, and ten seconds. And these are rough guidelines of important times that you want to keep. "Point one of a second" is what you want to do if you want the application to feel instantaneous. So, if your user is typing, everything should happen in less than a tenth of a second. "One second" is what you want if the user's flow of thought should stay uninterrupted. So, if I'm clicking some buttons and anything takes longer than one second, I might get a little bit distracted. One second's important there. And "ten seconds" is what you really want to do if you want to keep the user's attention focused. So, if anything in your application takes longer than ten seconds, "I've already Alt-tabbed; I'm checking Face book; watching videos on YouTube; change. Oh, it's finally loaded. What was I doing again?" Ten seconds is a really good guideline that, if anything in your application takes longer than that…what was I talking about again? >>Joel Webber: Exactly. >>Adam Schuck: Exactly. >>Joel Webber: So, we're going to talk about this, because I mentioned earlier, from two perspectives: myself working on GWT, perspective of the tool builder. That is, what can we do to make GWT faster, either through the compiler or through its libraries, so that your app is faster with you doing little or no work? >>Adam Schuck: And I'll be presenting from the Google Wave perspective. Google Wave, if you weren't already aware, is build in GWT. And we put a lot of work into making our application faster. We know we have a long way to go, but we're a lot faster than we used to be. And I'll give a demo of that in a second. But, the reason that the two of us are here on stage today is because the GWT team and the Wave teams work very closely together. We both absolutely believe in the importance of speed. And, as a result, whenever we discover something, we share it with the GWT team and vice versa. So we're trying to help…together… make GWT better, so you don't have to worry about these problems. And for those of you who haven't actually seen Google Wave since Google I/O last year, maybe you haven't dusted off your accounts you received. Just a quick reminder, Hands up in the crowd if you've ever opened a really large wave in Google Wave. Okay. Keep your arm up if you found it slow. Wow, people who didn't even have their hand up before put their hands up for that. [laughing] >>Joel Webber: That's because they never succeeded in opening the Wave. >>Adam Schuck: Bingo, okay. I should have said "attempted." Okay. You all know the problem. I am about to do a magic trick, which, for some of you, you'll be surprised at; I'm going to try and open. And, of course, I've made an offering to the demo gods today. >>Joel Webber: My fingers are crossed. >>Adam Schuck: This wave has 370 messages in it, and I'm going to click -- 3-2-1. [click] That was less than a second, right? >>Joel Webber: Within a second. It's close. >>Adam Schuck: All right. Put your hand back up if that's not fast enough for you. Good. We're going to make it faster for all of you. It's not fast enough for me. I forgot to put my hand up as well. So, we've done a lot of work previously that would actually grind your browser to a halt. It would take a long time before you could even see the content. So we're going to talk a little bit about some of the tricks that we've been using to try and make Wave faster. >>Joel Webber: So we're going to talk about this from four perspectives. There's really four major parts you need to consider when you're trying to make your app fast. First is "start up", right? We're talking about web apps. Web apps need to start fast. That's what we expect of them. When they don't, that's bad. And people typically won't stay with an app that doesn't start fast, especially if they're new to it. "Fetching data"… these are web applications. Therefore, they're distributed in practically all real instances. Distributed networks are typically slow. The Internet is certainly, is often quite slow. And you don't want to fetch more data than you need or too often. "Rendering", talking about the web browser. Web browser has pretty unusual performance characteristics, and you need to get data on the screen fast. Then, "user interactions," these are the things that fall into that 100 millisecond or point one of a second limit that Adam talked about earlier that, if you don't fall into that range, your app feels sluggish. You may have heard the word "sluggish" a lot if you were here this morning for Kelly Norton's talk. People think sluggish apps feel bad. They think they're low quality. They just don't feel right. And then, Adam's going to talk a little bit about performance measurement. That is, once you've got your app as fast as you want it to be, or at least under some reasonable threshold, you're released. You want to keep it fast. When you make it faster, you want to keep it faster. So you want to catch regressions. >>Adam Schuck: Okay. So we mentioned four main areas where your application can be faster. The first of these is "start up." You want your application to load quickly. And by "quickly," we mean, much less time than ten seconds. Oops. >>Joel Webber: Sorry. >>Adam Schuck: [chuckle] We're competing. "Concurrency control presently not as good as Google Wave." Who would have thought? Okay, start up. Where does the time go when you're starting up your application? There's four main places. Fetching the script. So, if I have a really bad dial-up network connection, it can take a lot of time to download the JavaScript. Secondly, "evaluating the script." So, if I have a slow computer, it might take awhile for the browser to evaluate this JavaScript that I've just downloaded. "Fetching the initial data." So now, "Okay, we've started up the application. We've evaluated it. Now, we need to know what, in Wave's case, what waves do they have in their inbox? What are their contacts? What are their folders? Etc. And once we've done that, of course, "building the application structure." So we need to put all the panels in place, wire everything up, get a communication channel running. All of these four things in your start-up sequence can add up to quite a slow start-up. Let's look at what a typical out-of-the-box GWT start-up sequence looks like. We're looking at four roundtrips. Four roundtrips, which, for people who live in Australia, can actually add up. There we go. Another Aussie, hurray! [laughing] It can really add up if you have a high-latency connection. Now, the four roundtrips. First one is, we download the host page. So this is initial HTML creates the structure. Secondly, GWT, we download what's called a "selection script." This script in the browser figures out what browser are they using, what language do they want to talk, etc. Now, we know exactly what version of your application we want to download. So now I know, "Okay. Let's download the French Safari version of Google Wave." Once we've done that, then we actually fetch that version of JavaScript. We evaluate it. So, if there's a lot of JavaScript, takes awhile to evaluate, especially on a slow machine. Now that we've finally done that, we've got the communication channel up and running. Only now can we start fetching the actual data, so, the inbox, the contacts etc. And that, of course, requires roundtrip, not just to your front end, but all the way through your back end. This adds up to quite a bit of time. Turns out, we can do a lot better than this. We can actually reduce the four roundtrips down to two roundtrips. And we can actually do one, but I'll tell you why that's not actually better. So, first optimization we make is, instead of downloading the selection script and, in the browser, figuring out what version of your GWT application to run; rather, we base off the http headers of your first request which version we should serve. So, the server, the front end can say, "Aha! We now know from the user agent they're using Safari." And we can use various http headers to figure out, "Aha! They're in France." Maybe we know their user ID from the cookies, etc. So we know they prefer using the Swiss French etc. So we save one round trip there. And second to that, we also kick off downloading the initial data in parallel. So, instead of waiting for everything to start up, we start downloading. While we're fetching the JavaScript, we start fetching the initial data. Note that we didn't bundle the JavaScript into that very first download. And the reason we don't want to do that is because your application will typically change on a weekly basis, whereas that initial HTML page will probably change every time you request. And so, the JavaScript caches a lot better this way. And, in this particular example, you see the initial data actually returns to the browser before the JavaScript is up and running, which is great. It means the second the JavaScript is ready, we can show a completely loaded application. So that's using two roundtrips. And note that we don't want to have to wait for the communication channel, the RPC channel, to be up and running. So we send down the data to the browser in a lightweight format, perhaps Json. So that's how we can really speed things up in your start-up sequence. >>Joel Webber: So, I'm going to tell you a little bit about code splitting. Now, Adam just told you how you can reduce the number of http requests intrinsic to your start-up process. That's certainly going to make a good bit of difference. But then, we have another problem. We have applications. People tend to like to add more features to applications, and applications that have more features tend to get bigger. As you may know, GWT is a monolithic compiler, tends to compile all of your app as one big ball of JavaScript that you then have to download to the client. Well, if that thing is growing without bound, you're eventually going to hit a brick wall. It's going to get really, really slow on start-up. So, with the Wave team's input and a strong desire, we came up with -- >>Adam Schuck: -- very strong. [laughing] >>Joel Webber: -- very strong desire. We came up with what we called code splitting. This is a compiler. It is semi-automated, but it is driven by the compiler. A way of splitting your application into multiple chunks automatically. So you have to define as "split points." These are the places where the compiler is allowed to split them up. So, split up fragments so that the user can take a small hitch during that time. But then, from there on out, it optimizes everything automatically. Your goal is to have one fragment that contains no more than is strictly needed for your initial view – whatever, whatever's visible in your initial page, or set-up of your application. And to show a quick demo of that, we're going to go to the -- >>Adam Schuck: --showcase. >>Joel Webber: -- simple GWT showcase example. And this is a bit of a contrived example, because the showcase is naturally divided into a million little pages. Well, 30 or 40, let's say. And what Adam's showing here is the network graph from Chrome where initially we see this Indi 5 blob here is the initial page, which contains really common code and enough code to get the first page displayed. In this case, that one checkbox sample. From there on out, it will fetch subsequent fragments. And this app has been really aggressively split up more to demonstrate the point. You wouldn't probably split it up quite this fine-grained in practice. But the idea is that, anything that's not needed immediately gets fetched later. So, in this case, there are a few things that get fetched fairly early on. So, what Adam's going to do is, click another spot in here that will show us a different sample that was not downloaded, that was not part of original code. And, if you watch the network graph on the right as he does this, you'll see there is a -- >>Adam Schuck: -- P-N-G. I haven't actually clicked basic texture. So let's click this. >>Joel Webber: Okay, just down at the bottom, we fetched two new fragments. Once again, because this is slightly over split up. But again, it's just a demonstration. This allows you to build an app that is essentially of arbitrary complexity. As long as you can wait on the compile times; yeah, it's been known that it can get a little slow, but we rather you wait than your users wait. So you can build really large apps, really complex apps, and still keep the start-up very fast. >>Adam Schuck: Okay, and from the Wave point of view, as Joel mentioned, we were adamant that we didn't want to have to stop adding new features to Wave-- >>Joel Webber: --and they were adding a lot. >>Adam Schuck: We were adding a lot of features. [laughing] Maybe we should have stopped. But, we kept adding features, and the code just kept growing. And we really felt it's important that users only see the exact code you need for the exact features that we're showing upfront. We heavily optimized our start-up sequence so that we show a loaded inbox first. We believe perceived latency is the most important thing to optimize for. Users should get the impression the application is loading fast. And in our opinion, we decided, we want people to be able to read their new content as quickly as possible. Perhaps we'll change that to optimize for showing individual waves later. But, we've currently structured our code to optimize for a full inbox. And if you try and do something like this, you're trying to reduce the amount of code that needs to be downloaded to start your application. the first question you're going to ask is, "What is actually in that initial download?" And there's a really neat tool created by the GWT to answer our question, [chuckle] of course, called "compile reports." And I've got one lying around right here. So here's a compile report for Google Wave. And you'll actually see the initial download size is quite large. We actually found this gives better performance in our particular case. To Joel's point of download everything you need to see what you want the user to see upfront. And we're talking roughly three megabytes full-code size, less than a meg initial download. That's huge, but bear in mind that's pre-compression. And GWT code does compress very well. And you can click on this link here, "report," and see exactly what is in your initial download. So I can see here Java.util is the number one contributor by size as a package, and it contributes 37k. I can see there's a search panel. That's a relief. We're trying to show the search panel, the inbox, as quickly as possible. And we can see, aha! You know, the contacts panel is not loaded. That is part of the contacts panel, but it's just some resources. It's not the full contacts panel. So we can see how much these things contribute. And your next question after, "What is in the initial download?" will be, "Why is it in the initial download?" And to give an example, if I click on "framed panel" here, it shows me exactly what java classes, are in the initial download, and how big they are. And the exact call stack from your on-module load all the way through to the class. So this is a really useful tool for identifying "why", not just "what", is in your initial download. Also, to talk a little bit more about Wave start-up sequence, we mentioned it's important to fetch your initial data as quickly as possible in parallel, in fact, to loading your JavaScript. So we have a very good idea of what information the user wants in their initial start-up of the Google Wave Client. We anticipate this on the servers. And we start, we use transfer encoding of chunk on the server, and we start sending small amounts of information up to the browser as it is available. And what we could do there, so we have "contact starter," "inbox starter," "wave data," we can send this all up when we get it. But note that you actually block the client if it has to evaluate that JavaScript. So rather, what we found we do, because we're optimizing for an inbox loading quickly, is we send that particular piece of data quick, and then, everything else as we get it. As mentioned, we code split to optimize for an initial download that shows an inbox. And overall, our results in the last six months that we've been thinking about this problem, we've made the median two seconds, it used to be five seconds; and for all of our users in Africa or Australia, 90th percentile, it used to be 16 seconds, now, its 7 seconds. These are ballparks. And you can see now, we are within that 10 second time limit that I mentioned as being so important. And, of course, we're doing more work to make this faster. We want to reduce that code size. >>Joel Webber: So earlier, Adam mentioned optimization of the start-up process; reducing http requests by doing script selection on the server. This is one thing that was built by another team within Google that's used by Wave in a number of products. And we're going to be open sourcing this as soon as time permits, so you won't have to roll your own custom link in order to do this yourself. And we'll bring that into GWT proper so you can all take advantage of it. So, second point, fetching data. Again, as I mentioned, the…fetching data…we're talking about over the Internet. Internet is slow, or it's always slower than you want it to be at least. There's really two things this boils down to. There is, fetching data you don't need, that is, fetching more data than is needed to actually display what the user is looking for and needs to see. And there's fetching it in too many http requests. The latter is a subtle point that, if you fetch a small amount of data spread out over a lot of http requests, then it's still going to be slow, because the http overhead is eating you alive. So, you have to consider both of these things and have different strategies for dealing with them. First one is a little subtle. It's not as simple to deal with. If you've ever built client-server applications, like LAN applications, say back in the early 90s when Google was doing that, you tended to sort of have a 2-tiered process, right? You have a database in the back end; you have a really chatty communication on the front end. You munge all the data and display it in the UI. That was at least very common. It's fairly easy to do. Problem with that is that it's chatty. You download a lot of data you don't need typically, and that doesn't work so well for the Internet. So, what we found really works well as a strategy is to design your RPC system, be it RPC Json, fetch or whatever, to support the UI directly. Make sure the UI can ask questions of the server in terms that will give it precisely what it needs and no more. Don't fetch anything you don't need. Be careful of types. If you're used to GWT RPC system, it's really nice, because it's automated. It will walk your reference graph and figure out how to download all the objects referenced by another object and so forth transitively, but that can also lead to situations where you download stuff you didn't realize because it's being so helpful that it sent a bunch of stuff so that it satisfies all the references. You have to be careful with that. So watch your payload size and watch which object you're sending out over the wires. Sometimes it makes sense to use specialized DTOs to the client, that may not be the same ones you use in the back end, so that you don't trigger this problem. The other issue, as I mentioned before, is the http requests. So, this is a super-simplified version or imaginative version of the Wave Client server protocol. It doesn't actually work this way for reasons that will become clear in a moment. Now, you might say, "Well, I'll just have each piece of my UI talk to an RPC interface, get data that it needs from the back end." This is written in terms of the GWT RPC semantics roughly, but the same thing applies no matter what you're using: Json, XML, or what have you. The problem with this is that, very often, a single user interaction leads to a bunch of requests. And the reason this happens is that you say, imagine I'm in wave, I click on a wave. I've got to fetch the header, to display that. I've got to fetch the wave itself, so I can display that. But also, because I want to keep the rest of the UI up to date, I use this opportunity also go fetch the status of my contacts. I have to fetch the inbox, an updated version of the inbox perhaps, that's deltas, or what have you. You get the idea. I'm fetching a bunch of stuff from one user interaction. Well, this would translate into four http requests, and they can get serialized because mini-browsers have a 2-connection limit. Even the modern browsers will have a 2-connection limit on the slow connection so they don't saturate the link. So that ends up being really, really bad in practice when your UI has, it really has two problems, one, it's slow. Two, it feels really slow, because these things come in at different rates. So, this pops in, that pops in, that pops in, that pops in. The whole UI kind of jiggles and feels awkward and sluggish. The easy way to fix this, well, relatively easy, is to actually restructure your RPC interface such that you break it into "request object" and "response objects." You build a single interface, at least for the purpose of most of the UI, that allows you to batch all of those requests. So, all the different parts of the UI that need information will populate an array of request objects, array of list request objects, go to the server, server processes all those, sends back the response objects, and the piece of code doing the transfer goes off and farms those responses off of the appropriate parts of the UI. And a really simple way of doing this is simply to use a deferred command. If you're more familiar with JavaScript, it's equivalent to using "set timeout zero," more or less. And that allows you to aggregate your requests based on a single user interaction. The idea is that a deferred command will run after the current event handler is done, which almost invariably corresponds to a single user action. And when that thing fires, you take all the requests that have been batched up by different parts of the UI, and it fires them off to the server to be handled. That allows the server to optimize things, so like hot database connections and so forth can be used for all these different requests being made. And it allows the different parts of the UI to be written independently, because they all just use this simple add request interface to get their data. They don't need to know about each other, which keeps your code simpler. And that can lead to, again, a single http request for user action, which is your goal. Okay, third part, rendering. Again, we're talking about a web browser here, unusual performance characteristic, even the best of them. And when your rendering is slow, you really run into two problems. One, you run into all the problems that come with being slow, "You just don't want your app to be slow. Your users won't like it." But also, on the rendering side, on the client side, you block your UI thread. And that's particularly bad because, on most browsers, that means that user can't even switch tabs off from Windows. And they can't do anything. They can't click the back button; they can't close the window until that action is free. So if you have something rendering taking several seconds, which is not unheard of if you're not careful, then it's a really, really bad user experience. And that's what Kelly Norton's talk was referring to, the sluggishness; really, really important thing to avoid. This happens really in two places. When you're creating widgets and you're getting data on to the screen. "Creating widgets" really means creating the scaffolding, a place where you're going to put all your data, and then, there's actually populating the data. And really, when I say "widgets," I'm specifically referring to GWT widgets. The same thing applies for JavaScript widgets and so forth, but I'll speak in terms of widgets. And there's two cases where you end up with widgets that you don't need. Either you create them too early, that is, you've created widgets that are not visible yet, and therefore, they were a waste of time. You could have amortized that and created them when they were needed. And you've created ones you didn't need at all. So, things, creating widgets that do simple things like "layout," I'll go more into detail on how you can deal with this later, is often unnecessary. You can just simply get away with standard HTML. It's often easier as well. And a simple case, so, I mentioned earlier about where I'm going to go, creating widgets too early. So if you imagine a simple case of a tab panel. Got five tabs. You've got one thing that's visible, four things by definition not visible initially, four things that I probably shouldn't have created yet. A really simple pattern for this. If I just say "lazy initialization," you can probably guess what I mean and figure out how to do it. There is a lazy panel that is a simple way of doing this that's built-in, but the pattern is very, very easy to imagine. Just wait until something is shown, then create the widget. And, in fact, there's a large hierarchy of widgets that can make a huge difference. So, I also mentioned that you should avoid creating widgets that you simply don't need. And the reason widgets are slow, I say they're slow. They're slow relative to their HTML counterparts, is because they do extra work. They do extra work to deal with event handling, then you've got memory leaks, so forth and so on. And if you don't need it, then there's no reason to run that code. I often like to say, "The fastest code is the code you never run." Only widgets that actually need to handle events really even need to exist. And in some cases, even those don't need to exist. So, if you can aggregate event handling at a higher level, at an outer widget that contains some simpler constructs inside. We use this in things like "tree" and some of the table widgets. Then, you can optimize the creation of these things a lot further. And there is an easy way now to create HTML elements for widgets where you may have used widgets before in UiBinder. I'll cover that very briefly and go into more detail later. So, use your I binder to replace widgets with HTML, again, often the easier thing to do. In GWT 2.1, we have added a series, or are in the process of adding a series, of cell-based lists trees and tables. We call these "data presentation widgets." You may have seen Bruce Johnson demonstrate these this morning. These are extremely fast waves during your large collections of data. Because, oftentimes, if you're talking about a "table" or "tree" or something like that, all the individual items are actually very simple. There's no reason to use heavyweight widgets for all of those. I won't go into great detail on this right now, but tomorrow morning, Ray Ryan and myself are giving a talk describing these in much more detail on how you can use them in your apps to speed them up drastically. So UiBinder, for those not familiar, this is a simple XML structure that we introduced in 2.0 that allows you to use XML to describe what would normally have been a lot of Java boilerplate. But a side effect of this is, it also makes it really easy to mix widgets with HTML elements, so you just create HTML if that's what you want. It also makes it easier to mix CSS with your widgets, associate them directly, and also avoid slow CSS patterns. So, things like descendant selectors, in this case here, you see a div at the link of a descendant selector. That can be an extremely slow pattern to match. This structure, which I'll show in a moment, really makes it easy to avoid these kinds of cases, which are very common in traditional CSS. >>Adam Schuck: And aside from the performance benefits of using UiBinder, there's also a really nice code cleanliness quality to using UiBinder. The Google Wave team, we proudly boasted a 4,000-line CSS file which nobody, one person in the team, knew how it worked and everyone else feared deeply. And I think we managed to very successfully carve that up into lots of 50-line CSS files scattered throughout our code base, living in the same package as the code that uses them. This also has the added performance benefit of; they get pulled down with code-splitting at the exact point in time that they're used. So code cleanliness and performance working together. It's really great. They're not competing goals in this particular example. What you're looking at here is an actual UiBinder template from Google Waves code base. This is a contact. So the little avatar image with the person speaking. And using UiBinder, we can pull all this CSS in place and connect this up to our Java. And Joel's going to tell us how. >>Joel Webber: So two things to note here. There won't be a pop quiz; you don't have to memorize this. What I've highlighted first here is, you notice these UI field attributes. What these do is describe how you're actually going to bind the individual elements created for you into your code. Because you could've simply write something like this with an interHTML call, put it in text in your Java code, but it's really hard to actually get the elements back out again once they've been created. UiBinder automates this process. And while there are no leaf widgets in this case, there could be widgets mixed into this. So you might have, if you had a G: button element as a leaf in here, that would work as well. So, it'll again, allows you to mix them freely. This allows you to, in this case here, you'll notice because they're all HTML elements, it allows you to quickly get references to the pieces you need so that you can populate them with data. But there are no widgets in here, because the only events that are handled are mouse and click events at the top level. So that's just handled in the outermost widget: keeps it simple, keeps it fast. The other thing I mentioned earlier is CSS. So this uses standard class attributes for binding CSS to elements. Same thing goes if you're binding it to widgets and I've got this G-style element right here. It's part of the same file even though they're separate boxes. These are individual rules. They're all simple rules, a single class selector, that are extremely fast to match. So you're not angering the CSS performance gods by making complex selectors. And UiBinder takes care of the process of obfuscating and name spacing these things so they don't conflict with one another. So, you can have something called "dot name" and not worry about somebody in some other team in another continent who after creating another "dot name" which conflicts and causes your UI to do strange things. >>Adam Schuck: So UiBinder, very good for both performance and cleanliness of code. So there's really no excuse not to use them. >>Joel Webber: Per this virtue is the pit of success. Where doing the easy thing is the fast thing or the correct thing. >>Adam Schuck: All right. Shall we demo? >>Joel Webber: We shall. >>Adam Schuck: Okay. >>Joel Webber: Okay. So, you may have seen this enormously fascinating application at the keynote this morning. What's really fascinating about this app is not that it's about expense reports, sort of a dry subject. What's fascinating about it is that it's fast, assuming that my application hasn't cooled down in the meantime. Here we go. It's normally fast. That's app engine. That's not the GWT side. I'm not speaking about app engine. It is…importantly, there are practically no widgets in this demo. There are, I think the last count, there are approximately 25 total widgets for the core string superstructure for a couple of the individual elements. But the rest of it is all rendered using the new list tables, trees, data presentation widgets we're adding in 2.1. And what this allows you to do importantly is things like this, that we probably would not have even bothered to try before because it would be too slow using a complex widget structure. So as I type here, you'll notice that it is highlighting as I type. That seems like a reasonable thing to desire. I'm sure UI guys would love it, but it would have been a pain before. This now is extremely fast to render. And there's no reason that you should worry about any complexity in your UI using these. It should always be reasonably performing, in this case extremely fast. This also brings us to another question that has come up a lot is …should I? Let's say, I'm writing an app. I'm finding that it's getting kind of slow. I'm using a lot of widgets, getting slow. And I'm starting to wonder if maybe I should be rendering my HTML on the server. You know, that seems like a really complex thing to do, but it's a question that often comes up. And it may have been true at one point. Now, what we've found is that, when we moved, when we built these widgets, they're carefully optimized to use in their HTML web wherever possible, optimized string manipulations and so forth. In a way that you don't have to be exposed. So you don't have to worry about it. And we found, in the vast majority of cases, it is at least as fast as rendering static HTML from the server without the added network overhead. And in many cases, it actually outruns downloading HTML from the server. The only cases where we found where it really, where it can still come out behind, is either when you're talking about progressively rendering static HTML; but that only applies to your first page. So if you have a page that has extremely high-latency requirements, you can render it from the server, sort of decorate the HTML and so forth. But that really only helps you on the first page. After that, you're still working on the Client. And the other case is when you have extremely complex calculations going on to produce the actual HTML that you're rendering. That really only comes up in rare cases. I think Wave has a few of them when it comes to Wave rendering. And I've seen a couple of other apps. But usually, you should almost never start there. You should almost always begrudgingly move any kind of rendering to the server. This pretty much takes care of the vast majority of those cases as far as we can tell, so far. >>Adam Schuck: And the Wave team is looking into using a server side rendering approach. We definitely think, as mentioned, the start-up sequence; you've got to do two JavaScript roundtrips to get the data. If you can really just send down HTML and go "set HTML" on the page. And then, when JavaScript is fired up, swap in some content. We do believe that we can get some performance gains. >>Joel Webber: But again, do that begrudgingly. It's a lot of work. >>Adam Schuck: We're doing it begrudgingly. [laughing] It's a lot of work. >>Joel Webber: One other thing to set stage for is, what I usually refer to as "on-demand rendering" or "on-the-draw widgets" if you will. Often referred to as "infinite scrolling." The idea is to render only what's literally on the screen. So if you have a large scrollable area, you fake out the scrollable part, and then, you render elements or widgets as they are, as they come into view. And we haven't actually implemented this yet except in some internal demos; but this sets the stage for doing that quickly and efficiently. So, we think this will make a huge difference for really, really large collections. And Wave has implemented it for both the inbox or search view, so, all the items that you see in that scrolling view. And also, for Waves themselves. So waves that are 3 or 400 blips long. If you rendered all of those upfront, it would be fairly slow, as you've probably seen in the past. But now, it's much, much faster as a result of rendering only the visible elements. >>Adam Schuck: That's right. So we're seeing, we calculate roughly a 4x improvement. And statistically, that's looking at the averages. So we've got very different sizes of waves. But, now we really do, it's much faster if you want to jump to the end of a wave as I showed you previously at the start of this talk, because we don't worry about rendering all the content in a linear order. Okay. If I may? >>Joel Webber: You may. >>Adam Schuck: So, we're now up to the fourth of the four areas where we think you can improve your application's performance. So we've talked about start-up. We've talked about fetching the data, we've talked about rendering. Now to talk about another one of these sub-tenth of a second hot spots: user interactions. When a user is interacting directly with your application, clicking buttons, typing, we're really shooting at that tenth of a second barrier. And there's a great talk, in fact, if you attended the Speed Tracer talk this morning, you learned it's a really great deep dive into this topic. We're going to stay somewhat high-level. And these interactions, there's a few places where time can build up. Slow event handlers, so, we block the UI thread. So if we do too much processing while we're handling the UI, it could potentially take more than a tenth of a second. Click events, if I click, I want to see a change immediately. That's what I'm used to. Mouse events, if I'm dragging, hovering, you don't want that "sluggishness" as we're using the word. Key events: typing, navigation, and also window resizes, all these things you should make them fast. One general tip is, if there is a browser feature, rather than you having to write code, use it. Turns out, writing less code also means it's going to be faster. So we're moving towards using CSS animations, for example, rather than JavaScript. And a lot of the new GWT widgets are doing exactly that, which is, relying on clever CSS positioning rules to improve the performance. [pause] We're talking about perceived latency. So you want to keep the application feeling responsive at all points in time. And remember, even though you're writing your code in Java using GWT, you are actually in fact in a single-threaded environment. What this means is that, if you want an event to be processed, I click a button. If you're still doing some JavaScript processing, we have to wait for this processing to be complete. So it's best to parcel your JavaScript into lots of small fragments so that we can keep the application responsive at all points in time. And if you look at this diagram I have here. And if you don't mind zooming in on a few things. This is from our experiments we found that the on-mouse-down event tends to trigger before the on-mouse-up event by half a second. So there's a really easy optimization to make there. Do processing on on-mouse-down rather than on-mouse-up and you'll give people the impression everything's half a second faster just by switching the event you use. Okay? And another important trick which we use throughout the wave user interface is what we call "optimistic UI," optimistic user interface, where we assume everything is going to work. If I type, we assume it will correctly get persisted to the server; it worked. Rather than roundtrip, update, okay, we managed to persist it. Now, show it to the user. It's a really good tip for all of you to use in your applications. Give the user the impression that everything worked and give them buttons like "undo" or "error notifications" after the fact. That way, you're optimizing for the case where everything works. If you're writing a good application, everything will work fast. And one way to do this is to use GWT's deferred command. That way, if I click a button, we do the thing that it associates with. So, if I'm typing in Wave, show the text. And in a deferred command, because we need the browser to update; we need to yield to the browser to refresh. Now, we send off requests and use a batching approach as Joel mentioned. So, really keep the application feeling as responsive as possible by deferring any kind of busy work, anything that will take longer than a tenth of a second. >>Joel Webber: The other thing that Adam mentioned a moment ago was leaning on the browser, leaning on native code. If you don't have to write it, that's also good. And leaning on native code is always faster. So resizing, again, we resize a panel within something like Wave, or resize the window itself. If that's sluggish, the app feels awkward and uncomfortable. It just doesn't feel good. And in GWT 2.0, we introduced these layout panels that actually lean on these process very heavily and very carefully so that you can get predictable layout that is also very, very fast. We won't go in deep dive on that right now, but again, tomorrow morning's talk, we'll go into that in quite some detail. You should definitely be using these. It helps you move your standards mode and it helps you keep things fast. And it makes your life a lot easier. >>Adam Schuck: Okay, so we've told you four places and some tips on how to make your application much faster. Now, here's a hypothetical. You spend a few months; you use Speed Tracer and some great tools; you figure out exactly why your application is slow; you make it faster. And somehow, another developer on your team touches one line of code, and the initial download doubles in size or clicks are being handled irresponsibly. This is not that hypothetical. This happened very much on the Google Wave team. And we learned our lesson hopefully so none of you have to make the exact same mistake as us. Things will get slower unless you are paying attention to them. So it's very important that you to do latency-regression testing. Have, most of you would have TVs in your office showing the state of your continuous build; it's green or it's red. It's just as important to track how fast your application is. So, the two main areas you want to track, plot your download size. As we mentioned, if you have a slow Internet connection, the size actually does make a big difference. Plot your initial size, so we care about start-up time. But also your total to see if there's any ridiculous jumps in how much code is being added, any inefficient JavaScript. And then, the other aspect is measure. Actually measure the performance in milliseconds, in seconds, of your application. For Wave, we care about the Client loading fast, waves opening fast, search performing quickly. There's a lot of other things, but really, it's best to focus on a small number of key metrics for your application. And measure them both in production conditions, what your user's experience. It's important to note, you don't want your whole team saying, "Yeah, it's fast." And all your users saying, "Wow, it's slow today." So it's really important you know what your users are feeling; and also, lab conditions. And I'll explain why in a second. Plotting your download size, very important. As mentioned earlier, Wave team, we love adding new features. So up, up, up, up, up, up, when our initial download. Then we politely ask the GWT team to create code-splitting for us. And much later, you see a sharp drop. And actually later on, we found out it's not optimal to just make it as small as possible, because we want the inbox to be optimized. The other interesting thing in this chart, the other reason you want to track it, it's not just to celebrate whenever it goes down, but whenever you spot a regression. So if Joel just zooms in on this little hill we have here. Somebody in our team accidentally checked in an Adobe Photoshop PNG with all the layers. Rather than a nice, compressed, crushed PNG file. We noticed this on the chart. Everyone went, "Why did it go up?" "Aha! We know now." We made it go down. So it's great to track this, because you find when you make mistakes, and we're all human here. It's much better to set up the system; rather than assume you're not going to make mistakes and trust each other, dare I say it. Timing is very important. So, it's important to track what your users feel. And you can see, it's a noisy data. But we're looking at here, creation of a wave from the search panel. And when we rolled out this on-demand rendering, we saw a sharp drop, which is great. It's really useful to see this information, because every week you get to see a jump in your chart, and you go, "Aha! We're faster." Everyone rejoices, pop open a few beers, and everyone's happy. Then, the next question comes, "Why is it faster?" And everyone puts up their hand and says, "It's because of what I did." Why does this happen? Because we only push a new version of our application every week, so, we don't actually know why it got faster, why it got slower. And it's very important to know what your users are experiencing, but the next question can't be answered by this data. This is why we do lab timings. So, we actually track, we run every hour in a controlled environment. We open waves 20 times in an average data and get medians. And what you're looking at here is the client load inquirim. This is in a controlled environment. So, pretty close to "runs every time somebody commits code," which is important. And we can see here…the chart went up. Somebody made our client load ten percent slower. In the old days, we'd find this out one week later when we got those big charts. Now, we know exactly what happened. Somebody does a binary search, runs a test in the lab. "Aha!" And then, a little bit later, we fix the regression; really important to track this. This is how you keep your application fast once you've made it fast. And the other point of interest in the top right chart is, this is Wave-opening Chrome. So it's good news when it goes up, because we fix it. And it's good news when it goes down, because we made something faster. And now, when we're celebrating on a Friday afternoon that we made everything faster and someone looks around the room, only one person gets to put their hand up. So we now know who's going to fix bugs and we know who gets the credit. And turns out, motivating people in your team, if you just measure something, it's going to get faster. That's just something we've noticed on the Wave team. Some other tools that are really interesting. Speed Tracer, if you went to the talk this morning, you'll have seen. It's a really fantastic profiler. Look into it. If you missed the talk, watch it on YouTube. Great for deep diving on the rendering and user-interaction type behavior. So, we have user events. And I believe all of the links for everything I've talked about and Joel has talked about today will be in the Wave. So you don't need to furiously write down URLs if you are. We also have GWT Inspector widget. As Joel mentioned, widgets are bad in general. >>Joel Webber: -- in large quantities. >>Adam Schuck: In large quantities, widgets are bad. So this is a useful little bookmarklet to tell you how many widgets you have in your application. If you feel like a laugh, point this at Google Wave. It's actually orders of magnitude larger than it should be. We know we have room for improvement. And Google Page Speed: a great Firefox plug in helps you improve your start-up performance. It tells you things like caching, G zipping, headers, sending cookies. This is a nice little useful tool to improve your page speed. And we have time. >>Joel Webber: We have time. >>Adam Schuck: Great. So, I'll give you a quick sample of what we actually measure if I can find it. I believe it's this one. Here we go. So. I mentioned we do lab timings. We like to measure how quickly waves open. And we have a special mode of our client which we run using Selenium. The automated script clicks speed tests and runs all these various tests. And we're going to start opening a large wave, which, let's see, it has 620 odd participants. Which is more than we recommend putting on your waves -- not just for performance reasons, but if you actually want to get any work done, that's a lot of people. And it's got 175 messages in it. And we run this test every hour. We get 20 samples. We extract the data from this. This is what it looks like for Wave to run lab timings. And I'll stop there. Great. [pause] >>Joel Webber: Couple of odds and ends to wrap up. I mentioned before that we work really hard to do things within GWT, so that you don't have to do anything. Usually, those take the form of compiler optimizations or library optimizations. In the case of the compiler optimization, we try to make those default, wherever humanly possible, so you don't have a rat's nest of GCT-style command line options. More recently, we've started discovering some that have this property that they are incorrect, but in a way you probably don't care about, violate Java semantics in minor ways, but they invariably make your apps smaller and/or faster. Those we've had to deal with wrong with tax options so we don't inadvertently break your code, which I think you would probably not prefer. And as we discover more, we'll keep adding those. And as we discover more, we will obviously document those so that you can take advantage of them. And if you were in Ray's talk earlier, I believe it was just last hour, he goes into some detail on that. And if not, I would definitely recommend checking it out on YouTube. There's some really interesting stuff going on in the compiler. So again, to recap, four things you need to think about: your app needs to start fast, you need to get data fast, you need to get it on the screen fast, and you need to keep fine-grained user interactions fast…snappy. >>Adam Schuck: And now that you've made it fast, keep it fast; measure it, really track it, and celebrate when your charts go up as well as when they go down. Thank you very much. [Applause] >>Adam Schuck: Okay. We're now going to take some questions both from the microphones if you're in this room, and we're also going to use the Wave, which hopefully has some questions lined up for us. [pause] Any questions? Nope. All right. Then we're going to go for the microphone. >>Joel Webber: We apologize. Dory, the moderator, did a bit of a face-plant there. So, but feel free to ask with the mic. And do go to the mics please. It's much easier for the recording. Male #1: Concerning using the CSS files in Client Bundle? >>Joel Webber: Mhmm… >>Male #1: My observation, I've played around with it some. It seems like it makes using existing tools like Firebug or other tools where you can edit your CSS on the fly a little more difficult. What's your experience with that? And how have you gotten around that? We solve problems like with Firebugs. If we have CSS problems, we'll play with it, and I find it a little harder to do that with it. >>Joel Webber: Mhhmm-- >>Adam Schuck:--Mhhmm... So from the Wave perspective? >>Male #1: Yeah. >>Adam Schuck: Ok, that's actually an interesting question. So, of course, there are a few flags you can parse into GWT that allow you to leave the CSS unobfuscated. So there's that. And that does allow you to play with it in hosted mode in the time it takes to refresh. But we have found also in practice that using it, GWT, in conjunction with Firebug, with developer tools, very helpful. It's much quicker to iterate directly on the HTML. So that's been our experience. But the GWT team is working very hard to try and improve the cycle between actually changing the source code. >>Joel Webber: Right, yes. We recognize the dev mode, hosted mode, AP call, is definitely slower than it should be and we're definitely working very *** that. But also as Adam said, don't forget to, I don't recall off the top of my head. Do look at the docs. But if you pass in the unobfuscated flag. I believe it's either a flag or a module you need to include. That will turn off obfuscation of the CSS names, which makes it a lot easier to track the back and forth between, let's say firebug editing and the original source. Not as easy as we'd like it to be and we're looking into ways to make that better, but we recognize your pain there. Certainly. >>Male #2: Yes, I am creating a software application data -- was going to pull about 5,000 names from the database. And so that, you know, like in your example you just type it in, it pulls the names up. I know you guys say "Stay away from widget as much as possible," and I've been trying to do that. >>Joel Webber: --Right. >>Male #2: But what I've been using Json to pull, to do a serial from the server. And it's been, the performance has been okay. I haven't used the Speed Tracer to check it yet, but I will. >>Joel Webber: --Right. >>Male #2: But I wanted to make sure that, is that the best recommended option where you're pulling such a large amount of names? And also, if it isn't, what do you recommend using to pull like a large database down and where they'll appear as soon as you type them in? >>Joel Webber: So, two things I'd say. One, most of the serialization formats with the possible exception of XML, but don't quote me on that, I'm not positive, are roughly equivalent in performance, at least in the same order of magnitude. So using Json over GWT RPC over XML over protocol buffer, swift, whatever; they're going to put you in relatively similar place in terms of the amount of data. The piece of, the place where you really have leverage is in how much data you actually send at a time. The problem with all these formats is, you can't stream them. You can't say, "Parse the first 37 of these" unless you actually break it up. And the only place where you can realistically do that is on the server, because something like Json isn't like parsed JavaScript, right? So you can't parse partway through, or at least not the way browser is built. So I would say basically, one, split it up. So you're not requesting really large chunks. And also, really large chunks take a long time to process in addition to being, taking longer to download. Split it up, and if you are trying to do something like type-ahead search, in that sample right there, that was all client side. Now, that particular sample also does server side search, but only when you actually hit the enter key. Typically, on the amount of data that you can actually display on the screen there, you'd have to work pretty hard to make it slow to implement something like that type-ahead search. As long as your data is indexed or ordered in the right way. So, as long as you just limit it to the amount of data that you can actually get on the screen, the actual work that you're doing. You'll typically stay within pretty reasonable limits. It's only when you get into processing thousands and thousands of things that you really get into problems. And that's the case you want to avoid. >>Male #2: Okay, yah. >>Joel Webber: I hope that makes sense. >>Male #2: Thank you. >>Adam Schuck: Definitely optimizing to avoid the network request if you can. >>Joel Webber: Right here… >>Male #3 I have a question about, you show the example where the search filter. You type a couple of characters in and it would highlight the right records. I understand how, if you have a lot of widgets, it would take a long time to load. >>Joel Webber: Mhhmm- >>Male #3: How does it become slow if you have a lot of widgets by doing that kind of filtering or typing it in, "highlights," and then, hitting "enter", and then you end up with only those records. How is having a lot of widgets? I feel like it's intuitively understandable, but maybe a little more detail. Why is it slow in that way if you have a lot of widgets? >>Joel Webber: Well, there's really two ways that can get slow. One is the main thing is the things you would expect to be fast in a browser, things like, I don't know, "set text" often aren't when they're called in large numbers. So, couple of cases, like if you saw Kelly's talk earlier this morning about…I think there was a case where he managed to create a table that triggered layout thousands of times. Things just get slow for freakish and bizarre reasons. And what these new widgets that we've built allow you to do is to, they are designed in such a way they give us the opportunity to do the things that we actually know to be fast. In this case, it involves aggregating all those changes into a single interHTML call. If that is the fastest thing to do. And we're working on optimizing for all the various cases that come up. You know, if I change 3 rows as opposed to 27 of them. You know, there might be a crossover. It might be faster to actually go set the individual text, for example. But it gives us the leverage to do it in the right place in the right time in a way that won't anger the layout gods, for example. >>Male #3: By keeping it HTML-centered? >>Joel Webber: Right. By using interHTML where that actually, where we determined that that's the fastest thing. That's typically the reason, especially on older browsers where they typically optimize for static HTML rendering. >>Male #4: You mentioned "infinite scroll" and the idea of having, I guess, what load is just what you see it. Is that something that Google Web Toolkit does for us or is that still on us to set that up? >>Joel Webber: It will as soon as I whip my hacked-up demos into shape. [laugh] It's certainly implementable right now. It's just a little bit like brain surgery. But, the point, one of the major points, of building these widgets, is that it allows us to do that in a really optimal way. So this is sort of the first step. I wanted to make sure we got these things done before we try to tackle that problem. Because using …tackling that problem on top of large numbers of widgets just wouldn't be as effective. So, look for some demos once the dust settles after IO and after we get 2.1 out the door that will start to address this problem. We're going to try and if you may have noticed, if you're following our trunk, we've added this Bike Shed Directory where we can put totally unfinished, half-baked code and samples and things like that. We'll definitely be doing that work in the open in there. >>Male #4: Thank you. >>Adam Schuck: I think we might just have time for our last two questions. >>Male #5: Okay. Yeah, towards the beginning of the call, you mentioned that it's possible to look at the header information on the server side and then send back just the skip that's necessary. I was also looking for something like that for HTML 5 application caches where you need to know exactly the file name. So my question is, is there any utility or library or something that lets you map the user agent name to the appropriate script name? >>Joel Webber: Right. >>Adam Schuck: So we're not doing this yet for Wave, but there are other applications within Google that are doing exactly what you said, selecting the correct version. I believe that is done in a different way on the client side, but I'm not actually sure. I have to get back to you on that one. But definitely, it is on the roadmap for GWT to get this service side selection mechanism. And, of course, an associated app cache mechanism out into the open source so that everyone can use it. >>Joel Webber: Right. >>Adam Schuck: Because this is not just a benefit to Google; it's a benefit to anyone using GWT. And there's no point in me telling you about it if you can't use it. >>Joel Webber: Right. Basically, as soon as we get the server side selection script ready for use, anything else that involves affecting user agent on the server would naturally flow from there. So that's sort of the first step. >>Male #6: I may misunderstood, but I saw an example in the beginning so you'd request data before anything is loaded in the browser? So how does it know that it needs data? >>Joel Webber: I'm sorry, could you repeat the question? >>Male #6: In the beginning you saw that well, used to be four calls now there are two calls and it looks like data was requested with the first page >>Adam Schuck: Mhmm… >>Male #6: and then returned asynchronously. So, how did it know that it needed that data? >>Adam Schuck: How did we know the user had--? >>Male #6: --Yeah, because the user is not smoldered so? >>Adam Schuck: Okay. So in the Wave example, we are assuming the user is already authenticated. So we know exactly which user it is and we know that we're going to download the inbox. Is that the question you're asking? >>Joel Webber: Using cookies, in other words. >>Male #6: So, how can it? So, before anything is loaded in the browser, you can't just request data, is it? >>Adam Schuck: So it's the very first request for the HTML page. >>Joel Webber: Right. Specifically what they're doing, if I understand you correctly, is that, because the request comes with a cookie for the outer HTML page, it starts sending back the, it starts fetching from the back end and then sending back the initial data as part of the actual outer HTML page. So, while that page is still streaming down, because of the way that the start-up semantics of HTML and JavaScript, you can start running scripts even though the page isn't fully downloaded and takes advantage of that. >>Male #6: Mhmm. >>Joel Webber: So the page download actually takes a long time as its streaming back results. But, the script requests and other kinds of start-up code can be running while that data is streaming. >>Adam Schuck: --That's right. >>Joel Webber: --That also allows you to wash out the cost of construction of UI and everything like that as well. Like, all of that can take place during that streaming. >>Adam Schuck: I'd highly recommend actually pulling up Firebug or Chrome developer tools and looking how Wave starts up and just see how we use the initial HTML page, reuse it for the data as well as to download the JavaScript. Okay, we're out of time. >>Joel Webber: Thanks everybody. >>Adam Schuck: Thank you very much. [applause]