John Resig - Testing, performance analysis, and jquery 1.4

>> JOHN RESIG: I really appreciate you having me here. I wanted to talk about a few different topics today, all things I'm thinking about at the moment and working on. I wanted to talk a little bit about JavaScript testing, I wanted to talk a bit about JavaScript performance analysis, and then wrap up by talking about some of the cool things we're working on for the next version of jQuery that's going to be out here in about a month. If you have any questions while I'm going, please raise your hand. Usually if you have a question, someone else probably does as well, and it'll just be easier that way. Yes? >> AUDIENCE MEMBER 1: Can you just give a brief overview of what jQuery is, please? >> JOHN RESIG: Sure. jQuery is a JavaScript library. It provides a variety of functionality: DOM selection, traversal, manipulation, Ajax, events, and things of that nature. It tries to make your life a lot saner by not having to deal with cross-browser issues, and just generally simplify things. I want to talk a little bit today about the importance of JavaScript testing. I find that JavaScript testing is fundamentally very challenging, much more so than application desktop testing. It's extra important because not only do you get the improved development workflow, but you also get protection against regressions and against weird browser issues. I want to really emphasize that JavaScript testing is just fundamentally very different from normal application testing, because you're not dealing with the vast quantities of weird browser quirks across everywhere. The question that I usually hear, at least when people are getting started in JavaScript testing, is people wondering what to use to do testing. I ran a survey a couple of months ago asking people what they use and I got about 1800 responses. For a lot of the responses only one person said that they were using it, so there's an incredibly long tail of testing frameworks. The reason for this is that it's actually really, really easy to write a testing framework. I wanted to step through and show you how a JavaScript testing framework is constructed, and why you might consider building your own. A typical testing framework has these components: you have the full suite, and the suite encompasses a bunch of tests. One aspect that is usually pretty unique to JavaScript testing is testing asynchronous tests — you can test Ajax, animations, and things of that nature. At least in that respect, that's very JavaScript specific. But in general, JavaScript testing doesn't differ that much. The fundamentals of the framework don't differ much from other testing you might do. As an example here, this is the absolute minimum JavaScript testing framework. It's a simple assertion function, and all it does is check to see if the value that's passed in is true or false. If so, it logs out a statement that is either red or green, depending on whether it's passing or failing. This is the bare minimum that you need to do testing, but at the same time, this is all that most testing frameworks are. They're glorified assertion logging frameworks. They're designed to help make it easier for you to log all of this out so that you can read it later. Some of the structure that starts to come along then is more of the benefit for you, as a developer, to be able to understand these results that are coming through. Using one of the first improvements that you see is some sort of test grouping, so being able to group a bunch of assertions together under a single unified test. Usually the test itself is linked to, let's say, a particular method in your API. Obviously it will depend on your application how you choose to group your assertions. In the end, though, it's really not that hard to implement test grouping. This is the full code — this is replacing the code from the previous slide. Again, we just have an assertion, and we have a testing function. The testing function takes an additional function callback, and then that callback is executed every time the test function is run. In the end, it doesn't make it that much more complicated, and all these assertions are then nicely categorized. If we wanted to add in asynchronous testing, as I alluded to before, asynchronous testing is the sort of idiom that you see more in JavaScript than you see elsewhere. In these cases some reason I'm testing timeout, which you probably aren't going to do in your code since timeout is timeout and you don't really need to test it. In reality, you would probably be testing an Ajax request to a server and making sure that the correct response comes back, or something of that nature. You can see the difference here, though, is that there are two additional function calls. There is the pause and the resume. What the pause is doing is it's telling the framework to stop executing other tests, then whenever the resume occurs, to again start executing the rest of the test, to continue moving along. The implementation for it is painfully simple. This is layered on top of the code from the previous slide. All told, this is a framework that has assertions, tests, and asynchronous testing, and it's only about 20 or 30 lines of code. So it's really not that hard to write your own testing framework, and I tend to encourage you to do it simply because writing a testing framework is a good way to better understand what you're trying to achieve when you're testing. At the same time, generally speaking, when you're writing a testing framework there aren't that many cross-browser issues that you have to deal with. You're just worrying about logging out information, you aren't trying to develop something that's going to have to worry about the minutiae of browsers. In that way, I find it to be a good, healthy JavaScript writing activity at least. But the reality is that people don't test. In the survey that I ran, about half the respondents just didn't test their JavaScript at all, and this is really unfortunate. The reality is that tonight you're probably not going to run home and write a testing framework from scratch, you're going to want to use something that's already been built and that people are already, hopefully, familiar with. There are a bunch of popular testing frameworks that already exist, and these are some of the top results. The big four were QUnit, JSUnit, Selenium, and YUI Test. All of those had a significant number of responses, and they all effectively tied in the results. I want to talk about those four today. Most testing frameworks you see center around the notion of doing unit testing, and that's roughly the structure that I outlined in the little sample framework. You have a search engine grouping into test groups, etc. Again, the popular frameworks here are QUnit, JSUnit, and YUI Test. JSUnit has been around for the longest. It came out in about 2001, and it doesn't really feel like it's been updated since 2001. It's a very crufty framework, and it's really kind of frightening if you start to look at the code. It's pretty much just a straight port of the JUnit stuff over to JavaScript, and it definitely feels like that. Here's some sample code of initializing tests and then running tests. When I was checking out JSUnit I was trying to figure out how to get at the total number of tests that have been run, and this number was embedded in the page, so I was using FireBug to try and go in and get at this number. I went in and inspected it and clicked the number, but when I did that it was nested so many layers deepâ€¦ The thing was, I was like 'OK, it's probably in a table or something', but it wasn't tables, it was all frames and framesets. I don't know why they didn't use tables, I would have loved tables. There was frame, frame, frame, and it kept going down. Some people still use JSUnit, but I highly encourage you to move to a more modern framework. There are very good ones out now. This is the JSUnit runner. I tend to recommend YUI Test very strongly to people who are just getting started with testing, and I'm not just saying that because I'm at Yahoo!, I actually do really like YUI Test. It's very well written, it has a lot of great features. Probably my favorite feature, though, is the really excellent event simulation code that's in it. You can use that to simulate mouse clicks, keyboard typing, things of that nature, so that you can layer that on top of your application to simulate all that happening. What I found to be surprising is that it rated so well — so many people used it — but it's only been out for about a year at this point. I was really impressed. I'm sure YUI Test is probably going to be the most popular testing framework by this time next year. Just as an example, this is what the YUI Test code looks like. You set up these larger test cases which have smaller test cases inside of them, and you can set up, and tear down, and do initialization, and things of that nature. YUI Test has this little runner widget that is embedded in the page, and which you can see it logs all the results into. YUI 3 Test is syntactically very similar to YUI 2 Test. There are not that many major differences. The runner has been spiffed up a little bit, and has some gradients and some rounded corners now, so it's better. QUnit is a testing framework that I helped write to do unit testing on the jQuery project. We've been working on it for awhile now, and we've been coming close to actually having a 1.0 release. QUnit is structured very similarly to the example testing framework that I showed you before, incidentally, because that's how I like to do testing. It also supports asynchronous testing. You can do test timeouts, so if you have stuff that's taking too long you can have a timeout and be marked as failed. I think more importantly, it's just really simple and really easy to use, much like jQuery itself. Here is just an example of the syntax you see when writing a test. There are these module groupings that you can cluster multiple tests together and filter them out however you wish. And here's an example of the test runner. Actually, the test runner looks a lot better now. Just in the past week we got a bunch of contributions from some people at the BBC, and it no longer looks like that. FireUnit is another unit testing framework that I wrote, and that exists as a FireBug extension. If you're already using FireBug, this exists as a new panel within FireBug and it exposes this far unit namespace that you can hook into and use the testing functions from. All the results are then logged into this extra panel. It's an alternative to doing testing within a webpage, and some people like it. What I think is pretty interesting, though, is that there's actually some standardization starting to happen within this sphere of JavaScript testing. There's the CommonJS initiative which has been working to standardize JavaScript on the server-side. Incidentally, they've kind of grown in that original scope and they're now encompassing both the server-side and the client-side. One aspect of that is that they've been working on developing a standard testing framework format for JavaScript. They've standardized the names of the methods so that theoretically, you could use the same testing method names across all frameworks. QUnit recently adopted this, and I hope other frameworks will too. An interesting thing that I see people doing is server-side testing. This is doing testing of JavaScript that is disconnected from an actual browser; simulating a browser and then doing testing within it. I'm going to throw a major caveat here, because the problem is that whenever you attempt to simulate an actual browser, or simulate what a user might be doing, at best you're going to get an approximation of what the user is actually going to be doing. Nothing beats actually having a real user do real testing, especially in a real browser. I just want to throw that out there because I know some people do server-side testing, but I don't know of anyone who does it exclusively so. They usually do it in conjunction with their normal testing routine. Here are a few frameworks that are pretty popular. Crosscheck. Another one I didn't mention but is also good is HtmlUnit. Those are both written in Java, and they essentially wrote the entire DOM in Java and running on top of Rhino so you can run all of this on the server-side. It's really quite cool. Env.js is a project that I wrote, started a couple of years back, and it's since grown and become its own beast. It's similar to Crosscheck in that it provides a full DOM on the server-side, but is written in pure JavaScript. So it's all in JavaScript, and ideally it would be able to run on multiple platforms, not just in Java and on Rhino. Blueridge is a more recent one, and this takes Env.js, and a couple of other frameworks, and creates a full testing pipeline. Just to give an example, this is an example of Env.js running. This is actually in a console here. I'm not sure how many people here are familiar with [Mozilla] Rhino, but Rhino is an implementation of JavaScript written in Java. You usually use it from the jar file — you load up the jar file and then you can start running JavaScript right there in the console with no browser. There's no browser involved anywhere here. In here you'll load Env.js and start running some various pieces of jQuery manipulation. The test suite of Env.js is actually the jQuery test suite, and Env.js is able to load a number of the major frameworks and run them successfully. Another problem that pops up usually when you're doing JavaScript testing is that since you're going to be testing in multiple browsers, you're going to want to automate that process because you don't want to have to physically sit there and open up a new browser, open up a new tab, click refresh and run the tests again. You want all of that to be taken care of for you, and that's where this whole collection of tools come in. There are a bunch of browser drivers that exist and which effectively sit and act as a server, and spawn browsers that run your test suite. When they spawn the browser they'll try and collect the results and bring them back into your central testing area, wherever that may be. There are some very good ones out there and I'd definitely recommend them, especially if you want to automate that process. As I mentioned, Selenium does a very good job here. Selenium has an almost complete pipeline, all the way from having a testing framework up to automation, including distribution. This is just a little diagram from the Selenium site showing roughly how browser launching works. You have this server, usually written in Java, sitting there under the desktop spawning browsers, popping them up and collecting the results back out. The problem then becomes: what if you want to integrate this testing into your continuous integration? You want to make sure that on every single commit you're running against all these various browsers. This becomes a really challenging problem, and it's also a problem that doesn't scale very well. You can see here that there are actually two different tools that are explicitly designed to tackle this problem. One is Selenium Grid, where you can take your Selenium tests and push them out to Amazon and across their many, many servers, and run your tests very quickly. This is actually very cool. Again, this is part of that full pipeline that Selenium has, so it's quite neat. Another tool that I've been working on is called TestSwarm. It's a little bit backwards; it works more where users are signing up and electing to help participate, rather than it being forcefully distributed out. We'll talk a little bit more about that. TestSwarm came about because in the jQuery project we were having significant problems scaling out our day to day testing. We needed to be running against about 15 different browsers — not only that, we had a number of test suites that we needed to run in every single browser — and it just did not scale well. Us sitting there, opening up tabs or even trying to automate it with a browser launcher, it was just way too cumbersome. In the end, we wanted a way that the full community could help us test. This is where TestSwarm came in. It works sort of like SETI@home, or another distributive project of that nature, where there's a central server that collects all these test suites, then our clients connect and help run tests just in their browser. It's constructed very simply — the client is actually just a basic HTML page doing Ajax requests, asking if there are any new tests at the server. If so, it opens up the little iFrame with the test suite in it, collects the results, and sends them back to the server. It works quite well. Here's just a rough diagram showing you how TestSwarm is built out. The test suites are submitted by various projects. As it stands, the TestSwarm service is exclusive to a couple of open source projects, but the software is completely open source, so if you wish to run tests from it on your own organization you can just download it and run it yourself. All these tests get distributed out to everyone that's connected, and the results get collected and displayed. This is the homescreen, if you will, showing you the different clients that are currently connected to the Swarm. And this is the ultimate result of TestSwarm. What you have horizontally here is all the various browsers that are being tested against, and vertically you have the commits that are coming in. You can see exactly, commit by commit, what was changing at every browser. You can see that there were errors happening in Opera up until commit number 6432, in which case it switched green, because obviously a fix for that was landed. TestSwarm aims to provide the full continuous integration experience so that you can just submit your test suites and everything else is taken care of for you. You can see that this is what it looks like when a suite is running. This is running the prototype test suite. This suite is broken down into its individual sub-suites, and you can see the browsers running through running the tests live. This is something that I've been working on, and we're hopefully going to have the final release here in the upcoming months. If you would like to help, definitely go check it out. It's down at the moment because I'm switching servers, but it'll be back up very soon. One of the things we're working on doing is hopefully being able to provide some incentives so that, if you're helping and you're brave enough to be running IE6 on Windows 2003, then we will shower you with t-shirts and things of that nature. There was a high score board on the website that was very competitive, people were really gunning for it, and there was a lot of data there so I had to disable it. But that's definitely going to be coming back. Before I talk about measuring JavaScript performance, are there any questions about JavaScript testing? >> AUDIENCE MEMBER 2: Is TestSwarm coupled to any library for testing? Was it QUnit or something like that? >> JOHN RESIG: The question was: is TestSwarm coupled to any particular library or anything? No, it's not. TestSwarm has built in hooks for QUnit, Selenium, Dojo, Prototype's Test Suite, I think YUI Test, and a couple of other ones. Most of the major ones are in there. But the API for it is very simple — you need to call two methods, the result come in, and you're all done. There are two methods. I think I showed Prototype running. I've gotten the Prototype framework running, MooTools, jQuery, Dojo, I've got them all running in the Swarm. I wanted to talk a little bit about finding ways to accurately measure JavaScript performance. This turns out to be a very tricky problem in which there's both a lot of confusion, and a lot of wrong results all around. There are two major use cases you see when it comes to measuring JavaScript performance. There's the case where you have the same identical code, but you want to compare different platforms. Usually, the people that care about this sort of thing are the browser vendors. The browsers want to know, given this piece of JavaScript, which browser runs this code the best? Which browser runs this the fastest? That is one distinct use case. The other use case is when you have different pieces of code and you want to compare the relative performance, like your JavaScript frameworks, and you want to see which one does CSS selectors the fastest or something of that nature. To start, I wanted to talk about when you have the same code but on different platforms, so analyzing different JavaScript engines. There already exist a few frameworks out there for analyzing performance, and actually just before getting up on stage I got an email from Microsoft and it sounds like they might have just released one. We'll see. There might be a fourth one here in a little bit. These different frameworks give you different approximations for the relative performance of JavaScript engines. SunSpider was developed by the WebKit team, the V8 Benchmark was developed by the V8 team, and I developed Dromaeo in my work at Mozilla. They all try to provide a certain level of statistical assurance that the results that you're getting are correct. I wanted to talk a little bit more about that because it's really important, especially being able to reproduce the results every single time. When SunSpider was first released, all the results were very finely tuned and designed to be very balanced, so all the testsâ€¦ I'm trying to remember. I think they took Firefox 2 and made sure all the tests ran in about the same amount of time on Firefox 2. Obviously that's changed pretty drastically since then — browsers have gotten much, much faster, so the tests don't run in the same time any more. They also provided some level of statistical assurance that the tests that were running were actually running within this amount of time. They can say that this took 1000 ms to run, plus or minus 5 ms. All the tests are run by loading a full test into an iFrame and then loading that about five different times, and doing analysis upon those five results. One problem, though, is that if you ever try to fix a bug in the test suite, there's no versioning built in, so you have to throw the whole thing away and release a whole new suite as the next version. I wanted to explain what I mean by error rate a little bit, because it can be very important. It's a way of saying how confident you are that the result that you're producing is actually what you say it is and that it's going to be within this certain realm of numbers, and that the next time you run this test, you'll be getting something within that range. This is all going back to the normal distribution. I'm going to be throwing down a little bit of math here, but I think we can all handle that. The way the normal distribution works is that the majority of the results are going to be happening, 80 per cent of the time, right in this center. But you have weird fluctuations that happen — sometimes it'll run a bit faster, sometimes it'll run a bit slower — and you can see this in actual results. This is some test data that I did from runs in some different browsers, and you can see, almost exactly, these normal distributions here. Blue is one browser, red is a different browser, yellow's another browser. You can see this tapering; the vast majority of the results are all happening around this specific time, and then it tapers off to each side. What we want to be able to do in these results, then, is say that this middle result is happening the majority of the time, and the rest of the time, less frequently, it's going to be happening either a little bit faster or a little bit slower. To be able to determine a level of confidence we use different techniques, and one of them is called a t-distribution. It's a way of being able to say that the majority of the results are going to be happening within this portion of the distribution. In this case, around 90 per cent of the results are going to be happening in this very small portion of time. Using this, you end up with what's effectively called an error rate. When running this again, you can promise with a 90 per cent certainty, or 95 per cent certainty, that you will get a number within this range. You can use this to say: this took 123 ms plus or minus 5 ms, so maybe next time it will run a bit faster, maybe next time it will run a bit slower. That's the technique that all these benchmarks use. The SunSpider one uses the t-distribution, and so does Dromaeo. The V8 Benchmark works a little bit differently from the SunSpider benchmark in that the SunSpider one only runs the test five times, but the V8 Benchmark runs it thousands of times. The way it does this is instead of trying to measure the absolute time it took to run a test, what it does is try to measure how many runs per second you can do. This is really nice because what happens is that tests that can run really quickly have a massive error rate. For example, if you have a test that runs really fast, only takes a millisecond to run, you might run it so it's like 1 ms, 1 ms, 1ms, 3 ms, and if you think about it that's a huge error rate. You don't know what the actual real result is going to be. It doesn't help that browsers have really poor timing mechanisms. In general, the results are just very messy. But I mean, if you look at a test that runs very slowlyâ€¦ In this case there's a 4 ms fluctuation here, but since the numbers are so large it doesn't really matter. The problem, then, is that tests that run faster need to be run more times, because when you start running them more times you get more accurate results. You can see here that there's a slow running test, and there's a lot of fluctuation in the results of a slow running test — a large absolute fluctuation, maybe plus or minus 40 ms. But since the results we're talking about are already like 1000 ms, that's just a very small percentage. Whereas with a fast running test, that is not the case. What we end up with is the faster a test runs, and consequently the faster browsers become, the more error-prone tests become, especially tests like SunSpider. Since all the browser vendors are actively trying to improve the performance of those tests — and they are, they're improving them very much — it means that the faster they become, the higher that error level is going to occur. This is where that really nice runs per second comes in, that the V8 Benchmark does. This means that the tests that run the fastest just keep getting run more and more times. What they do is try to run it as many times as they can within one second, so at the very least, you aren't going to be running it for more than a second. It won't be freezing up your browser and everything. This V8 Benchmark introduces this technique, and Dromaeo now uses it as well. This is really nice because, like it said, it gives you much finer grained results for the tests that need it the most. What you end up with is this runs per second number, rather than just an absolute number of seconds. What you can do with that number is run that whole collective tests multiple times. For example, what you might end up doing is running it once and you'll end up with 1000 runs in one second. Then you take that whole thing and run it again, and you end up with many, many thousands of runs at a single result. This gives you an incredible amount of accuracy in your data, and this is what I think is so important. To get at a final result you would probably use something like an harmonic mean. I've just got it here as an example. You could end up here with a final number that accurately measures runs per second. Dromaeo uses both of those previous techniques. It uses error calculation and runs per second. It's also versioned so that if there's a bug in your tests, you can go back and change it and it'll make sure that it doesn't try to run tests against different versions of the same tests. This ends up being really important because tests have do bugs in them, and you do need to go back later and change them. But this all leads up to the other problem area, which is the problem of running different code on the same platform. The example is gave before was testing multiple libraries on the sameâ€¦ doing the same thing, but with different pieces of code. The problem here is that there are very few testing frameworks that exist to cover this sort of area, and the ones that do exist are very poor. They do not work well, and provide very inaccurate results. One of the reasons for the inaccurate results is the problem of garbage collection which occurs in browsers. Browsers are constantly churning and trying to free up as much memory as possible. They're going through and saying OK, you're not using that object any more, we can free that up. This gets expensive, especially if you have many, many tabs open and there's Flash going on, or who knows what. It can really add up. You'll end up with results like this: you'll see 10 ms, 13 ms, 11 ms, and then like 400 ms. Obviously your code didn't suddenly become slower in the mean time, what happened is that the browser did a garbage collection in the background. Unfortunately, that's not something that you can control. It's out of your hands; the browser's off doing its thing. So you kind of have to take this into account when you're doing your tests. There are obviously multiple techniques that you can use to try and arrive at a result. Most people that I've seen end up not using the mean, the average, and instead start going towards something like the mode, trying to figure out the most frequently reported result that comes out of all these numbers. The thing is that you have to be really, really careful about discarding bad results, because numbers that are reported inherently have some meaning. For example, if you ran a test for framework A that causes more garbage collections to occur, that is probably important information in and of itself. Doing something like reporting a mode would actually discard those very important spikes and make your results less relevant, so it's actually very important to not discard these garbage collections. I mean, it's tricky because sometimes you just want to kind of zero in on the actually important numbers, but the reality is that you have to encompass all of these. These spikes, as well, also become more relevant when you look at the issues with timer accuracy and JavaScript. The problem is that when you do getTime in JavaScript, it's actually very, very imprecise. When a number's reported, you see that they're all reported in milliseconds, but the thing is that the timer doesn't update all the time. It updates, actually, very infrequently. Just to show an example, these are some tests run on OS X, and these are the results that I showed before. You can see that there are really nice distributions of results, and the results are coming in at every single millisecond. The browser timers on OS X are very, very precise. They're precise at least down to the millisecond. But if we look at the results on Windows XP, notice what we're missing here. You can see the results coming in, and they're coming in at intervals. There's absolutely nothing in between these 15 ms intervals, and that's because the timer is only updated every 15 ms. So it doesn't matter how many times you run it, or what accuracy you're trying to get at, it's only going to report at these specific times. If you notice, there are no nice little normal distributions anymore — everything's clustered around single results. This gives you really, really poor results. If you ever see anyone try to give you JavaScript performance numbers on Windows, you can just throw those results away because there's no possible way in which those numbers are going to be interesting or accurate. What's happening here is that you're going to have an error rate that is so huge, you're going to have an error rate of up to 750 per cent. That's 7.5 times larger than the actual result, and that is just huge. One interesting thing that I discovered when looking at this was that when Internet Explorer is running in Wine, the emulation layer actually has a really accurate timer. So if you run Internet Explorer in Wine on OS X, you can get access to an accurate millisecond level timer. Naturally, you're then running Internet Explorer in Wine and OS X, and who knows how that browser actually performs in the real world. In this case, at least, I find it to be interesting to casually look at to try and get more accurate numbers, but definitely not used in any sort of real testing situation because it's going to be pretty drastically different from the actual real Internet Explorer. Just to show you here, you can see the green numbers here, and that's Internet Explorer running in Wine, not OS X. There are no more stark spikes like that. The problem is, in the end, how do we get at these good numbers? How do we get at accurate timing information in the browsers that we want to tackle? The reality is that we need to go and use the tools that the browser provides: use the profiler that's baked into Firebug, use the profiler that's in Safari, the profiler that's in Internet Explorer 8. All these are actually really good tools. They've improved dramatically over the past year or so, and they're really quite excellent. Usually the question that comes up now is 'how do I handle Internet Explorer 6 and 7?' 6 and 7 are still very real parts of our development workflows. One tool that I discovered recently and that I've been very pleased with is called DynaTrace Ajax. It's a standalone tool that hooks into Internet Explorer, works with 6, 7, and 8, and it does full tracing of the entire browser. It's really quite amazing. You can see that it even traces across network requests, like here — this is actually on Yahoo! Maps. You can see that it traces through JavaScript, traces through the DOM, through the meat of browser DOM, so you can see exactly how long it takes to run native browser methods like get element by ID, set timeout, and stuff like that. It traces through the timeout, traces through the Ajax, and you can see exactly how long it took to run each individual step, and what ran the other parts. I've been very, very impressed with this tool. Not only is it a great tool for Internet Explorer, it's just a great tool in general. I've already made this a part of my testing toolkit. Just to sort of zoom in here, we have a detachEvent and a setInterval, and both of those are native browser methods in Internet Explorer. You can actually see how long it took to run those browser methods in themselves, which is something that no other testing tool provides. I was very impressed with that. Another way to get at really detailed information is doing shark profiling. Shark is a tool that you can use to get at the underlying internals of an application. It's not browser specific, but it works for browsers. It's very, very low level. There's probably a good chance it won't be immediately useful to you, but if you find a weird bug in a browser, and you attach a shark profile to your bug when you submit it, I can guarantee that they'll be more likely to respond to you because that is exactly what they want to see. This can be really good for filing bugs. You get this full [xx]. It kind of looks like what you see from DynaTrace, but it's not like it at all. You have all these really cryptic names. This is tracing through Firefox, through their JavaScript engine. Unless you really know the JavaScript engine internals, it really doesn't make much sense. That's all I really want to talk about with analyzing performance. Are there any questions, really quick, before I talk about jQuery 1.4? >> AUDIENCE MEMBER 3: Is there any way to force browsers to do the garbage collection before you start the tests? >> JOHN RESIG: Is there a way to force the browser to do garbage collection before you start your tests? Not that I know of. Definitely not in a way that's cross-browser. One way to force it is to close your browser, open the browser again, and then run your tests. That's the ultimate garbage collection. Usually what you find is that if you're doingâ€¦ At Mozilla, for example, when they're doing performance analysis on Firefox, what they do is they have a whole server farm constantly turning out builds and popping up new copies of Firefox. To do a performance analysis, what it'll do is pop up a fresh copy of Firefox — and this is very fresh, just built — and it'll run through the tests, close it completely, and then open it and start it again. That is the only way you can really get at absolutely stable numbers, if you're looking for really, really stable. Don't have stuff running on your computer, as well, because the other stuff can mess up everything. There was one thing that I heard: if you run iTunes on Windows — or is it Quicktime? I don't remember the exact details. Maybe it was Windows Media Player — the timers start to become more accurate, because something taps into the internals of Windows and tells Windows to start becoming more accurate. Like I said, just kill everything and start fresh. At the very least you can start to get some decent numbers. >> JOHN RESIG: What about the newer versions of Windows? Same problem. In fact, I think it did result in other versions of the operating systemâ€¦ Just to repeat, the question was: what about other versions of OS besides Windows XP, like Vista? I haven't tested it in Windows 7 yet, but I don't have high hopes. I mean, it's a pretty low level thing, and I think if they change that a lot of other stuff will be changed. One thing I should note, though, jumping back to those results — that's IE, Opera, and Safari, but that does not include Firefox and Chrome. Firefox and Chrome have accurate timers on Windows. They do black magic to conjure up the correct times. I'm obviously not sure what they're doing, because they're obviously using a better API than what Internet Explorer's using. So yes, I should just qualify by saying the situation may have since changed. I tested Safari 3.1 there, and maybe that change from Chrome got back ported, and maybe it's now in Safari 4. I'm not sure. But it's definitely not in IE8, I know that for certain. >> JOHN RESIG: Internet Explorer 8 definitely has a profiler. I mean, this is a run here. You do a profile, you hit start profiling, and it captures everything that comes in and shows you a dump of the results. I don't know if the dump is as pretty looking as the other browsers, I'm not sure. But worse case, you could try the DynaTrace Ajax, and that definitely has a profiler, that profiles everything. Maybe that could work. I wanted to talk about some of the interesting problems that we've been working on lately in jQuery 1.4. We did the first alpha release last Friday, and the final release is currently slated to be released mid-January. There are a few interesting problems that I feel we tackled in 1.4: reducing the overall complexity of the code, and I want to talk about how we did that, adding in support for more bubbling events that don't normally bubble in Internet Explorer, and doing script loading. To reduce complexity in jQuery, one of the things that we started to do is kind of take a step back and to stop looking at absolute times. One of the problems that we found was that we found we were spending too much time trying to compare ourselves relative to other frameworks, or trying to compare our speed relative to old versions of jQuery, when the reality is that we should be spending more time improving the overall code flow and code quality of jQuery itself. Through that, we can get performance improvements. One of the ways we did that — I mentioned this earlier — was the FireUnit testing framework that I built for FireBug. One of the things that I added on to FireUnit was a way to programmatically get at the full data dump of a profile run in FireBug. You could use this data, and you could get a full JSON data structure, of all the data that came out in a profile. You could use this and you could get at exactly how much time, in milliseconds, it took to run a specific function. More importantly, you can get at the number of function calls that occurred. When you start getting at the number of function calls, you can start to compute a level of complexity with your code. I used the term Big O notation here but it's not real Big O notation, it's sort of a one off, as I call it. I wanted to test the complexity of adding a class, for example, so I ran addClass against about 95 elements. I don't remember the exact number but it was about that many. Then it looked at how many function calls occurred. What we found was that for every single time you called addClass, it would call six different functions for each individual element. That's a lot of function calls. Though, function calls in and of themselves don't necessarily indicate slowness; function calls may just indicate just a poorly written piece of code, and there may be ways of optimizing it further. The nice thing about this is that it's sort of removing the time portion of performance analysis away from the actual trying to improve your code. We can see that there are various levels here. There is a 6n, a 9n. We can see one really bad one here, remove, that was 2n + n squared, and that's a huge number of function calls for every single element. That was one we looked at right away, because obviously there's a problem going on there. We looked at it a little bit more and we realized, at least in the case of remove, that it was trying to traverse through and call things more often than it needed to. We made that improve it, and everything that used remove, also improved. HTML, empty, both of those also used remove. We were able to draw the time down from that massive n squared to about 3n. Just to show you, these are some of the times that came up in jQuery 1.3.2. You can see a few n squareds. The fine div down there is 16n. There are quite a few up there. But watch what happens when we switch to the complexity of jQuery 1.4; I'll flip back and forth. Things become way simpler. All the n squareds are gone, the 16n is now a 5n, that 8n is now a 0n in that it only calls 2 functions total no matter how many elements you put in. Using this, we were able to get some really excellent performance improvements. Yes? >> AUDIENCE MEMBER 6: Can you do a one sentence definition of what a Big O is? >> JOHN RESIG: Big O is a way of denoting complexity. For example, the way it's usually used is, say if something has a complexity of 'n', that means it's linear complexity and for every single time that you callâ€¦ How do I explain this? In jQuery, for every element there will be one function call. So for something that has a complexity of 3n, for every element there would be 3 functions calls. It's just a way of denoting usually a bigger number, a more complex equation. The longer it takes to run and usually the worse performance you're going to get. The numbers here are kind of small. With jQuery 1.3.2 it took about 3 seconds to run, and in jQuery 1.4 it took less than a second to run. We've seeing improvements of about 3.5 times over the previous version of jQuery. This is all without looking at performance absolute time numbers — we were strictly looking at comparing and improving ourselves, more of reducing our overall complexity. Are there any questions really quick, before I move onto event bubbling? No, OK. Another problem we tackled in 1.4 is the number of issues that Internet Explorer has surrounding event bubbling. In jQuery 1.3 we had the new live method, and this live method allows you to do event delegation really, really simply. Event delegation just works by if an event occurs someplace down the page, the event bubbles back up, you capture it, and you handle it. You bind less event handlers in your code, and overall it's generally much faster. The problem, though, is that Internet Explorer doesn't bubble some events. It doesn't bubble focus, blur, change, and submit, so those are ones we had to implement and override their lack of bubbling. One of the tools that we used was this method, developed by Juriy, and it's a way of determining if an event will work on an element. You can use this method to figure out, for example, will a submit event ever happen on a div? That would return true in every browser but Internet Explorer, because in every other browser but Internet Explorer this submit event will bubble up. We were able to use this information to determine whether this will actually happen in the browser that we want. Incidentally, the easiest ones to fix are the focus and blur events. Internet Explorer has a whole bunch of focus and blur style events, one of which is focusin and focusout, and both of those bubble. You can just replace focus and blur with focusin and focusout, and it works wonderfully. I wish all of them were that easy, but they're not. Submit was much trickier. In order to make the submit work in Internet Explorer, what you have to do watch for the click event to occur. If someone clicks the submit button, or clicks an image submit button, then you can capture that. Additionally, if somebody hits the enter key at an input, that will trigger a quick submit button. All this works great. You can just watch the submit button. The problem is that if you don't have a submit button in the form, there's no way to get at the submit event unless you attach a keypress handler. If you attach keypress, you can then figure out when someone's hitting enter in a text area or a password. Even that one wasn't quite so bad when you compare it to the change event. The change event is a real, real bear. In order to do the change event properly you essentially have to implement the full change event. You have to track all changes that occur to the input, you have to track its previous value, and then on blur check to see if it has changed in the interim. It's really quite convoluted. You also have to track if someone's using the keyboard to navigate around the form, it's a large convoluted piece of code. One thing that I thought was interesting is that there's an event in Internet Explorer called beforeactivate, and beforeactivate is another one of those crazy special Internet Explorer events. In this case, beforeactivate happens before a radio button is activated, and you can use that to get at that value before that occurs. Another new piece of functionality we've been working on and just recently I committed this to a branch, the day before yesterday. It's called jQuery.require, and it's a way to dynamically load pieces of jQuery code. In setting out to do this, we wanted to build a script loader that would just work really, really well, and that we felt wouldn't harm applications but would actually benefit them. Some of the things we did that are just given are that we wanted to make sure we didn't load duplicate files. We wanted to make sure that if you ran it, it would be run synchronously, and that if you did a require and then tried to use some code that used the require, it would be loaded ahead of time. Simple stuff like that, we wanted to make sure. The important point is that we made sure that it worked asynchronously. What we mean by that is that you can load multiple files, those files will be loaded up in the background, and they'll be downloaded in parallel. So you can download multiple files in parallel and not block the browser execution. Not only does it make it faster in that your scripts will be downloading much faster, but it won't freeze the browser while the script is downloading. So this is really a win-win situation. The scripts will download faster and it won't prevent the user from doing anything. The way it works is that we load all the scripts asynchronously before the document.ready event occurs, and then we just delay the document.ready event until all the scripts are loaded. It really works quite well. I should mention that because of this, we guarantee that the scripts will continue to load in the correct order even though we're loading them asynchronously. We also provide some URL mapping, so you can type a simple thing like: jQuery.require("ajaxâ€ ) and that'll map out to ajax.js. You can also do some basic namespaces, and it'll translate them into full filenames. You can also specify full namespaces. If you have code living in a specific directory off somewhere, or on a specific server, you can specify the full namespace and that'll get all that completed and filled out when it requires. I think that was the major content I wanted to cover today. I have one little bonus section that I wanted to discuss super quick. One interesting thing I've been seeing recently is that there's a lot of people who really want to start using HTML 5 today, and a lot of people are trying to use the new HTML 5 elements in Internet Explorer, especially older versions of Internet Explorer, and they're running into a lot of problems. I just wanted to outline the variety of problems that exist right now, because it's a real minefield. Consider this a follow-up to my 'DOM is a mess' talk earlier this year. One of the first problems, the one that everyone encounters, is that they find they can't actually style the HTML 5 elements in Internet Explorer. It's as if they just don't exist. What you have to do, as someone found out awhile back, is document.createElement an element that Internet Explorer doesn't know about, like an HTML 5 element, then suddenly you can start styling it. This works quite well, actually. There's a nice little script that you can download that has the full list of all the HTML 5 elements in it that you can just stick in your header, it's like 300 bytes, it's really tiny, and suddenly all your HTML 5 elements can become style-able. So that's really cool. However, a problem then becomes that if you try to put an unknown element — an HTML 5 element — into another one in Internet Explorer, it'll just barf. It doesn't really know what to do, and your elements will get pushed out of your container and kind of get strewn about. Unfortunately there's really no good solution there short of dynamically constructing the DOM yourself. At least when you ship it in the page like this, it won't work right away. Another problem is that Internet Explorer thinks that these HTML 5 elements aren't actually HTML. It thinks that they're maybe some sort of XML — I don't know, it's kind of hard to figure out what Internet Explorer's thinking sometimes. But if you look and try to see what the nodeName is, the nodeName is actually case sensitive which is very different from HTML. In HTML the node names are case sensitive. The solution here is that you kind of have to assume in your code that the nodeName would never be all uppercase, which is kind of weird since it should always be. Then finally, the other big problem is that if you try to inject HTML 5 elements using innerHTML, it will just completely not know what's going on. You will end up with elements with a nodeName of like 'section'. It creates an element for the opening section and the closing section, two different elements. It's really bizarre. It's some of the craziest stuff you see. The solution here is to write a full HTML parser and to parse the HTML yourself, and construct a DOM. Obviously that's a pretty crazy solution. That's something I would do, but I don't recommend it. Those are just some problems that I wanted to outline that are happening right now. If you try to use HTML 5 elements in Internet Explorer you're going to hit fun problems. So that's all I wanted to talk about today. Are there any questions about anything I discussed, or about jQuery or what have you? Yes? >> AUDIENCE MEMBER 7: These people who want to use HTML 5, did you tell them this? Did they come to see the reason and decide not to use HTML 5? >> JOHN RESIG: The question was: when I show people the list of how HTML 5 elements are broken, do they see reason and not use HTML 5? No. At least not yet. I'm probably going to talk about this a little bit more. It's really an unfortunate situation right now. You can use some elements some of the times, as long as you don't try to put them inside of each other and you don't try to dynamically create them. I think the situation is probably a little bit more tenuous than people realize, so I don't know. It's kind of tricky. [laughs] >> AUDIENCE MEMBER 7: It seems like there's kind of... >> JOHN RESIG: Yeah. I don't know. It's really up to the author themselves if they decide to derive benefit from it. Yes? >> AUDIENCE MEMBER 8: What's your opinion of Chrome Frame? This breaks, or that breaks. Why not just Chrome Frame? >> JOHN RESIG: What's my opinion of Chrome Frame? As a solution to Internet Explorer, or justâ€¦? >> AUDIENCE MEMBER 8: Just as a work around problems. >> JOHN RESIG: I guess that could work. I would prefer that they just move to a different browser. I mean, I think the reality of the matter is that I find it hard to see a situation in which someone would be able to, and willing to, install a plugin for which they are not capable of upgrading their browser. It seems like the people who have the problem, they're going to have the same problems either way. >> AUDIENCE MEMBER 8: They could install Flash. >> JOHN RESIG: Are you sure? Are you sure the same situation where someone could install Flash themselves is also the same situation where they could not upgrade a browser? Usually when you're in that lock down of a situation in a corporate environment, you can't upgrade a browser, you can't install a plugin, you can't do anything. I mean, you aren't the administrator of your machine. I think most of the companies that exist, I imagine, that would be capable of using Chrome Frame are also capable of upgrading their browser. Yes? >> AUDIENCE MEMBER 9: What you said in the first part about browser [inaudible], what's your confidence level to do with TaskSpeed and SlickSpeed? >> JOHN RESIG: The question was: what's my confidence level in tools like TaskSpeed and SlickSpeed? SlickSpeed is a framework developed by the MooTools developers that analyzes the performance of CSS selectors in various frameworks. TaskSpeed is a framework developed by MooTools and Dojo to analyze performance of various tasks like HTML injection, adding and removing classes, binding events, and stuff like that in various frameworks. They both use the same basic framework, though, the SlickSpeed framework. My opinion of them is not very high. Based on what I said, they have no statistical assurance whatsoever. They run the tests about five times, which is very low — especially for how long they take to run, it's very poor. In the SlickSpeed, for example, you'll find that most tests run in less than 16 ms, which means that the timer's granularity issueâ€¦ If you go and run SlickSpeed in Internet Explorer, you'll look at the times and it'll be 0, 0, 0, 0, 0, 0, 16, 0, 0, 0. You'll be like wow, I have the fastest browser ever, but it's not the fastest browser ever. It's just really unfortunate. There needs to exist better ways, and I've seen a version of SlickSpeed that adds in where it starts to do runs per second, but it only does that run per second once and then it doesn't try to provide any sort of statistical boundaries on it. It's really quite unfortunate. I hope better tools come along. Audience member 11: Have you looked at Google Closure, and what do you think of it? >> JOHN RESIG: There are two things. There's Google Closure the library, and it's alright. I looked through it. I didn't see anything that really blew me away. Then there's Google Closure the Compiler, which is pretty slick. The Closure Compiler works sort of like YUI Compressor and Dean Edwards' Packer and various other compression scripts. It has both a basic compilation mode and an advanced compilation mode. The advanced mode goes much farther than other toolkits or other frameworks. That being said, there's a huge asterisk around that, meaning that if you try to just go today and take jQuery, or take YUI, or take Prototype and put it in an advanced mode and compress it, all you're going to do is get a broken piece of JavaScript out. The way that the Closure Compiler is designed is that you put in your framework, you put in your plugins, you put in your code, and then you compile all of that together into a single codebase, and that one file you put up in a website. That's the way it's designed to work. Using it like you would YUI Compressor is going to cause problems, and I don't think a lot of people realize that yet. I know when I first used it, I popped it in JFrame and was like wow, it's so small. Well, it's so small because half the code got ripped out in the meantime and it doesn't work anymore. [laughs] You kind of have to be aware of that, and you have to realize the situation in which it was built. They built it so that they could put a single JavaScript file on their website and that would have everything in it. All the non-essentials would be stripped out. I definitely recommend checking it out. It may be useful for your application, but just be aware of what's going to happen. Alright, I think that's it. Thanks again for having me. If you have any questions I'll be around. Have a good evening. [applause]