Tip:
Highlight text to annotate it
X
>> JOHN RESIG: I really appreciate you having me here. I wanted to talk about a few different
topics today, all things I'm thinking about at the moment and working on. I wanted to
talk a little bit about JavaScript testing, I wanted to talk a bit about JavaScript performance
analysis, and then wrap up by talking about some of the cool things we're working on for
the next version of jQuery that's going to be out here in about a month. If you have
any questions while I'm going, please raise your hand. Usually if you have a question,
someone else probably does as well, and it'll just be easier that way.
Yes?
>> AUDIENCE MEMBER 1: Can you just give a brief overview of what jQuery is, please?
>> JOHN RESIG: Sure. jQuery is a JavaScript library. It provides a variety of functionality:
DOM selection, traversal, manipulation, Ajax, events, and things of that nature. It tries
to make your life a lot saner by not having to deal with cross-browser issues, and just
generally simplify things.
I want to talk a little bit today about the importance of JavaScript testing. I find that
JavaScript testing is fundamentally very challenging, much more so than application desktop testing.
It's extra important because not only do you get the improved development workflow, but
you also get protection against regressions and against weird browser issues. I want to
really emphasize that JavaScript testing is just fundamentally very different from normal
application testing, because you're not dealing with the vast quantities of weird browser
quirks across everywhere.
The question that I usually hear, at least when people are getting started in JavaScript
testing, is people wondering what to use to do testing. I ran a survey a couple of months
ago asking people what they use and I got about 1800 responses. For a lot of the responses
only one person said that they were using it, so there's an incredibly long tail of
testing frameworks. The reason for this is that it's actually really, really easy to
write a testing framework. I wanted to step through and show you how a JavaScript testing
framework is constructed, and why you might consider building your own.
A typical testing framework has these components: you have the full suite, and the suite encompasses
a bunch of tests. One aspect that is usually pretty unique to JavaScript testing is testing
asynchronous tests — you can test Ajax, animations, and things of that nature. At
least in that respect, that's very JavaScript specific. But in general, JavaScript testing
doesn't differ that much. The fundamentals of the framework don't differ much from other
testing you might do.
As an example here, this is the absolute minimum JavaScript testing framework. It's a simple
assertion function, and all it does is check to see if the value that's passed in is true
or false. If so, it logs out a statement that is either red or green, depending on whether
it's passing or failing. This is the bare minimum that you need to do testing, but at
the same time, this is all that most testing frameworks are. They're glorified assertion
logging frameworks. They're designed to help make it easier for you to log all of this
out so that you can read it later.
Some of the structure that starts to come along then is more of the benefit for you,
as a developer, to be able to understand these results that are coming through. Using one
of the first improvements that you see is some sort of test grouping, so being able
to group a bunch of assertions together under a single unified test. Usually the test itself
is linked to, let's say, a particular method in your API. Obviously it will depend on your
application how you choose to group your assertions. In the end, though, it's really not that hard
to implement test grouping. This is the full code — this is replacing the code from the
previous slide. Again, we just have an assertion, and we have a testing function. The testing
function takes an additional function callback, and then that callback is executed every time
the test function is run. In the end, it doesn't make it that much more complicated, and all
these assertions are then nicely categorized.
If we wanted to add in asynchronous testing, as I alluded to before, asynchronous testing
is the sort of idiom that you see more in JavaScript than you see elsewhere. In these
cases some reason I'm testing timeout, which you probably aren't going to do in your code
since timeout is timeout and you don't really need to test it. In reality, you would probably
be testing an Ajax request to a server and making sure that the correct response comes
back, or something of that nature. You can see the difference here, though, is that there
are two additional function calls. There is the pause and the resume. What the pause is
doing is it's telling the framework to stop executing other tests, then whenever the resume
occurs, to again start executing the rest of the test, to continue moving along.
The implementation for it is painfully simple. This is layered on top of the code from the
previous slide. All told, this is a framework that has assertions, tests, and asynchronous
testing, and it's only about 20 or 30 lines of code. So it's really not that hard to write
your own testing framework, and I tend to encourage you to do it simply because writing
a testing framework is a good way to better understand what you're trying to achieve when
you're testing. At the same time, generally speaking, when you're writing a testing framework
there aren't that many cross-browser issues that you have to deal with. You're just worrying
about logging out information, you aren't trying to develop something that's going to
have to worry about the minutiae of browsers. In that way, I find it to be a good, healthy
JavaScript writing activity at least.
But the reality is that people don't test. In the survey that I ran, about half the respondents
just didn't test their JavaScript at all, and this is really unfortunate. The reality
is that tonight you're probably not going to run home and write a testing framework
from scratch, you're going to want to use something that's already been built and that
people are already, hopefully, familiar with.
There are a bunch of popular testing frameworks that already exist, and these are some of
the top results. The big four were QUnit, JSUnit, Selenium, and YUI Test. All of those
had a significant number of responses, and they all effectively tied in the results.
I want to talk about those four today. Most testing frameworks you see center around the
notion of doing unit testing, and that's roughly the structure that I outlined in the little
sample framework. You have a search engine grouping into test groups, etc. Again, the
popular frameworks here are QUnit, JSUnit, and YUI Test.
JSUnit has been around for the longest. It came out in about 2001, and it doesn't really
feel like it's been updated since 2001. It's a very crufty framework, and it's really kind
of frightening if you start to look at the code. It's pretty much just a straight port
of the JUnit stuff over to JavaScript, and it definitely feels like that. Here's some
sample code of initializing tests and then running tests.
When I was checking out JSUnit I was trying to figure out how to get at the total number
of tests that have been run, and this number was embedded in the page, so I was using FireBug
to try and go in and get at this number. I went in and inspected it and clicked the number,
but when I did that it was nested so many layers deep… The thing was, I was like
'OK, it's probably in a table or something', but it wasn't tables, it was all frames and
framesets. I don't know why they didn't use tables, I would have loved tables. There was
frame, frame, frame, and it kept going down. Some people still use JSUnit, but I highly
encourage you to move to a more modern framework. There are very good ones out now. This is
the JSUnit runner.
I tend to recommend YUI Test very strongly to people who are just getting started with
testing, and I'm not just saying that because I'm at Yahoo!, I actually do really like YUI
Test. It's very well written, it has a lot of great features. Probably my favorite feature,
though, is the really excellent event simulation code that's in it. You can use that to simulate
mouse clicks, keyboard typing, things of that nature, so that you can layer that on top
of your application to simulate all that happening. What I found to be surprising is that it rated
so well — so many people used it — but it's only been out for about a year at this
point. I was really impressed. I'm sure YUI Test is probably going to be the most popular
testing framework by this time next year.
Just as an example, this is what the YUI Test code looks like. You set up these larger test
cases which have smaller test cases inside of them, and you can set up, and tear down,
and do initialization, and things of that nature. YUI Test has this little runner widget
that is embedded in the page, and which you can see it logs all the results into. YUI
3 Test is syntactically very similar to YUI 2 Test. There are not that many major differences.
The runner has been spiffed up a little bit, and has some gradients and some rounded corners
now, so it's better.
QUnit is a testing framework that I helped write to do unit testing on the jQuery project. We've
been working on it for awhile now, and we've been coming close to actually having a 1.0
release. QUnit is structured very similarly to the example testing framework that I showed
you before, incidentally, because that's how I like to do testing. It also supports asynchronous
testing. You can do test timeouts, so if you have stuff that's taking too long you can
have a timeout and be marked as failed. I think more importantly, it's just really simple
and really easy to use, much like jQuery itself.
Here is just an example of the syntax you see when writing a test. There are these module
groupings that you can cluster multiple tests together and filter them out however you wish.
And here's an example of the test runner. Actually, the test runner looks a lot better
now. Just in the past week we got a bunch of contributions from some people at the BBC,
and it no longer looks like that.
FireUnit is another unit testing framework that I wrote, and that exists as a FireBug
extension. If you're already using FireBug, this exists as a new panel within FireBug
and it exposes this far unit namespace that you can hook into and use the testing functions
from. All the results are then logged into this extra panel. It's an alternative to doing
testing within a webpage, and some people like it.
What I think is pretty interesting, though, is that there's actually some standardization
starting to happen within this sphere of JavaScript testing. There's the CommonJS initiative which
has been working to standardize JavaScript on the server-side. Incidentally, they've
kind of grown in that original scope and they're now encompassing both the server-side and
the client-side. One aspect of that is that they've been working on developing a standard
testing framework format for JavaScript. They've standardized the names of the methods so that
theoretically, you could use the same testing method names across all frameworks. QUnit
recently adopted this, and I hope other frameworks will too.
An interesting thing that I see people doing is server-side testing. This is doing testing
of JavaScript that is disconnected from an actual browser; simulating a browser and then
doing testing within it. I'm going to throw a major caveat here, because the problem is
that whenever you attempt to simulate an actual browser, or simulate what a user might be
doing, at best you're going to get an approximation of what the user is actually going to be doing.
Nothing beats actually having a real user do real testing, especially in a real browser.
I just want to throw that out there because I know some people do server-side testing,
but I don't know of anyone who does it exclusively so. They usually do it in conjunction with
their normal testing routine.
Here are a few frameworks that are pretty popular. Crosscheck. Another one I didn't
mention but is also good is HtmlUnit. Those are both written in Java, and they essentially
wrote the entire DOM in Java and running on top of Rhino so you can run all of this on
the server-side. It's really quite cool. Env.js is a project that I wrote, started a couple
of years back, and it's since grown and become its own beast. It's similar to Crosscheck
in that it provides a full DOM on the server-side, but is written in pure JavaScript. So it's
all in JavaScript, and ideally it would be able to run on multiple platforms, not just
in Java and on Rhino. Blueridge is a more recent one, and this takes Env.js, and a couple
of other frameworks, and creates a full testing pipeline.
Just to give an example, this is an example of Env.js running. This is actually in a console
here. I'm not sure how many people here are familiar with [Mozilla] Rhino, but Rhino is
an implementation of JavaScript written in Java. You usually use it from the jar file
— you load up the jar file and then you can start running JavaScript right there in
the console with no browser. There's no browser involved anywhere here. In here you'll load
Env.js and start running some various pieces of jQuery manipulation. The test suite of
Env.js is actually the jQuery test suite, and Env.js is able to load a number of the
major frameworks and run them successfully.
Another problem that pops up usually when you're doing JavaScript testing is that since
you're going to be testing in multiple browsers, you're going to want to automate that process
because you don't want to have to physically sit there and open up a new browser, open
up a new tab, click refresh and run the tests again. You want all of that to be taken care
of for you, and that's where this whole collection of tools come in. There are a bunch of browser
drivers that exist and which effectively sit and act as a server, and spawn browsers that
run your test suite. When they spawn the browser they'll try and collect the results and bring
them back into your central testing area, wherever that may be.
There are some very good ones out there and I'd definitely recommend them, especially
if you want to automate that process. As I mentioned, Selenium does a very good job here.
Selenium has an almost complete pipeline, all the way from having a testing framework
up to automation, including distribution. This is just a little diagram from the Selenium
site showing roughly how browser launching works. You have this server, usually written
in Java, sitting there under the desktop spawning browsers, popping them up and collecting the
results back out.
The problem then becomes: what if you want to integrate this testing into your continuous
integration? You want to make sure that on every single commit you're running against
all these various browsers. This becomes a really challenging problem, and it's also
a problem that doesn't scale very well. You can see here that there are actually two different
tools that are explicitly designed to tackle this problem. One is Selenium Grid, where
you can take your Selenium tests and push them out to Amazon and across their many,
many servers, and run your tests very quickly. This is actually very cool. Again, this is
part of that full pipeline that Selenium has, so it's quite neat. Another tool that I've
been working on is called TestSwarm. It's a little bit backwards; it works more where
users are signing up and electing to help participate, rather than it being forcefully
distributed out. We'll talk a little bit more about that.
TestSwarm came about because in the jQuery project we were having significant problems
scaling out our day to day testing. We needed to be running against about 15 different browsers
— not only that, we had a number of test suites that we needed to run in every single
browser — and it just did not scale well. Us sitting there, opening up tabs or even
trying to automate it with a browser launcher, it was just way too cumbersome. In the end,
we wanted a way that the full community could help us test. This is where TestSwarm came
in. It works sort of like SETI@home, or another distributive project of that nature, where
there's a central server that collects all these test suites, then our clients connect
and help run tests just in their browser. It's constructed very simply — the client
is actually just a basic HTML page doing Ajax requests, asking if there are any new tests
at the server. If so, it opens up the little iFrame with the test suite in it, collects
the results, and sends them back to the server. It works quite well.
Here's just a rough diagram showing you how TestSwarm is built out. The test suites are
submitted by various projects. As it stands, the TestSwarm service is exclusive to a couple
of open source projects, but the software is completely open source, so if you wish
to run tests from it on your own organization you can just download it and run it yourself.
All these tests get distributed out to everyone that's connected, and the results get collected
and displayed. This is the homescreen, if you will, showing you the different clients
that are currently connected to the Swarm.
And this is the ultimate result of TestSwarm. What you have horizontally here is all the
various browsers that are being tested against, and vertically you have the commits that are
coming in. You can see exactly, commit by commit, what was changing at every browser.
You can see that there were errors happening in Opera up until commit number 6432, in which
case it switched green, because obviously a fix for that was landed. TestSwarm aims
to provide the full continuous integration experience so that you can just submit your
test suites and everything else is taken care of for you. You can see that this is what
it looks like when a suite is running. This is running the prototype test suite. This
suite is broken down into its individual sub-suites, and you can see the browsers running through
running the tests live.
This is something that I've been working on, and we're hopefully going to have the final
release here in the upcoming months. If you would like to help, definitely go check it
out. It's down at the moment because I'm switching servers, but it'll be back up very soon. One
of the things we're working on doing is hopefully being able to provide some incentives so that,
if you're helping and you're brave enough to be running IE6 on Windows 2003, then we
will shower you with t-shirts and things of that nature. There was a high score board
on the website that was very competitive, people were really gunning for it, and there
was a lot of data there so I had to disable it. But that's definitely going to be coming
back.
Before I talk about measuring JavaScript performance, are there any questions about JavaScript testing?
>> AUDIENCE MEMBER 2: Is TestSwarm coupled to any library for testing? Was it QUnit or
something like that?
>> JOHN RESIG: The question was: is TestSwarm coupled to any particular library or anything?
No, it's not. TestSwarm has built in hooks for QUnit, Selenium, Dojo, Prototype's Test
Suite, I think YUI Test, and a couple of other ones. Most of the major ones are in there.
But the API for it is very simple — you need to call two methods, the result come
in, and you're all done. There are two methods. I think I showed Prototype running. I've gotten
the Prototype framework running, MooTools, jQuery, Dojo, I've got them all running in
the Swarm.
I wanted to talk a little bit about finding ways to accurately measure JavaScript performance.
This turns out to be a very tricky problem in which there's both a lot of confusion,
and a lot of wrong results all around. There are two major use cases you see when it comes
to measuring JavaScript performance. There's the case where you have the same identical
code, but you want to compare different platforms. Usually, the people that care about this sort
of thing are the browser vendors. The browsers want to know, given this piece of JavaScript,
which browser runs this code the best? Which browser runs this the fastest? That is one
distinct use case. The other use case is when you have different pieces of code and you
want to compare the relative performance, like your JavaScript frameworks, and you want
to see which one does CSS selectors the fastest or something of that nature.
To start, I wanted to talk about when you have the same code but on different platforms,
so analyzing different JavaScript engines. There already exist a few frameworks out there
for analyzing performance, and actually just before getting up on stage I got an email
from Microsoft and it sounds like they might have just released one. We'll see. There might
be a fourth one here in a little bit. These different frameworks give you different approximations
for the relative performance of JavaScript engines. SunSpider was developed by the WebKit
team, the V8 Benchmark was developed by the V8 team, and I developed Dromaeo in my work
at Mozilla. They all try to provide a certain level of statistical assurance that the results
that you're getting are correct. I wanted to talk a little bit more about that because
it's really important, especially being able to reproduce the results every single time.
When SunSpider was first released, all the results were very finely tuned and designed
to be very balanced, so all the tests… I'm trying to remember. I think they took
Firefox 2 and made sure all the tests ran in about the same amount of time on Firefox
2. Obviously that's changed pretty drastically since then — browsers have gotten much,
much faster, so the tests don't run in the same time any more. They also provided some
level of statistical assurance that the tests that were running were actually running within
this amount of time. They can say that this took 1000 ms to run, plus or minus 5 ms. All
the tests are run by loading a full test into an iFrame and then loading that about five
different times, and doing analysis upon those five results. One problem, though, is that
if you ever try to fix a bug in the test suite, there's no versioning built in, so you have
to throw the whole thing away and release a whole new suite as the next version.
I wanted to explain what I mean by error rate a little bit, because it can be very important.
It's a way of saying how confident you are that the result that you're producing is actually
what you say it is and that it's going to be within this certain realm of numbers, and
that the next time you run this test, you'll be getting something within that range. This
is all going back to the normal distribution. I'm going to be throwing down a little bit
of math here, but I think we can all handle that.
The way the normal distribution works is that the majority of the results are going to be
happening, 80 per cent of the time, right in this center. But you have weird fluctuations
that happen — sometimes it'll run a bit faster, sometimes it'll run a bit slower — and
you can see this in actual results. This is some test data that I did from runs in some
different browsers, and you can see, almost exactly, these normal distributions here.
Blue is one browser, red is a different browser, yellow's another browser. You can see this
tapering; the vast majority of the results are all happening around this specific time,
and then it tapers off to each side. What we want to be able to do in these results,
then, is say that this middle result is happening the majority of the time, and the rest of
the time, less frequently, it's going to be happening either a little bit faster or a
little bit slower.
To be able to determine a level of confidence we use different techniques, and one of them
is called a t-distribution. It's a way of being able to say that the majority of the
results are going to be happening within this portion of the distribution. In this case,
around 90 per cent of the results are going to be happening in this very small portion
of time. Using this, you end up with what's effectively called an error rate. When running
this again, you can promise with a 90 per cent certainty, or 95 per cent certainty,
that you will get a number within this range. You can use this to say: this took 123 ms
plus or minus 5 ms, so maybe next time it will run a bit faster, maybe next time it
will run a bit slower.
That's the technique that all these benchmarks use. The SunSpider one uses the t-distribution,
and so does Dromaeo. The V8 Benchmark works a little bit differently from the SunSpider
benchmark in that the SunSpider one only runs the test five times, but the V8 Benchmark
runs it thousands of times. The way it does this is instead of trying to measure the absolute
time it took to run a test, what it does is try to measure how many runs per second you
can do. This is really nice because what happens is that tests that can run really quickly
have a massive error rate. For example, if you have a test that runs really fast, only
takes a millisecond to run, you might run it so it's like 1 ms, 1 ms, 1ms, 3 ms, and
if you think about it that's a huge error rate. You don't know what the actual real
result is going to be. It doesn't help that browsers have really poor timing mechanisms.
In general, the results are just very messy.
But I mean, if you look at a test that runs very slowly… In this case there's a
4 ms fluctuation here, but since the numbers are so large it doesn't really matter. The
problem, then, is that tests that run faster need to be run more times, because when you
start running them more times you get more accurate results. You can see here that there's
a slow running test, and there's a lot of fluctuation in the results of a slow running
test — a large absolute fluctuation, maybe plus or minus 40 ms. But since the results
we're talking about are already like 1000 ms, that's just a very small percentage. Whereas
with a fast running test, that is not the case. What we end up with is the faster a
test runs, and consequently the faster browsers become, the more error-prone tests become,
especially tests like SunSpider. Since all the browser vendors are actively trying to
improve the performance of those tests — and they are, they're improving them very much
— it means that the faster they become, the higher that error level is going to occur.
This is where that really nice runs per second comes in, that the V8 Benchmark does. This
means that the tests that run the fastest just keep getting run more and more times.
What they do is try to run it as many times as they can within one second, so at the very
least, you aren't going to be running it for more than a second. It won't be freezing up
your browser and everything. This V8 Benchmark introduces this technique, and Dromaeo now
uses it as well. This is really nice because, like it said, it gives you much finer grained
results for the tests that need it the most.
What you end up with is this runs per second number, rather than just an absolute number
of seconds. What you can do with that number is run that whole collective tests multiple
times. For example, what you might end up doing is running it once and you'll end up
with 1000 runs in one second. Then you take that whole thing and run it again, and you
end up with many, many thousands of runs at a single result. This gives you an incredible
amount of accuracy in your data, and this is what I think is so important. To get at
a final result you would probably use something like an harmonic mean. I've just got it here
as an example. You could end up here with a final number that accurately measures runs
per second.
Dromaeo uses both of those previous techniques. It uses error calculation and runs per second.
It's also versioned so that if there's a bug in your tests, you can go back and change
it and it'll make sure that it doesn't try to run tests against different versions of
the same tests. This ends up being really important because tests have do bugs in them,
and you do need to go back later and change them.
But this all leads up to the other problem area, which is the problem of running different
code on the same platform. The example is gave before was testing multiple libraries
on the same… doing the same thing, but with different pieces of code. The problem
here is that there are very few testing frameworks that exist to cover this sort of area, and
the ones that do exist are very poor. They do not work well, and provide very inaccurate
results.
One of the reasons for the inaccurate results is the problem of garbage collection which
occurs in browsers. Browsers are constantly churning and trying to free up as much memory
as possible. They're going through and saying OK, you're not using that object any more,
we can free that up. This gets expensive, especially if you have many, many tabs open
and there's Flash going on, or who knows what. It can really add up. You'll end up with results
like this: you'll see 10 ms, 13 ms, 11 ms, and then like 400 ms. Obviously your code
didn't suddenly become slower in the mean time, what happened is that the browser did
a garbage collection in the background. Unfortunately, that's not something that you can control.
It's out of your hands; the browser's off doing its thing. So you kind of have to take
this into account when you're doing your tests.
There are obviously multiple techniques that you can use to try and arrive at a result.
Most people that I've seen end up not using the mean, the average, and instead start going
towards something like the mode, trying to figure out the most frequently reported result
that comes out of all these numbers. The thing is that you have to be really, really careful
about discarding bad results, because numbers that are reported inherently have some meaning.
For example, if you ran a test for framework A that causes more garbage collections to
occur, that is probably important information in and of itself. Doing something like reporting
a mode would actually discard those very important spikes and make your results less relevant,
so it's actually very important to not discard these garbage collections. I mean, it's tricky
because sometimes you just want to kind of zero in on the actually important numbers,
but the reality is that you have to encompass all of these.
These spikes, as well, also become more relevant when you look at the issues with timer accuracy
and JavaScript. The problem is that when you do getTime in JavaScript, it's actually very,
very imprecise. When a number's reported, you see that they're all reported in milliseconds,
but the thing is that the timer doesn't update all the time. It updates, actually, very infrequently.
Just to show an example, these are some tests run on OS X, and these are the results that
I showed before. You can see that there are really nice distributions of results, and
the results are coming in at every single millisecond. The browser timers on OS X are
very, very precise. They're precise at least down to the millisecond. But if we look at
the results on Windows XP, notice what we're missing here. You can see the results coming
in, and they're coming in at intervals. There's absolutely nothing in between these 15 ms
intervals, and that's because the timer is only updated every 15 ms.
So it doesn't matter how many times you run it, or what accuracy you're trying to get
at, it's only going to report at these specific times. If you notice, there are no nice little
normal distributions anymore — everything's clustered around single results. This gives
you really, really poor results. If you ever see anyone try to give you JavaScript performance
numbers on Windows, you can just throw those results away because there's no possible way
in which those numbers are going to be interesting or accurate. What's happening here is that
you're going to have an error rate that is so huge, you're going to have an error rate
of up to 750 per cent. That's 7.5 times larger than the actual result, and that is just huge.
One interesting thing that I discovered when looking at this was that when Internet Explorer
is running in Wine, the emulation layer actually has a really accurate timer. So if you run
Internet Explorer in Wine on OS X, you can get access to an accurate millisecond level
timer. Naturally, you're then running Internet Explorer in Wine and OS X, and who knows how
that browser actually performs in the real world. In this case, at least, I find it to
be interesting to casually look at to try and get more accurate numbers, but definitely
not used in any sort of real testing situation because it's going to be pretty drastically
different from the actual real Internet Explorer. Just to show you here, you can see the green
numbers here, and that's Internet Explorer running in Wine, not OS X. There are no more
stark spikes like that.
The problem is, in the end, how do we get at these good numbers? How do we get at accurate
timing information in the browsers that we want to tackle? The reality is that we need
to go and use the tools that the browser provides: use the profiler that's baked into Firebug,
use the profiler that's in Safari, the profiler that's in Internet Explorer 8. All these are
actually really good tools. They've improved dramatically over the past year or so, and
they're really quite excellent.
Usually the question that comes up now is 'how do I handle Internet Explorer 6 and 7?'
6 and 7 are still very real parts of our development workflows. One tool that I discovered recently
and that I've been very pleased with is called DynaTrace Ajax. It's a standalone tool that
hooks into Internet Explorer, works with 6, 7, and 8, and it does full tracing of the
entire browser. It's really quite amazing. You can see that it even traces across network
requests, like here — this is actually on Yahoo! Maps. You can see that it traces through
JavaScript, traces through the DOM, through the meat of browser DOM, so you can see exactly
how long it takes to run native browser methods like get element by ID, set timeout, and stuff
like that. It traces through the timeout, traces through the Ajax, and you can see exactly
how long it took to run each individual step, and what ran the other parts.
I've been very, very impressed with this tool. Not only is it a great tool for Internet Explorer,
it's just a great tool in general. I've already made this a part of my testing toolkit. Just
to sort of zoom in here, we have a detachEvent and a setInterval, and both of those are native browser methods in
Internet Explorer. You can actually see how long it took to run those browser methods
in themselves, which is something that no other testing tool provides. I was very impressed
with that.
Another way to get at really detailed information is doing shark profiling. Shark is a tool
that you can use to get at the underlying internals of an application. It's not browser
specific, but it works for browsers. It's very, very low level. There's probably a good
chance it won't be immediately useful to you, but if you find a weird bug in a browser,
and you attach a shark profile to your bug when you submit it, I can guarantee that they'll
be more likely to respond to you because that is exactly what they want to see. This can
be really good for filing bugs. You get this full [xx]. It kind of looks like what you
see from DynaTrace, but it's not like it at all. You have all these really cryptic names.
This is tracing through Firefox, through their JavaScript engine. Unless you really know
the JavaScript engine internals, it really doesn't make much sense.
That's all I really want to talk about with analyzing performance. Are there any questions,
really quick, before I talk about jQuery 1.4?
>> AUDIENCE MEMBER 3: Is there any way to force browsers to do the garbage collection
before you start the tests?
>> JOHN RESIG: Is there a way to force the browser to do garbage collection before you
start your tests? Not that I know of. Definitely not in a way that's cross-browser. One way
to force it is to close your browser, open the browser again, and then run your tests.
That's the ultimate garbage collection. Usually what you find is that if you're doing…
At Mozilla, for example, when they're doing performance analysis on Firefox, what they
do is they have a whole server farm constantly turning out builds and popping up new copies
of Firefox. To do a performance analysis, what it'll do is pop up a fresh copy of Firefox
— and this is very fresh, just built — and it'll run through the tests, close it completely,
and then open it and start it again. That is the only way you can really get at absolutely
stable numbers, if you're looking for really, really stable.
Don't have stuff running on your computer, as well, because the other stuff can mess
up everything. There was one thing that I heard: if you run iTunes on Windows — or
is it Quicktime? I don't remember the exact details. Maybe it was Windows Media Player
— the timers start to become more accurate, because something taps into the internals of Windows
and tells Windows to start becoming more accurate. Like I said, just kill everything and start
fresh. At the very least you can start to get some decent numbers.
>> JOHN RESIG: What about the newer versions of Windows? Same problem. In fact, I think
it did result in other versions of the operating system… Just to repeat, the question
was: what about other versions of OS besides Windows XP, like Vista? I haven't tested it
in Windows 7 yet, but I don't have high hopes. I mean, it's a pretty low level thing, and
I think if they change that a lot of other stuff will be changed.
One thing I should note, though, jumping back to those results — that's IE, Opera, and
Safari, but that does not include Firefox and Chrome. Firefox and Chrome have accurate
timers on Windows. They do black magic to conjure up the correct times. I'm obviously
not sure what they're doing, because they're obviously using a better API than what Internet
Explorer's using. So yes, I should just qualify by saying the situation may have since changed.
I tested Safari 3.1 there, and maybe that change from Chrome got back ported, and maybe
it's now in Safari 4. I'm not sure. But it's definitely not in IE8, I know that for certain.
>> JOHN RESIG: Internet Explorer 8 definitely has a profiler. I mean, this is a run here.
You do a profile, you hit start profiling, and it captures everything that comes in and
shows you a dump of the results. I don't know if the dump is as pretty looking as the other
browsers, I'm not sure. But worse case, you could try the DynaTrace Ajax, and that definitely
has a profiler, that profiles everything. Maybe that could work.
I wanted to talk about some of the interesting problems that we've been working on lately
in jQuery 1.4. We did the first alpha release last Friday, and the final release is currently
slated to be released mid-January. There are a few interesting problems that I feel we
tackled in 1.4: reducing the overall complexity of the code, and I want to talk about how
we did that, adding in support for more bubbling events that don't normally bubble in Internet
Explorer, and doing script loading.
To reduce complexity in jQuery, one of the things that we started to do is kind of take
a step back and to stop looking at absolute times. One of the problems that we found was
that we found we were spending too much time trying to compare ourselves relative to other
frameworks, or trying to compare our speed relative to old versions of jQuery, when the
reality is that we should be spending more time improving the overall code flow and code
quality of jQuery itself. Through that, we can get performance improvements.
One of the ways we did that — I mentioned this earlier — was the FireUnit testing
framework that I built for FireBug. One of the things that I added on to FireUnit was
a way to programmatically get at the full data dump of a profile run in FireBug. You
could use this data, and you could get a full JSON data structure, of all the data that
came out in a profile. You could use this and you could get at exactly how much time,
in milliseconds, it took to run a specific function. More importantly, you can get at
the number of function calls that occurred. When you start getting at the number of function
calls, you can start to compute a level of complexity with your code. I used the term
Big O notation here but it's not real Big O notation, it's sort of a one off, as I call
it.
I wanted to test the complexity of adding a class, for example, so I ran addClass against
about 95 elements. I don't remember the exact number but it was about that many. Then it
looked at how many function calls occurred. What we found was that for every single time
you called addClass, it would call six different functions for each individual element. That's
a lot of function calls. Though, function calls in and of themselves don't necessarily
indicate slowness; function calls may just indicate just a poorly written piece of code,
and there may be ways of optimizing it further.
The nice thing about this is that it's sort of removing the time portion of performance
analysis away from the actual trying to improve your code. We can see that there are various
levels here. There is a 6n, a 9n. We can see one really bad one here, remove, that was
2n + n squared, and that's a huge number of function calls for every single element. That
was one we looked at right away, because obviously there's a problem going on there. We looked
at it a little bit more and we realized, at least in the case of remove, that it was trying
to traverse through and call things more often than it needed to. We made that improve it,
and everything that used remove, also improved. HTML, empty, both of those also used remove.
We were able to draw the time down from that massive n squared to about 3n.
Just to show you, these are some of the times that came up in jQuery 1.3.2. You can see
a few n squareds. The fine div down there is 16n. There are quite a few up there. But
watch what happens when we switch to the complexity of jQuery 1.4; I'll flip back and forth. Things
become way simpler. All the n squareds are gone, the 16n is now a 5n, that 8n is now
a 0n in that it only calls 2 functions total no matter how many elements you put in. Using
this, we were able to get some really excellent performance improvements.
Yes?
>> AUDIENCE MEMBER 6: Can you do a one sentence definition of what a Big O is?
>> JOHN RESIG: Big O is a way of denoting complexity. For example, the way it's usually
used is, say if something has a complexity of 'n', that means it's linear complexity
and for every single time that you call… How do I explain this? In jQuery, for every
element there will be one function call. So for something that has a complexity of 3n,
for every element there would be 3 functions calls. It's just a way of denoting usually
a bigger number, a more complex equation. The longer it takes to run and usually the
worse performance you're going to get.
The numbers here are kind of small. With jQuery 1.3.2 it took about 3 seconds to run, and
in jQuery 1.4 it took less than a second to run. We've seeing improvements of about 3.5
times over the previous version of jQuery. This is all without looking at performance
absolute time numbers — we were strictly looking at comparing and improving ourselves,
more of reducing our overall complexity.
Are there any questions really quick, before I move onto event bubbling? No, OK.
Another problem we tackled in 1.4 is the number of issues that Internet Explorer has surrounding
event bubbling. In jQuery 1.3 we had the new live method, and this live method allows you
to do event delegation really, really simply. Event delegation just works by if an event
occurs someplace down the page, the event bubbles back up, you capture it, and you handle
it. You bind less event handlers in your code, and overall it's generally much faster.
The problem, though, is that Internet Explorer doesn't bubble some events. It doesn't bubble
focus, blur, change, and submit, so those are ones we had to implement and override
their lack of bubbling. One of the tools that we used was this method, developed by Juriy,
and it's a way of determining if an event will work on an element. You can use this
method to figure out, for example, will a submit event ever happen on a div? That would
return true in every browser but Internet Explorer, because in every other browser but
Internet Explorer this submit event will bubble up. We were able to use this information to
determine whether this will actually happen in the browser that we want.
Incidentally, the easiest ones to fix are the focus and blur events. Internet Explorer
has a whole bunch of focus and blur style events, one of which is focusin and focusout,
and both of those bubble. You can just replace focus and blur with focusin and focusout,
and it works wonderfully. I wish all of them were that easy, but they're not.
Submit was much trickier. In order to make the submit work in Internet Explorer, what
you have to do watch for the click event to occur. If someone clicks the submit button,
or clicks an image submit button, then you can capture that. Additionally, if somebody
hits the enter key at an input, that will trigger a quick submit button. All this works
great. You can just watch the submit button. The problem is that if you don't have a submit
button in the form, there's no way to get at the submit event unless you attach a keypress
handler. If you attach keypress, you can then figure out when someone's hitting enter in
a text area or a password.
Even that one wasn't quite so bad when you compare it to the change event. The change
event is a real, real bear. In order to do the change event properly you essentially
have to implement the full change event. You have to track all changes that occur to the
input, you have to track its previous value, and then on blur check to see if it has changed
in the interim. It's really quite convoluted. You also have to track if someone's using
the keyboard to navigate around the form, it's a large convoluted piece of code. One
thing that I thought was interesting is that there's an event in Internet Explorer called
beforeactivate, and beforeactivate is another one of those crazy special Internet Explorer
events. In this case, beforeactivate happens before a radio button is activated, and you
can use that to get at that value before that occurs.
Another new piece of functionality we've been working on and just recently I committed this
to a branch, the day before yesterday. It's called jQuery.require, and it's a way to dynamically
load pieces of jQuery code. In setting out to do this, we wanted to build a script loader
that would just work really, really well, and that we felt wouldn't harm applications
but would actually benefit them. Some of the things we did that are just given are that
we wanted to make sure we didn't load duplicate files. We wanted to make sure that if you
ran it, it would be run synchronously, and that if you did a require and then tried to
use some code that used the require, it would be loaded ahead of time. Simple stuff like
that, we wanted to make sure.
The important point is that we made sure that it worked asynchronously. What we mean by
that is that you can load multiple files, those files will be loaded up in the background,
and they'll be downloaded in parallel. So you can download multiple files in parallel
and not block the browser execution. Not only does it make it faster in that your scripts
will be downloading much faster, but it won't freeze the browser while the script is downloading.
So this is really a win-win situation. The scripts will download faster and it won't
prevent the user from doing anything. The way it works is that we load all the scripts
asynchronously before the document.ready event occurs, and then we just delay the document.ready
event until all the scripts are loaded. It really works quite well. I should mention
that because of this, we guarantee that the scripts will continue to load in the correct
order even though we're loading them asynchronously.
We also provide some URL mapping, so you can type a simple thing like: jQuery.require("ajaxâ€
) and that'll map out to ajax.js. You can also do some basic namespaces, and it'll translate
them into full filenames. You can also specify full namespaces. If you have code living in
a specific directory off somewhere, or on a specific server, you can specify the full
namespace and that'll get all that completed and filled out when it requires.
I think that was the major content I wanted to cover today. I have one little bonus section
that I wanted to discuss super quick.
One interesting thing I've been seeing recently is that there's a lot of people who really
want to start using HTML 5 today, and a lot of people are trying to use the new HTML 5
elements in Internet Explorer, especially older versions of Internet Explorer, and they're
running into a lot of problems. I just wanted to outline the variety of problems that exist
right now, because it's a real minefield. Consider this a follow-up to my 'DOM is a
mess' talk earlier this year.
One of the first problems, the one that everyone encounters, is that they find they can't actually
style the HTML 5 elements in Internet Explorer. It's as if they just don't exist. What you
have to do, as someone found out awhile back, is document.createElement an element that
Internet Explorer doesn't know about, like an HTML 5 element, then suddenly you can start
styling it. This works quite well, actually. There's a nice little script that you can
download that has the full list of all the HTML 5 elements in it that you can just stick
in your header, it's like 300 bytes, it's really tiny, and suddenly all your HTML 5
elements can become style-able. So that's really cool.
However, a problem then becomes that if you try to put an unknown element — an HTML
5 element — into another one in Internet Explorer, it'll just barf. It doesn't really
know what to do, and your elements will get pushed out of your container and kind of get
strewn about. Unfortunately there's really no good solution there short of dynamically
constructing the DOM yourself. At least when you ship it in the page like this, it won't
work right away.
Another problem is that Internet Explorer thinks that these HTML 5 elements aren't actually
HTML. It thinks that they're maybe some sort of XML — I don't know, it's kind of hard
to figure out what Internet Explorer's thinking sometimes. But if you look and try to see
what the nodeName is, the nodeName is actually case sensitive which is very different from
HTML. In HTML the node names are case sensitive. The solution here is that you kind of have
to assume in your code that the nodeName would never be all uppercase, which is kind of weird
since it should always be.
Then finally, the other big problem is that if you try to inject HTML 5 elements using
innerHTML, it will just completely not know what's going on. You will end up with elements
with a nodeName of like 'section'. It creates an element for the opening section and the
closing section, two different elements. It's really bizarre. It's some of the craziest
stuff you see. The solution here is to write a full HTML parser and to parse the HTML yourself,
and construct a DOM. Obviously that's a pretty crazy solution. That's something I would do,
but I don't recommend it.
Those are just some problems that I wanted to outline that are happening right now. If
you try to use HTML 5 elements in Internet Explorer you're going to hit fun problems.
So that's all I wanted to talk about today. Are there any questions about anything I discussed,
or about jQuery or what have you?
Yes?
>> AUDIENCE MEMBER 7: These people who want to use HTML 5, did you tell them this? Did
they come to see the reason and decide not to use HTML 5?
>> JOHN RESIG: The question was: when I show people the list of how HTML 5 elements are
broken, do they see reason and not use HTML 5? No. At least not yet. I'm probably going
to talk about this a little bit more. It's really an unfortunate situation right now.
You can use some elements some of the times, as long as you don't try to put them inside
of each other and you don't try to dynamically create them. I think the situation is probably
a little bit more tenuous than people realize, so I don't know. It's kind of tricky. [laughs]
>> AUDIENCE MEMBER 7: It seems like there's kind of...
>> JOHN RESIG: Yeah. I don't know. It's really up to the author themselves if they decide
to derive benefit from it.
Yes?
>> AUDIENCE MEMBER 8: What's your opinion of Chrome Frame? This breaks, or that breaks.
Why not just Chrome Frame?
>> JOHN RESIG: What's my opinion of Chrome Frame? As a solution to Internet Explorer,
or just…?
>> AUDIENCE MEMBER 8: Just as a work around problems.
>> JOHN RESIG: I guess that could work. I would prefer that they just move to a different
browser. I mean, I think the reality of the matter is that I find it hard to see a situation
in which someone would be able to, and willing to, install a plugin for which they are not
capable of upgrading their browser. It seems like the people who have the problem, they're
going to have the same problems either way.
>> AUDIENCE MEMBER 8: They could install Flash.
>> JOHN RESIG: Are you sure? Are you sure the same situation where someone could install
Flash themselves is also the same situation where they could not upgrade a browser? Usually
when you're in that lock down of a situation in a corporate environment, you can't upgrade
a browser, you can't install a plugin, you can't do anything. I mean, you aren't the
administrator of your machine. I think most of the companies that exist, I imagine, that
would be capable of using Chrome Frame are also capable of upgrading their browser.
Yes?
>> AUDIENCE MEMBER 9: What you said in the first part about browser [inaudible], what's
your confidence level to do with TaskSpeed and SlickSpeed?
>> JOHN RESIG: The question was: what's my confidence level in tools like TaskSpeed and
SlickSpeed? SlickSpeed is a framework developed by the MooTools developers that analyzes the
performance of CSS selectors in various frameworks. TaskSpeed is a framework developed by MooTools
and Dojo to analyze performance of various tasks like HTML injection, adding and removing
classes, binding events, and stuff like that in various frameworks. They both use the same
basic framework, though, the SlickSpeed framework.
My opinion of them is not very high. Based on what I said, they have no statistical assurance
whatsoever. They run the tests about five times, which is very low — especially for
how long they take to run, it's very poor. In the SlickSpeed, for example, you'll find
that most tests run in less than 16 ms, which means that the timer's granularity issue…
If you go and run SlickSpeed in Internet Explorer, you'll look at the times and it'll be 0, 0,
0, 0, 0, 0, 16, 0, 0, 0. You'll be like wow, I have the fastest browser ever, but it's
not the fastest browser ever. It's just really unfortunate. There needs to exist better ways,
and I've seen a version of SlickSpeed that adds in where it starts to do runs per second,
but it only does that run per second once and then it doesn't try to provide any sort
of statistical boundaries on it. It's really quite unfortunate. I hope better tools come
along.
Audience member 11: Have you looked at Google Closure, and what do you think of it?
>> JOHN RESIG: There are two things. There's Google Closure the library, and it's alright.
I looked through it. I didn't see anything that really blew me away. Then there's Google
Closure the Compiler, which is pretty slick. The Closure Compiler works sort of like YUI
Compressor and Dean Edwards' Packer and various other compression scripts. It has both a basic
compilation mode and an advanced compilation mode. The advanced mode goes much farther
than other toolkits or other frameworks.
That being said, there's a huge asterisk around that, meaning that if you try to just go today
and take jQuery, or take YUI, or take Prototype and put it in an advanced mode and compress
it, all you're going to do is get a broken piece of JavaScript out. The way that the
Closure Compiler is designed is that you put in your framework, you put in your plugins,
you put in your code, and then you compile all of that together into a single codebase,
and that one file you put up in a website. That's the way it's designed to work. Using
it like you would YUI Compressor is going to cause problems, and I don't think a lot
of people realize that yet.
I know when I first used it, I popped it in JFrame and was like wow, it's so small. Well,
it's so small because half the code got ripped out in the meantime and it doesn't work anymore.
[laughs] You kind of have to be aware of that, and you have to realize the situation in which
it was built. They built it so that they could put a single JavaScript file on their website
and that would have everything in it. All the non-essentials would be stripped out.
I definitely recommend checking it out. It may be useful for your application, but just
be aware of what's going to happen.
Alright, I think that's it. Thanks again for having me. If you have any questions I'll
be around. Have a good evening.
[applause]