Gtac 2007 - Jason huggins & jen bevan - Extending selenium

ALAN: All right. So we have Jason Huggins and Jennifer Bevan. Jason joined Google in March 2007 as a test engineer. Prior to Google Jason worked at software consultancy called ThoughtWorks for six years. While at ThoughtWorks he and a team of ThoughtWorkers created Selenium, the browser for automation, but you all know what Selenium is-- why did you put that in there? Jason graduated from the University of Notre Dame in South Bend, Indiana, with a major in management information systems. He currently lives in Mountain View with his family. Jason was actually born in New York City but hasn't been back in a really, really long time, and he can't spell his son's name. this JASON HUGGINS: I misspelled it. ALAN: Jennifer is a software engineer in Google's test engineering group. She received a PhD. in software evolution from UC Santa Cruz in 2007, and is the creator of Kenyon, an open source repository data mining tool. She previously worked in the radio sciences systems group at the Jet Propulsion Laboratory, and received her electrical engineering and computer science bachelor's degree from UC Berkeley. Jason and Jennifer were actually asked to do this talk last week-- last Friday-- so please extend a warm welcome to Jason and Jennifer. JENNIFER BEVAN: Well, I'll be starting off. So essentially we're going to do a talk in two parts here. I'm going to talk about an experience report from our use with Selenium RC and then Jason will move on to the demo, which presumably will not break this time. So as Patrick and Alan were mentioning, the purpose of this conference is to share experiences and to talk about we're doing. So what we have is, we took Selenium RC and we put it on a grid of machines. And in that process, through deployment and operation, we learned a lot of lessons at the level of testing this at the Google scale of tests, which is a very high level. Also as I mentioned, the demo is Jason's going to demo the gridified-- and that's his word, that's why I put his name next to it-- version of Selenium RC for the masses. This uses Amazon's Electronic Compute Cloud just on a pay-by-your-credit-card level. So you don't have to have a lot of resources to start parallelizing your tests. And then we'll talk a little bit about future directions. So a little bit of background. Given what the other talks have already said, I'm going to skip over a lot of the Selenium RC part. But specifically, Selenium RC has not as much [INAUDIBLE] support as say the Selenium core, and this talk is specifically focused on our work with RC. We have other work on core but we can discuss that offline. One key thing I would like to mention is that RC does support the injection of, as Simon put it, large chunks of badly written JavaScript. into the page. And actually, we use for good purpose. What we did is, when we created this farm not only did we take the Selenium RC test that people have we also take not so badly written chunks of JS unit tests and use the same machines to run both types of tests. So this farm was created for a very specific purpose, and you might recognize this slide from Patrick's keynote. Essentially what we had is we had a lot of applications and products that already have some sort of a continuous test integration system that's triggered off of their change lists. So they already have a build that's going on their system. Under test is already being built and deployed, and they also have their own monitoring. So what we do is we take care of everything that happens in between when the tests get sent off and then the results get sent back. And essentially, all we did is we just took the Selenium client interface and implemented our version of that that then captures the commands, sends it over to our servers, which is basically mulitplexing and monitoring. That talks to a real client, a vanilla client that talks to the standard Selenium RC, and then it just takes the usual path off to the system under test. Results get sent back and as far as the tests are concerned it's the same test running on local hosts around the farm. So a couple of these are statistics that we had. I generated this yesterday so I've got five [? points ?], This is number of tests we've run on some of our configurations in the last five days. So Saturday, Sunday are low numbers and then Monday, Tuesday, Wednesday. As you can, Firefox are our top configurations. We do have pre-selected configurations for this farm. If you want something a little bit different we have other ways of doing this. But these are essentially pre-configured machines, which makes the problem of ensuring that you've got the right test environment a lot simpler. And then just in terms of what this allows our people to do-- for example the Gmail tests, they've got one test suite that was taking over 40 minutes to run and they essentially just did one thread per test. Ran it against the farm and it went to 3.5 minutes, which is the length of their longest test. So we've heard a lot of wonderful things about Selenium, and some not so wonderful things. Dealing with Selenium RC this scale we did find quite a few issues that we needed to fix. There are some variances in what the different browsers and operating systems will give you. We did find reliability issues, scalability issues. And then of course the tests are running into the limitations that RC itself has. Because Selenium has to operate within the context of a browser, so if the browser dies then Selenium can't do anything about recovering the state on this. But any changes that we make to Selenium RC to support what we need, they are getting propagated back to the open source repository. Because that way everyone else can benefit and not have to deal with the same issues we have. So I'm going to go over some of these issues and if you've been seeing these in your own work with your own grids the nice think to keep in mind is all of these issues are either actively being addressed or have been addressed and just haven't been propagated out to the repository yet. So our top issue was just test isolation. We actually don't run these tests in individual virtual machines at this point, It's still a trade off. You can have one test per machine but then you have a little bit of overhead on the machines. And some browsers you really don't need to do that. Firefox supports multiple tests at one time because you have individual profiles. IE on the other hand doesn't, especially IE 6, and so what Selenium RC does is it modifies the registry settings in your WinXP box, say. And the problem with this is if you have multiple tests from different people coming in at the same time who, say, want to login to the same session as different users you now have a cookie collision. And this shared resource management was not done in RC proper. So we had two approaches. The easy one to implement was we just single-tracked all of the IE tests to all of our machines that had IE configurations. And it works, it's great. We;ll deploy that. Individual virtual machines. We still have that on the table as an option but that's not what we're doing at this time. In fact what we'd like to do is work with the RC developers to really support fully interleaved tests with your states saved in between. Secondly, another difference is the way that the JavaScript is evaluated within the browser changes from one to the other. A lot of our tests will use the Chrome version of Firefox that essentially lets them bypass all of the security checks. I kind of understated it there. Whereas *iehta, it was sort of expected to do the same thing of bypassing the security checks but it doesn't really do it the same way. So we had people who had tests that were thought to run on both that weren't. And in fact given that *iehta is still only experimentally supported by RC we just haven't been able to support that level of testing with IE. So we've only deployed it with the version that doesn't do any security bypassing, *iexplore. And essentially through this experience we learned that we can't just say we support IE 6, IE 7, on these platforms. You have to also define what level of support that you have. You have to say tests that do this are fully supported, tests that do something a little bit more complicated are moderately supported, and go from there. And building that into you're test submission really helps because then if you do have a set of basic IE tests, they don't do anything fancy, you should just be able to run all those at the same time. And you can really optimize that way. So essentially we did have our testers actually modify their tests a little bit to manage their own cookies a little bit more. And we do support IE to some extent, but we still can't get around the security checks. Reliability issues. So this is all about 0.9.2 and the pre-0.9.2 releases. This deadlock was not as pronounced in 0.9.1. However the features from 0.9.2 are things that we need. So it's a problem of two evils right there. So we chose the 0.9.2 and we discovered this deadlock because we were running so many tests so fast that some percentage of our tests kept timing out when they really shouldn't have. What our testers are doing now is they have an automatic retry of any failed tests. And because every time you test it can be run on a completely different machine, the same machine, we don't really know. But If it fails twice in a row it is more likely to be a real failure. And you can actually extend that strategy out. You can do multiple configurations and do a voting system if we really wanted to do. Right now we're just doing the retry. And this deadlock is actually our top priority. And in fact we do have a patch that just needs to get merged. And that will be the first thing we ship off to the open source repository. One other issues on the reliability front. It does have a memory leak. It does have a connection leak. They're not very fast, it's just that for the length of time and the number of tests that we run we do see a growth in both types of leaks. And effectively we first went with our defensive strategy, which was just, we'll restart it every so often. And that works, and it's not a great solution. However, we had a few other priorities that were more pressing at the time. And so we have our defense strategy in place but we will be working also to help search and destroy these leaks. Oh yeah, there was one more. Testing. Well see, all of these issues that we found were in code that passed the regression suite. And so, clearly, we needed to add more tests to the regression suite. So as we do fixes we are adding tests for our own code, we are adding tests for the code that was around the code that we changed. And all of these tests are getting pushed out the repository as well. Yes, it takes about 20 minutes to run the regression suite right now. This would make it take longer except for what Jason's going to demo in a little while. Scalability. Session identification. So the way Selenium RC does session identification right now is it takes a look at what time your open new browser command arrived. And this actually wasn't a fine enough granularity for us. So we made a change for that, and pushed that out. We also were running into a problem where even on Firefox, which in theory supports multiple tests at one time, right around five or six concurrent tests running at the same time we started getting some other odd-- not exactly a deadlock-- but some other odd timing issues. And so we talked to them about this and there is an unofficial assumption that this would always be run in a virtual machine. And so the tests don't really investigate what happens in this model. And so we are considering both, really optimizing multiple tests on a real machine versus the virtual machine. And all of these changes, like I said, will get pushed out. Performance. I think that I was going to get asked about performance from some of the other talks. And of course you'd want to. It's actually not our priority. Because we are generally going with a massively parallel execution we want to get that part working first, and then we can do more performance optimizations on your tests. But sometimes it's your test themselves that can be optimized, and sometimes it's Selenium RC. And you don't really know that it's RC's fault at that point. So just a couple other things. And it's a little bit detailed. But as some of you-- if you noticed you were talking about timeouts. Well when fixed the deadlock in our local version we suddenly noticed that certain tests that were supposed to have a timeout of about 45 seconds, when the page wasn't available it would test out in four seconds. And clearly that wasn't right. So we found that, fixed that, that's getting pushed. And then the second one. You tend to have-- oh, I don't know-- after we ran it for about half an hour we would have 40 browser sessions left open in certain cases, not all the time. And in a lot of cases you would end up with it being because of the deadlock, or being because the idea of what Selenium had of what the browser was doing mismatched what it actually did. But there's also a few other paths, and we found those. So as you can see, a lot of this stuff you wouldn't necessarily hit either at an interactive level or at a mild scope. And in fact we had deployed this and were running it for a couple of weeks before we really started hitting these problems. As we got more users, as we got more adoptions, that became very apparent. So of course, being lazy or just being human a grid of these machines-- you really don't want to have to ever go into a single machine and fix something. You don't ever want to have to go and click don't send this report, or get out of the Firefox, a talkback mechanism. You never want to have to do this. However, because Selenium RC is constrained to run inside the browser there are going to be certain classes of errors that it cannot by nature handle. And so you really do need to have some sort of out-of-band communication. And we have one method of doing this that is part of our next version of the farm. And essentially that will handle state management, some extra monitoring, and the ability to just kill things when you need to. And now I'm going to switch to Jason, who is there, and he's going to start on the demo. JASON HUGGINS: How much time do I have? It's 11:45 now. When's lunch? Where's Alan? [INAUDIBLE] 12:30, wow. OK. I probably won't take that long. What was that? Oh, right, right. OK. So this is an amazing grasp of the obvious here, but a little quiz. Which one is faster? This one or this one? Right? OK. So obviously, this one. However one of the interesting things about-- you could call it anti-patterns of Selenium adoption-- is everyone does this because this is easier than that. So what happens is people download Selenium, they have one test, then two, then three. They're like yea, yea, and then four, and then 100, and then 4,000. And then they're like, dude, I've got three seconds of unit tests and five hours of Selenium tests. Selenium sucks. Well, no, it's actually a symptom of an architectural design of this little-- shoot, I just gave away my slides-- Anyway, so what you need to do is put in the effort to parallelize or gridify your tests. Yeah, easier said than done, So how do you add more servers? How do you go from this to this? Well, you thank this guy. You send him a letter and say thank you, thank you, thank you. He's Jeff Bezos, the creator of Amazon.com. And they went out on a limb last year-- it's been around for a little bit less than one year-- created Amazon Web Services. So it's a whole family of other things, really crazy stuff like Google Mechanical Turk. That's some really crazy stuff. But the two things I'll be showing is the EC2, the Elastic Compute Cloud, which on the fly gives you as many machines as you want. So you can go from one machine to one million, if they let you. And the other thing is it dovetails to another service called S3, something storage service. And what it does is you upload a virtual machine to their storage service and then when you want to-- so it's just gigabytes stored, one and one half gigabytes stored on some machine somewhere. And when I want to start up a new instance it'll just go grab that instance and then start it up. Or actually, more correctly, grab an image and create an instance out of that. Anyway, also a bit of a context. If last year I was nothing but a shill for Apple because they just came out with the Intel Macs. And my whole little talk was about how Intel-based MacBooks are cool because you could do Windows, Linux and Mac testing all on one machine. So this year I'm nothing but a shill for Amazon. But I never got a thank you note from Apple. I don't know if that will change this year. So just to summarize-- I can zoom in here, maybe. They have made computing a true utility. It looks like a phone bill or an electrical bill. The key thing right here is I'm charged for-- well that's not really interesting because I only got charged $0.03. So it's $0.10 per hour of usage. So I created my account yesterday. This was as of last night. So I owe them $2 now for 19 hours worth of usage. So that's kind of cool. So you're like, wow holy crap, $0.10 an hour. That's dirt cheap. I'm going to grab a million machines-- however psychological effect of pricing-- It's actually about $74 per month if you ran them 100% all the time, every day of the month, which is actually probably about the right price of rack space, server co-located or whatever. The interesting thing about this is that it's not one server all the time per month, its CPU hours. You can take one server and do it the whole month or you can take 500 servers and just use them for an hour. That might actually be a lot more useful for your build process when you're more interested in a five-minute build than one server for a month. So that's kind of interesting. So time for a quick demo of this thing. All right. Also just briefly, so the last-minute preparation for this I created the account while other people were presenting yesterday. I went through Amazon's awesome getting started guide. So it actually was pretty useful. So I won't actually go over the steps of creating an account, doing all the stuff like that because it pretty much was paint by number. And then also they've got another website-- this is actually an interesting thing, Jumping ahead a little bit. But there's this notion of public images out there. So some other people in the community they posted-- someone put a Ruby on Rails instance out there. So you can clone that and, boom, you're five seconds from doing Ruby on Rails development. So it would be really cool if there'd be a Selenium instance out there. So if you want to do your own testing, boom, you just bring this online and you can have it. Then ssh back into your network and then you can start doing all kinds of fun stuff. So it's kind of a plug for their services. So I've got two running instances now connecting to some machine somewhere on the globe. I don't know if they're in Seattle or wherever they may be. Let's see if I can get started. I made a little shortcut so-- machine one. OK. So first of all there's a couple command lines options that you have. You can describe-- I'll clear this. Simplify my screen here. So first you can kind of just ask Amazon for a list of public images that are available. That's kind of complicated so I'm going to simplify it. These are just the ones I've created in the last 24 hours. So my first version was gtac-vanilla-fc4. They give you by default Fedora Core. Someone in the community has bootstrapped Ubuntu into a Fedora Core and then made it Ubuntu. So I was playing around with that. And then to get ready for this demo I added a whole bunch of stuff like being able to VNC into it to see the remote desktop. And then I took a snapshot of that, and then called that, rather creatively, Ubuntu image two. So yeah, I probably won't remember what that was for. I'll have to watch the video to remember. So let's see, there's a couple other commands that you can see. So if I wanted to just be really crazy right now and put this in all little for loop I could go-- like a nice little one-liner in Python-- run this particular command, this EC run instances, that's a reference back to this. That's a handle to your instance. So this is my golden image. I could put that into here-- I don't know if that shows up-- right there, and then just start up the server. And then about one minute later it'll actually be running. So I'll show you the way to kind of pull that-- also, this is a command line interface. They also provide a SOAP API and a REST API through actual programming languages so you don't have to do it all through the command line. So this part is the sizing, But the interesting thing is these are images and then when you actually start them up you have instances. So I have an instance ID here, an instance ID there. So I have two machines running on the Elastic Cloud. And this is the public IP address of that machine. And if I zoom out, over here-- sorry for the scrolling but-- that is the internal address. So if you actually had a grid, and you wanted to have all the machines just talk to themselves and not go out to the internet to do all their communication this is an IP address that just the machines can use to talk themselves behind their firewall. So depending on how you're going to be using their services they provide these two IP addresses. So for my purposes, if I'm going to bring up my web app over here I'm really only going to be interested in-- well, I'm going to connect to it remotely and then I'm going to have it connect back. So I didn't really do much with the internal IP addresses. Also I can-- just to show you-- this is where-- hopefully things don't fall down but-- I'm going to *** one of the machines. They give you a public private key-- wow, that was fast. Who am I? I'm roots. And I am this machine. That's the internal address. So you can do all kinds of stuff. This is how I actually bootstrap myself into installing in VNC server, and Gnome, X Windows, and stuff like that. You just do whatever app gets stuff that you need but you do it through your ssh shell console. And then when you're done you actually take your instance ID and you can just terminate it there, which I'm not going to do. So let's play around with Python really quick. And one of those instances-- let's see if I can get this straight. OK. So the lay of the land here, just nothing but a Gnome desktop-- Is it Nome or Gnome? Nome. Gnome. All right. Gnome-- Gnome desktop and actually have a console version of Selenium RC just running there, not doing anything interesting, waiting for commands. I'm going to use my scratch pad here. I've got a version of Python running so I'm just going to give it commands. So I'm going to import the Selenium library. I do not want to type this. See if I can get over there. OK there. I created my Selenium object. The cool thing about things like Python and Ruby that you can get a list of all the commands. Selenium is kind of like the PHP of web testing applications. Where WebDriver is simple, we've got one big fat global name space of commands. It kind of stinks. But, hey, PHP is pretty popular. OK. So I'm going to actually start a session, and for my next trick, something will actually show up. Let's see. Select that window, there we go. OK. Watch the X Window. Something's happening. It says, preparing Firefox file, launching Firefox. It's now starting it, and there you go. I'm now going to do couple other commands. So more kind of a house cleaning thing. Going to maximize the window, hopefully. Actually I'm just going to do this on the fly. I'm going to open google.com. Let's see. I think I've, hopefully, done this enough times that I can remember this. So I'm going to type-- let's see, what is it? Q-- and I'm going to say gtac. And it types gtac into the field. And then I'm going to say Selenium.click. And the Google search button is called button g. So I'm going to click that, and you get to the Greater Toledo Aquatic Club or the Geospatial Training Application Center. Interesting stuff. And then when I'm done with my test-- I could do some assertions but I'm not going to because I'm lazy-- I'm just going to stop the test. And then Selenium just goes back to listening mode. So it's kind of interesting. So that's an interactive way of using Selenium remote control with a nice dynamic language like Python or Ruby. Sorry Perl. I think it's available in there I think it's just a little bit harder to do it interactively. So that's a fun way to play around with the service at first. Now for my next trick I will now do this same thing but in this massive parallel grid system, scaled out, wow, cool. So I'll exit this session and we'll show you how I cheated a little bit. So the way I would approach this problem, usually I'd probably have a program that took all of my tests and created one thread per test. And then through some kind of magic, [UNINTELLIGIBLE] either polled the service for the next available machine and do a matching game between a thread and an available machine and just send it on its way. Just for him simplicity here, just proving the point you could do two tests at once, I just created two test files, test_google_aws1 and test_google_aws2. I'l just show you the first one. Bad programmer, I hard-coded IP addresses in here. But I'm going to show you the test. It was very similar to what I did. This is using Python's unit test harness. So in set up it will actually create a Selenium object and start it. And then in my actual test I will-- can you see this, by the way? Wow, that's small. That's too big. Is that better? So yeah, very simple. Just open, maximize the windows, set the speed-- actually that's more for demo purposes. I'm just going to slow it down. Don't actually have to do that for a real test. Again I'm going to type, hello world, click the button, wait for the page to load up to five seconds, otherwise it will time out, and then just actually do an assert. You're saying hello world actually is the title. And then as a tear down I will stop the test. So that is test one, and that is pointing to one of my machines. And if you're live blogging do not attack that machine, please, as I program. Let's see. And then test_google_aws2 is the same thing pointed to a different one. And I'm really cheating here with some multithreaded code. The way I'm actually going to be doing two tests at the same time. Let's see. I'll make this a little bit simpler. I am cheating. I'm adding a nice little ampersand after calling the scripts. That launches it without waiting for it to execute. So a nice little Unix hack. So if you're asking if I'm going to be open sourcing the stuff that I'm showing today, there you go. Because actually it's just the test Google Python script that's already there. I modified it for an IP address and then did these three lines of code. So I can't possibly fathom how many lines of code that would have been if I had to do that in Java. So I will actually perform my trick, if I can. Set up my screens here. It has to be set just right because-- you'll see in a second. OK. Are you ready to ooh and aah? Ooh, very nice. OK. So I've got the three actors in my little play, machine one, machine two and my little Python console. So I'll go back here and launch that Python test_Google_gridified. Go. Ooh. All right. This is the point where most demos crash and burn miserably. But I succeeded this year, so I do not get a copy from Alan So machine one, machine two are going. They're typing hello world, they're getting results, they're asserting results and they're quitting. Yea. A demo works. I should just stop right now. So as you notice the test ran and it took 20 seconds. So this is probably the point where you start saying, OK, let's come off our little high here. We just used a big grid, somebody's other machines. We did all it, it all worked. But dude, it took 20 seconds to go to type hello world into Google? So there's some issues here. Well one, I'm going across the country, I think. Maybe their servers are here. But there was time to start up the server. If you actually go all the way back and don't cheat like I did, and do some kind of threading thing where you're going to bring online a server, you have to wait for that time. These are all implications. So there's going to be a trade off point. If you only have 20 seconds of testing. If somehow magically-- but this is, what, 20 plus 3. Well, so here the thing is, it would have taken 43 seconds but now I got a 50% increase or decrease in time, whatever. So that was pretty good. But there's a trade off. There's so much overhead in keeping all these plates spinning that you'll have to see where things work. The other things about this is that it probably will get expensive if you do this all the time. This happened to me last night. I think I brought some of these things on but I fell asleep, and I'm still getting charged per hour. So you have to make sure that you add into your tests shut down the instance so they don't grab more of your money even though it's only another $0.10 or something like that, or $0.80 for an eight-hour sleep. So that's pretty much my demo. I just wanted to do a proof of concept. In all best of intentions, I was going to do it like over five hundred machines just to see if I could do it. But Amazon, because the Elastic Cloud is technically in beta-- I blindly clicked through the user license, like everyone does-- and it probably says somewhere in there that you can't just grab 500. Actually that's one thing that is explicit. For regular users you can only bring up 20 instances online unless you email them for permission to bring online more. So I sent them an email last night saying, hey, I'm a guy at Google. I'm presenting at this conference. I want to have 500 machines. Will you let me do it? But I didn't get a response back. So as I went to press they did not write me back. So who knows? Maybe this weekend I'll get my request for 500 machines, and I'll do this again. What was that? Yeah, yeah, yeah. So someone, if they know someone at Amazon, send them a link to the YouTube video and they'll say, poor Jason. All right. So that actually is my demo. I actually didn't know if it was going to be able to work because this started exactly 24 hours ago. So that was it. I could do a whole bunch more complicated things. But I'll just seed the audience with some questions here so it comes up in everybody's talk. Are you going to open source this? So there's actually two parts that need to open source. One thing, the thing that I did just now, you saw the three lines of code. And then all the other stuff, that's just Amazon stuff so ask them. And then there's the other parts of all of the Selenium, Google grid stuff that you talked about. I"m sure you came up here to give that part of the answer. JENNIFER BEVAN: Right. So for the Google-created Selenium farm, as we've been calling it, we very much want to open source it. When we initially built it we did include some internal technology just to speed up our actual deploy time. And so we're really looking through the best way exactly to take that out and what to put it to let us open source it. But that is what we want to do. And as Jason mentioned, the rest of it you just saw and can replay on YouTube in slow motion. JASON HUGGINS: And before I actually open up to the other questions. Second one, I know everyone has been chomping at the bit to ask this question. Who is Paul Hammant? So first of all, Hi Paul on YouTube, right? So every time I talk about Selenium Paul Hammant comes up to me and says, dude, you forgot to mention me again. JENNIFER BEVAN: Does he say dude? JASON HUGGINS: Yes. Dude. Actually that's his thing. Everytime time I'm like, "dude!" But he's British so it's kind of odd. So let me show this slide here. That is Paul Hammant. That was taken from his public profile on the thoughtworks.com page so if it's embarrassing to him it's his fault, right? But he is the creator of what is now Remote Control but at the time it was called Selenium Driven. So some things like the [? GSL ?] stuff that was not using Remote Control, that was just using Core. So I'm the creator of Selenium Core and I'm not just the creator, there's a whole bunch of other people at ThoughtWorks that did that. But there was this big, ongoing architectural, philosophical difference of how you do so Selenium. So there were these two things, Selenium Core and then when Paul saw the first demos of Core he immediately saw, oh, it's great, but it sucks because you can't drive it from Java. So he wrote the Java drivers to core, which was called Driven. And then it got rewritten by Nelson Spears, Dan Fabulich at BEA, and Patrick Lightbody, who's at Jive Software and then he left. So I'm giving out all of these karma-balancing kudos to other people. So all those all those other people made the Remote Control version of Selenium what it is. And so, finally, to correct past wrongs, thank you Paul. I don't know who threw that snowball in his face. So anyway, thank you and openqa.org is where you can find Selenium. That's it. AUDIENCE: So I was curious about [INAUDIBLE] Can you hear me now? All right. I'll just speak really loudly. JASON HUGGINS: Or you could come up here. This one works. Yeah, wait-- with the video. I'd have to repeat your question. AUDIENCE: It's a short question. JASON HUGGINS: Or you could. I just will repeat your question if the mike doesn't work. AUDIENCE: Let's try-- aha, there. Thank you. AUDIENCE: In running Selenium in a farm do you always run one instance of the browser per instance of Selenium Remote Control. JASON HUGGINS: With Firefox I think we started with one instance of Selenium RC and then as many instances of Firefox is it can handle. So I don't know, at one point we could get a peak of 20 or 30 or something like that before the wheels started falling off. JENNIFER BEVAN: The wheels started falling off around 20 or so but you can actually start noticing the wheels shaking around five or six right now. We're hoping to boost that number a lot. JASON HUGGINS: But we did not do one server per one test. But we are planning-- it's one of the options of doing one virtual machine, which then does one Selenium server, which then does one test because then your ability to reason about failures is-- you have more ability. When things go multithreaded, it's harder. JENNIFER BEVAN: Although as I mentioned, for IE we do actually single-track the tests, but for other reasons. Yeah. AUDIENCE: I don't know if you guys are using Windows at all for the instances but if you want to test on IE what are you doing about the Windows licenses? Do you have to get a license for each instance of the VM and spawn it up? JENNIFER BEVAN: Well, so for our version we're not actually using VMs so we have just official installs. That is one of the issues if we go to a virtualization model that we will have to deal with. Now I don't know what Amazon's theory is on that. JASON HUGGINS: So yeah, I didn't give enough attention to the trade offs with the Amazon model. Of course that's only Firefox and I guess potentially Conquer, depending on how you set it up on Linux on an open source thing. So it's potentially just a matter of sending money to Microsoft to install-- and VMware, or Zend, if you can get that for free, or get that working-- and putting Windows on their grid. I'm sure they wouldn't be too happy with that but it's a simple matter of money. With Apple it's quite not legal. So those are the trade offs there. Well, yeah, it's possible but not legal. I don't know if it's possible but I've heard rumors. AUDIENCE: And you were a fanboy last year. JASON HUGGINS: Right, right, right, right, right. So yeah, that's the big thing. Yeah, you scale out on this particular thing. But on the grid at Google where we have all the machines and it's-- AUDIENCE: You can have a machine with it. JASON HUGGINS: Right, right. But, oh, one other clarification. there was-- and I don't know if this is old news to people but there is there is some FUD around Vista's licensing scheme, specifically around virtual machines. They clarified it but then they made it fuzzier. And I guess what it comes down to is you can't use the exact same instance, licensed copy of your host, and use that in that inside the client-- the virtual machine. I think some people were thinking, oh, Vista is not allowed to be run inside a virtual machine, or Vista consumer, or something like that. So you have to go get the platinum-plated enterprise version to be able to do that. That's not true. They clarified it in a really ambiguous way saying they actually gave you more rights, that you could take one license and use that in many different ways. Probably I'll make it more sound more-- yeah, it's a tough thing. I can't even make it straight. AUDIENCE: So you're saying if I had a license on a VM, hypothetically, I could bring it up and as long as I don't have that same license on another VM at the same time I can continue to use that one. JASON HUGGINS: But-- JENNIFER BEVAN: [INAUDIBLE] JASON HUGGINS: --since I flew through the user license agreement with the Elastic Cloud I have no idea if I'm breaking the law if I did that. So I would just assume I would be. I would probably want a written letter from Bill Gates that I could do that on Amazon services. And also we don't use the Elastic Cloud inside Google. Just a correct-- AUDIENCE: You mentioned about memory leaks in Selenium RC and how you're working around them by restarting. Can you give me an idea how often you restart? Or is there a certain number of tests, or? JENNIFER BEVAN: Well, actually what we did is we made the assumption that n minus 1 was fairly close to n when n it was large. And so since our n is large enough to do this we actually just cycled through continuously. So we always have n minus 1 machines up. And then one of them is getting its tests strained off, those all finish, we restart it, and then we get it back. And generally you we could probably do some extra fancy stuff-- putting them into sub pools, cycling through those-- so far we haven't needed to, and we are really hoping to actually fix them before that becomes necessary. AUDIENCE: Well it sounds like you're talking about you restart it after a given set, not in the middle of running a set? JENNIFER BEVAN: Oh, no. Definitely let the tests finish, that had already started, and then we just don't send any new test requests off to that machine until it's done processing. It gets restarted, and then we open it up for more tests after that. AUDIENCE: OK. Thank you. AUDIENCE: Hi. I'm wondering about your use of the Chrome browser. Do you always use Chrome or do you only use Chrome in the instances where what you're trying to test is not testable using Firefox? JENNIFER BEVAN: Actually just because the Selenium RC support for Chrome is so good we just use Chrome because effectively it's a super set. Now if someone specifically wanted to test some of the stuff that Chrome bypasses we could certainly set up a configuration for that. We don't have it as one of our current one's but there's nothing that would keep us from doing that. AUDIENCE: I'm just wondering if there any elevated privileges that are available to Chrome which aren't-- well, which misrepresent what's available to Firefox? JENNIFER BEVAN: Go for it, take this. JASON HUGGINS: Well, the thing is, the same origin policy trips up everyone all the time, everywhere because you think a user can just go to one site and then click a button and go to the next one. But they don't realize Selenium, because it's implemented in JavaScript, is subject the same origin policy. You can't just go from evilhacker.come to chaste.com and transfer things back and forth. So as far as testing is concerned if you go with the philosophical idea that you are doing what the user is doing you don't want the security there because a user is not limited from going from site to site. So it's a bug for your testing tool to have a limitation that a user, a human being does not. So I would actually say that whatever the *Chrome mode, that we should just string replace-- when you start Firefox it should be in Chrome mode. And then we should then call Chrome mode secure lock down mode or something like that. Because if you really want a test that you couldn't have a program type of file upload thing but I think you might want a different use a tool for that. So Chrome is great because there's no security. But there are other work arounds for IE and Safari. There's a thing called-- it's really experimental-- called proxy injection. And there's also some really crazy stuff for SSL support. And it's working but sometimes it's working, sometimes it's not for the apps that I'm trying to point it at. So I think that the future of Selenium RC would be more like an elevated privilege kind of mode into the browser, like native code. Like what the WebDriver is doing where we're actually driving it natively. The downside to that, the native approach, is that you have to get more in bed with the native code. So you have to write C Sharp or C Plus Plus for IE, you have to write Objective-C or AppleScript for Safari. Firefox is effectively its own platform. It's a complete black box any platform it sits on. So you can't use Win32, or AppleScript, or whatever. You have to program to the Firefox platform. So that's why you have to do crazy stuff like telnetting into it because the operating system, as far as it's concerned it's like another virtual machine. Like really crazy stuff. So Selenium solved 80% of the problem with its just simple-- like the JavaScript implemented version. But I made this answer wait too long than it needed to be. But, yeah, the security issues are that last-- It was trade off, simplicity of implementation, do a JavaScript everywhere, but you lose the ability to do anything everywhere. So eventually we'll be going down to more native, closer to the browser, internals approach. AUDIENCE: Thanks. Can I ask a couple more quick ones? Do you always close a browser after the test or do you sometimes return it to some kind of neutral state and start from there? Because I know that the brother start up time is a significant part of the test runtime. JASON HUGGINS: I've seen both. It depends on what you do. I've seen some places where they keep it up and then tunnel a whole bunch of tests inside of it and then after maybe 10 munites they'll just shut it down. But I think most people just go ahead and keep it so you can reason about failures. Every test assumes a start up, a launch and a shut down. If you do it in a massive grid hopefully that start up time isn't so huge. But I don't know if you have a take on that. JENNIFER BEVAN: Essentially that's in the control of the test. If the test sends a stop command then it shuts down the browser. So it's well before the actual grid gets involved. So we actually let the test make that decision for themselves. JASON HUGGINS: And there's an interesting side effect to that. If the test writer-- if you're looking at the grid but that test writer has forgot to say Selenium.stop you now have machines start percolating extra copies of the-- JENNIFER BEVAN: At which point we shut them down after awhile when they've been inactive. JASON HUGGINS: Of course you have the side effect in restarting Selenium anyway because when you stop the server all the browsers that were attached to disappear. AUDIENCE: So I do have a question about putting the hosting on the grid. So obviously parallelization will help me if I can split my original task into independent sub tasks. Now if I'm testing a web-based application probably all the clients are more or less independent but they are still talking to the same sever. And in each test case I expect to find a server in a given state when I start my test case. Now these test cases executed in parallel might have some side effects on the success of each other that i really don't like. Do you suggest also to run a separate web server for each of these applications? JASON HUGGINS: If you go through the little "Getting Started" guide on Amazon they say you don't have just one type of a machine, like a Selenium thing. You could have two types. So I could imagine if you want a really thin slice through your stacked you could bring up an instance that's nothing but mySQL server, another one that's nothing but a web server, another one that's nothing but a test harness to those other two. So any tme you're actually doing one tests you've actually got three instances all just talking amongst themselves. If it's one of those things that all that is a simple matter of programming. I do not solve the problem you'll have to solve it. So really this is just an advertisement for the possibilities that you now has a regular programmer on the streets you can have access to a really super powerful grid. But it'll be up to you as far as figuring out what the best design strategy is for isolating your tests properly. AUDIENCE: Then the question is probably this. It seems to be pretty tempting to solve the problem that what I am doing is done not in the most efficiently way by throwing more resources at it instead of solving the actual problem. JASON HUGGINS: Yeah. Yeah. I'm guilty as charged for throwing more crap at it and seeing what happens. AUDIENCE: There is feedback on your credit card account, of course. AUDIENCE: So, I have a follow up to that actually, is that I think that to parallelize your test cases you're always going to do things in the non most efficient way. So the question of opening and closing the browser? So opening the browser and closing the browser every time takes longer but doing that can isolate your test cases to where they can be individually run across different machines, and you actually end up being faster? So if you look at just the design of the one test case you may say, well why do you do this every time, there's a start up overhead involved? But it allows 150 tests each run in their own thread and finish in five minutes as opposed to running sequentially and taking two hours on the machine. So I think that there is a most efficient trade off to actually make use of the ability to have multiple software that ends up making the end result much faster. JASON HUGGINS: Right. This is also part of a complete balanced breakfast. And the most efficient thing is to never run a Selenium test. JENNIFER BEVAN: They're sufficient in the global sense and in the full system sense. In general what we do see is for the tests that tend to take a very, very short amount of time we get more of the behavior of one browser session, a whole bunch of stuff. Our JS unit tests tend to run that way and then you would shut it down. Whereas if you're doing any actual application level testing you tend to take Julie's approach and you just go ahead and pay the extra up front cost. JASON HUGGINS: A lot of questions. JENNIFER BEVAN: One more. AUDIENCE: I'll just talk. Jason, does Amazon let you control where the servers are physically? Can I cause a server in London to run my tests and a server from Hong Kong to run my test? Do they give you any control of that aspect? JASON HUGGINS: [UNINTELLIGIBLE] ask question so I don't have to repeat it, even though I heard you. AUDIENCE: You didn't hear me? JASON HUGGINS: No I heard you you. Ok, I'll repeat it. AUDIENCE: OK. JASON HUGGINS: So can you control for the-- AUDIENCE: Can you control the physical location of the nodes in that parallelization. I'm thinking a lot of financial systems, what you'd like to do is to create load from Tokyo from Hong Kong from London from Paris from wherever. And I'm just curious if they give you any control over that. JASON HUGGINS: So my quick answer is I don't know. But my second answer is one of the Corr developers of a Selenium RC, Patrick Lightbody He quit his job at Jive Software, started a company called [? ROriginate, ?] which was effectively hosted Selenium testing, Not too long ago-- six months ago or so-- he got acquired by a company called Gomez, and they do exactly that. So you can, say, run a test from Australia from the behind the Great Wall of China, maybe, from Tokyo, from London, and you can write-- I think they can [UNINTELLIGIBLE] way yeah, you effectively can write a Selenium test. So there actually are some interesting trade offs here. You can do all this kind of stuff on your own, on Amazon's grid, you don't know exactly where it is. They might actually be able to tell-- maybe if we could do a trace route and find where it came from, I don't know, you could tell it to go somewhere. But if that's the critical thing that you would want, go talk to Patrick Lightbody at Gomez and he'll be able to hook you up, I think. Sorry, yet another plug for another company. An equal opportunity shiller. ALAN: All right. Are there any more questions? I think we're done. All right. Well, thank you very much.