Github Presents • Passion Projects - Live - #2 • heather arthur - Machine learning

[GITHUB PRESENTS Passion Projects Heather Arthur] My name is Julie, I work at GitHub and I'm also the creator and organizer of Passion Projects. [cheering, applause] [laughing] I'm really excited. [laughing, applause] I'm really excited to have all of you guys here and I'm incredibly proud to introduce Heather Arthur, who is our speaker for tonight. So, if we can just give her a nice round of applause and welcome her to the stage, that would be great. [cheering, applause] [Heather Arthur] Alright, so just to give you a little background. I'm Heather, and harthvader on Twitter, harthur on GitHub. And I am a developer at Mozilla. I work on Firefox, so that means I write open source code, all day, every day, which is really awesome. In particular what I work on is the Firefox developer tools. So, these are the tools that are built into the browser that help web developers debug and create their webpages. So, this is just a screenshot of the stuff that I work on but I'm not gonna talk about this today. I'm gonna talk about something else that I'm really interested in, which is machine learning. More particularly, bringing machine learning to JavaScript. So, let's talk about machine learning for a second. So I think machine learning as a phrase sounds really intense. It kinda sounds like a robot apocalypse to be honest. I know, I remember telling one of my friends that I was doing machine learning stuff and they were, they sounded terrified, when I mentioned, and they were, like, "Oh my gosh you're making Cylons or something? So, I think when a lot of people hear machine learning, they think of robots and math formulas and it's actually a little bit like that but not, it's not all like that. You're not always making robots. And it's also, can be really useful even just for everyday applications. So, I like to think of machine learning as a collection of algorithms and these algorithms are used to solve problems that people have been really good solving, but programs have been pretty crap at automating in the past. So, examples of this are spam filtering. So, your email deciding whether or not something's spam so you don't have to see it or not. Face detection, recommendations like Netflix and Amazon do. Character recognition, so digitizing books. So there's some of the applications. And if you sat down to write a program to solve any of these, let's say, spam filtering. If you sat down to try to solve that without machine learning, what you might do is, you might look at some emails. You might be like, "Alright, well a lot of these spam emails are in all caps, and I don't see any legit emails in all caps." So, you might make a rule that's, if the email is in all upper case, then it's spam. And then you might look at some more emails and you might be like, "Alright, well, a lot of these spam emails are also mentioning free ipads. And that's kinda sketchy and my other emails don't mention that." So you might make another rule, that says "or if there are the words free iPad in it" then, it's also spam. So what you're gonna end up with is a bunch of painstakingly hand-picked "if" statements, basically. And even then, you can't look at all the emails in the universe. You're gonna miss a lot of things just doing this manually. And what machine learning algorithms do is they find these "if" statements for you and they do this better than you can and with less error. So essentially, machine learning algorithms are, like, better programmers than you. Oh, just for these particular applications. So, to sum it up, on the one hand we have traditional programming, where you just have, you choose the conditionals and all the parameters. And then on the other hand we have machine learning algorithms, where the rules are learned from the data. And so, this is actually really powerful. And there are some applications, in fact, where a machine learning solution is the only really manageable solution to that problem. So yeah, basically, machine learning super important and super useful. So now let's talk about JavaScript for a second. JavaScript is a ubiquitous programming language. This is a screenshot of the top languages on GitHub. I believe this is the percents of repositories where the main language is JavaScript. And some language, or some repositories are multiple languages, but I think it's safe to say JavaScript is squarely in first place. And not only that, but JavaScript is essentially the only language that runs in the browser and also runs in server with no JS. So yeah, it's a huge language. So it makes sense that we see if we can put the two together, machine learning and JavaScript. [Let's solve a problem] So let's do exactly that. So what we're gonna do, is we're gonna solve a problem, with machine learning, and JavaScript. So this particular problem we're gonna solve is, you get bombarded with a lot of content today on the internet, and sometimes you're wondering if you should bother, checking something out. I have this problem a lot. So, you might ask yourself, "Should I bother clicking on this link? Should I bother checking this thing out?" So the answer to that questions lies in another question, which is, "Does it have any cat pictures?" [laughter] Course. So this problem is called cat detection. And this actually, this is not an original problem at all. If you look this up, if you search for this, there are research papers and PhD theses on this going back years. So it's, clearly an interesting problem. So the exact problem we're gonna solve is given an image, does it contain a cat? We're gonna break down a little bit more, something more manageable. So just given a section of image, we're gonna see, is that section a cat head or not? So what we would need to do this, we need something that will take an image and will be, Yes this is a cat, or no it's not a cat. So what we need is a classifier. So a classifier does exactly that. It takes a piece of data and it tells you which class it's in. For our case, we have two classes: cat and not cat. And classifying algorithms are usually trained on data where you actually already know the class to it. And some examples of classifiers, there's Bayesian classifiers, for spam. K-nearest neighbors, neural networks, support vector machines. These are all different classification algorithms used for different things. We're gonna use the support vector machines. So, to talk about the support vector machine algorithm. So, it's also another phrase that sounds pretty intense but we'll get to at least figure out the vector part in a second. So these support machines, just like I mentioned before, are trained on data where we already know the answer. So it's trained on labeled data. And what I mean by data is, basically, a bunch of vectors of numbers arrays of numbers. And then, what I mean by labels is, we're gonna give it a one, or a negative one. So in our case, we want to label a vector with a one, if it's a cat, and negative one, if it's not a cat. And also, in our case, what we have to do we have our labels, we know it's one it's a cat, if it's negative one, it's not. But what is our data gonna look like? So we know it has to be an array of numbers. So we have to figure out how to take an image and get an array of numbers that represents it. [What's the input? Pixels] So, what's our input gonna be? So one thing we can do, is just straight up take the pixels from the image. So we could just take the [RGB] values and flatten it out into array, or, we could just take the gray scale intensities of each pixel. In both those cases we're gonna end up with like a ton of information, not all of it's relevant, there's gonna be a lot of noise. So we wanna help out the classifier a bit. We wanna give it something it can chew on a bit more easily. So here's another idea, which is we could just look at the edges in the image. So this is the edge of an image. Edges in images are places where the image changes from dark to light or vice versa. And here we've taken the gradient of each of the images above so each of these images on the bottom are the gradient of the image above it. So the gradient is basically, getting the edges from the image. And you see, after we've done this, we've taken out a lot of information, like the colors and whether it's dark or light and we've retained, basically, the only thing we've retained is the shape of the cat head. So, that's really awesome. That's exactly what we want and we don't have a lot of extra information that we don't need. So it's great. So we're gonna do something similar but a little bit more sophisticated. So we're gonna do, is get the histogram of oriented gradients of the image. Another intense sounding word. So it's also, I call it a HOG descriptor. I've never heard it said out loud, I've only read it on the internet and in papers, so I hope that's how you really say it. [laughter] If you're a real machine learning person, but I call it HOG descriptor and this will capture the edges that we were looking for before but also capture the direction of the edges. So, we'll capture the angle of the cat ear. So, and it does it, even better. And a HOG descriptor is an array of numbers. So this is perfect, this is exactly what we need to feed into our classifier. So just giving you an idea. The HOG descriptor is gonna be an array of about a thousand numbers. Alright, so now we have this, we know what data we're gonna give the classifier. So now we actually have to collect it. So this is an important step, getting a lot of data to train our classifier with. So what we need to do is we need to get a bunch of cat pictures and a bunch of non-cat pictures and we want a lot of both of them, so the classifier knows about both classes. So for the negatives, those are the images without cats. For those, we're just gonna download a bunch of images from Flicker and crop out random sections of them. And then for the positives, we need thousands of cat pictures and we also just need the cat head too, we need to crop around the cat head. So they're actually, there happens to be this amazing data set of cat pictures Each one is annotated with the location of the ears and the eyes and the nose. So this is very, convenient. I can't believe this exists but it does. [laughing] So that's very nice for us. So all we have to do now is take each one of these cat pictures, rotate it so that the eyes are level and then we crop the picture so that we're just framing the cat's head, like that. And then also we want each of these pictures to be the same size. So we can feed it to our classifier. Ok, so that's awesome. So now we have you know, thousands of cats and non-cats. And now, we can start training with that stuff. So, training. This is where JavaScript comes in. So, I'm just gonna go through this a bit. So first of all, there are a couple of packages that we're using from the NPM package manager. There's a HOG descriptor package, to do the HOG descriptor. And the SVM package, which will give us our support vector machine. So you're creating a support vector machine and then we go through each of these pictures that we have, and then we first extract the HOG descriptor from it. And then we give it a label. One, if it's a cat, and negative one, if it's not. And then finally once we've gotten all these inputs and their respective labels, we can call svm.train with this and put some labels. Alright, so that's it. So we've trained our support vector machine and just to give you a better idea of what's happening when you do that what training does, is it creates this state. So this is a JSON object, a JSON object that is the trained state of our support vector machine. I hope you can see. Yeah. So, you know how I said earlier that machine learning algorithms find "if" statements for you? So they actually, that's actually not what's happening. In reality, this algorithm is finding floating point values for you. So, the "if" statements are fixed and the floating point values change based on which data you trained it with. So, the values of these numbers here will determine, will be combined with your new data that you haven't seen before and will help you, will help the support vector machine determine what class it's in. So ok, so you've got it, our support vector machine trained. So now what we have to do is go back and start solving our problem, which is, the first problem we're trying to solve is just given a section of image, is it a cat head? So what we're gonna do all we have to do here, is take this image that we're gonna try to classify, we extract the HOG descriptor from it, just like we did with the training image. Then, we just call svm.predict with that descriptor. And that result will be one or a negative one and that's just, that's not a label, that's actually just a prediction of what the support vector machine thinks the image is. So then, if it's one, we'll say, "True, it is a cat." And if it's not, False, it's not a cat. Alright, so great, we've [bought] this function that will take a section and tell us if it's a cat head or not. So now, we can go one level up and finally answer our original question, which is, given an image, does it contain a cat? So what we do, is we take this function that will classify individual sections and we run it on all the sections in the image. So we're testing windows at all different locations and scales throughout the image. And then after that, around cat heads we'll have a bunch of overlapping detections, so we're gonna combine all those and also weed out, in the process of doing that, we're gonna weed out any detections that, that dont have enough detections. They're probably false positives. So, not really cats. So yeah, and then that's it. That will answer our original question. So, let's go to demo. Ok, so we have Kittydar. I actually did this about a year ago, but this will run this algorithm that we just made. So we're gonna try it out. We're gonna give it a cat picture. Ok, and you see what it did, was it did that thing where it detected at different windows and then it combined those overlaps. So bam! It got, there's a cat. [applause] We're gonna try another one. Yay! Ok, we're gonna try another one. You'll see the limitations of it. So, you can see here that it got two cats, but there are 4 cats there. It didn't get the two on the sides. It didn't get the one on the left. Yeah, it's left for you. Ok, didn't get on the left because we didn't train the classifer to detect cats that are facing the other way. We just trained it on cats that are facing the camera directly. And then, the one on the right has an abnormal shape. [laughter] So, it's not, and it's also a little tilted, which will make it harder as well. But still, it works pretty well. So that's pretty cool. And I have to mention one other thing that I did. This is running in Webpage but we have to run it on Web Worker, so in a separate process. This stuff we're doing is serious number crunching. It's basically the equivalent of doing a while-true loop for a few seconds on your page, so you really want to, you really wanna make sure you're not running that on the main thread because it will block everything and it will be really bad. So it's one more thing that you did, that I did as well. Alright, so what I think is really awesome about this, is this is running in the browser. I just think that's the coolest thing ever. It's all running on the client's side and actually, well, we see the url harthur.github.com, so you know it's static, and it's really cool. So finally, alright, so, We just saw that running in JavaScript. So a few years ago, or several years ago, this just would not have been possible in JavaScript and there are a few things that have made it possible. So, I guess the big one is just speed. JavaScript has gotten a lot faster the past few years and this kind of number crunching just ran way too slowly in the past. So, that's one thing. Another big thing is node.js, which came out a few years ago. You wouldn't have been able to do that collection and training stuff we just did without node.js. We really need a command line and access to the file system to do those things. But I also, I have to note that you don't have to do each of these steps in the same language. So, you could do your collection or training in another language and then, export your train state, so JSON or something and then use it from JavaScript, too. So, that's another option. And actually, for the collection, I think I used a [INDISTINCT] script to get Flicker images, cause the API was easier to use there. So yeah, you could still mix it up but node.JS makes it possible to do these steps in JavaScript if you want to. Another thing that's helped out a bit is typed arrays. So when JavaScript- usually, when you have an array, you can put anything that you want into it. You can mix types. You can have indices where you don't put anything in it, and then some have strings, and some you put numbers in, and some random objects. So, it's very lax. And with typed arrays, you're telling JavaScript this array will only contain, floating point 64 bit numbers, and so what Java can do is, it can figure out the offset into memory a lot faster using that information because it knows the type of everything in the array. So, it makes it a lot faster. In fact, it makes it about two times faster for training. So instead of it taking ten hours to train our support vector machine, it takes five hours, which is a big deal if you are trying to iterate and figure stuff out. So, it's one thing. Another thing is that workers really make this possible to run this stuff in the browser. Otherwise, it would, there is no way you'd want to run it in the browser. It would just block everything. So there's some big things that helped out a lot. And as you can see, it's clearly possible to do it now. But there's still stuff that JavaScript is lacking. JavaScript is definitely not your typical language you use to do machine learning. Things like Python, languages like Python have a ton of really good building block libraries like Mat Lab and stuff like that, which JavaScript is lacking. Matrix Math, Library Statistics, Image Processing, that kind of stuff is just not quite there yet in JavaScript. So that's one thing that still really needs to happen. But also in a, also it'd be nice if it were even faster, so it's cool. But yeah, let's see here. But things are still happening though. There's some libraries out there. There's neural networks, there's support vector machines, clustering, and the natural language processing library, which I really like, called Natural Node, that also has a Bayesing classifier, as well as, doing natural language processing, which is, kinda similar to machine learning, in some cases. So I wrote the neural network and the clustering a few years ago. Also Andre [Carpathy] has been contributing a lot of libraries, too including the support vector machine implementation, which we just used, so that's cool. And there's some applications in JavaScript, too, using machine learning. My favorite one is this real-time face detection. So this is face detection that runs really quickly. So, that means that you can do it on every frame of a video, like an HTML5 video. So that is super cool. There's also a hand and eye detection library and the first machine learning I heard of in JavaScript was this OCR, Captcha solving algorithm, which is pretty cool. And that uses neural networks to craft Captchas. So it'll be really interesting to see where machine learning will go in JavaScript. It's hard to tell right now, but I hope that people will try it out, [sure]. And finally, I want to talk about if you guys are interested in machine learning. So, machine learning doesn't always look like this. There's not always this classification, you're not always doing classifications. Sometimes, you're predicting continuous value, like temperature, stock price. Sometimes, you're not doing, you don't have to do this training stuff at all. Sometimes, you're not using labeled data. So, there's a huge variety of things that machine learning does. So just be aware of that, too. This is just kinda like a small example. And also, there is a Stanford, a free, online, Stanford machine learning class. And that's actually starting on Aprill 22nd, so that's in a few weeks. And that's hosted on Coursera. I actually took it a year ago and it was really good and it really helped fill in a lot of holes for me. So, definitely check it out. And finally, what really I think I learn the most from is just starting with a problem and going from there. So, when I was working on Kittydar, I just kind of, well, first thing I just searched for "cat detection" and then from there, found research papers, I read them, when I didn't understand something, I looked it up on Wikipedia and then they linked to other papers, and read those, and I actually ended up learning a ton of stuff just from doing that. So much stuff, I never thought I'd know anything about. So there is, yeah, there's a lot of value in that. I really encourage you to, if you're facing a problem, you think you can use machine learning on, just go for it and you'll end up learning a lot. I think that our brains are kind of like these machine learning algorithms, where you can just kind of throw fancy words and research papers at it and eventually, it will just figure things out. [giggle, laughter] That's what I found, at least. So, yeah anyways. So try it out. It's my best suggestion. [image source: Sandra-honestly on deviantart] Ok, thanks. [applause] [GITHUB PRESENTS Passion Projects Heather Arthur] Hi guys, welcome back. I hope you're enjoying all of the yummy snacks that my wonderful co-workers provided for us. Pretty awesome. That was not a plug, I swear. So Heather just gave an amazing talk and it was probably a lot different from Rachel's talk last week, which was more sort of like high-level and organizational, whereas Heather's was super technical. And so I'm just kinda gonna jump in and ask you the question. How did you get into programming? [Heather] I got into programming in high school. There was this three week seminar period class in high school where we could take one of five classes or something, and my friend Alyssa and I decided to take this game programming class. And it was just kind of random. We were just like, "Oh games, that sounds kinda cool." And so we signed up for it and just the first day of class, you know, I saw what the code looked like and I saw what it involved finally for the first time, I had no idea what it entailed -- [Julie] - It was exactly like the movie Hackers, right? [laughter] Yeah, stuff streaming down your screen. No, but I just saw the code and I was just like, "Yeah, this is definitely what I wanna do. This is awesome." And then ever since then, I just, you know. I basically started in school, and then I kept on taking classes, and then majored in it in college, and then, and so on from there. [Julie] That's really awesome. You have a really awesome story in that it's, also, it's completely different from Rachel's, who learned to program kind of after college. You went to school for a CS degree at Carnegie Mellon and you continued straight out of college but I think it's really interesting kind of the way that you found Mozilla. Did you kinda wanna talk about that? [Heather] Yeah, so I work at Mozilla now and I've worked there ever since college. And actually I found it, I was at my career fair at my school and I'd known that Mozilla, I mean, I'd used Firefox two years, I knew about Mozilla but I didn't know that they actually hired people. So when I saw their booth [laughing] at my college's career fair, I was like, "Oh my gosh." They, you know, I went and talked to them. And I, I probably, I dunno, I think I sounded like a total idiot. But I guess they liked me, alright. And then I became an intern there, and so interned there one summer, and then, when I graduated, I started working there because I loved it so much. [Julie] And you've been there ever since? [Heather] Yeah, I've been there ever since. [Julie] What's it like to work at an open source company? [Heather] It's awesome. I don't have anything to compare it to. [laughing] It's really awesome, I don't think I could go back to proprietary unless I had to. It's really cool to be able to talk about what you're working on and not have to think, "Oh, maybe I can't talk about that, or something." And I can release, yeah everybody can see my code. The best part is getting contributors, too. You'll be working really hard with your team and there's these bugs that are falling by the way side, nobody's fixing them and a contributor will come along and fix this really critical paper cut bug and, it's just like the most awesome thing. [Julie] It's pretty awesome. So GitHub obviously, has a lot of open source projects and we asked, I got the privilege of talking to Heather before her talks, so I cheated a little bit and I got to hear her story a little bit and how she got into programming, obviously, and also open source but we talked a little bit about sort of the role GitHub plays in open source and what it was like to sort of transition into using a tool like GitHub. [Heather] Yeah, it was awesome. I actually, well, I have been doing open source a few years now. I'd been doing it before GitHub was around. And I was hosting my code on Google code and stuff like that. I definitely wasn't getting any contributions. I don't know if people saw it even. But definitely after GitHub, when I started putting my stuff on GitHub, I got so many pull requests and issues and people saw my code. It's just great. [Julie] The best Christmas presents, basically. I think pull requests make the best Christmas presents. [Heather] Yeah, definitely. [laughing] [Julie] Yeah, so yeah, GitHub has been awesome. And also Git, too. It's helped. Absolutely. Very essential. [Julie] If you, if anyone has, I kinda planned to make this more into a discussion, so if anyone has any questions, throw your hands up. Don't be shy. There's a question. Hi Steven. [Steven] So, did you study mathematics and statistics as part of your college track? Or is that something that you started learning afterwards in order to use that to solve problems? [Heather] I did, I was a Computer Science major and I was a Math minor but the Math I was doing was this crazy stuff, Ring theory and stuff so it wasn't really very practical. For instance, we had the option to take a matrix algebra class, or linear algebra. Linear was really theoretical. And I really could've used this matrix stuff, for the stuff I was doing here on the talk. But I didn't take that class. I took the really theoretical linear algebra class that was really abstract. So, I really kinda skipped over that practical Math and Statistics stuff actually. So, I really learned that through this. And another thing that I learned, I don't know if you guys found this yourselves but when I was in school taking Trigonometry classes, Trigonometry was the subject where I was like, "Ok, this is like, I'm never gonna use this." You know, the sine angles and stuff. But for doing the cat detection, I had to figure out how to rotate the image the right way and I was like, "Wow, I'm actually using this stuff." And I had to really re-learn that. Yeah, basically, a lot of it I just learned from figuring out problems. Yeah. Even though I did take classes on related stuff. [Julie] So you studied CS in school and I'm always really curious to hear how you chose a language, or how you chose what you would work on next, 'cause you, obviously, you code a lot in Javascript and you also write [CUE2] and HTML and CSS. [Heather] Yeah, so I guess my job has been JavaScript programming for so long. And I really, I mean, I have to write JavaScript for my job. Firefox is written and the parts that I work on are written in JavaScript. And sometimes actually platform stuff, which is C++. So right now, it's really about what I have to use. I can't just call out to Ruby from Firefox or something, or people would be very mad at me. [laughing] But so, that's part of it. But also, I do have a lot of side projects and they're mainly JavaScripts. I really like doing them JavaScripts so that they're in the browser 'cause I love it when things are in the browser. But also, I really, I love Node.JS too a lot. It's been, made it really easy to make command line scripts and one-off things, too. [Julie] So, Kittydar wasn't your first open source project obviously, right? [Heather] No, definitely not. That was, Kittydar was just a demo to explain machine learning. But I think my first, I think the first open source I did was working as an intern on Mozilla stuff. But I guess, the first one I did myself was a Firefox add on that helped web developers pick colors on their webpage and save colors and tag them and stuff like that. [Julie] That was your first experience with machine learning, basically? [Heather] Yeah, actually that was my first experience with machine learning, too. I have this problem where I would be displaying these colors that [he] collected from this webpage, I also displayed the color value on top of this color. So if it was a light color, I'd wanna display black text over it and if it was a dark color, I don't wanna display light, white text over it. And there are colors, like bright green where it's not really immediately clear, we're like, "Which color do you wanna, black or white to display over it?" So I looked up some formulas for doing that and I found them to be pretty spotty. So, I just kind of, in the back of my head I had heard that the words "neural networks" before and I thought it sounded super cool. [laughing] 'Cause they do sound really cool. And I figured out actually, I might be able to use them for this. And so I wrote neural networks in JavaScript and it was a Firefox add-on so it had to be JavaScript. And it actually work pretty well at determining which color to use, black or white over random color. So that was my first foray. [Julie] And you still work on open source projects outside of what you work on - at Mozilla, or does it matter-- - Yeah, yeah occasionally whenever I have time. At this point, I'm really just maintaining open source projects, which is a lot of work in itself. I'm trying not to create too many new ones 'cause then I have to maintain them. [laughing] But yeah, mainly work, which is also open source. [Julie] Do we have any other questions? Yeah, the excited guy. [guy] Hi, you mentioned Python, and all those amazing libraries in Python, and last week we had Mozilla launching S and JS, which would enable Python [inaudible] being used on the browser. Do you think that, what do you think will be [inaudible] this technology [inaudible] ML in general? [Heather] Yeah, I think that I don't know yet. Yeah, I really I have no idea what effect that stuff will have and how many people will end up using S and JS and other languages compiling to JavaScript. - I can't-- - [Julie] Predict the future. Right now. [Heather, giggling] Yeah, I can't predict that. I'm not sure. It's not something I would do right now. I wouldn't be like, "Alright, I'm gonna use this Python and convert it to LVM and have it run unscripted, or whatever. - Yeah, I don't know. - [Julie] The answer. We don't know. [Heather, laughing] But it will be really interesting to see and I hope people try it out for sure. Maybe I'll try it out at some point. [Julie] Any other questions? Yes, Tea. Oh, sorry. [laughing, comments from audience] Now fight. [more laughing] [woman] Do you know if citizen science projects like Galaxy Zoo or Planet Hunters are using machine learning behind the scenes because they're collecting so much data that could be used for people basically tagging different photos and things like that - on Facebook? - [Heather] Ok, so... [Julie] That's a hard question to repeat. She's asking you about specific projects and whether they're using machine learning. [Heather] And so was it Galaxy something? [woman] Like citizen science projects like Galaxy Zoo and space-- Planet Hunters and Galaxy Zoo and there's a few others out there, like Protein Folding ones, and all kinds like that where they're basically collecting a whole bunch of data from people who are entering information about images they're given. And you think on the back end they've been collecting that information and developing machine learning to do it [inaudible]? [Julie] Yeah, I've actually heard of at least one app that's actually a game and it's used to detect protein bindings, or something. And so they're using, yeah, so there's a lot of really cool things. I mean that I think is a native iOS app or something like that, so it's probably running on what, C, or something. I dunno. But I don't know of any, do you know of any others? [Heather] Yeah, I'm not sure. But yeah, most likely they are. I think anytime that you have a lot of data and you're using it to predict stuff, it's probably using machine learning. [Julie] Yeah, definitely. When we talked, we had some really simple examples of machine learning and applications that we all use or sort of, you wanna talk about those a little bit? [Heather] Yeah, images I wanna talk like spam filtering, everybody benefits from that. And also, I think a lot of recommendation [inaudible] use machine learning to recommend things for you based on what people like you also use. [Julie] So when Amazon tells you to buy that horse mask, they're obviously using some type of machine learning. [Heather, laughing] Yeah and any time there's computer [ridge] that uses a lot of machine learning. So, face detection, self-driving cars, they're probably using it, too. [Julie] Laura, you have a question? She's hiding. I can call people out by name. [Laura] What advice do you give people who for people who are getting into programming? [Julie] The question was what advice do you have to people who are getting into programming now? [Heather] I've, it's so, I've been programming for so long that I wish I could empathize more. It's, I think, for me, I came from an academic background and so I kind of so you could always do that, which is just taking online classes, free online classes, like the machine learning one I was talking about. Even that one, you don't have to know programming to take it. The professor, Andrew Young, is really good at explaining things and you don't need to know programming but you might end up learning some in the meantime. And also I know I've heard of Code Academy that does this online JavaScript learning. And I've heard nothing but amazing things about that. People say how easy it is and how easy is to understand things immediately from that. So that's another thing I've heard about. So there is, yeah that's kind of coming from the schooling kind of background. But also I think if you can think of any problem that you just wanna solve, kinda what I was talking about with machine learning, it really, once you try to figure something out, you end up learning so much. And then, before you know it, you'll know a lot of programming. Just by trying to solve something. [Julie] Any others? Oh! [man] How would you address the approach that you take when approaching a machine learning problem versus traditional programming problems? How do you make those logical leaps to find the algorithm you need? [Heather] So I think he's asking, how do you approach machine learning problems differently from regular problems maybe? So, I guess, machine learning, well, I certainly end up doing a lot of research with the regular program I'm kind of, I know what steps I'm gonna take, more or less. I'm like, "Ok, well, I'm gonna make this object and then it'll probably do this thing and it'll communicate this way." [Julie] There are a lot more conventions and best practices around regular programming problems, whereas machine learning is still very open, depending on the problem and, I mean, there can be an entirely different path [inaudible]. [Heather] Exactly. I think with machine learning, it's more you have to do a lot of research. You're, "How the crap do I do this?" And then you end up just searching for it. And also another big thing, is he needs to figure out where to get your data from 'cause there's always data. That's a big difference, too. [woman] Hey, I was wondering how you choose whether to use Webworkers or if you use a more traditional RPC Asynchronous framework and what are the trade-offs that you're finding? [Heather] She's asking whether, how do you decide between Web Workers and RPC Asynchronous. [woman] Yeah, because it seems to me at least naively, that they have a lot of the same benefits. That you can do data computational tasks or computational heavy tasks without worrying about it using up your [data space.] [Heather] That's interesting, I literally haven't heard of the RPC Asynchronous stuff. I'm not exactly sure what you're talking about actually. [woman] So you send off an RPC call to a server, maybe I'm just using the wrong terminology. [Heather] Oh I see. [woman] And then you let the server handle the processing and you send it back. [Heather] So yeah, it's basically whether doing it in Web Workers and the browser versus having it done on the server. And calling out to, ok, cool. So there's a big difference there. So one thing, I actually, I like doing my webpages all on the client side if I can and having as little server side as possible. Not that I'm saying that's the best thing to do or anything, but I usually do that so I prefer using Web Workers to do something like this. And also it's just one less network request, which is cool. And a lot less dependency on that. So that's the main difference, I guess. It's just whether you want to do the computation on the client or the server and do an extra network request probably. [Julie] We have two more questions. Anna? Will you go first? [Anna] What really drives you? What are you really passionate about? What makes you get up in the morning and be excited to go do your job? [Heather] She's asking what makes me passionate? [Julie] What drives you and what makes you get up in the morning? What are you excited to work on most? [Heather] I'm really excited, mainly, about I love my job, I'm very invested in making our Firefox developer tools better. I think it's really important that we help web developers out. So, that drives me a lot. I'm like, "We need to get this feature done so web developers can use it." So in that sense, a lot of my motivation is about just getting stuff done. Getting particular things out there so people can use them. That's the main motivation. Also, some, I mean, just programming in general is really fun so that's a secondary motivation. It's just, "Oh I get to do fun stuff now." So that's cool. [Julie] So one of the things, or one of the reasons we started Passion Projects originally, was we wanted to hear from women who really loved the companies they worked at and I remember during our interview, I asked, you know, what is your favorite thing about working at Mozilla? And I think you said it was the people. Do you wanna talk about that? [Heather] Yeah, definitely, at least again, I haven't worked at other companies but at least at Mozilla, everybody there really cares. It's an open source company. It's also wholly owned by a non-profit and we have a really clear mission and I think that a lot of people that work at Mozilla are really passionate about that. So, it makes for really awesome co-workers and they're also, you know, they're doing open source but they're really used to helping out a lot of people and so they've been extremely helpful and they're also just really nice. They're not, they're not *** basically, which is awesome. [laughing] So I just love everybody that works at Mozilla. I've never met somebody that works at Mozilla that I didn't like a lot. [Julie] And I think that's another thing about when you're getting into programming. Surround yourself with people who wanna help other people. I know, I was really lucky. I started writing code when I was at Yammer and I had just this incredible network of people around me who really wanted to help me learn really quickly. And I got just pointed in the right direction and I mean, if you're self-motivated and you have those types of people around you. It's the perfect formula for success, business. Other things. - [Heather] Yeah. - [Julie] So people and such, and open source companies I think attract those kinds of people. People who wanna build things that make other people's workloads more efficient and make them better at what they do. [Heather] Yeah, totally. Open source companies are the best. [Julie] No offense, everyone else. [laughing] Cool, well, thank you so much for letting us rack your brain and also for the amazing talk on Machine Learning. We record all our talks, so we usually post them in the follow up blog post. So anyone who's watching from home and maybe missed the beginning or whichever, we'll catch you up and thank you so much, Heather, for being a part of Passion Projects. [Heather] Oh, thank you for doing this. [applause] [GITHUB PRESENTS Passion Projects Heather Arthur]