Ai - Perspectives and applications

Well guys, I'm going to record our talk. Thanks Feijao for inviting me to speak I hope you like it I'm going to show some stuff that I find very interesting Stuff I would like to have known when I graduated. To motivate you guys from ITAbits and ITAndroids I'm going to start talking about a question that this freshman "Radioativo" I don't know his name, is he not here? He was asking me: If I were to build a robot, what matters the most? Is it hardware or software? You know, we see so much software available Java, Eclipse, C#, some programming language You can start developing right off Hardware seems much more complicated to get because you need some equipment, a prototyper, things you might not have. It turns out I gave the question a thought and I can give a clearer answer now, which is: Even though it's easier to obtain software, and that it may look less important, at the end of the day software is *more* important That's because if you're going to build a mechanism, and let's say this mechanism has tolerance or else it won't work. Have you ever heard about laser gyroscope? A laser gyro is used to measure rotation, angular velocity It's based on the fact that the speed of light is absolute So, in order to build a laser gyro, as the equipment spins, there's an interference counter which tells you how many spins happened The mechanics used involves a fantastic precision And it turns out the full precision is a bit beyond the specification of some cutting tools So, how do they do it? It's a mechanical system This mechanical system has an average, which is what you're looking for, but also has some variation If you know the mechanic requirement and your machine has, say, some mean and standard deviation, you can make 100 gyros and use the one that falls into the required tolerance Statistically, there should be one within it Should you need a lower tolerance, build 10000 and grab only the best It will be expensive but it's doable Now let's turn to an AI software If you consider a problem, for example, image identification: Make me a software that identifies a bicycle in a picture That is a trivial task for us humans: I can just stare at the picture You can have as large a budget as you wish since we don't know how to do this, you can't do it even with all computers in the world That means that having a good heuristic, a well implemented logic is worth more than the mechanical component itself, because knowing the algorithm is the difference between being able or not being able to perform a task It's not a matter of budget; we just don't know how to do some things Following this line of what we know and don't know how to do, I'll show you the results of a research that was conducted about how it is that we learn, how our brain works, which I found extremely interesting This is a presentation from Stanford's professor Andrew Ng, founder of their AI lab, which I'm going to comment myself There's the audio here Whoever wants to copy may get it from the web If you've already seen of course you alredy know this. So what they did was this: they got the auditory nerve of some animal, some frog I guess, for what he said, and this auditory nerve is connected to the auditory cortex, which is the part of the brain trained to process this type of signal from the auditory nerve so that you can hear So the signal comes here to this auditory part and then what they did was to cut this link and they got the visual nerve, cut the link to the auditory nerve, and rewired the vision into this auditory part of the brain Now guess what happened? The auditory cortex learns to see A part of the brain of the animal which was previously trained to see now with rewiring of the signal of the eye, this same part of the brain learns to see And it's not just see meaning there's light or not, it's see as in recognize objects and do everything that can be done with vision It's not just some ghost A part of the brain that used to hear now can see [Andrew Ng speaking] And so not satisfied with that, researchers intercepted the touch signal this part here of the brain, and did the same experience: they took off the sensory signal and rewired vision here. Guess what happened? This region of the brain that used to process touch signals now learned to see, which is very interesting If you do research in AI you know that to train a robot to do some task that is the ideal situation: what is this algorithm that runs here that can learn to process sight, touch and vision? It's the same algorithm This is one of the great secrets people research It would things very simple: Let's say I want my robot to learn to play soccer I'll just install the same learning algorithm and I make it play soccer Now I want it to learn how to drive I can simply teach it how to drive and it will learn how to drive We're not quite at this stage yet, but this is motivation for many research areas Let me speed this up a bit Now I'll show some state of the art research in this area This is not exactly AI but it serves as basis for us to extract ideas How is it that our brain works? After all, we're trying make a computer replicate what we do So here what people are trying to do is this: our tongue has many blood vessels So blind people are using sensors in their tongue and the hope is that they can learn to see using their tongue This research here is very interesting and has some very good results This is also for blind people What they do is they put these people in a room with random obstacles and then they put this buzzer on top of the guy So based on the sound that bounces back from the walls, as he touches the walls and notices how the environment is, with time, he learns to process this sound of the buzzer, kinda like a bat, and know where there are obstacles in a room And then the guy can go around in an unknown environment without bumping into anything just by processing the sound that bounces back Then, with time, as they get practice in this, these vision impaired people learn to perceive the sound that comes back when they snap their fingers and locate themselves in an environment which they didn't know This is a direction sense, kinda like the sonar one, and here they put this third eye in a frog and did this experiment where they wire this third eye into a random place in the frog's brain and so the frog learned how to see with this third eye, which is interesting How can that be? We don't really know yet, but we want to get there So this is what I intend to talk about here: How is it that we learn? How can we do it? How does our brain work? Can we grab this we have and put it inside a computer? A 2 year old child can perform visual recognition tasks much better than the best computer we have today, just so you know Here's another thing we don't know how to do: let's say I play two different movies in two different screens and show that to a 6 month old child. Then what you do is you put the audio corresponding to one of the movies, so that in one screen movie A is playing and in the other one movie B is playing. Say I put the sound of movie B The 6 month old child, after a few seconds, will look at the screen B He knows that that sound goes with what is being shown at screen B Today we can't do using artificial intelligence We don't know how to do this that a six month old child can do So if we discover how it is that we learn that will be a great evolution in terms of how we can make AI work Now let me talk about some of what we do know how to do It's a lot of things, there are many research fields and there are many choices around and there's actually a lot you can do and earn money If you want to found your own company and make lots of money by solving a problem we already know how to solve, you just need some structure That's specially true when we talk about processing the Portuguese language There are algorithms out there but not in Portuguese at least to the extent I know Well, let me talk a little bit about some stuff I did I'll show you some software, patent, some stuff I did I also brought a quiz just so we can discuss what is easy to do and what's hard I brought some real problems, research in artificial intelligence we see around, importance of machine learning and parallel processing, which I study, that is a method to have, say, 500 times better performance in a computer using graphics cards, and this is a recent development Well, I graduated from ITA, MEC05, I stayed some time in CTA, IAE, at the Defense Systems Division and then I went to Petrobras. My work there involves technology development for construction and assembly, especially for large-scale equipment and jobs, which is a field that, to me, needs some serious improvement seeing as some of the techniques are extremely old and much is done as it is simply because it has always been like that So that's a very interesting area I think I am really lucky to be where I am While I was here in ITA I used to develop using MATLAB and then I decided to create something akin to MATLAB but not quite the same So I developed some algorithms, some stuff with teachers here, with Adade, with Heinzelmann, which I think is not here anymore And then sometime later I got this certificate from INPI, which is a patent for softwares I made this software in my senior year It has some algorithms I intended to test Much of what's here is available in more modern applications now but nevertheless it was an important experience for me not just because I wanted this patent but also because it helped me understand the numeric methods which were implemented in MATLAB [pause] I'm hoping nobody saw the answers there Let me ask you now, what can we do with AI? This OK symbol means "We have a commercial solution", that is, one which works satisfactorily This symbol here means "well, there's research, there's a prototype, but not a product" And this question mark means "we just have no idea". Now let me ask, autonomous vehicle driving? We were talking about this earlier, anybody has any idea? [audience] Yes, Google has got its driving license but still, right now, we don't have a commercial solution, a satisfactory one So this one is in an advanced research state How about split voices from N microphones? Let's say I put 5 microphones in this room, or 10 microphones, and 3 pairs of people talking, can I differentiate all these dialogs? [audience] Yes, right, that is possible but there's no commercial solution readily available There's good amount of research about it and it already works, it's doable For you who want to know more, it's done using SVD, singular value decomposition. Split two voices in a single microphone? This works like this: I'm recording a dialog here of me talking to somebody else Can I make a software that reads this audio and splits what parts are my speaking and what parts are his speaking is there anything available for that? Is that in an advanced research state? Or is it that we don't know? [audience] Well, let me tell you, it looks to be the same problem, but in fact we have no idea If it's a man talking to a woman, then we can use the frequency, because male voice has lower frequency than female voice. However, if the two people have about the same voice tone, we don't know how to do it There's a lot of research but nothing very convincing, although it looks simple Next, speech recognition Like, get this audio of me speaking and turn it into text It's not the contrary, I mean, not Hawking's equipment where he types and that computer voice speaks what he wants to say, that's not it It's read what I'm saying and turn that into text [audience] That exists in Siri, but this has a detail here, do we have that in Portuguese? I don't know I'm pretty sure that doesn't exist in Portuguese [audience] But this 15 year kid work wasn't quite like how Siri works Identify the idiom of a given text Let's say I give you a text file which is written in some language What I want to know is which language it is Google does that, this is a solved problem For you who want to know more about the topic, that can be solved using letter bigrams and trigrams, I don't know if you've heard of that, but that is how you solve it. That is, I get many texts in a given language and count how many times I saw AB, AC, AD up to AZ, ZZ I count these in various languages and then I get to know which are the most likely letter sequences in each language So, for example, if a word contains DER the chance it's a German word is very high Correct "sessão" to "seção" [audience] Yes, Word can do that, but "sessão" with double S exists, there's that detail [audience] It gets green? Yes, there's that, Word does it, but systems in English do it better, it's not that good yet in systems in Portuguese as it is in English systems That is, detect that the meaning in the text fits better using another word that has the same sound This exists, but the best systems are in English [audience] Watson? No, Jeopardy is something else I'll talk about it later I even have some slides here about these question answering systems It's a little bit different, the technology used is different What I use to correct "seção" to "sessão" is a statistical procedure For example, let's say I have "sessão de cinema" and "seção de supermercado" I know that I saw "seção de supermercado" in many texts using "seção" with "ç" and I saw few texts containing "sessão" with double S before "supermercado" So it's statistically more likely that, before supermarket, I'll use "seção" with "ç" and before cinema I'll use "sessão" with double S It's a simpler analysis If you want to do it like Watson, which was the champion of Jeopardy, you need a deeper analysis of the context You need to understand more about the semantics of the sentence That is, this adjective belongs to this sentence and so on "Fulano" is husband of "Ciclana" You need to understand that "is" is a verb used to link, husband refers to the guy who is the husband and "Ciclana" is a complement of husband, get it? It's not just statistics, it's a more complex problem Now, classify some news article as "sports" I give you an article, and I want your system to tell if it has to do with transportation, politics, what do you guys think? [audience] This classification problem, that we already know how to do A statistic classifier will do the job, kinda like how we correct "sessão" and "seção" If you want to know more about how to do this, perhaps the most used algorithm is called Naive Bayes It's a statistical algorithm Given a text, recognize sarcasm and meaning inversion in the text? [audience] We have no idea about how to do it There's research in this area but nothing too relevant Language understanding to this level is still a very difficult task for a computer What's that? Yes, maybe [audience] Yes, more or less The thing is that it's too dependent on the context, you know For example, if I say I saw Geicke today walking by, you'll find it funny whereas a computer will not [laughs] Yes, but you know, this depends too much on the context, understand? I mean, unless it is very obvious, that is, some word in the text has been modified by various positive adjectives, which we can do using parsing, and in the end it is modified by a negative adjective, then we can have an idea that this is not really the case, but not how we people do it And finally, I give you various pictures which contain either a cat or a dog, as many pictures as you want, say, one million pictures of cats and dogs Now I give you a new, unseen picture what's in the picture, a dog or a cat? This is a task we humans do without any problem, isn't it? Make a computer program that does that, is there research, is there a solution, what do you think? [audience] We have no idea about how to do it What has been done around, and people have been trying to do it, but the correct classifications were about 50%, that is [laughs] Now that's easy, you make a software that has a 50% chance of outputting dog will have the same performance as some of the research around there So the message here is that some tasks that look difficult may in fact be easy and some tasks we do on a daily basis are really complicated to implement using a computer Let's now turn to real problems Some of them have been solved For example, creating a text editor with spelling correction when you miss a letter is a solved problem in any language All you do is keep a dictionary of valid words and when an unknown word appears you highlight it Software Word has been doing that for a long time now But there are some problems, and that's where opportunities lie, that we can solve with what we know But the problems are out there waiting for somebody to put together an interface, you know, waiting for somebody to present a solution What I mean is that we can solve some things with the science that is already available Things that have not been solved And there are problems for which we don't know if a solution exists And I'm not talking about NP-hard problems, problems that are too difficult, there are some problems that we don't know if they are NP-complete and some we know they are, but it may be the case that a heuristics exists that can solve that problem at least to a satisfactory extent in an acceptable time There's a lot of state-of-the-art research in the field of AI about these intractable problems but you may get a heuristic, a mechanism to at least sometimes get the correct answer in a reasonable amount of time These are research opportunities This gives rise to what I call the Industry technological paradox This is how it works: we have some known technologies which are widely used And here we have the boundaries of our knowledge, where researchers work, where there's research budget And here there's an area we don't know Industry has problems that could be solved using known technology; some with a low technology level, some with a higher level Industry has some problems that would be solved using knowledge in the boundary of what we know And when I say industry, I'm referring to Petrobras, where I work, the Air Force itself, where I stayed for some time, Vale, Odebrecht, which is a big contractor, you know I'll give examples of these problems in the next slide And industry also has some problems which are beyond what we can do, especially involving materials science and nanotechnology The focus of industry, what we see happening, is that most of the solutions used are here, in the lowest region of known technology Industry doesn't use more advanced known technology, it's very restricted to basic stuff For example, control systems: what you'll see around are PID systems, you'll know when you have this class, but PID is not state of the art Maybe some processes could be controlled using lead-lag, AI, fuzzy logic which are known techniques but not used And the focus of academia, which is where research lies, is this, the boundary of knowledge So researchers won't research here because they understand this is a solved problem, somebody did it, and he's not going to publish a paper about the topic But then there are these problems in this region here, see? These are problems that we could solve using current technology but that industry won't use since it's not its focus and researchers don't study because either he won't publish a paper about the topic or because he doesn't know the problem exists And this region here is very promising in terms of making money I'll give you an example: Instagram This guy sold Instagram for 1 billion That's a lot of money, isn't it? Is there anything unknown there? No, you just take a picture and apply a filter We've known how to do this for a long time But the guy envisioned how to sell this to the correct people He earned 1 billion Was it worth it or not? That's good, isn't it? [laughs] You can have quite a good barbecue with all that And then in this region "I don't know how to do it", industry will tell you "I don't know how to do it" because it has never been used and people don't want to take risks "This thing here has always worked, so why should I change?" And the guy from the academia will say "this won't generate a paper" But many times it's because he doesn't know the problem exists So here, coming back to this paradox, I'll give you some examples of known technologies which are used, problems and stuff in the boundary of knowledge, but in companies So, for example, electrode welding This is known and widely used Welding with controlled short-circuit, which is a more productive process and has a better weld quality This is a known process in industry, there's even equipment to do it, but not all industries use it because since electrode has always worked, why should I change, understand what I'm saying? And here laser and hybrid laser welding, this is boundary of knowledge, but that could generate to a company like Petrobras millions in profit if it existed, because of productivity For example, if I can produce oil from a reservoir today, it's worth a lot more than have it produce a month from now It doesn't matter if laser welding is expensive because I get so much money producing today that I might be able to afford paying 20, 30, 40 times more and use this type of welding Text editors, spreadsheets and CAD drawings, that is known There's AutoCAD, Word, and many other programs that do this Version control in a collaborative environment: have you ever used Google Docs? Have you used that one like Excel, that spreadsheet from Google Docs? When there's more than 1 person using it, the cells each person use gets highlighted It's an excellent tool for collaborative work Now you ask me, do people use that in the industry? Do all companies use it? It's not, and it is very useful because many people can work simultaneously using the same document There's no such thing as I got my copy and I'm working here and then somebody else got his copy and worked there Then to get what I did and put together with what he did, somebody's going to have to rework what I did and what he did and put it together, although this technology already exists and it could have been used And it's not used in many industries And one of the things that are boundary of knowledge is context dependent search So, say I have many texts from many construction jobs Let's say, in a given construction, I get a daily job report So every day, I get a paragraph describing what happened in the construction After 3 years, I have a lot of stuff to search if I want some information So if I want to know, when did that accident happen in which some guy broke his leg? Then I search leg in these 3000 documents, I'll find the leg of a chair, the leg of some equipment, understand? It is going to be difficult because although the data is there I can't find it because I can't do this disambiguation of a person breaking a leg and some equipment breaking its leg All right, you can do that in some cases but this is still a research area which would be very useful for companies Now this is what I was talking about when I mentioned control Much of what is used in the industry in ON/OFF control, which is the most basic of all Now Lead-Lag and optimal control are known methods but they are rarely applied or not applied at all And control using machine learning, this one we still need to develop more to apply in industry That's because it's difficult, even if you have a research that works, to transform it into a product that has the required reliability to be used in a company such as Petrobras, Vale or Odebrecht Let me get back here What do I mean by this? These problems, for you who want to create a business and earn a lot of money, these problems whose technology is already known, but that industries can't solve because it has always been done some other way, these problems are an excellent opportunity to earn money Not if you want to become a researcher, but there are many problems such as these ones that we can solve, knowledge is available, but the problem persists, why? It's because people there don't worry about changing what's there because it's been there for a long time, and sometimes people here from the academy don't know the problem I want to show you some videos here about research about AI which are fantastic videos About this research that I was talking, processing Portuguese, that is, transform what I'm talking about into text in Portuguese or make a sentence recognition system in Portuguese, or make a system that can automatically correct answers, there's a lot of opportunity there These are things that exist in English but not in Portuguese Let me show you some of these videos The Stanford autonomous vehicle This video is available in Youtube Let me forward to the interesting part For you who know, this is the guy who, Sebastian Thrun, who made the car which won the DARPA Grand Challenge by crossing a desert It's a car without a driver And this is a demonstration of some of his research These are some images of the DARPA Grand Challenge Let him speak [video audio] Notice that this car is being driven by a machine, there's no driver There's a guy there watching in case it's necessary to brake or something like this, but all driving is autonomous Watch this I'll stop here to comment Can you see these yellow and red parts in this drawing? This is related to the positioning system of this car using distance sensors and this green trajectory here, can you see? This drawing is based on search algorithms originally developed in gaming, many heuristics used here came from games And the localization system of the car has sensors there, but the basic algorithm is quite simple, it's called particle filtering, and the difficult part are the sensor models, if you care to know more about this topic So, how about it? If someone wants to make an autonomous car here at ITA, I think it's an interesting challenge But I would suggest something different, because you saw that the city car is at an advanced research stage now But making a car that can cross mud and offroad situations, and map the environment, that doesn't have such an advanced research status and it would be useful in a country like Brazil for, say, topography Now this one here, I don't know if you've already seen it, have you seen the Stanford helicopter? Just to make it clear, this helicopter has algorithms to perform complicated maneouvers that professional pilots take many years to learn and they implemented algorithms to perform these acrobatics automatically So all you do is press a button and the helicopter will do what you asked And this was done using Machine Learning techniques This one here is also interesting We talk about computer vision, path planning, object mapping, this one was an initiative to put this all toghether and perform a task You'll see what this is I'll turn up the volume What he asks the robot is: "robot, please fetch my stapler from my office" and he does that using his voice We have Siri today but that was not available then So he issued a voice command to this robot here I'll fetch the stapler That is, it knows where it is in the map and now it's looking for the stapler Where it is? [audience] Yes but this is 6x fast forward [laughs] That is, well I find it interesting, but we have a long way to go, the true speed is 6x slower than that we saw Here I put some stuff about question answering, a research area, which is what we were talking about earlier, about Jeopardy and Watson I put here "Which is the sixth most spoken languague?" this is Wolfram Alpha, I don't know if you know it, and this is what it understood, sixth most spoken languague, and this is the answer, Portuguese "How many calories are there in an avocado?" Avocado, and its answer here So this is an intelligent system It may look simple, but to do that the system has to extract information from text written in natural language, there's no formal structure like a programming language or a table And Google has that too, I don't know if you've tried to type a question in Google, but I put here "Who is the founder of Apple?" and it returned here the founders We were also talking about this statistic approach to language, and that there's a lot to be done in Portuguese Take a look at some Google translations "A project report shall be presented every other week". This translation here is correct This "every other" in English means at every 2 of this period of time Every other day, every other week, every other trimester Now take a look at what it does in this other sentence: a project report shall be presented every trimester other Why does it do that? Because "every other week" has been translated many times into "a cada duas semanas" and it thinks this to be statistically probable Now "every other trimester" wasn't done correctly because it's improbable I don't know if you knew this is how Google translate works, but it is a statistical approach to translation. Look here, "every other year", "every other month", now look at this: One meeting, yes, the other, no, and it translated to every meeting other But I got another example after I had prepared this presentation which is even more interesting, let me see here I did this test here: tree bark is tree bark, and dog bark is the action So I came up with this sentence intending to be ironic: Dogs bark a lot, but a tree bark has never been seen Now look at this: it translated to dogs bark a lot, but a tree outer part has never been seen [laughs] So Google and these translation systems don't undestand context, but nevertheless the probabilistic methods work well Now about Machine Learning, changing the subject a little, is a fundamental piece in all this So let's say I measure a set of points like this, which model should I adjust? Do I want a straight line? A straight line probably won't be very good Do I want a parabola? Probably yes But I can construct a polynomial which passes through all these points, but do I want that? The old way, so to speak, and by old I mean before 2008, it's not that old, was to put all this data into a statistical training system and output: "Well, this fourth order polynomial has a R-squared statistical error of so much, 99%" Nowadays, however, we have more modern techniques to do this, especially when you need to generalize If you want to talk about this later, I have more material about the topic, but since I've been speaking for quite a while, I'll give this to whoever is interested Other machine learning techniques: like I was saying, before 2007 and around 2008 we would proposed a model, take measurements and fit a least squares model, if you've heard of least squares, it's a regression method that allows you to build a model Now what do we do today? We do something called regularization, which are techniques that allow us to reject outliers, basically And we want to learn a model that is not too specific to what you observed, and this is such a common mistake: This guy goes to the lab and takes 100 measurements Now he builds a model that fits his 100 points perfectly Yeah, but so what? I don't care about his 100 points, what I want to know is whether he can use his model to predict future, unseen instance, that's what matters And this is the focus of current Machine Learning Another focus is learning by examples Have you heard of this MNIST dataset? It's a database of handwritten digits, like these ones here, which are used to test recognition systems So what does your algorithm receive as input? It receives this image, that is, a matrix of intensities, and the algorithm has to decide that this is a 7, this is a 6, and this can be performed very well using example based learning, that is, I give your algorithm an example: Look, this is a 7. This is a 7 too. This is a 7 too. When I give it another 7, I want it to have learned this structure And then there are techniques to do that So what would be an overview of modern machine learning? What is mostly used is informed search, or heuristic search, which Google uses to plan paths for its car So, for example, the traveling salesman problem is intractable It is an NP-hard problem But there are solution modes, for example, if I sum the distances I have a limit, an upper bound for the best possible solution Based on that, I can create a system that does the search in a more intelligent way It doesn't need to try all possibilities, it only goes so far into the search tree Supervised learning: in the previous example I told it: what is a zero? what is a one? a two? I gave various examples Then there are techniques for the computer to learn based on these examples Then when a new, unknown example comes it will know what is a 0, what is a 1 Now, on the other hand, there's unsupervised learning, which is what we were discussing here: For example, if I have a database containing various facial expressions, I want to throw all that into the algorithm and have it tell me how many different facial expressions there are in that database Or in the case of handwritten digits, somebody who writes the 4 with a leg, and somebody who writes a chair-like 4, these are different types of 4's So if I give the algorithm a bunch of 4's it might not know beforehand that this is a chair-like 4 and this is a legged 4, but there are techniques to tell, given a set of examples, that there's a category here, which looks similar, and this other group is also alike I can't tell what it is, but I do know this one is different from this other one This is a way to learn Afterwards an expert may come and tell "this is this, that is that", and label the categories And reinforcement learning is used to perform real-time decision making, such as the Stanford helicopter The helicopter receives an input command and, depending on which state it ends up in, which is its attitude, its speeds etc, and computes a reward function for this I took this action when I was in this state, and I ended up in this state, which is good So this is a good action Now I did this here and lost my helicopter, I lost my entire research, and that is not good, I don't even know when I'm going to be able to test this again But what has been studied a lot recently are methods based on probability and statistics That is, given that I'm here and I can see this wall from this distance, and this one from this distance, and this from this distance, where am I? I can answer that if I'm in this location because I have a map of this region in my brain But can we do that using a computer? The answer is yes, and I don't know if you how the Google car works, but one method it uses to know where it is is by watching the ground So based on this history of ground it has already seen it can locate itself and know where it is in the road, and which is its geographic location, whether it's left or right One of Google's car problems is to drive when there's snow When there's snow the car can't see the ground so they don't have a solution yet to solve this snow problem Maybe they'll use Street View because you can get a tree here, a traffic sign there And this set of information, I mean, a single sign, alone, doesn't help much in localization, but the set of signs: well there's a sign here, then 10 meters from here there's this other one, then 10 meters more I see that tree, you put all this together and you have a good probability of being able to locate yourself These are graphs that are used in searches, and this is a method of supervised learning, where this segmentation is labeled as building, sky This one here depicts reinforcement learning, that is, throw this mouse in this environment If it gets to the cheese, it gets a reward which is the cheese, so as time passes it learns which is the path it should take that is a path where it doesn't walk too muchh and still reaches the desired reward Now it's still very important to know the problem In heuristic search, if you want to have a good heuristic, you need to know the problem For example, if you want to create a program that plays chess, people who know how to play chess will develop a much better chess software than people who don't, because people who know will be able to identify advantageous positions This guy who can't play chess will have no clue So the software of the guy who can't play chess, if there's infinite processing power, will get to the same result But since that is not the case, having a good way to evaluate positions without having to get to the end of the game is very valuable In the case of reinforcement learning you need to build a simulator, which involves knowing the problem The message here is that it is mandatory to know the problem; there's no computer mechanism so that you throw anything in there, it recognizes what should be done and does it That's not very close to happening And lastly, what is this parallel processing that arised recently? It is a new technology to use graphics cards and multicore processors to accelerate certain processing tasks For example, in medical imaging there's been a 45x speedup In fluid mechanics 17x, planet interaction, 100x, some medical research, up to 400x, gas diffusion, 35x In summary, let's say I want to run some application in a server cluster and I need, say, 10 computers to do the job If I use a GPU and apply this technique, it's as if I had bought 1000 more computers, because the algorithms are so much faster Now look at the impact such thing would have for banks, for example: banks need to do secure key exchange for every transaction, based on number theory This is an expensive computational procedure: generating a prime number takes a few milliseconds And then you have to do it millions of times Now if you could accelerate this, you'd go to the bank, install GPUs in each server, modify the code and suddenly it's as if the bank had 100 times more servers There's a good opportunity to make money, for example Now there's a world of possible research that can be done with that For example, you could plan the sequence of construction and assemlby How am I going to build this equipment? In Catia, for example, how should I position this part? It has a module that can compute collisions and stuff like that, but there's no computation of a completely automatic trajectory that avoids all collisions At least I think it's not fully developed yet That's an opportunity 3D reconstruction: Let's say I get multiple cameras and try to make a 3D map of this environment Oh, yeah, I'll hand you this 3D camera so you can take a look at it So, this 3D reconstruction thing from structured points: if I measure distances in a point cloud, how do I go about reconstructing the 3D? If I have multiple cameras to find objects, or if I have multiple Kinects, for example, how do I do that? Now, parsing and speech recognition in brazilian Portuguese For example, telephone attendance systems In the United Stated you can call to buy a ticket, you'll practically talk to the answering bot Here, in the telephone, you go: Press 1 for this Press 2 for that Press 3 for that Of course you always need option 7, and you need 10 minutes for each part of the menu, and then you still end up having to talk to a real person and you hang there waiting 15 more minutes See, that's something that doesn't exist in Portuguese At least I have never made a call to a company that had such a system in Portuguese And, you see, the technology is known, so that Siri already exists in the USA Why not make one of these in Portuguese? Is it that there's no market? I find that very implausible Synthesis of voice and instrument sounds This is to create music For example, there's this Encore software that allows one to input a score sheet and it will play the instruments But hey, why can't we do that with voice? Let's say I record a singer sing various phonemes and I want to compose my own song and have this guy sing it, so that I can test and see how it goes, can we do it? Yes, we can Of course you'll have to study how to connect the sounds, there's some research there, but it's a very interesting topic, and it's doable If one of you is interested... [audience] Now what I was talking, about banks, this arithmetic modulo N is what is used in cryptography Banks and credit cards can use a considerably lower amount of servers to perform the same task So instead of buying 1000 more computers they buy 1000 GPUs, which is way cheaper Integration with microcontrollers and sensors, in order to do integrated product development For example, you know the Arduino, which is even supported by Google Let's say I want to make a table that maintains uniform illumination I need this because I'm assembling some electronic circuit and do some precision work What I can do is I put many high power LEDs and measure the light intensity throughout the surface and control the power of each LED so, for example, if there's a light source to my left, the illumination of the table should remain uniform So I can read data from these sensors, transfer them to a computer, process them and give back to the controller what should be the values of each LED intensity That's hard to do using only a microcontroller because the model is not that simple Particle filtering and robot localization: this is what we saw about Google's car There are some limitations: for example, to drive in snow, it will have to process images from the sides, and since there are many images that's harder than just processing ground images So you need to embed more processing power, which is also a good opportunity to use parallel processing And nonconvex problems, which are problems that require brute force: have you heard of Rainbow Tables? It is a scheme used to break passwords When you log in Windows, it saves your password, or else it won't remember your password, obviously But it doesn't save your password directly, what it saves is a hash of the password, which is a modification You can't get this hash and reconstruct the password but since passwords are usually short, ranging from 6 to 8 characters, this is what they do: they scan through all possible passwords and generate all corresponding codes Then he goes to your computer, retrieves your hash code and reads your password from that list Just so you know, passwords up to eight characters are already broken, This means that if someone goes to your password-safe computer and reads the folder where your password is, he can retrieve your password hash and discover the password if it has less than 8 characters, and in Windows, up to Windows XP, even if the password had more than 8 characters, Windows would split the password into blocks of 8 characters and save the corresponding hashes, which doesn't help So it was useless to have a longer password against these Rainbow Tables Now some suggestions: these are some things I have been doing, and if you want to know more I can give the code and explain how it works and show details This is 2D and 3D collision That is, these two vehicles here don't touch each other although this part here is intertwined, but there's no collision It's just that they're, like, one inside the other So we need exact collision detection and it's possible to accelerate that using GPUs That's not how games work today, as an example Games compute various boxes around the character and computes collisions using that So a shot that barely hits is understood to have hit It would be interesting to use this technique in a game You could write a Counter-Strike that works like this: someone got shot You could check that the bullet hit the leg and reduce the speed of the character, do stuff like that In Counter-Strike, it's either hit or miss, unless it's a headshot, in which case it renders a head but computes collision using a cube, which is easier This is a performance comparison chart This line here shows CPU performance as a function of number of polygons And this one here is the GPU result And you see that there's a lot of difference: if you extrapolate this blue graph here to compute this amount of polygons it won't fit in this area by a large deal Just check the derivative here OpenCL and cryptography Parallel processing Many opportunities and challenges One of the challenges is to build a secure system, if you like cryptography Even if you implement the algorithm correctly, it may be the case that it is insecure That's because, for example, a memory card: it implements a cryptographic key, and if you get this card and measure the energy consumption of this card, each part here in this graph is a step in the cryptography algorithm So if you get the card and measure its power consumption, based on these peaks here it's possible to extract the bits of the key So if that's running in your own PC, you'll be running the cryptography in one of multiple cores If there's a malicious code running in another processor, it turns out to be possible to watch the processor cache and check how many times the crypto code needed to fetch data from memory instead of using cached data And then it can figure out your password Why wouldn't that be possible with OpenCL? Or at least I think it's not, I don't know, there's not much research on that At least I don't know The thing is that using parallel processing techniques, you can control whether you want to use all processors at the same time So when you run the code in parallel, you can configure the system to run only your code So it's not possible to do a side channel attack on a GPU, to the extent I know If the GPU is running crypto, it's all it is doing, and besides it will be much faster This regards convex optimization and quadratic programming, there are multiple methods to solve linear programming like the simplex method This is a more general interior point method and it can solve more complex problems Whoever likes fractals, fractal generation with parallel processing can be 800x faster So for example, these images here are from NVidia if I'm not wrong, and this image here I generated in a software I wrote to run in the GPU You can have this type of result OpenCL and shape recognition This here is... I even sent you the link to a video in which there's recognition of some coins I have it here and I can play it at the end if you guys want me to The question here is this occlusion part Can you see this shape? Say, for example, this region is filled and this region is filled, notice that this whole region is going to be one single shape so if I get just the boundary, this here is the boundary, see? There's no distinction And one of computer vision problems is when an object is behind some other one How do I keep recognizing an object when one can pass behind the other one? So it's also possible to accelerate this type of thing using parallel processing, and let me show you this one quickly This is an example I'm going to load a shape So look, I'll load this geometry, this shape, this one here, this one, and this one Now I'm going to open this file here So, you can see that this part here has no defined boundary I can't... did everyone see this here? Interesting, isn't it? This 3D technology without glasses is something that may appear in televisions within some years But what I was saying is that there's occlusion here This part of the image, see? I can even get the border but it's a connected border and I can't... it's not that I can't, it's that it's not trivial to read this border here and split the shapes that gave rise to it, which is the star and these other known shapes And here I'm going to use OpenCL, parallel processing, to find these geometries here So this is the result Even with occlusion, it can find geometries Let me load this other shape here, this one So this has a very interesting acceleration Now let me open that image again, just so you can compare Now check these times It's almost 1 second for this filter here, practically 1 s total time Let me deactivate parallel processing and load that same image We can talk in the meantime Now here, just the filter took 4.5 s, then 3 s for borders and 3.4 for edge thinning So, you see, in this simple example, not so optimized, I have a 10x faster algorithm [audience] Oh, to find it! I didn't even measure that time You see that it's much slower, even though I didn't measure, it's much slower than before Well, I've talked a lot, just to end here, I'd like to say that today we do things that would be unimaginable 3 years ago So methods we have today in computer science, algorithms, there's a lot we have today that allows us to do things that would be unthinkable 3 years ago So the evolution has been fast and interesting with parallel processing, the computing power is quite strong There are many unsolved problems but that we can solve, and that's an opportunity to make money If you want to deal with some of these problems, you won't be researching state-of-the-art stuff, but you have a good chance to make money with these problems, for example, the ones I showed about welding and document collaboration I see many opportunities regarding natural language processing, especially in Portuguese And there's a world of applications using parallel processing, these ones I showed you are just a small set; I myself did other things and I know many other researches in this area and there's a world of applications So if you'll accept the challenge to work with that, whoever wants to work with this, do some interesting research, you don't need, say, to make a complete cryptographic system, but you can make a small piece, publish an article, do some other interesting thing, do your final work concerning a subject that is under the spotlights instead of just doing some literature review which may not help even yourself later on Whoever wants to do something interesting can do research, and if I can help, I will If someone wants to ask something, thanks for the attention, I hope I showed something interesting here I hope you liked it and if you want to ask, if I can answer, I hope you're ok with the many I don't know I'm going to say... [question] So the website of this OpenCL developer... well, first you have to pick a platform And today we have 2 of them, basically: there's CUDA, which is NVidia's language, I don't know if you've heard of it, and OpenCL, the Open Computing Language, which works on various platforms So CUDA is more mature, but OpenCL can be used in AMD GPUs, Intel CPUs, so it's more versatile If you want to use CUDA, there's NVidia tutorials which are an excellent starting point And for OpenCL what I'd recommend are the examples in the website of the manufaturer, which is Khronos Group Also if you're interested you may want to take a look at some of the things I wrote in my website, there's some interesting stuff there too Take a look at some examples, some applications, create some simple code, you know, get some familiarity with it [question] Well, when I was here at ITA, I discussed this with people who studied computer science in my class and we talked about Moore's law, which says that processing doubles every 18 months and so on, but back then in 2005 we were already reaching this stage where it was not possible to increase processor clocks too much, because a physical limit is being reached, meaning that current processors won't go much beyond 4 GHz at this point So at that time we would say: well, if this can't be done anymore, the most logical solution is to use many cores And so let's say I have 20 cores in my computer, I don't want to run 20 programs Instead, I want the software I'm running to run fast So I started to look at that see how progress was being made There was OpenMPI, MPI, which is a language used for computer clusters, high-performance clusters, but I didn't have a cluster, that made things harder And then 2, 3 years from now CUDA already existed but I thought, at the time, that being platform specific was too limiting a factor And then came OpenCL, we started to study, we decided to create a website to consolidate the things we did and then... I learned much with the website itself, up to the point that... Have you heard of LAPACK? LAPACK is what runs under MATLAB It's what MATLAB uses to solve linear systems, it's the heart of MATLAB We wrote a sort of LAPACK, a simplified version, to run in GPUs and we got better performance than LAPACK, which is quite interesting considering LAPACK has been developed for more than 10 years [question] It was me, Edmundo and Diego, guys from my class, Edmundo and Diego studied computer engineering So this is how it goes, start with something... Get a "for" loop, and this for has an i which goes from 1 to 1000, well that's something Just by doing this you can get a 5x faster algorithm People kill for 20%, 10% And then you go on to learn other structures There are algorithms which can't be fully parallelized For example, say I have a really large vector and I want its sum What is most logical is to accumulate that sum in some number but that can't be done in parallel because the next value depends on values I have summed previously And then what you do are more complex parallel structures, you get all this, break this big vector into pieces, then each core of the GPU sums a piece, and then I'm left with, say, 1000 numbers, and I have to sum only these 1000 numbers instead of 1 million numbers, for example But that was how it went, I learned slowly, had this notion, even a strike of luck to see that parallel processing was a tendency in a future which is practically true now That's somewhat how it was [question] Yes, I intend to get my doctorate here at ITA [question] Well what I intend to do here are robotic systems, robot aided systems and computer vision to build large scale equipment, because in Petrobras there are many construction jobs, like the COMPERJ, the petrochemical complex, refineries in the northeast, there's the fertilizer plants, and many large scale equipments will be built, and by using robots we should be able to reduce how long it takes and the cost to finish these jobs and get to use them earlier Like I was saying, this is worth a lot of money, the opportunity cost of having a refinery ready 1 month earlier, that is worth millions So this is my intention here with ITA researchers, and it has applications in large scale construction, like refineries and the oil industry I don't know if that answers the question [question] Oh, right, it's www.cmsoft.com.br [question] Which? [question] Oh, yes, we wrote the software So this is a real-time application, let me show you the video, so this here, this parallel processing to identify circles in real time, this wouldn't be doable without parallel processing In other words, 3 years back we couldn't use a single computer; I'd need a cluster to do this here So, you see, even though there's the pen in the middle here, it keeps on identifying the coins Look here, there's a pen on top of the coins, but it still sees them Now here there's this strange thing, it's unrelated, but... Now here it understood there was nothing there, but the pen is still here, it's on top of the coins, right on top See occlusion here This is a known algorithm, we parallelized it, we made some adaptations to make it parallel [question] Yes, similar, it's the same algorithm, and the name of this algorithm is the Hough Transform It's just that this is the generalized version, I'll show later why, I basically don't need an equation, I just need a shape and I can recognize any format, independently of occlusion, which is the main advantage of the technique: I can see an object even if it is partially obstructed, that is the application [question] Well, you know, this I showed you here, if you want to ask me anything, send an email and work together Of course I don't live here in Sao Jose, but you can count on me, I can't dedicate exclusively of course, but send me an email, ask what you want, if I can help I will help So, that sound synthesis, I found it interesting, how do I do that? I have some stuff I wrote, some libraries, I'd be OK with passing it and helping you do it There's a lot I would like to do but I can't So it's interesting for me, because it's a new application of parallel processing, and I think the person who gets to do it will learn a new technology, which is very promising Within 1 or 2 years there should be phones with a small GPU that will be able to run parallel processing, so this is something that has an enormous potential to grow, so if you want to do something, if I can help, do count on me So this is the occlusion part part, in which I can train this format here, can you see? There's no equation, and then I can find a key among the coins This here is because of orientation So this is the algorithm, it's advantage is that it is less sensitive, or just not sensitive, to occlusion, but there's still the problem of scale and rotation Computer vision problems have many solutions, this is one of them Here it's not real time, it's accelerated, two times faster But still... [question] Yes, quite the same [question] Oh, this software, this is open source software I wrote, it's hosted in Google Code, you can download it, we put many examples in the website about how to process images, how to create a filter In Google Code? The examples are in my website, CMSoft, what's in Google Code is the source code of it Yes, that website Then to process images and compute border detection, you saw here how much a difference it makes It's much faster Oh, count on me Well, I hope you liked it, count on me if you want to do something in this field of parallel processing, we can publish it, post in the website, create applications, create localization, I will be glad to help Of course I can't help all the time but then there's the weekends, overnight work, you know Well I guess I'm not quite up to that anymore Well thank you guys for your attention