Llbmc - The low - Level bounded model checker

>> SINZ: Welcome everybody and thanks for inviting us here for this talk. For this talk will be about the techniques for fact finding. And this technique is called Bounded Model Checking and we will also present our tool which implements--includes this technique and our tool is called LLBMC which stands for Low-Level Bounded Model Checker. Okay, let's start with a very general problem, The Battle Against Software Bugs. So, I want to give some examples about how software bugs turned out in realistic examples and what some people suggested, how to deal with their software bugs. So let me start back at the end of 2008. So this example is about mobile devices, more particularly about the Microsoft media player Zune. And a strange thing happened on the 20--on the 31st December of 2008 when suddenly all of those 32 gigó-30 gigabytes Zune devices suddenly stopped working. So if you were going to switch off you could not start it again. And that was of course the problem because many people wanted to use their devices on New Year's Eve to play some music on their party and the device didn't work. And the reason for this was that there was a bug in the clock initialization of these devices which didn't work on some particular days due to Leap Years and the 31st of 2008 was just such a day when these devices for this clock initialization didn't work. And fortunately on the next day, first of--1st January of 2009, this bug didn't show up anymore because the Leap Year had changed and then the devices started working again. But nevertheless it brought some not so good press for Microsoft, this program. Okay let's look at another example which happened around the same time and it's also about mobile devices. So this time we're talking about a device from Research In Motion and this device is Storm which was introduced at the end of 2008 and it also suffered from some bugs which many users complained about. And what's perhaps even more interesting than those bugs is the reaction of the co-CEO Jim Balsillie, who you see here. And how he reacted and what he said about how he views as such. He said in the interview that Buggy smartphone software is just the new reality. So which mainly means you cannot do anything about this, okay. Because there's so much time pressure and [INDISTINCT] of money, you just have to accept that there will be bugs in your software. Perhaps over time when you release these bugs [INDISTINCT]. And of course this is not a position that we want to keepó-we want to do something else with the bugs, not just say that's something we cannot do anything about. Okay. So let's leave the area of mobile devices and switch to some other area and that's about security and about bugs in operating systems, systems software and applications which can be exploited by [INDISTINCT] for example on the Web or in other notions in this--in this context doesó-cyber war for example, they're also such things exploited. And the example that the other one showed here that happened at the end of 2009, the beginning of 2010 and it was later named Operation Aurora and it was caused by a zero-day vulnerability in Internet Explorer which was exploited--so this was a security hole in Microsoft Internet Explorer and that was exploited by presumably Chinese hackers to write into computer systems and also into Google systems, into systems from Google and they tried to get access, illegal or unauthorized access to intellectual property from Google. They also hacked some Gmail accounts from Chinese human rights activists. And the reaction of Google was afterwards that they said that is a--that it was a highly sophisticated and coordinated attack what was going on here. But what is the main point here for me is that it wasó-this was finally just caused by a very small error in Internet Explorer. There was one user after [INDISTINCT] this whole thing. So a very small error in one program will cause a very large effect. And besides Google, also other companies were affected, so a total--it was said that 35 companies from all different kinds of industry sectors were affected by this attack here. Okay. So this is such another example for such an error, just a small error which has a large influence. And this isó-this is rather a very special case, so this is something which is quite general, so. And you can see this year from this figure for example here. So, this figure shows the number of vulnerabilities which were caused by buffer overflows. The buffer overflow is just one particular kind of software errors which can be in a program and you see it in those--and this year what's shown here is the number of software errors or vulnerabilities in software which--for which [INDISTINCT] was known and which were caused by buffer overflow errors. And you see the numbers are quite high so we haveó-and they are quite stable so you see that in 2008 you had more than 500 such vulnerabilities caused by buffer overflow problems and this number is not really going down very fast so it seems that the situation and how we can deal with such facts would I think is not getting much better over time. Okay, so what else. One thing which I found quite interesting was a message which was posted on an Internet forum that's on the LWN New York's forum. And it was posted by some user in 2005 in reaction to a buffer overflow error in the zlib compression library. And here is what he has written, it was JoeBuck was his alias that he was using on his phone and he said, "We should have been done with zlib bugs long ago." Because he says this is old software, it's 10 or 15 years old. And he continued, "It should be possible to analyze the code and prove that there are no remaining buffer overflows." And that's exactly the direction that we also want to follow. But not like what you have seen before with RIM, for example, just saying this is in your reality but we wanted to do something about this along these lines here. And what's also interesting is that he continued, "Too bad summer's over, it would have been nice to ask Google to sponsor something like that." Perhaps we will have the chance to meet Joe Buck in the future. Okay, so this closes my list of examples on just more or less random selection of some problems in software and the effects that it caused. Okay, so we now have these effects in our software, the question is how do we deal with them and how perhaps should we deal with them? So let's compare the reality how [INDISTINCT] handled and what is our vision, how they perhaps could be handled in the future. So I think you have regular maintenance software, nice software which consists of different modules like this one here but unfortunately after awhile this software still has some bugs. So you see them here [INDISTINCT] the reality is it's still there in the software in some places. And so what to do about those facts now? So, typically you will start to write some test cases and drag those test cases on the software and just people have toó-well this means in some way, you have to go with your armory or your machinery to deal with bugs and you just start looking at some very particular points and shoot your arrows at those points and see what happens. And in this example, that's a bit hard to see but there was no bug at this place so you just think, okay my software is correct for this particular example so let's continue with some other test cases, some other error reservations that this software looked into. And in some cases you might be lucky like here with this error was you have to get rid of the bug but with others you areó-you are not so lucky. Finally, of course, you will not know whether there are still bugs left in your software of course you might reduce the number by writing through test but they might still be something left. >> [INDISTINCT] that testing can only show the presence of bugs but never the absence? >> SINZ: Yes from [INDISTINCT]. This is the same direction. So of course what we would like much more if there would be some kind of device or Oracle or [INDISTINCT] which could tell us exactly where the bugs are, which places we have to look at or which testing tool we have to write to find our bugs. Or you can also think of it like some kind of scanner and the scanner will just go over your software once like there's a scanner at the airport like this here and then outwards suddenly turns into graver errors in your software [INDISTINCT]. And then you can just fix these pinpointó-pinpointed errors in the program. Now this is how we think of our tool that it works in some way like this scanner that you had just seen. Okay so let's now turn to this method that we proposed as aó-as a scanner and this method is called, Bounded Model Checkingó-yes. >> One question for you. So, how do you define bug in that context? I mean bugs are very hard to determine, and generally I'm surely undecided [INDISTINCT] expect for the program inside [INDISTINCT]. >> Yes of course--as we've said there are very different notions so we mainly think ofó-let's say obvious programming errors like these buffer overflows which you've seen before. And buffer overflows is for example, something which you can take without writing the justification or integer overflows would be another such class or division by zeroó-so I willóI will show some examples. >> So certain case where you run into undefined behavior in some form, something like that? >> Yes, undefined behavior or behavior which makes your program crash or just some, you know maybe illegal things that happened with your program. So this isó-this is very much like or it's on the source code level of course so it's not a high level error finding design for example or improved designs but it's a lowó-very low leveló-very low level tool--that's also why we have chosen this in a low-level bounded model checker for our tool. You will see a bit more about this type of [INDISTINCT]. Okay, but let's turn now to the technique that we want to use, this technique of bounded model checking. Now let's start with--if you're involved in ideas of this technique, perhaps the most important idea is that, with this technique, we want to analyze all paths of a program and all inputs of a program at the same time to give you with a test case the best tool of just going through one program path, you're just using one particular set of input values for each variable, one exactly one exact value and then you run it to the tool--you don't have to do everything at once, so we check, we're going to check all paths and all the inputs at once. And then there is the restrictions you see here. So not all arbitrary paths but only paths up to a fixed length are what we can check with our tool. And of course my last point, what does it mean this fixed length? When fixed length in our case means that the number of [INDISTINCT] is limited which can be taken into and also the number of method function cores which can be taken from our limited in this approach. So this is something which you give to the program if you rechecked that before you started. So you would say, analyze all loops up to 1,000 computerations for example or analyze function cores up to a nesting depth of 100 or something like that. And then all paths which areó-which or--within this limit will be analyzed but not the ones which are around [INDISTINCT]. And why are we doing this? The problem is that, hacking properties of programs in general is undesirable but if you have finite paths, then this means that everything else also gets finite. If you have only finite program paths, this means that during this finite runs, only the finite part of your memory and of your variables can beó-can be looked ató-so all the data structures will also be finite in that case and checking properties about finite things is something which is decided within. Okay so this--this is the first idea--analyze all paths that runs with this restriction of having reached the limit. >> Can you say something more about the technical limits that you do? >> Yes soó-problems, it's very dependent on the program so doing those things between seem during our test [INDISTINCT] that's hundred or thousand iterations of [INDISTINCT] which you were--you were using. But, of course if you keep hands on anything so whenó-if you just left, for example, one very large buffer with hundreds of thousands of bytes and if you more have one loop which goes over it to initialize into zero then you get the much larger bound but typically you can also parameterize this or just choose this smaller buffer for verification but around 100,000 [INDISTINCT]. [INDISTINCT]--my experience intoó-often is efficient tool use very low numbers of [INDISTINCT] so two is often sufficient, three, four can be used. >> Well then, [INDISTINCT] you're talking about C programs here. >> SINZ: Yes. I haven't mentioned thisó-that will be cancelled from this--for next slide. Yes so it's mainly C what we are looking ató-a little bit we also started recently through to get C++. Okay. And mainly system software or low-level software. Okay, the second main idea is to encode the whole program in a logic, in a suitable logic. So because that's where [INDISTINCT] can work on. And what's worth perhaps important in our cases that we want to transform our programs, and sometimes stateless program. So typically your program has a state which evolves over time. Let's say you have some structure or a class which it also values [INDISTINCT] into just changing those nodes over time and we try to reformulate this program into something which is stateless. So this is a bit logging functional program [INDISTINCT] for example you have been erasing the functional programming. This makes it much easier to transform into logic and the techniques that we're using is--for example this SSA form, single static assignment form, for variables which, you know, by [INDISTINCT] also means also already given. Then we are using excessive functions for memory and also these techniques likeó-or other things like loop unrolling which is what we're using in function [INDISTINCT] helps to loop such a thing. Okay. So the next main idea is the model semantic of these five instructions very precisely. So this means we could've [INDISTINCT] an approximation of what an exact instruction was made but we normally have exactly the semantics of each instruction in our program and to achieve that by modeling everything on a bit level. So, for example if you have a bit operation in your program like a Bitwise or angles, I'm shifting going on then this--all this is more on a bit level so you get exactly which bits in your drivers should be set to one and zero. And these mailings that logically encoding this simulating the hardware which runs on the system which also apparently works on the bit level or on a gate level [INDISTINCT]. So, you're going to a very low level also in verification and which might seem--which might seems a bit strange because the formulas also can get very large by this but, it's obvious that they can be available they are able to handle this. >> So, how do you define the memory modes, so like in modern hardware where usually a weak link memory models which are under specified really? So how do you or what do you do about that? You just assume some deterministic behavior or...? >> SINZ: No, so we have someó-just as you can have [INDISTINCT] behavior, for example for the program inputs you can also have any type of this behavioral faulted memory. For example you can makeó-yes. So we have one particular memory model that's been built into our system but itó-for example doesn't say anything whether memory blocks, which you get from [INDISTINCT], for example, [INDISTINCT] but it just specifies some mono overlapping constraints for different [INDISTINCT] very often you get from other case [INDISTINCT]. So it's constrained [INDISTINCT]. >> Can you do deal with caching the email, things like that? So you know... >> SINZ: Yes. >> The synchronization for it isóthe synchronization of it is [INDISTINCT]. >> SINZ: That's something that we are concentrating on now so we're just [INDISTINCT] potential programs maybe, not entire programs and because of that we are also not looking at as a synchronization [INDISTINCT]. So caching is also something [INDISTINCT]. >> So low level is more like allocation model? >> SINZ: Yes allocation model perhaps a good term [INDISTINCT]. Okay. So now if you have this logic conforming which comes out of transformation of your program, then you'll use this [INDISTINCT] which are or [INDISTINCT] for example for propositional logic formulas or SMT solver for decidable--it's stronger logic which you can also [INDISTINCT] about arrays or about [INDISTINCT]. We are using these models also. Let's just show the two historical notes so these technical part of model checking is very established for quite some years now in hardware verification. So, almost all other companies or [INDISTINCT] companies are using the techniqueó-very excited cute design [INDISTINCT]. And then around 2004 this technique to verify hardware can search through software area so that was done by Clarke, Kroening and Lerda from Carnegie Mellon University. Okay. So this was the general techniques of some ideas of someó-to get a rough understanding of how this technique works and now we will turn to our tool which is the LLBMC Bounded Model Checker. So now what you already guess before is that LLBMC is working for one C programs also on some C++ programs. We only started with that and of course it tries to find bugs or remaking bugs into program. And let's look at some design goals. So one goal was that we wanted with our tool to handle all programming language features. So this means, if you have written C program with other tools, you often have to change some constructs in the program just to make it run through your verification tool and that's something which we didn't want to have so we wanted toó-whenever you're using your C program which will drag you to--just support all language features so in a bit is to carry one slide [INDISTINCT] of things like that or when using memory-based [INDISTINCT] which we write [INDISTINCT] to memory and then leave the same half of the memory as an area of bytes for example such things can also occur. Okay and to make this easier, we are doing all verification not on the software level but we are instead using a compiler of intermediate language. And now as we're using the intermediate language for the LLVM compiler and the advantages of using such intermediate language is that it is much simpler than the higher programming language like C or C++. So, particularly just looks like risk assembler and you don't have to deal with complicated things like perhaps operator overloading or some conversional things which are in the background without being explicitly mentioned in the program. Okay. So we also wanted it to be easy to use for non-experts so this means that you do not have to write large specifications which my friends also could even be larger than the program itself that it should be possible to use it without any specifications so with just very few specifications module program. And now that they onlyó-to achieve high precision, and high precision in this case means that we do not want to get false error reports or limit them as high as possible and we do not wanted our tools misses any errors. So... >> But doesn't it by definition if you're only checking some finance debt? >> SINZ: Yes, of course, yes. So youó-I think that the general problem is that the program back checking of verifications, undesirables, so you have to make some restrictions. Nothing perhaps the big out is how to make these restrictions in the best way. So they are of course in many different possibilities how we can restrict your inputs against the-óor you check if you're going to make it decidableó-when your program's being decided. And I think that these techniques of founding paths is something which makes a lot of sense in practice. Okay, so we are using these bit-precise techniques here to avoid false error reports on these and these errors. And we want to provide a comprehensive set of built-in checks; I will show you a bit more about these checks on the next slide. And scalability of course is a problem also with these logic-based tools because the problems are from theoretical complexities [INDISTINCT] and so you have to find some tricks to make that run also and with logic programs. One thing is that we are using these very advanced SMT solvers for example and the other thing is that we also tried to get rid of easy soft problems by having some pre-process as [INDISTINCT] modification procedures. Okay, so before I come to the built-in checks, I just want to have these other slides here which tells you a little bit about the internal scheme--schematic of our tool. And it so happened [INDISTINCT] on how to increase process--on the left you start with your source code program then you run the LLVM Compiler front end on that and then what we will get into this LLVM intermediate languages [INDISTINCT]. Which is like your risk assembler transformation but it's still is more expected than the usual [INDISTINCT] that you have for [INDISTINCT]. Okay, so this is LLVM and now I'll give you all to Falke into this point and what he does first is studied [INDISTINCT] loops and this function inliningó-this is mainly for handling these finite is tható-in which--in which properties and you get this latest formula. And we have this transformed intermediate language which we then transform to a logic so we can go to logical encoding and then we have a formula in this particular logic of Bit-Vectors with Arrays where we then have some simplifications which run on this formula to make it smaller and to get rid of the [INDISTINCT] checks which might be [INDISTINCT] then we pass the simplified form of your SMT solver and finally we get the result. It can either be that it hasó-it hasn't found any errors in the program or if it has found the error it will also help between the error trace which shall use the location of the error. Okay, let's turn to the checks that are built-in so without you having to specify anything and you can use these checks. For the first thing is that itó-I'll do checks for integer overflows, for example if you add too large integers and the result of this addition will be not representable in the space or--of your integer then there could be an overflow. Here's another example which is perhaps a little bit how to [INDISTINCT] that also an overflow happens here and [INDISTINCT] it so happen, can happen at this point here, they have the engagement of x and there is one case where this engagement can lead to an overflow of both signed integers and we will show this a bit afterwards in our demo what's--for which particular that would happens. For another thing that we check is for divisions by zero, integer divisions by zero. So something like this here for example if you have a statement where X, one variable is divided by another variable, and you wonder whether Y can be zero in any case. You know in some cases it can be relatively easy to see what this can happen but in other cases perhaps like this one here might be not so easy to see whether such a division by zero can happen and how to check for this. Another thing is invalid bit shift which is perhaps not so well in this problem and this is something which the C [INDISTINCT] defines so if you have aó-let's say a 32 bit move and you shifted to the left by more than 32 bits, then the result is undefined. So, you might expect that it's just filled in by zeros which has shifted out but then it doesn't happen so it's just undefined and implementations are also doing something differently. So typically it's just to shift with is just taking modular [INDISTINCT]. That's something what we checked for, then we check for all kinds of illegal memory accesses, so one [INDISTINCT] for example is [INDISTINCT] but also all of kind of other memory accesses. So what we are mainly doing is we check all reads and writes to memory, no matter where it come from and check whether they form into illegal area to [INDISTINCT] what you write to. You're checking for invalid frees or invalid free would be something where you have a pointer which doesn't come out of the previous [INDISTINCT] instruction but it's just some other pointer which can possibly be--we're checking for double frees which is mainly in particular case an invalid free, they free the same memory twice and then we have some [INDISTINCT] memories that [INDISTINCT] tested we have checked if they have built into our combined memory test, for example. Okay. And at this point, I want to pass over to Florian who will show a little bit about our tool, how it applies in reality and [INDISTINCT]. >> MERZ: Great. Let's get started. I will start with the first example that wasó-this smaller program and what we want to show is what happens when we run this through our tool and we will take the file x.c and then the function x and we want to check if any of these built-in properties are viable between. And we can see from files to code to--[INDISTINCT] from compressed code to this LLVM bit code file. That's abs.c.bc and then it transept our tool on that file and it indeed find the back and it gives a short synopsis of what the error is. It tells us what kind of errors it was assigning control into them and it tells us on which assembly line it was. And it tells us what the values where in that case that wereó-somewhere involved in this particular instruction set. In this case, we divided zero by minusó-well the last number. >> Subject. >> MERZ: Subjectó-yes and what you can see here is the precise location in which this error occurs and then which this undefined value was calculated and you can see that it is the ret = -x case then what happens here is that this particular value affects--gave to the [INDISTINCT] method is--I mean the smallest possible integer and what happened in this case is we tried to get the absolute value of that and that is not representable in 32 bits. So that would be int max plus 1 because the twos compliment, this is not symmetric. And that has to cause in all the [INDISTINCT] somehow. So this is what happens--see what other tooló-it is essentially try toó-well virtually executed this method for all possible values of x so that's 2 to the power of 32 different possible values, if you try to run that manually and test every one of them it will take minutes, hours, I don't know, quite a long time actually. So, we're quite a lot faster we can do that in 0.007 seconds. These are quiet samples, they areó-we picked these long examples so they fit on to one screen but we can do a little bit more than that. The next example would be... >> Could you show us a word that we called ñx + 1? >> MERZ: The code? >> Just... >> MERZ: What, this? >> Yes >> [INDISTINCT] >> MERZ: [INDISTINCT] I assume the number's changed. >> It's still the same one. >> MERZ: It's the sameó-yes it's the same actually. >> [INDISTINCT] >> MERZ: 1 ñ x? It's 1 -... [INDISTINCT] >> [INDISTINCT] >> MERZ: Subjects 1 ñ x. So it's still [INDISTINCT] >> You can play around with the different examples. >> Maybe you can do it offline. >> MERZ: Yes. We'll have our second example. This is a simple division and we reacted in the function which takes x and y so x and y can be any value and you probably guessed that this is also ready and... Yes. And you can see that--well, obviously it contains a part in that divisional line and it tells us it was a division by zero and which--what assembly line it is and on which line C code it isn't, what the values are. You can see that Y are the zeros so, obviously that's a division by zero. So what we can do to get rid of that is, we can catch this particular back if one is still in that file to--for comparison, number 2 is the same one and because, you know, the division by zero is not defined in this case, we tried to approximate it by int max simply because that's the closest we can get to any kind of a result. Obviously it's a very poor replacement but better than crashing in that case. So, we run the tool again and we get--still we get it back. And this one's not that expected and the back this time is to have a minus a very large number and y is minus 1 and what happens is very close to the replacement trust there. If you divide it by minus 1 which is taking, you know, the symmetric value on positive sign and that is again undefined so what we do here is... >> [INDISTINCT] overflow error [INDISTINCT]. >> MERZ: Yes [INDISTINCT] this is not a division by zero and... >> Can that Y will never be zero? >> SINZ: You can add something which is called an assertion so you can derive [INDISTINCT] assumption is LLBMC [INDISTINCT]. >> MERZ: Yes the way we fixed this here is to fix the method with just, you know, taking that case again and int max is the closest one we can get to int max plus 1 so we take that one and we can verify that one. And this time it tells us there's some document number code in. The latest interest will be try to--because we know you are interested in C++ more that in C-code, we tried to implement a little bit of C++ support. And we've had about a week for that. And we got this case again. Can anyone see what happens in that case and why this is--well why this will produce a doc? >> [INDISTINCT] missing virtual destructors? >> [INDISTINCT] >> MERZ: And if we actually can see that it does indeed detect this particular missing virtual destructor. What happens is that, this cause up here from derived to base to plus point of arithmetic's are the pointer O, points into the middle of this derived class and not the beginning of it, if you are not called the lead on this regarding virtual destructor, recalling the lead is essentially free, by what we can see is for the replacement for that, on this point which was never allocated directly, which was in the middle of an allocated block and that is illegal. And what we do is we don't statically just check that this missing virtual destructor but we actually execute--simulate execution of that code and we catch that particular example when this does actually happen. Yes. So, these are first examples. >> Thank you very much. >> [INDISTINCT]. >> SINZ: This is just a small scene action of these errors. So we haven't showed you anything, for example, about these memory excesses, illegal memory excesses that perhaps we'll then later on if we a chance, let me show you something [INDISTINCT] somebody if you want [INDISTINCT]. Let's just see that later after the talk. Okay, so I want to switch now to look a little bit different in it. So the question is now, if we have this tool, how can you use it in your daily work? Now typically, you will have test cases and the questions what to do with this tool. How can you put up some ideas, how you can adjust your test cases to make such a tool like our LLBMC work? [INDISTINCT]. Okay, so, what would be a typical workflow with this LLBMC? Of course, the first thing is that you can just check for these low-level errors which are already been [INDISTINCT]. So you just do, more or less, nothing. You run this tool on your coding C work outputs. Okay. But this might already find of some of those errors but there might be, let's say, perhaps more interesting things that you will also want to change. So, for example, some functional correctness properties which are not count by these low-level tests. Which is now, how can you come to such functional correctness test with this tool? Also you see here... [INDISTINCT] I call it migrating for testing to verification for this systematic way of path finding. And one possibility is just to take the functional correctness, also how do you code just to provide in a totally different notation of the tool. After this we will show you an example of this. Then you just have these tool limitations and you just compare them whether on all possible inputs they produce the same output. And then if this is the case, then you can, at least, be a little bit more sure that your program is correct is if those two computations are different enough. Another thing is--something which was called parameterize unit test, or also abstract testing, sort there are different names for this. And so the name parameterized unit test comes from a Microsoft research paper from 2005 from which has to do with the tech school for testing software. And they have used this name, this notion of parameterized unit test there. And now, we've done something similar in the automotive software checking. But, what is the idea to have this parameterized unit test? Particularly in the unit test, you just have one particular case that you check. So, one for each variable you fix, [INDISTINCT] you have to write brand new code with those variables and see what's the outcome. In such a parameterized test, you will just take one variable or several variables that now fix the value for those variables. Just leave it open and you want the tool to check all those variables at once, or all these values for the variables at once. This goes for example, instead of initializing it to 0, you might know that your program expects [INDISTINCT] between 0 and 100 and you've just been--did such a range into one programming checking for all. And the third possibility is to write a checker program. And the idea here is that, if you have a quite complicated algorithm to compute something, it might still be relatively easy to check whether this result is correct. And then what it can do is to take the right verifier or the checker program which takes what is obviously correct, which can be relatively small. You have your original program helping, which computes some solution and then you put them both together and check them together. And this would mean that, we have such a checker function which takes the input to our original problem and the output of what your program does. And then you check your original program together with this--with this checker function and see whether this checker function always says, "Yes, this is the correct output for the respective input. >> How can you make sure that your checker implements the [INDISTINCT]? >> SINZ: Yes, it's of course, that's--I think the general problem. If you want to do functional correctness checks, whether you are using requirements or other kinds of basic modifications for your program needs. >> [INDISTINCT] you are likely to make the same mistakes as the original [INDISTINCT] make sense to have this [INDISTINCT] illogical abstract level and then the implementation of all that language [INDISTINCT]. >> SINZ: There might another alternative but what I hear, for example, I had in mind, is if you are writing something like such a logical solver, like a set solver, for example, then it's very hard to compute such a--such a solution that is quite easy to check such a solution. And if you write such a checker program that might be so trivial that you believe that you [INDISTINCT]. That was the idea. Okay. So now it's time for demo again and this time Stephan will give a different demo and he will show two of those methods, to go from testing to semantic pathfinder verification. Wait a second. Now. >> FALKE: This is the example, essentially show us what Carsten just mentioned [INDISTINCT] verifying that alternative definition of some branch of computers [INDISTINCT]. And this example is a potentially it's a population context sample so you want to figure out how many bits are set in any given bit vector and this bit vector expands. And there are some clever ways to do that. Just optimize population function which potentially just does a few bit [INDISTINCT] logical or bitwise and the corrections and some arithmetic. Now, of course we're interested to know whether this actually computes what it's supposed to compute, mainly the number of sets bit which is an X. But one way you can do it is of course do some unit tests, either [INDISTINCT] directly but again they are true to this [INDISTINCT] too because of the inputs, so unit testing is going to be quite expensive. What you can do instead is you can define an easier version of the function that also is supposed to increase the operation now that it's so easy that you probably believes that it computes the population, kind of, input X which is this reference popcount function. And now that we can direct verify that the optimized function is doing what it's supposed to is compute the outputs of both of those functions and you want to check whether the OS computer's saying [INDISTINCT]. And this is done in this main method, in this driver method so it computes the return value of the optimized implementation which [INDISTINCT] reference and limitation and then the [INDISTINCT] are always the same. And there are tools that is a backgrounding tool addressed to find your way as this discussion takes which means that input X they are the two permutations that different return values. And if it kind of [INDISTINCT] into the program you may kind of find an X [INDISTINCT] then it becomes a function [INDISTINCT] of the optimized version. So that's the [INDISTINCT] do you know what it's an example of? Okay. It actually takes a little bit longer now but it's still much faster than unit testing. Because were essentially it still has to check the 2 to the second two to the power of [INDISTINCT] too many inputs, takes exactly more than five seconds. Unit testing again will probably take more like an hour or so. And if there's no disagreement in the implementations and the reference and limitations, the optimized implementation so we can be corrected, [INDISTINCT] the optimized implementation is actually corrected, so. Okay. The next example I'm going to show is the second message that Carsten just mentioned. That will be essentially a bunch of generalized unit tests, so give it now a function which is supposed to compute the next larger number from given input X which is the power of two, so for example, if you gives a function where you [INDISTINCT] 800 then the return range should be 8,192 such as those [INDISTINCT] next power of two. And then again does some mere quick shifting and bitwise operation so it's not really clear that it's working. You could again, do unit tests. So, for example, could just process that number [INDISTINCT] and then see what the output is. And then you could [INDISTINCT] that it's always computing the right [INDISTINCT] for that input. If you want to be a little more generic on this, you want to just test run input, you want to test the [INDISTINCT] inputs. And this we can do using this use statement. And this driver message essentially just use statement saying that X is--should be bigger than 8,192 and then should be smaller than 16,384. And for all various within that range, [INDISTINCT] npo 2 should always give chances since it's always greater than 84. So it's more general than unit testing in the sense that it covers a latter range of inputs, not just run inputs. So, let's see, whether the implementation is correct because it's generalized test case. And again, it's quite fast. It doesn't find the error. Meaning that for all the possible inputs [INDISTINCT] it always computes the right output, which is nice, but of course it could be more general than that even. And in this parametric test case is set [INDISTINCT] to specify what's the output of X and [INDISTINCT] because now we only assume that X is not negative and not too big, because we don't have an overflow. So it took me less than [INDISTINCT], half essentially. And that the assumption of that [INDISTINCT] so we now check for all X that are non-negative and not too big. And you want to check [INDISTINCT] next larger power of two. And we do that by using [INDISTINCT] that your return value is bigger than the input, then some [INDISTINCT] and should have covered up to the power of two which we can also specify using this version. So now we are still checking two to the [INDISTINCT] different inputs and we want to see whether the message is correct for all of them. And again, it doesn't really take very long. We don't find the back meaning that for all of those inputs the master function [INDISTINCT] always computes the correct output. And I guess that actually concludes the second demo, so. >> Yes, continue. >> Thank you. >> SINZ: Okay. So you've seen these examples so far from our tool. But those has just all been quite small examples just to make you see what are the possible checks of these tools are. Now, we're now showing some first results that we got from this tool, it's ready for use. So, we haven't made extensive testing but nevertheless I have two results here which are a bit larger here. So the first example when we checked was, function blit. Or it's a program which is called blit and that's from an embedded system benchmark suite. This suite is called Powerstone because of the power consumption testing. And that's just a bit larger. It has 96 lines of code but you have to set quite highly bits here, you have to set an under or even a thousand because those are capable many times functioning the limit of two. And then you can text an illegal bits use in this benchmark which presumably wasn't there before these statistics and error here. And this means--so this [INDISTINCT] are a bit strange things that we tried to detect but as I mentioned already if you shift such a value which is a 32 bit value by more than 32 bits on the left then the result is undefined. So this would be undefined because you're working at more derived inside, instead this value gets a 44. So you have a shift of 44 which is just undefinable. And this took quite some timing because this under rolling student was a bit more than one and a half minutes to find this area here on my computer, even the laptop. And another example where we found an error was in a routine for MPEG2 video decoding. That's a decoder which is quite some years old and it was written by the MPEG Software Simulation Group. And we took a look at some parts of this decoder and in particular, the index in this screen [INDISTINCT] having arithmetic to transform these blocks of video. And so the whole decoder consists of a bit more than 10,000 lines of C code. And this unit test that we roll, or this parameterized unit test for checking this discrete fulltime transform would be--that consisted of 600--766 lines of codes including [INDISTINCT]. And so what did these routine do? So it took one input eight by eight bytes or eight bytes, shortened actual inputs, passed it to this to inverse, discreet cosine transform and looked whether this routine was corrected. And we were just looking for these low level boxes, field checks that we have there. It combined the loop under a limit of 1,024 and function unit of 2 and we detected an array in this out of bounds area on this program. So this occurred after having computed heavy arithmetic code. Some particular value that was used as an index into an array and this index was just too high for the array. But it was not really obvious to find, to see from the code directly whether such rows could be computed from the other [INDISTINCT]. This error was found in seven minutes on this platform here. Okay. So, this would be to a little bit larger examples what our tool can do. And so, I'm approaching the end of my talk. And so I want to summarized now what we have shown you today. So, it was mainly about this path finding technique by [INDISTINCT] in our tool for this technique which is LLBMC. So you've seen that LLBMC can verify C programs, I would say up to several thousand lines of codes because this was relatively arithmetic. How for it to verify in other cases I should get the scale up to a bit more. So, say several thousand lines of code are possible with this tool. And it's especially suited for low-level system code. Various methods has already shown that it works quite well in, for example, in those case studies which we have done with this general method of finding modern checking of software along with our tool. And so this is some work that we have done in the past and it was, for example, in the various files in the Linux program that we have found some error with this technique. Also, equivalence checking of cryptographic routines so we've compared to [INDISTINCT] implementations and we are also are using this technique in embedded systems, automotive domain to functional verify certain devices on a car. Okay, so this is the state now. And if you look at it the future what is still missing or what we like to do in the future, something which is missing from this supporting all of C and all C++ is still floating pond arithmetic, which is conceptually not that hard to implement but it takes some time. The other thing is support for more complex or complicated languages than C, like C++ or perhaps [INDISTINCT]. What is also something that has to be done in the future is to provide some specifications for why do we use library functions. For example, for deep C read functions from the file life functions to file, that's something which we have not yet covered with our tool which we have to do in the future and of course, performance and improvements are always good to have. So that's what we are--we are planning to work on in the future. But I think that this technique is already quite useable for our many programs, especially if you are using these techniques like generalizing the test case for example. In the test case, this typically do not need to master the old program that you want to verify but just a small function. And then it might be sufficient. Okay. So, this concludes my talk, our talk. And if you want to get further information, you can either get it from our research group's website here [INDISTINCT] for verification or directly from the site website for our tool and we are planning to make this tool available in April this year. So, [INDISTINCT]. Thank you. >> So I guess, I have to ask the main question that what is asked about when talks about model shaping, which is what do you do about high [INDISTINCT] programs? Like, as soon as you get a function [INDISTINCT] or objects? >> SINZ: So, we can handle the function pointers with our tool but the techniques to handle them are mainly also to inline all the possible functions that could occur in a certain place. So if you have a function pointer, you can try to find out if--can even a set of possible functions that might be called to that place? >> So you basically defunctionalized the program and then do the first order [INDISTINCT]. >> SINZ: Yes. Then you can just try to make it a bit more scalable [INDISTINCT]. >> How does that work together with your parameterized unit test? So you kind of parameterize over function test [INDISTINCT]? >> SINZ: I think that in principle that you brought, you don't have to fix it to one function in your program. So anybody doing [INDISTINCT] just made a big switch stake within your program which just switches between different functions but how you specify which particular function is. >> Right. We need to know the whole program at that point, right? >> SINZ: Yes, that's right. >> Which means incongruently the unit testing, the unit checking in. >> SINZ: But I don't know what to say, but I would say that you need the whole source code. So you can not assume, for example, that your function points to something external. >> If I'm talking about. If I want to check a [INDISTINCT] email and it's not defined what the whole source code is, right? >> So, for example, the matter that finds a fixed point of the function, it takes function as a parameter, and there is no reasonable branch to these functions. ] >> I mean that's an opportunity [INDISTINCT]. >> SINZ: Yes. [INDISTINCT] if you do not know anything about functions then you cannot handle this method. >> But it is the same with unbounded data search [INDISTINCT], right? So if you process the tree parameters [INDISTINCT] compute something on that then you can't just check all trees that are possible. >> SINZ: Yes. That's also the case but I would say now what is best important here is that you can check all--or you're bonding your program traces and this also means that either you bound your data structures. So this means you, for example, check all trees up to a size of 100 moves or 1,000 moves or something like that. Then they can say something about research. For many errors, they already show up for limited to smaller examples. So that's the idea behind this technique why it is expected to work, it also shows some smaller examples. >> So in theory, obviously you don't have false positive, but in practice probably you do it because there is another code that just doesn't care about some things that's very, you have a--you write a unit test and you would pull out the other function that has two parameters and you divide and you just know that in your code you don't own it. Every valid over, like arbitrary examples and like looked at how high your false positive right is. >> SINZ: No. As If said so, this is relatively new so we haven't been doing intensive testing but what I can say about false positive is that, so if you have everything together, if you have the whole source code, and if there was a balance are sufficient and this method is completed will not show any false positives, as long as we have the program here [INDISTINCT]. But, for larger examples, it's often not the case. So typically you have library calls for example, and for this library calls you do not want to include the whole source code of the--of the [INDISTINCT] and then you would just give us specifications or some approximation on what this function does. And as soon as you are using those [INDISTINCT] approximations you also can get some false error reports and such. I would say that the method in general is complete and doesn't provide such false errors. But, in practical cases, as soon as you have to use the library functions then such case can happen. But we hope that it's still better than with other techniques like have some interpretations for example, which they can even cause some other approximation sense. >> That would be an interesting thing because most things that I have seen that find bugs actually have the problem of too many false positives. In the end it's just too much time to grill through all those examples to actually find the real bugs. >> SINGZ: Yes, but I would say that this is an advantage of this bounded model checking method, that it should be better than, for example, this [INDISTINCT] to computing ranges for variables. Using these precise techniques, we can really often avoid this kind of two cause approximation in checking. But I also know if the studies which compare different verification tools and also the [INDISTINCT] tools. That the rates can b e really quite high among these tools. We have to make actually more sophisticates sides [INDISTINCT] practice. And in that case, instead we have, I mean I've shown you on the last slide, for example, [INDISTINCT] didn't turn out to be such a big problem. So, when you also mentioned we also made together with Bosch one kind of research project where we tried to combine a second permutation that was a commercial tool that they are already using at Bosch which is called Polyspace, which produces quite some number of such false, such [INDISTINCT] false if we tried to combine it with our methods. We did it in such a way that we first run this less precise permutation method and then out of those possible errors that this tool reported, I'm just checking again without worry. We could get rid of I think more than 70% percent out of all those errors that were reported by the initial tool. [INDISTINCT]. >> You said that you spent about a week getting the basic C ++ support in there? So how much time do you think would it take to support all, or let say, most of C++? >> SINZ: Yes, it's, always hard to guess. I think not decades, I hope. I think in a year or so you could really do quality on that. So what is the [INDISTINCT] you could say more on what are the main points that can be done? >> MERZ: Running in general, that would be caught quite quickly because on the LLVM level there's not that much of a difference between C++ and C code because it's all just after December. But what was the first larger change that we had to do is to support function pointers. And what we do right now is to inline all possible function pointers that would fit for example. And we over approximate and the inline lost more functions than would actually be necessary. So what we have to do in that case is to go back and look which functions could it actually be. And that would most of the time be the ones of the derived classes, for example, that would actually be overlooked functions of the same base function. We don't do that right now, we just take all functions that match--have a matching signature. So to get that information, which is completely lost on the LLVM level, we'd have to spend some more work on it but it's not like this is a theoretical problem, it's more an engineering problem we just have to do and implement. >> Isn't there also a theoretical problem in the sense that construction space explodes quickly when you have an object on the program because of all the functions you might call? >> MERZ: That's especially a problem because we take in all of the functions that are roughly measured. That's a lot more. That the over approximation that makes it inefficient. It runs. You can use it but it won't be as efficient to get it efficient and to reduce that over approximation to really accept that those functions that can actually be called in that particular context. To reduce it to that number would be a little bit of work. >> So, you can--I suppose you can restrict it by looking at the type of a function. You can restrict it by looking which class it is defined? >> MERZ: You can, for example, just see what happens if that particular function is caught. Your second search in that location where it is caught. And instead of calling the function, you say, "Oh no, let's just stop it here and tell us what was the value of the function point at this particular place?" And then you can look it up which function was that value and you can take note it and edit to the list of functions that could be called at this particular point. And you can cache that information for later on so you can say, for example, if it's at this function, in this line I'm calling one of these--of this set of functions. So you could do things like that to reduce the number of possible functions that can be caught. >> But you're still--going back to my earlier point. You're still bound to your old program approach, right? It can't get around that? >> SINZ: We'd have to generate all possible function definitions [INDISTINCT]. >> So like checking an object going to the library in that ways. >> SINZ: That is not something that is the intention of this. I mean, we can't really have to write some program which uses the library and then really test the whole program [INDISTINCT]. >> MERZ: What you'd do in the unit test, you would actually create the method of the object and initialize its B table and all that stuff. And instead of having one specific object at that location, you could say, "Okay, this could be one of these objects, of this object, of this object, of this class, of this class." And you get some kind of longitude [INDISTINCT] at that point, but you're still restricted to subsets. That it's largely unit test but a little bit more general and that reduces the complexity. But if you just say it's some object and I don't know anything about it then well, yes, most likely it will be too hard to solve it. >> Maybe there's a solution maybe you need to provide some method of putting meat to paper on function declarations instead to tell you to [INDISTINCT] necessary information [INDISTINCT], like [INDISTINCT] barriers of that might help. Then you can do verification [INDISTINCT]. >> MERZ: I think there's--even if you add this kind of annotation to the kind function itself, let's say, okay this function has always called with the correct object of this, of this, of this type, well you simply create that object. So, you have some kind of test drive a function that calls a function you want to check. And that builds up, let's say for example, you have some kind of tree infrastructure with different kind of classes in that tree, you build up a small tree that fits the kind of tree that you want to check. And then--but you don't do it explicitly, like object number--node number five is of this class. You can say node number five is of one of these classes. And you create a non-deterministic [INDISTINCT] and then you cast that one deterministic tree into the function and you know that it will be later restricted to these kind of--to these kind of trees. So it'd be some kind of, I think that's the best approach you can do with this. Take a unit test and make more general, more general until you hit that kind of performance bound, until you can't make it more general without making it--making it inefficient. And I think that's the best way to go to start off with that something very specific and to make it more general as far as possible. You will cover lots more different cases with that. >> So, are you looking into adding your tool on your tool source code? I mean that's what most code people do that with their compilers, so. >> [INDISTINCT] >> MERZ: You can get it running for the more basic data types that we use quite quickly, actually. It will take more than, let's say another one or two weeks just to get used to how to do it. >> [INDISTINCT] we have checked that. >> MERZ: The problem is we use C++ and we only added C++ at testing in the last weeks. [INDISTINCT] >> [INDISTINCT]. >> [INDISTINCT] limited complexity basic data tabs? [INDISTINCT] >> About this compression library or? >> Yes. >> SINZ: Okay. We are looking at some part of that specific, for example, to check some computation [INDISTINCT] and that was mainly because of library functions [INDISTINCT]. Bu these--check some computations for example, that's something that you can check. We also have some tests which we can show you later. Thank you.