Tip:
Highlight text to annotate it
X
>>
SINZ: Welcome everybody and thanks for inviting us here for this talk. For this talk will
be about the techniques for fact finding. And this technique is called Bounded Model
Checking and we will also present our tool which implements--includes this technique
and our tool is called LLBMC which stands for Low-Level Bounded Model Checker. Okay,
let's start with a very general problem, The Battle Against Software Bugs. So, I want to
give some examples about how software bugs turned out in realistic examples and what
some people suggested, how to deal with their software bugs. So let me start back at the
end of 2008. So this example is about mobile devices, more particularly about the Microsoft
media player Zune. And a strange thing happened on the 20--on the 31st December of 2008 when
suddenly all of those 32 gigó-30 gigabytes Zune devices suddenly stopped working. So
if you were going to switch off you could not start it again. And that was of course
the problem because many people wanted to use their devices on New Year's Eve to play
some music on their party and the device didn't work. And the reason for this was that there
was a bug in the clock initialization of these devices which didn't work on some particular
days due to Leap Years and the 31st of 2008 was just such a day when these devices for
this clock initialization didn't work. And fortunately on the next day, first of--1st
January of 2009, this bug didn't show up anymore because the Leap Year had changed and then
the devices started working again. But nevertheless it brought some not so good press for Microsoft,
this program. Okay let's look at another example which happened around the same time and it's
also about mobile devices. So this time we're talking about a device from Research In Motion
and this device is Storm which was introduced at the end of 2008 and it also suffered from
some bugs which many users complained about. And what's perhaps even more interesting than
those bugs is the reaction of the co-CEO Jim Balsillie, who you see here. And how he reacted
and what he said about how he views as such. He said in the interview that Buggy smartphone
software is just the new reality. So which mainly means you cannot do anything about
this, okay. Because there's so much time pressure and [INDISTINCT] of money, you just have to
accept that there will be bugs in your software. Perhaps over time when you release these bugs
[INDISTINCT]. And of course this is not a position that we want to keepó-we want to
do something else with the bugs, not just say that's something we cannot do anything
about. Okay. So let's leave the area of mobile devices and switch to some other area and
that's about security and about bugs in operating systems, systems software and applications
which can be exploited by [INDISTINCT] for example on the Web or in other notions in
this--in this context doesó-cyber war for example, they're also such things exploited.
And the example that the other one showed here that happened at the end of 2009, the
beginning of 2010 and it was later named Operation Aurora and it was caused by a zero-day vulnerability
in Internet Explorer which was exploited--so this was a security hole in Microsoft Internet
Explorer and that was exploited by presumably Chinese hackers to write into computer systems
and also into Google systems, into systems from Google and they tried to get access,
illegal or unauthorized access to intellectual property from Google. They also hacked some
Gmail accounts from Chinese human rights activists. And the reaction of Google was afterwards
that they said that is a--that it was a highly sophisticated and coordinated attack what
was going on here. But what is the main point here for me is that it wasó-this was finally
just caused by a very small error in Internet Explorer. There was one user after [INDISTINCT]
this whole thing. So a very small error in one program will cause a very large effect.
And besides Google, also other companies were affected, so a total--it was said that 35
companies from all different kinds of industry sectors were affected by this attack here.
Okay. So this is such another example for such an error, just a small error which has
a large influence. And this isó-this is rather a very special case, so this is something
which is quite general, so. And you can see this year from this figure for example here.
So, this figure shows the number of vulnerabilities which were caused by buffer overflows. The
buffer overflow is just one particular kind of software errors which can be in a program
and you see it in those--and this year what's shown here is the number of software errors
or vulnerabilities in software which--for which [INDISTINCT] was known and which were
caused by buffer overflow errors. And you see the numbers are quite high so we haveó-and
they are quite stable so you see that in 2008 you had more than 500 such vulnerabilities
caused by buffer overflow problems and this number is not really going down very fast
so it seems that the situation and how we can deal with such facts would I think is
not getting much better over time. Okay, so what else. One thing which I found quite interesting
was a message which was posted on an Internet forum that's on the LWN New York's forum.
And it was posted by some user in 2005 in reaction to a buffer overflow error in the
zlib compression library. And here is what he has written, it was JoeBuck was his alias
that he was using on his phone and he said, "We should have been done with zlib bugs long
ago." Because he says this is old software, it's 10 or 15 years old. And he continued,
"It should be possible to analyze the code and prove that there are no remaining buffer
overflows." And that's exactly the direction that we also want to follow. But not like
what you have seen before with RIM, for example, just saying this is in your reality but we
wanted to do something about this along these lines here. And what's also interesting is
that he continued, "Too bad summer's over, it would have been nice to ask Google to sponsor
something like that." Perhaps we will have the chance to meet Joe Buck in the future.
Okay, so this closes my list of examples on just more or less random selection of some
problems in software and the effects that it caused. Okay, so we now have these effects
in our software, the question is how do we deal with them and how perhaps should we deal
with them? So let's compare the reality how [INDISTINCT] handled and what is our vision,
how they perhaps could be handled in the future. So I think you have regular maintenance software,
nice software which consists of different modules like this one here but unfortunately
after awhile this software still has some bugs. So you see them here [INDISTINCT] the
reality is it's still there in the software in some places. And so what to do about those
facts now? So, typically you will start to write some test cases and drag those test
cases on the software and just people have toó-well this means in some way, you have
to go with your armory or your machinery to deal with bugs and you just start looking
at some very particular points and shoot your arrows at those points and see what happens.
And in this example, that's a bit hard to see but there was no bug at this place so
you just think, okay my software is correct for this particular example so let's continue
with some other test cases, some other error reservations that this software looked into.
And in some cases you might be lucky like here with this error was you have to get rid
of the bug but with others you areó-you are not so lucky. Finally, of course, you will
not know whether there are still bugs left in your software of course you might reduce
the number by writing through test but they might still be something left.
>> [INDISTINCT] that testing can only show the presence of bugs but never the absence?
>> SINZ: Yes from [INDISTINCT]. This is the same direction. So of course what we would
like much more if there would be some kind of device or Oracle or [INDISTINCT] which
could tell us exactly where the bugs are, which places we have to look at or which testing
tool we have to write to find our bugs. Or you can also think of it like some kind of
scanner and the scanner will just go over your software once like there's a scanner
at the airport like this here and then outwards suddenly turns into graver errors in your
software [INDISTINCT]. And then you can just fix these pinpointó-pinpointed errors in
the program. Now this is how we think of our tool that it works in some way like this scanner
that you had just seen. Okay so let's now turn to this method that we proposed as aó-as
a scanner and this method is called, Bounded Model Checkingó-yes.
>> One question for you. So, how do you define bug in that context? I mean bugs are very
hard to determine, and generally I'm surely undecided [INDISTINCT] expect for the program
inside [INDISTINCT]. >> Yes of course--as we've said there are
very different notions so we mainly think ofó-let's say obvious programming errors
like these buffer overflows which you've seen before. And buffer overflows is for example,
something which you can take without writing the justification or integer overflows would
be another such class or division by zeroó-so I willóI will show some examples.
>> So certain case where you run into undefined behavior in some form, something like that?
>> Yes, undefined behavior or behavior which makes your program crash or just some, you
know maybe illegal things that happened with your program. So this isó-this is very much
like or it's on the source code level of course so it's not a high level error finding design
for example or improved designs but it's a lowó-very low leveló-very low level tool--that's
also why we have chosen this in a low-level bounded model checker for our tool. You will
see a bit more about this type of [INDISTINCT]. Okay, but let's turn now to the technique
that we want to use, this technique of bounded model checking. Now let's start with--if you're
involved in ideas of this technique, perhaps the most important idea is that, with this
technique, we want to analyze all paths of a program and all inputs of a program at the
same time to give you with a test case the best tool of just going through one program
path, you're just using one particular set of input values for each variable, one exactly
one exact value and then you run it to the tool--you don't have to do everything at once,
so we check, we're going to check all paths and all the inputs at once. And then there
is the restrictions you see here. So not all arbitrary paths but only paths up to a fixed
length are what we can check with our tool. And of course my last point, what does it
mean this fixed length? When fixed length in our case means that the number of [INDISTINCT]
is limited which can be taken into and also the number of method function cores which
can be taken from our limited in this approach. So this is something which you give to the
program if you rechecked that before you started. So you would say, analyze all loops up to
1,000 computerations for example or analyze function cores up to a nesting depth of 100
or something like that. And then all paths which areó-which or--within this limit will
be analyzed but not the ones which are around [INDISTINCT]. And why are we doing this? The
problem is that, hacking properties of programs in general is undesirable but if you have
finite paths, then this means that everything else also gets finite. If you have only finite
program paths, this means that during this finite runs, only the finite part of your
memory and of your variables can beó-can be looked ató-so all the data structures
will also be finite in that case and checking properties about finite things is something
which is decided within. Okay so this--this is the first idea--analyze all paths that
runs with this restriction of having reached the limit.
>> Can you say something more about the technical limits that you do?
>> Yes soó-problems, it's very dependent on the program so doing those things between
seem during our test [INDISTINCT] that's hundred or thousand iterations of [INDISTINCT] which
you were--you were using. But, of course if you keep hands on anything so whenó-if you
just left, for example, one very large buffer with hundreds of thousands of bytes and if
you more have one loop which goes over it to initialize into zero then you get the much
larger bound but typically you can also parameterize this or just choose this smaller buffer for
verification but around 100,000 [INDISTINCT]. [INDISTINCT]--my experience intoó-often is
efficient tool use very low numbers of [INDISTINCT] so two is often sufficient, three, four can
be used. >> Well then, [INDISTINCT] you're talking
about C programs here. >> SINZ: Yes. I haven't mentioned thisó-that
will be cancelled from this--for next slide. Yes so it's mainly C what we are looking ató-a
little bit we also started recently through to get C++. Okay. And mainly system software
or low-level software. Okay, the second main idea is to encode the whole program in a logic,
in a suitable logic. So because that's where [INDISTINCT] can work on. And what's worth
perhaps important in our cases that we want to transform our programs, and sometimes stateless
program. So typically your program has a state which evolves over time. Let's say you have
some structure or a class which it also values [INDISTINCT] into just changing those nodes
over time and we try to reformulate this program into something which is stateless. So this
is a bit logging functional program [INDISTINCT] for example you have been erasing the functional
programming. This makes it much easier to transform into logic and the techniques that
we're using is--for example this SSA form, single static assignment form, for variables
which, you know, by [INDISTINCT] also means also already given. Then we are using excessive
functions for memory and also these techniques likeó-or other things like loop unrolling
which is what we're using in function [INDISTINCT] helps to loop such a thing. Okay. So the next
main idea is the model semantic of these five instructions very precisely. So this means
we could've [INDISTINCT] an approximation of what an exact instruction was made but
we normally have exactly the semantics of each instruction in our program and to achieve
that by modeling everything on a bit level. So, for example if you have a bit operation
in your program like a Bitwise or angles, I'm shifting going on then this--all this
is more on a bit level so you get exactly which bits in your drivers should be set to
one and zero. And these mailings that logically encoding this simulating the hardware which
runs on the system which also apparently works on the bit level or on a gate level [INDISTINCT].
So, you're going to a very low level also in verification and which might seem--which
might seems a bit strange because the formulas also can get very large by this but, it's
obvious that they can be available they are able to handle this.
>> So, how do you define the memory modes, so like in modern hardware where usually a
weak link memory models which are under specified really? So how do you or what do you do about
that? You just assume some deterministic behavior or...?
>> SINZ: No, so we have someó-just as you can have [INDISTINCT] behavior, for example
for the program inputs you can also have any type of this behavioral faulted memory. For
example you can makeó-yes. So we have one particular memory model that's been built
into our system but itó-for example doesn't say anything whether memory blocks, which
you get from [INDISTINCT], for example, [INDISTINCT] but it just specifies some mono overlapping
constraints for different [INDISTINCT] very often you get from other case [INDISTINCT].
So it's constrained [INDISTINCT]. >> Can you do deal with caching the email,
things like that? So you know... >> SINZ: Yes.
>> The synchronization for it isóthe synchronization of it is [INDISTINCT].
>> SINZ: That's something that we are concentrating on now so we're just [INDISTINCT] potential
programs maybe, not entire programs and because of that we are also not looking at as a synchronization
[INDISTINCT]. So caching is also something [INDISTINCT].
>> So low level is more like allocation model? >> SINZ: Yes allocation model perhaps a good
term [INDISTINCT]. Okay. So now if you have this logic conforming which comes out of transformation
of your program, then you'll use this [INDISTINCT] which are or [INDISTINCT] for example for
propositional logic formulas or SMT solver for decidable--it's stronger logic which you
can also [INDISTINCT] about arrays or about [INDISTINCT]. We are using these models also.
Let's just show the two historical notes so these technical part of model checking is
very established for quite some years now in hardware verification. So, almost all other
companies or [INDISTINCT] companies are using the techniqueó-very excited cute design [INDISTINCT].
And then around 2004 this technique to verify hardware can search through software area
so that was done by Clarke, Kroening and Lerda from Carnegie Mellon University. Okay. So
this was the general techniques of some ideas of someó-to get a rough understanding of
how this technique works and now we will turn to our tool which is the LLBMC Bounded Model
Checker. So now what you already guess before is that LLBMC is working for one C programs
also on some C++ programs. We only started with
that and of course it tries to find bugs or remaking bugs into program. And let's look
at some design goals. So one goal was that we wanted with our tool to handle all programming
language features. So this means, if you have written C program with other tools, you often
have to change some constructs in the program just to make it run through your verification
tool and that's something which we didn't want to have so we wanted toó-whenever you're
using your C program which will drag you to--just support all language features so in a bit
is to carry one slide [INDISTINCT] of things like that or when using memory-based [INDISTINCT]
which we write [INDISTINCT] to memory and then leave the same half of the memory as
an area of bytes for example such things can also occur. Okay and to make this easier,
we are doing all verification not on the software level but we are instead using
a compiler of intermediate language. And now as we're using the intermediate language for
the LLVM compiler and the advantages of using such intermediate language is that it is much
simpler than the higher programming language like C or C++. So, particularly just looks
like risk assembler and you don't have to deal with complicated things like perhaps
operator overloading or some conversional things which are in the background without
being explicitly mentioned in the program. Okay. So we also wanted it to be easy to use
for non-experts so this means that you do not have to write large specifications which
my friends also could even be larger than the program itself that it should be possible
to use it without any specifications so with just very few specifications module program.
And now that they onlyó-to achieve high precision, and high precision in this case means that
we do not want to get false error reports or limit them as high as possible and we do
not wanted our tools misses any errors. So... >> But doesn't it by definition if you're
only checking some finance debt? >> SINZ: Yes, of course, yes. So youó-I think
that the general problem is that the program back checking of verifications, undesirables,
so you have to make some restrictions. Nothing perhaps the big out is how to make these restrictions
in the best way. So they are of course in many different possibilities how we can restrict
your inputs against the-óor you check if you're going to make it decidableó-when your
program's being decided. And I think that these techniques of founding paths is something
which makes a lot of sense in practice. Okay, so we are using these bit-precise techniques
here to avoid false error reports on these and these errors. And we want to provide a
comprehensive set of built-in checks; I will show you a bit more about these checks on
the next slide. And scalability of course is a problem also with these logic-based tools
because the problems are from theoretical complexities [INDISTINCT] and so you have
to find some tricks to make that run also and with logic programs. One thing is that
we are using these very advanced SMT solvers for example and the other thing is that we
also tried to get rid of easy soft problems by having some pre-process as [INDISTINCT]
modification procedures. Okay, so before I come to the built-in checks, I just want to
have these other slides here which tells you a little bit about the internal scheme--schematic
of our tool. And it so happened [INDISTINCT] on how to increase process--on the left you
start with your source code program then you run the LLVM Compiler front end on that and
then what we will get into this LLVM intermediate languages [INDISTINCT]. Which is like your
risk assembler transformation but it's still is more expected than the usual [INDISTINCT]
that you have for [INDISTINCT]. Okay, so this is LLVM and now I'll give you all to Falke
into this point and what he does first is studied [INDISTINCT] loops and this function
inliningó-this is mainly for handling these finite is tható-in which--in which properties
and you get this latest formula. And we have this transformed intermediate language which
we then transform to a logic so we can go to logical encoding and then we have a formula
in this particular logic of Bit-Vectors with Arrays where we then have some simplifications
which run on this formula to make it smaller and to get rid of the [INDISTINCT] checks
which might be [INDISTINCT] then we pass the simplified form of your SMT solver and finally
we get the result. It can either be that it hasó-it hasn't found any errors in the program
or if it has found the error it will also help between the error trace which shall use
the location of the error. Okay, let's turn to the checks that are built-in so without
you having to specify anything and you can use these checks. For the first thing is that
itó-I'll do checks for integer overflows, for example if you add too large integers
and the result of this addition will be not representable in the space or--of your integer
then there could be an overflow. Here's another example which is perhaps a little bit how
to [INDISTINCT] that also an overflow happens here and [INDISTINCT] it so happen, can happen
at this point here, they have the engagement of x and there is one case where this engagement
can lead to an overflow of both signed integers and we will show this a bit afterwards in
our demo what's--for which particular that would happens. For another thing that we check
is for divisions by zero, integer divisions by zero. So something like this here for example
if you have a statement where X, one variable is divided by another variable, and you wonder
whether Y can be zero in any case. You know in some cases it can be relatively easy to
see what this can happen but in other cases perhaps like this one here might be not so
easy to see whether such a division by zero can happen and how to check for this. Another
thing is invalid bit shift which is perhaps not so well in this problem and this is something
which the C [INDISTINCT] defines so if you have aó-let's say a 32 bit move and you shifted
to the left by more than 32 bits, then the result is undefined. So, you might expect
that it's just filled in by zeros which has shifted out but then it doesn't happen so
it's just undefined and implementations are also doing something differently. So typically
it's just to shift with is just taking modular [INDISTINCT]. That's something what we checked
for, then we check for all kinds of illegal memory accesses, so one [INDISTINCT] for example
is [INDISTINCT] but also all of kind of other memory accesses. So what we are mainly doing
is we check all reads and writes to memory, no matter where it come from and check whether
they form into illegal area to [INDISTINCT] what you write to. You're checking for invalid
frees or invalid free would be something where you have a pointer which doesn't come out
of the previous [INDISTINCT] instruction but it's just some other pointer which can possibly
be--we're checking for double frees which is mainly in particular case an invalid free,
they free the same memory twice and then we have some [INDISTINCT] memories that [INDISTINCT]
tested we have checked if they have built into our combined memory test, for example.
Okay. And at this point, I want to pass over to Florian who will show a little bit about
our tool, how it applies in reality and [INDISTINCT]. >> MERZ: Great. Let's get started. I will
start with the first example that wasó-this smaller program and what we want to show is
what happens when we run this through our tool and we will take the file x.c and then
the function x and we want to check if any of these built-in properties are viable between.
And we can see from files to code to--[INDISTINCT] from compressed code to this LLVM bit code
file. That's abs.c.bc and then it transept our tool on that file and it indeed find the
back and it gives a short synopsis of what the error is. It tells us what kind of errors
it was assigning control into them and it tells us on which assembly line it was. And
it tells us what the values where in that case that wereó-somewhere involved in this
particular instruction set. In this case, we divided zero by minusó-well the last number.
>> Subject. >> MERZ: Subjectó-yes and what you can see
here is the precise location in which this error occurs and then which this undefined
value was calculated and you can see that it is the ret = -x case then what happens
here is that this particular value affects--gave to the [INDISTINCT] method is--I mean the
smallest possible integer and what happened in this case is we tried to get the absolute
value of that and that is not representable in 32 bits. So that would be int max plus
1 because the twos compliment, this is not symmetric. And that has to cause in all the
[INDISTINCT] somehow. So this is what happens--see what other tooló-it is essentially try toó-well
virtually executed this method for all possible values of x so that's 2 to the power of 32
different possible values, if you try to run that manually and test every one of them it
will take minutes, hours, I don't know, quite a long time actually. So, we're quite a lot
faster we can do that in 0.007 seconds. These are quiet samples, they areó-we picked these
long examples so they fit on to one screen but we can do a little bit more than that.
The next example would be... >> Could you show us a word that we called
ñx + 1? >> MERZ: The code?
>> Just... >> MERZ: What, this?
>> Yes >> [INDISTINCT]
>> MERZ: [INDISTINCT] I assume the number's changed.
>> It's still the same one. >> MERZ: It's the sameó-yes it's the same
actually. >> [INDISTINCT]
>> MERZ: 1 ñ x? It's 1 -... [INDISTINCT] >> [INDISTINCT]
>> MERZ: Subjects 1 ñ x. So it's still [INDISTINCT] >> You can play around with the different
examples. >> Maybe you can do it offline.
>> MERZ: Yes. We'll have our second example. This is a simple division and we reacted in
the function which takes x and y so x and y can be any value and you probably guessed
that this is also ready and... Yes. And you can see that--well, obviously it contains
a part in that divisional line and it tells us it was a division by zero and which--what
assembly line it is and on which line C code it isn't, what the values are. You can see
that Y are the zeros so, obviously that's a division by zero. So what we can do to get
rid of that is, we can catch this particular back if one is still in that file to--for
comparison, number 2 is the same one and because, you know, the division by zero is not defined
in this case, we tried to approximate it by int max simply because that's the closest
we can get to any kind of a result. Obviously it's a very poor replacement but better than
crashing in that case. So, we run the tool again and we get--still we get it back. And
this one's not that expected and the back this time is to have a minus a very large
number and y is minus 1 and what happens is very close to the replacement trust there.
If you divide it by minus 1 which is taking, you know, the symmetric value on positive
sign and that is again undefined so what we do here is...
>> [INDISTINCT] overflow error [INDISTINCT]. >> MERZ: Yes [INDISTINCT] this is not a division
by zero and... >> Can that Y will never be zero?
>> SINZ: You can add something which is called an assertion so you can derive [INDISTINCT]
assumption is LLBMC [INDISTINCT]. >> MERZ: Yes the way we fixed this here is
to fix the method with just, you know, taking that case again and int max is the closest
one we can get to int max plus 1 so we take that one and we can verify that one. And this
time it tells us there's some document number code in. The latest interest will be try to--because
we know you are interested in C++ more that in C-code, we tried to implement a little
bit of C++ support. And we've had about a week for that. And we got this case again.
Can anyone see what happens in that case and why this is--well why this will produce a
doc? >> [INDISTINCT] missing virtual destructors?
>> [INDISTINCT] >> MERZ: And if we actually can see that it
does indeed detect this particular missing virtual destructor. What happens is that,
this cause up here from derived to base to plus point of arithmetic's are the pointer
O, points into the middle of this derived class and not the beginning of it, if you
are not called the lead on this regarding virtual destructor, recalling the lead is
essentially free, by what we can see is for the replacement for that, on this point which
was never allocated directly, which was in the middle of an allocated block and that
is illegal. And what we do is we don't statically just check that this missing virtual destructor
but we actually execute--simulate execution of that code and we catch that particular
example when this does actually happen. Yes. So, these are first examples.
>> Thank you very much. >> [INDISTINCT].
>> SINZ: This is just a small scene action of these errors. So we haven't showed you
anything, for example, about these memory excesses, illegal memory excesses that perhaps
we'll then later on if we a chance, let me show you something [INDISTINCT] somebody if
you want [INDISTINCT]. Let's just see that later after the talk. Okay, so I want to switch
now to look a little bit different in it. So the question is now, if we have this tool,
how can you use it in your daily work? Now typically, you will have test cases and the
questions what to do with this tool. How can you put up some ideas, how you can adjust
your test cases to make such a tool like our LLBMC work? [INDISTINCT]. Okay, so, what would
be a typical workflow with this LLBMC? Of course, the first thing is that you can just
check for these low-level errors which are already been [INDISTINCT]. So you just do,
more or less, nothing. You run this tool on your coding C work outputs. Okay. But this
might already find of some of those errors but there might be, let's say, perhaps more
interesting things that you will also want to change. So, for example, some functional
correctness properties which are not count by these low-level tests. Which is now, how
can you come to such functional correctness test with this tool? Also you see here...
[INDISTINCT] I call it migrating for testing to verification for this systematic way of
path finding. And one possibility is just to take the functional correctness, also how
do you code just to provide in a totally different notation of the tool. After this we will show
you an example of this. Then you just have these tool limitations and you just compare
them whether on all possible inputs they produce the same output. And then if this is the case,
then you can, at least, be a little bit more sure that your program is correct is if those
two computations are different enough. Another thing is--something which was called parameterize
unit test, or also abstract testing, sort there are different names for this. And so
the name parameterized unit test comes from a Microsoft research paper from 2005 from
which has to do with the tech school for testing software. And they have used this name, this
notion of parameterized unit test there. And now, we've done something similar in the automotive
software checking. But, what is the idea to have this parameterized unit test? Particularly
in the unit test, you just have one particular case that you check. So, one for each variable
you fix, [INDISTINCT] you have to write brand new code with those variables and see what's
the outcome. In such a parameterized test, you will just take one variable or several
variables that now fix the value for those variables. Just leave it open and you want
the tool to check all those variables at once, or all these values for the variables at once.
This goes for example, instead of initializing it to 0, you might know that your program
expects [INDISTINCT] between 0 and 100 and you've just been--did such a range into one
programming checking for all. And the third possibility is to write a checker program.
And the idea here is that, if you have a quite complicated algorithm to compute something,
it might still be relatively easy to check whether this result is correct. And then what
it can do is to take the right verifier or the checker program which takes what is obviously
correct, which can be relatively small. You have your original program helping, which
computes some solution and then you put them both together and check them together. And
this would mean that, we have such a checker function which takes the input to our original
problem and the output of what your program does. And then you check your original program
together with this--with this checker function and see whether this checker function always
says, "Yes, this is the correct output for the respective input.
>> How can you make sure that your checker implements the [INDISTINCT]?
>> SINZ: Yes, it's of course, that's--I think the general problem. If you want to do functional
correctness checks, whether you are using requirements or other kinds of basic modifications
for your program needs. >> [INDISTINCT] you are likely to make the
same mistakes as the original [INDISTINCT] make sense to have this [INDISTINCT] illogical
abstract level and then the implementation of all that language [INDISTINCT].
>> SINZ: There might another alternative but what I hear, for example, I had in mind, is
if you are writing something like such a logical solver, like a set solver, for example, then
it's very hard to compute such a--such a solution that is quite easy to check such a solution.
And if you write such a checker program that might be so trivial that you believe that
you [INDISTINCT]. That was the idea. Okay. So now it's time for demo again and this time
Stephan will give a different demo and he will show two of those methods, to go from
testing to semantic pathfinder verification. Wait a second. Now.
>> FALKE: This is the example, essentially show us what Carsten just mentioned [INDISTINCT]
verifying that alternative definition of some branch of computers [INDISTINCT]. And this
example is a potentially it's a population context sample so you want to figure out how
many bits are set in any given bit vector and this bit vector expands. And there are
some clever ways to do that. Just optimize population function which potentially just
does a few bit [INDISTINCT] logical or bitwise and the corrections and some arithmetic. Now,
of course we're interested to know whether this actually computes what it's supposed
to compute, mainly the number of sets bit which is an X. But one way you can do it is
of course do some unit tests, either [INDISTINCT] directly but again they are true to this [INDISTINCT]
too because of the inputs, so unit testing is going to be quite expensive. What you can
do instead is you can define an easier version of the function that also is supposed to increase
the operation now that it's so easy that you probably believes that it computes the population,
kind of, input X which is this reference popcount function. And now that we can direct verify
that the optimized function is doing what it's supposed to is compute the outputs of
both of those functions and you want to check whether the OS computer's saying [INDISTINCT].
And this is done in this main method, in this driver method so it computes the return value
of the optimized implementation which [INDISTINCT] reference and limitation and then the [INDISTINCT]
are always the same. And there are tools that is a backgrounding tool addressed to find
your way as this discussion takes which means that input X they are the two permutations
that different return values. And if it kind of [INDISTINCT] into the program you may kind
of find an X [INDISTINCT] then it becomes a function [INDISTINCT] of the optimized version.
So that's the [INDISTINCT] do you know what it's an example of? Okay. It actually takes
a little bit longer now but it's still much faster than unit testing. Because were essentially
it still has to check the 2 to the second two to the power of [INDISTINCT] too many
inputs, takes exactly more than five seconds. Unit testing again will probably take more
like an hour or so. And if there's no disagreement in the implementations and the reference and
limitations, the optimized implementation so we can be corrected, [INDISTINCT] the optimized
implementation is actually corrected, so. Okay. The next example I'm going to show is
the second message that Carsten just mentioned. That will be essentially a bunch of generalized
unit tests, so give it now a function which is supposed to compute the next larger number
from given input X which is the power of two, so for example, if you gives a function where
you [INDISTINCT] 800 then the return range should be 8,192 such as those [INDISTINCT]
next power of two. And then again does some mere quick shifting and bitwise operation
so it's not really clear that it's working. You could again, do unit tests. So, for example,
could just process that number [INDISTINCT] and then see what the output is. And then
you could [INDISTINCT] that it's always computing the right [INDISTINCT] for that input. If
you want to be a little more generic on this, you want to just test run input, you want
to test the [INDISTINCT] inputs. And this we can do using this use statement. And this
driver message essentially just use statement saying that X is--should be bigger than 8,192
and then should be smaller than 16,384. And for all various within that range, [INDISTINCT]
npo 2 should always give chances since it's always greater than 84. So it's more general
than unit testing in the sense that it covers a latter range of inputs, not just run inputs.
So, let's see, whether the implementation is correct because it's generalized test case.
And again, it's quite fast. It doesn't find the error. Meaning that for all the possible
inputs [INDISTINCT] it always computes the right output, which is nice, but of course
it could be more general than that even. And in this parametric test case is set [INDISTINCT]
to specify what's the output of X and [INDISTINCT] because now we only assume that X is not negative
and not too big, because we don't have an overflow. So it took me less than [INDISTINCT],
half essentially. And that the assumption of that [INDISTINCT] so we now check for all
X that are non-negative and not too big. And you want to check [INDISTINCT] next larger
power of two. And we do that by using [INDISTINCT] that your return value is bigger than the
input, then some [INDISTINCT] and should have covered up to the power of two which we can
also specify using this version. So now we are still checking two to the [INDISTINCT]
different inputs and we want to see whether the message is correct for all of them. And
again, it doesn't really take very long. We don't find the back meaning that for all of
those inputs the master function [INDISTINCT] always computes the correct output. And I
guess that actually concludes the second demo, so.
>> Yes, continue. >> Thank you.
>> SINZ: Okay. So you've seen these examples so far from our tool. But those has just all
been quite small examples just to make you see what are the possible checks of these
tools are. Now, we're now showing some first results that we got from this tool, it's ready
for use. So, we haven't made extensive testing but nevertheless I have two results here which
are a bit larger here. So the first example when we checked was, function blit. Or it's
a program which is called blit and that's from an embedded system benchmark suite. This
suite is called Powerstone because of the power consumption testing. And that's just
a bit larger. It has 96 lines of code but you have to set quite highly bits here, you
have to set an under or even a thousand because those are capable many times functioning the
limit of two. And then you can text an illegal bits use in this benchmark which presumably
wasn't there before these statistics and error here. And this means--so this [INDISTINCT]
are a bit strange things that we tried to detect but as I mentioned already if you shift
such a value which is a 32 bit value by more than 32 bits on the left then the result is
undefined. So this would be undefined because you're working at more derived inside, instead
this value gets a 44. So you have a shift of 44 which is just undefinable. And this
took quite some timing because this under rolling student was a bit more than one and
a half minutes to find this area here on my computer, even the laptop. And another example
where we found an error was in a routine for MPEG2 video decoding. That's a decoder which
is quite some years old and it was written by the MPEG Software Simulation Group. And
we took a look at some parts of this decoder and in particular, the index in this screen
[INDISTINCT] having arithmetic to transform these blocks of video. And so the whole decoder
consists of a bit more than 10,000 lines of C code. And this unit test that we roll, or
this parameterized unit test for checking this discrete fulltime transform would be--that
consisted of 600--766 lines of codes including [INDISTINCT]. And so what did these routine
do? So it took one input eight by eight bytes or eight bytes, shortened actual inputs, passed
it to this to inverse, discreet cosine transform and looked whether this routine was corrected.
And we were just looking for these low level boxes, field checks that we have there. It
combined the loop under a limit of 1,024 and function unit of 2 and we detected an array
in this out of bounds area on this program. So this occurred after having computed heavy
arithmetic code. Some particular value that was used as an index into an array and this
index was just too high for the array. But it was not really obvious to find, to see
from the code directly whether such rows could be computed from the other [INDISTINCT]. This
error was found in seven minutes on this platform here. Okay. So, this would be to a little
bit larger examples what our tool can do. And so, I'm approaching the end of my talk.
And so I want to summarized now what we have shown you today. So, it was mainly about this
path finding technique by [INDISTINCT] in our tool for this technique which is LLBMC.
So you've seen that LLBMC can verify C programs, I would say up to several thousand lines of
codes because this was relatively arithmetic. How for it to verify in other cases I should
get the scale up to a bit more. So, say several thousand lines of code are possible with this
tool. And it's especially suited for low-level system code. Various methods has already shown
that it works quite well in, for example, in those case studies which we have done with
this general method of finding modern checking of software along with our tool. And so this
is some work that we have done in the past and it was, for example, in the various files
in the Linux program that we have found some error with this technique. Also, equivalence
checking of cryptographic routines so we've compared to [INDISTINCT] implementations and
we are also are using this technique in embedded systems, automotive domain to functional verify
certain devices on a car. Okay, so this is the state now. And if you look at it the future
what is still missing or what we like to do in the future, something which is missing
from this supporting all of C and all C++ is still floating pond arithmetic, which is
conceptually not that hard to implement but it takes some time. The other thing is support
for more complex or complicated languages than C, like C++ or perhaps [INDISTINCT].
What is also something that has to be done in the future is to provide some specifications
for why do we use library functions. For example, for deep C read functions from the file life
functions to file, that's something which we have not yet covered with our tool which
we have to do in the future and of course, performance and improvements are always good
to have. So that's what we are--we are planning to work on in the future. But I think that
this technique is already quite useable for our many programs, especially if you are using
these techniques like generalizing the test case for example. In the test case, this typically
do not need to master the old program that you want to verify but just a small function.
And then it might be sufficient. Okay. So, this concludes my talk, our talk. And if you
want to get further information, you can either get it from our research group's website here
[INDISTINCT] for verification or directly from the site website for our tool and we
are planning to make this tool available in April this year. So, [INDISTINCT]. Thank you.
>> So I guess, I have to ask the main question that what is asked about when talks about
model shaping, which is what do you do about high [INDISTINCT] programs? Like, as soon
as you get a function [INDISTINCT] or objects? >> SINZ: So, we can handle the function pointers
with our tool but the techniques to handle them are mainly also to inline all the possible
functions that could occur in a certain place. So if you have a function pointer, you can
try to find out if--can even a set of possible functions that might be called to that place?
>> So you basically defunctionalized the program and then do the first order [INDISTINCT].
>> SINZ: Yes. Then you can just try to make it a bit more scalable [INDISTINCT].
>> How does that work together with your parameterized unit test? So you kind of parameterize over
function test [INDISTINCT]? >> SINZ: I think that in principle that you
brought, you don't have to fix it to one function in your program. So anybody doing [INDISTINCT]
just made a big switch stake within your program which just switches between different functions
but how you specify which particular function is.
>> Right. We need to know the whole program at that point, right?
>> SINZ: Yes, that's right. >> Which means incongruently the unit testing,
the unit checking in. >> SINZ: But I don't know what to say, but
I would say that you need the whole source code. So you can not assume, for example,
that your function points to something external. >> If I'm talking about. If I want to check
a [INDISTINCT] email and it's not defined what the whole source code is, right?
>> So, for example, the matter that finds a fixed point of the function, it takes function
as a parameter, and there is no reasonable branch to these functions. ]
>> I mean that's an opportunity [INDISTINCT]. >> SINZ: Yes. [INDISTINCT] if you do not know
anything about functions then you cannot handle this method.
>> But it is the same with unbounded data search [INDISTINCT], right? So if you process
the tree parameters [INDISTINCT] compute something on that then you can't just check all trees
that are possible. >> SINZ: Yes. That's also the case but I would
say now what is best important here is that you can check all--or you're bonding your
program traces and this also means that either you bound your data structures. So this means
you, for example, check all trees up to a size of 100 moves or 1,000 moves or something
like that. Then they can say something about research. For many errors, they already show
up for limited to smaller examples. So that's the idea behind this technique why it is expected
to work, it also shows some smaller examples. >> So in theory, obviously you don't have
false positive, but in practice probably you do it because there is another code that just
doesn't care about some things that's very, you have a--you write a unit test and you
would pull out the other function that has two parameters and you divide and you just
know that in your code you don't own it. Every valid over, like arbitrary examples and like
looked at how high your false positive right is.
>> SINZ: No. As If said so, this is relatively new so we haven't been doing intensive testing
but what I can say about false positive is that, so if you have everything together,
if you have the whole source code, and if there was a balance are sufficient and this
method is completed will not show any false positives, as long as we have the program
here [INDISTINCT]. But, for larger examples, it's often not the case. So typically you
have library calls for example, and for this library calls you do not want to include the
whole source code of the--of the [INDISTINCT] and then you would just give us specifications
or some approximation on what this function does. And as soon as you are using those [INDISTINCT]
approximations you also can get some false error reports and such. I would say that the
method in general is complete and doesn't provide such false errors. But, in practical
cases, as soon as you have to use the library functions then such case can happen. But we
hope that it's still better than with other techniques like have some interpretations
for example, which they can even cause some other approximation sense.
>> That would be an interesting thing because most things that I have seen that find bugs
actually have the problem of too many false positives. In the end it's just too much time
to grill through all those examples to actually find the real bugs.
>> SINGZ: Yes, but I would say that this is an advantage of this bounded model checking
method, that it should be better than, for example, this [INDISTINCT] to computing ranges
for variables. Using these precise techniques, we can really often avoid this kind of two
cause approximation in checking. But I also know if the studies which compare different
verification tools and also the [INDISTINCT] tools. That the rates can b e really quite
high among these tools. We have to make actually more sophisticates sides [INDISTINCT] practice.
And in that case, instead we have, I mean I've shown you on the last slide, for example,
[INDISTINCT] didn't turn out to be such a big problem. So, when you also mentioned we
also made together with Bosch one kind of research project where we tried to combine
a second permutation that was a commercial tool that they are already using at Bosch
which is called Polyspace, which produces quite some number of such false, such [INDISTINCT]
false if we tried to combine it with our methods. We did it in such a way that we first run
this less precise permutation method and then out of those possible errors that this tool
reported, I'm just checking again without worry. We could get rid of I think more than
70% percent out of all those errors that were reported by the initial tool. [INDISTINCT].
>> You said that you spent about a week getting the basic C ++ support in there? So how much
time do you think would it take to support all, or let say, most of C++?
>> SINZ: Yes, it's, always hard to guess. I think not decades, I hope. I think in a
year or so you could really do quality on that. So what is the [INDISTINCT] you could
say more on what are the main points that can be done?
>> MERZ: Running in general, that would be caught quite quickly because on the LLVM level
there's not that much of a difference between C++ and C code because it's all just after
December. But what was the first larger change that we had to do is to support function pointers.
And what we do right now is to inline all possible function pointers that would fit
for example. And we over approximate and the inline lost more functions than would actually
be necessary. So what we have to do in that case is to go back and look which functions
could it actually be. And that would most of the time be the ones of the derived classes,
for example, that would actually be overlooked functions of the same base function. We don't
do that right now, we just take all functions that match--have a matching signature. So
to get that information, which is completely lost on the LLVM level, we'd have to spend
some more work on it but it's not like this is a theoretical problem, it's more an engineering
problem we just have to do and implement. >> Isn't there also a theoretical problem
in the sense that construction space explodes quickly when you have an object on the program
because of all the functions you might call? >> MERZ: That's especially a problem because
we take in all of the functions that are roughly measured. That's a lot more. That the over
approximation that makes it inefficient. It runs. You can use it but it won't be as efficient
to get it efficient and to reduce that over approximation to really accept that those
functions that can actually be called in that particular context. To reduce it to that number
would be a little bit of work. >> So, you can--I suppose you can restrict
it by looking at the type of a function. You can restrict it by looking which class it
is defined? >> MERZ: You can, for example, just see what
happens if that particular function is caught. Your second search in that location where
it is caught. And instead of calling the function, you say, "Oh no, let's just stop it here and
tell us what was the value of the function point at this particular place?" And then
you can look it up which function was that value and you can take note it and edit to
the list of functions that could be called at this particular point. And you can cache
that information for later on so you can say, for example, if it's at this function, in
this line I'm calling one of these--of this set of functions. So you could do things like
that to reduce the number of possible functions that can be caught.
>> But you're still--going back to my earlier point. You're still bound to your old program
approach, right? It can't get around that? >> SINZ: We'd have to generate all possible
function definitions [INDISTINCT]. >> So like checking an object going to the
library in that ways. >> SINZ: That is not something that is the
intention of this. I mean, we can't really have to write some program which uses the
library and then really test the whole program [INDISTINCT].
>> MERZ: What you'd do in the unit test, you would actually create the method of the object
and initialize its B table and all that stuff. And instead of having one specific object
at that location, you could say, "Okay, this could be one of these objects, of this object,
of this object, of this class, of this class." And you get some kind of longitude [INDISTINCT]
at that point, but you're still restricted to subsets. That it's largely unit test but
a little bit more general and that reduces the complexity. But if you just say it's some
object and I don't know anything about it then well, yes, most likely it will be too
hard to solve it. >> Maybe there's a solution maybe you need
to provide some method of putting meat to paper on function declarations instead to
tell you to [INDISTINCT] necessary information [INDISTINCT], like [INDISTINCT] barriers of
that might help. Then you can do verification [INDISTINCT].
>> MERZ: I think there's--even if you add this kind of annotation to the kind function
itself, let's say, okay this function has always called with the correct object of this,
of this, of this type, well you simply create that object. So, you have some kind of test
drive a function that calls a function you want to check. And that builds up, let's say
for example, you have some kind of tree infrastructure with different kind of classes in that tree,
you build up a small tree that fits the kind of tree that you want to check. And then--but
you don't do it explicitly, like object number--node number five is of this class. You can say
node number five is of one of these classes. And you create a non-deterministic [INDISTINCT]
and then you cast that one deterministic tree into the function and you know that it will
be later restricted to these kind of--to these kind of trees. So it'd be some kind of, I
think that's the best approach you can do with this. Take a unit test and make more
general, more general until you hit that kind of performance bound, until you can't make
it more general without making it--making it inefficient. And I think that's the best
way to go to start off with that something very specific and to make it more general
as far as possible. You will cover lots more different cases with that.
>> So, are you looking into adding your tool on your tool source code? I mean that's what
most code people do that with their compilers, so.
>> [INDISTINCT] >> MERZ: You can get it running for the more
basic data types that we use quite quickly, actually. It will take more than, let's say
another one or two weeks just to get used to how to do it.
>> [INDISTINCT] we have checked that. >> MERZ: The problem is we use C++ and we
only added C++ at testing in the last weeks. [INDISTINCT]
>> [INDISTINCT]. >> [INDISTINCT] limited complexity basic data
tabs? [INDISTINCT] >> About this compression library or?
>> Yes. >> SINZ: Okay. We are looking at some part
of that specific, for example, to check some computation [INDISTINCT] and that was mainly
because of library functions [INDISTINCT]. Bu these--check some computations for example,
that's something that you can check. We also have some tests which we can show you later.
Thank you.