Day 1 Part 1 - Introductory intel x86 - Architecture, assembly, applications

alright, and the miscellaneous here is just... the reason it's called x86 is because the chips were called the 8086, and then 80186, 80286, 80386, had to get that out of the way you know for mac people like me you know: 'i don't know why it's called x86, i've been using motorola chips and i.b.m. chips, and powerpc things all my life.' just had to get that out of the way. so, here's what you can learn in this class. we're going to take things like a simple C program and we're going to say, you know, the background prerequisite for this class is that you must be able to unerstand a program like this, so there's nothing new or scary about the, for instance, number sign, other characters, or nothing strange about the syntax hopefully. so this hello world and specifically when we are returning this '1234' at the end, ok, that's the same as this sort of assembly; and you don't need to necessarily understand what it means right now but you can see in this there's a 'hello world' and you can see it, and you can see 'printf', and the hex '1234', and then our goal here is to figure out what all that stuff surrounding it is. to figure out what's the point of the rest of it. this was if i compiled that 'hello world' in visual studio with c++ 2005 with the buffer overflow protection turned off, to make it simpler, this is the assemblor you generate. That is the same thing as this. And so if i do it on Ubuntu with GCC 4.2.4, instead i get this. again, we have only one call here, but it's a call to 'puts', and so ok, that may be an equivalent to 'printf', and there's hex '0x1234', and i do not see 'hello world'. So there's a bunch of stuff here that needs explaining, basically. And that is also the same thing as this, which was compiled on mac osx 10.5.6, like i said 10/6 and greater is 64 bit code, but on 10/5 using GCC 4.0.1 you get this code so again, looks like 'puts' here, hex'0x1234', and don't know where our 'hello world' string went. But basically, if i turn on optimization and go back to that visual studio and then turn on optimization, this is what it can boil down to. so there is a push of the string 'hello world', essentially it is not the string but you push the pointer to the string. there's the push of a string, the 'printf', there's a 'pop' we do not know what the point of that is but there's '1234' and there is the return. this is the optimized code, there's much fewer instructions, and it maps well to the C code: you've got a string 'hello world', the 'printf', '1234', and the return. So, the good news is, and this is why i think a class like this can be fairly effective getting you bootrapped on to x86. Good news is that, by at least some people's measure of code complexity: only 14 assembly instructions account for 90% of code. So this was a blackhat talk where the guy was trying to generate a discriminator to say 'this is not based on x86 assembly frequencies' so amongst that you can say 'the frequencies say that it looks like these 14', you would pin them and it shows that the 14 assemblies account for 90% of code. All these assemblies, take 14 of them, and that's 90% of it. and so I generally find that 20-30 instructions is pretty much good enough as far as i'm concerned to make it so that you don't have to keep going back and forth to the reference manual. so once you go through this class, then ideally you won't have to go back and forth. you will still have to look up some special stuff, as it comes up, but we go into looking stuff up as well. and actually if you look back and count all the instructions, you'll notice in the 'hello world' variant there is actually 11 things, and most of them are due to the complex Ubuntu one. there's alot of different instructions such as the 'lea' and 'push', 'mov', etc. So, now we're going to do a couple refreshers so that we are all on the same page here. this picture is aken from the Intel manual. they are fundamental data types. what Intel calls a 'byte' you may call a 'char' in C. (pronounced 'ch-' like in 'church'). A 'byte' is a 'char'. a 'wrod' in Intel's notion is what you may put as a 'short', and note it's called a 'word' because this was back when they put 16 bit things. so the original 8086 were 16 bit processors, not 32 bit. so to them, a 'word' of data was 16 bits. so that's why then the 32 bit value is called a 'doubleword'. so when they moved from 16 bit to 32 bit, well, this is twice as much as what we were used to dealing with you got quite work center and that's why it's called 'doubleword'. you've got 'quadword', etc. and also, it's good to know, you know, you may have things like this, and if you're used to things like NCC or writing stuff in Posix etc. you may be used to using or seeing straight up C types: 'char', 'short', 'long' or 'int', etc. and if you are used to Windows programming, Windows will like type def something like 'long' into a 'doubleword' and call it 'dword'. so in Windows programming, you may see 'dword', 'qword', things like that. that goes back to this Intel notion of it's a 'doubleword', it's a 'quadword', etc. Alright. so, alternative radices: this is something you definitely want to memorize. basically if you do any sort of dealing and assembly, really you are going to need to know the hex and binary translation. you are going to want to know how to go back and forth between those and for your own purposes you may still want to know how to go back to decimal. but the nice thing about this is if you look here, you will see that this entire 0 to 15 on the binary side, everything up to 15 is represented in hex. so 4 bits is called a 'nibble', so 8 bits is a 'byte' and 4 bits is called a 'nibble'; and a single hexadecimal character is essentially a 'nibble'. so you can have hex 'F' is 4 bits, and then hex 'F' is always going to translate to '1111' in binary. and so when you memorize these translations between hex nibbles and binary nibbles, you can essentially go back and forth pretty quick and that can become necessary when you're doing something like bitwise operations and some things where your x-orienting two strings of bit together you need to go down the line and do each of those bits one at a time using the x or operation, and operation, us the or operation, using them one at a time. right so if you see something like, i'm going to go over to the board here for a second, so you will frequently see 32 bit values as a string of hex nibbles. if i take each of these, in trun, and if i know what their translations are into binary, it can be useful for doing the math yourself for an 'or' operation. so let's say i was doing this and frequently what you'll see for bit masking is something like 'ff0ff30f' something like that. if you need to do this sort of thing, what you will do is look one nibble at a time. so, in this case you will say '2' in binary is '0010' and 'F' in binary is '1111', and then you just go down the line and do your 'ands': 1 and 0 is 0, 1 and 1 is 1, … and then you turn that back into hex, and in this case it is hex '2'. in many cases you may want to do this mentally, and in other cases just say 'ok, i'll just let the debugger deal with it and see what happens afterwards' but this is why knowing your conversion between radices is very important. so you can do this quickly when it's necessary, so when you just see something for instance subtracting 'C', you can know 'oh, it's subtracting 12', something like that. i'm just going to get into negative numbers on the slide here, and not go into digression why negative numbers are represented this way. negative numbers are the "two's complement" of the positive number. a "two's complement" is defined as a "one's complement" plus one, and "one's complement" is defined as flip all the bits. i'll use these examples and not rewrite them. so i said "two's complement" is defined as a "one's complement" plus one, and "one's complement" is defined as flip all the bits. let's say i have the number one, so it is '00000001' in binary or 0x01 in hex. if i "one's complement" it, then i flip everything so it is '11111110'. in hex, i could take those four bits and turn them back into 'E' and i can take these top four bits and turn those into 'F', because they are '1111'. so that is the "one's complement" value, and if i add one to that, i get '11111111' and that is 'FF' and how you represent '-1'. so this is like a one byte sort of number here, it is 8 bits, so one byte positive number is one, and the negative number is 'FF' and so basically between 0x01 and 0x7F those are all your positive numbers that can be represented through one byte, and from 0x80 to 0xFF those are all the negative numbers. so 'FF' is '-1', 'FE' is '-2', and you count backwards from 'FF' to get your negative numbers and the exact same holds when you have 32 bits worth of things 'FFFF….' is '-1', then you subtract one to get '-2'. so that sort of makes sense, '-1-1 = -2'. so i just show the simple example here, if this is '4', so we've got '00000100' and if i flip all the bits '11111011' is going to be the "one's complement" of '4', and i take that plus one, so one plus one will result to carry the one, so that will result in one plus one again, so that will be zero carry the one, so i get '11111100' and that is what we see over here so 0xFC is '-4' and so if you start at 0xFF which is '-1', you count backwards: 0xFE is '-2', 0xFD is '-3', and oxFC is '-4' and again this is why you want to know your conversions back and forth. you need to be able to count backwards, forwards, change the radices in any direction, jump forward a decimal, minus 5, go back over to hex, etc. so counting backwards and forwards, if you spend any amount of time with it you will pick it up but it's the kind of thing you will need to know right now in order to deal with it frequently. or you can always have your calculator open, and that works too. so a little about architecture, cause architecture was in the title, so i have to talk about it a little bit. Intel is a CISC or 'complex instruction set computer' architecture. in CISC architecture, you keep layering on more and more instructions. if you have some new thing that you find lots of people are doing frequently, like if a compiler needs to generate a specific set of instructions, so you can make a new single instruction that does all the instructions in one instruction. 0:14:24.130,,0:14:37.800 so with CISC you keep adding more things in, in particular on Intel we have 'variable length' instructions, that means you can have one byte instructions or they can theoretically be up to 16. i am not sure what the longest valid one is, i think it is 15 but it's by some naive computation i made based on what it said at the beginning how to put instructions together, 0:14:49.570,,0:15:00.330 theoretically 16 can be the biggest but i think it is a little smaller than that. so instructions will vary between one and 16 bytes. for anyone out in the internet watching this if and when it becomes public please feel free to tell me what the maximum instruction is. i'm going to make a plea to the people out in the internet to please correct me. i am sure they will correct me with gusto! the other major architecture is called RISC or 'reduced instruction set computer' instructions and as you might guess from the name, it is a pushback from the stuff that was going on with CISC where they would keep adding instructions. this came out of some academic work where they said 'hey, most of the time we are doing a small subset of your RISC. the compiler writers don't know how to use all these things and can't figure out the best way to generate code from high level code that uses all the instructions placed into the architecture, so we are going to try to figure out what the reduced set of instructions is so we can still get pretty much everything done, and maybe we'll need to use a couple more instructions, but we'll still be able to get everything done'. so most all of the other major architectures are RISC. PowerPC, ARM, SPARC, MIPS. and these also typically have more registers, so whereas Intel only has only very few registers that you can use, RISC machines have a ton of registers because they have less instructions they need to be holding things in registers and juggling more data and frequently they are a fixed size instructions. whereas Intel can be between one and 16 bytes, RISC architectures are typically fixed at whatever data size is for that particular CPU. so if the data size is 32 bit architecture, the instructions are 32 bits. 64 bit architecture, the instructions are 64 bits. So, a little bit about 'endianness'. so this comes from uh... Jonathan Swif's "Gulliver's Travels" and it has to do with pointless wars, between uh... England and France, so there's an allegory England and France were going to war over particularly nothing meaningful. in "Gulliver's Travels", Gulliver came upon people who were having wars based on whether you should crack your hard-boiled egg at the 'big end' or 'little end'. take your hard coiled eggs, hit it on one end or the other and eat it. what it means is that it does not matter, there is no functional difference. "endianness" same thing with computer architectures there's no functional difference which we do things but if you have 'little endian' architecture what that means is you store the 'little end', the least significant byte, first in memory. so if you have low addresses in memory so let's say your address and started zero if you are doing 'little endian' for the number '12345678', you would take '78', that is one byte, and put that at address zero. and then you would put '56', then '34', then '12'. so you're 'little endian' goes to lower addresses in the memory. with 'big endian' it is the exact opposite. you just start with you know address zero would have '12', then '34', etc. Intel is 'little endian', you have to keep that uh... in mind things are stored 'big endian' in registers. so when you are looking at a value in registers, the value '12345678' is '12345678' in the register. but if you go back out and look at the value in RAM, and if RAM is disaplying the value in a sequence of sequential bytes, it will look like '78563412' and you will have to mentally flip that back around. be aware of that. when you look at RAM, stuff will be stroed 'little endian' on Intel. when you're looking at the value in a register, it will be a value that you as a human would be and English speaking individual 0:19:14.370,,0:19:22.260 with the most significant bits and byte at the left hand side, like English speakers are used to. and uh... so the other architectures that i was talking about, besides Intel, they tend to be either 'big endian' or 'bi endian', meaning it can go either way. so PowerPC technically you can flip it back and forth between 'big endian' and 'little endian', but by default they are typically always 'big endian'. this is a simple picture of what i was describing just now. if we pretend this is our RAM, and uh... we say the RAM has a sequence of byte size cells stored in it. then, on Intel you may have hex 'FEEDFACE' and uh... this will be stored with it's 'little end' first, right. so 'CE' is the least significant byte, that goes into the lowest address (0x0). etc. and 'big endian' is the exact opposite. so the 'big end' goes at the lower address. so that's a good way to visualize this job. everything is always in the register in hex 'FEEDFACE', but in memory if you are trying to read memory out: on 'big endian' it would make sense, you'd look at 0:20:26.870,,0:20:35.280 memory and it would be maybe from left to right and displayed horizontally and you'd be like 'oh, FEEDFACE, right', but on Intel you have to think 'ok, stuff is going to be backwards'. and another thing just to clarify: 'endianness' has nothing to do at te byte level. 0:20:44.230,,0:20:56.400 so sometimes people get confused stuff like 'my least significant bits are flipped around as well, so i have a single byte and the least bit gets swapped from that side to that side'-- it only has to do with the byte level. 'CE' here is 'CE' there, you do not do any bit level flipping around, only byte level. 0:21:02.059,,0:21:20.250 so when i say byte level, that means if you have a short that is 2 bytes, those 2 bytes can be flipped around in RAM because it is 'little endian' in RAM and in registers it would still be 'big endian'. uh... digging more into registers, what it is, is a small memory storage area built-in to the CPU proper (processor) but it is still volatile memory. that means if you power off your CPU, you will lose the data in those registers. uh... and so on Intel we have 8 'general purpose' registerss plus the instruction pointer, which is a pointer that we will sort of be dealing with indirectly. when you use a 'call' instruction, that is moving the instruction pointer because instruction for always just points at the next instruction that we want to execute. and so we can't like just move values in there directly, we can call certain instructions which were going to manipulate it. so if we 'call', use a call instruction to go to some sub-routine, well you are implicitly changing the instruction pointer because you are saying 'i am pointing you at the new code that i want to execute'. got the 8 'general purpose' where you can store anything you want in them and they have different conventions that are used on them but ostensibly theyre general purpose, and uh... the instruction pointer. and two of the 8 are not 'general purpose', and end up with 6 that are 'general purpose'. this class is all about x86-32, so in 32 bit architecture, the registers are 32 bits long, which is 4 bytes. on x86-64, the registers are 64 bits (8 bytes). and actually, just to call attention to this right here, down at the bottom of some of the presentation slides is a note like this 'Book p.25', and things like that looks like we don't have the books distributed there and i have a note here saying i will mail all books to students when the last three arrived so looks like we were short on books them here 'cause he didn't want it play favorites or something that uh... kind of comes with the class about teaching to the book but it was something where i needed something that could give you a different perspective on x86 basically other than the manual book and bring the manuals because they're big and they're like thick but usually like to say and this is to take this class you don't have to read manuals there's a more little book which uh... will give you an alternate perspectives so if you want to go back through and see you know how the book describes stuff as opposed to how i described it uh... i kind of call out where in the book you can find different instructions different architectural references up on the board Intel has register conventions which are basically intel's guidance where they say for compiler makers or anyone who wants to hand code assembly here's how you should deal with the registers on the system you can think of it like syntax guidance in C, or something like that. 0:24:16.330,,0:24:39.639 people just put out a guidance that says 'do it this way, so when someone else comes along and tries to read your code they will understand it.' similarly, you don't have to do it that way, you don't have to use these register conventions. however, if the compiler maker uses these conventions, then when someone comes along they'll be able to understand it. that's useful for even the compiler maker so that they uh... can try to understand uh... if something's going wrong that they're generating code things like that so anyways in this class i marked these as green. also, i should say these physical slides are out of date. i've updated for the class, but they had already printed the hand-outs for the class. if necessary, update. green are the ones we will actually see in this class. all of these are conventions but you need to be aware they are not always used for convention. it's more just like if you see something being used a certain way you can say haha it looks like they're using them for convention but at other times they're just using it at some scractch register for instance 'EAX' is green and it says in this class you will see 'EAX' being used to return the value of some function. so if i call into a function, and the function returns zero, one, or negative one, or something like that. in assembly, which you're going to see at the end is going to have some move into the 'EAX' register right before a function returns. that means the register is being used for convention to hold the return value before you exit function that is something we will actually see in this class. 'EBX', on the other hand, i put as… i simplified some of this, 'EAX' is also used as the 'accumulator' meaning that you cann add stuff into it and build up values in 'EAX', but i kind of left that out here because you do not see the instance files. again, i have the Intel Architecture Reference here if you want to see what they say they can register on convention use, then go look it up, but i have simplified these for our purposes. we will see 'EAX' used to store return values in this class. 'EBX' pointing to the base of the data section, we are not going to see. i think we may see that in the 'life of binaries' class, but i'm not sure yet because i'm still working on the material. i know it is used in GCC generated code, and also position independent code. Windows does not generate position independent code, meaning that it can go anywhere in RAM Linux and GCC do have the ability and i think 'EBX' is frequently used when you are generating position independent code. it is used for that but we will not see it in this class. 'ECX' on the other hand, can be used as a counter for the certain operations that are going to be doing the same thing over and over again in that case, the 'ECX' is used as a counter and is decrimented one and uh... each time you go through this operation so you may do like one copy decriment ECX, one copy decriment ECX, and when ECX reaches zero that means stop copying. that's one convention you can see 'ECX' used for, and if it's used for that it will be obvious because there will be the repeated instruction we'll probably get to by the end of today. 'EDX' IO pointer, you're going to see that in intermediate x86 when we talk about hardware IO, but not in this class. 0:28:06.540,,0:28:12.790 'ESI' and 'EDI' can be used for source or destination. 0:28:12.790,,0:28:23.820 'SI' source, 'DI' destination. for repeating instructions or string operations that i was talking about where something is looping and doing the same thing over and over. it may, for instance, copy from source to destination, and decriment ECX. copy from ESI to EDI, decriment ECX. that's how these may be used for convention. we are definitely going to see 'ESP' being used as a stack pointer. we will get into stack alot, but basically the stack is just a data area. as you add data to the stack, the stack pointer will keep moving to point at the top data in the stack. 'EBP' is the base pointer. 'BP' for base pointer. that is when we have stack, each function may get it's own new stack frame, which is its own area where it can control the variables in this area, things like that basically. 'EBP' by convention can point to the base of stack frame and say "here's where this function's stack starts, anything between it and the top of the stack is owned by whatever function you're in currently". 'EIP' as i've said before, we don't have direct control over this, we have indirect control over this. and that um... this points to the next instruction that the CPU will actually execute. so you cannot just load some address into 'EIP' but if you call a 'jump' or 'call' instruction that implicitly changes the 'EIP' to go to wherever you jumped or called to, etc. any questions at this point? anyone on the phone have any question? - 0:30:25.380,,0:30:39.440 moving on. beyond conventions for how registers are actually used, there is conventions for how functions agree not to destroy each other's registers, basically. 0:30:39.440,,0:30:45.020 this has to do with what is called 'caller-save registers' and/or 'callee-save registers'. 0:30:45.020,,0:31:04.100 'caller-save registers' means 'i am a function and i want to call another function, i am the caller. i'm going to call this other function, but, i know that this other function by this convention is allowed to modify certain registers as much as they want, so if i call this function i should assume that it's going to destroy these registers that i'm currently using.' the registers are 'eax', 'edx', and 'ecx'. 0:31:11.820,,0:31:18.960 so if i'm going to call another function, i have to assume it's going to destroy whatever values i have stored in 'eax', 'edx', or 'ecx'. therefore, i as the caller, am responsible for if i have a value in one of these registers that i do not want detroyed, i better save a copy before i call the next function. and so basically uh... for 'caller-save registers' if the compiler generates code where the code currently has something in 'eax' or 'edx', the compiler also needs to generate code that says before you call a function, save a copy of that register off to the stack and when you're done calling the function, restore it back from the stack so that nothing actually changed. that is 'caller-save registers'. 'callee-save registers' is just the exact opposite. the caller is allowed to assume that the callee will not make any changes to 'ebp', 'ebx', 'esi', or 'edi'. there is this guarantee that if i call a function, it will not smash my values in these four registers. therefore, if the callee wants to use those registers, it is responsible for saving them, restoring them, it can use the registers but it needs to save them right way at the beginning, go ahead and use them blah blah blah, write whatever values they want to use, and then right at the end they need to restore them and they return to the caller and the callee made no changes to the registers, basically. these are conventions that divide up the responsibility who is in charge of saving what registers. caller is in charge of a few of the registers, callee is in charge of the other ones. thereby, the compiler can try to generate code that does not use any of the callee saved registers, or something like that. the reason why i have to talk about this is because when you are looking at assembly code, you can potentially see some of these registers being saved, at the very beginning, you will see these being saved, and it will have to say 'ok, is this a callee-save register?' and if so, that is why they are saving it. 'i can see it saving it at the start, and restoring it at the end', this is just for your own purposes to recognize what is the point of these pointless instructions. 'i see that it saved it and restored it… it didn't do anything', the net no change, i came in saved it, restored it, returned, there was no net change. what the point of that? it's the convention that they have, so it's a limited number of registers, and it's these six right here, your real general purpose ones 'eax' through 'edx', 'esi', and 'edi'-- those are your real general purpose registers and therefore we have this convention to make sure they're not smashed by different functions. i've been talking about 'eax', 'ecx', etc. and those are our 32 bit versions of the registers. zero through 31, 32 bits. and the point here is originally, Intel was 16 bit, like i said, so in the beginning there was 'AX', ('A'- accumulator, 'B'- base pointer, 'C'- counter), so there was 'AX', 'BX', 'CX', 'DX', and they were 16 bit versions of the registers. when they moved to 32 bit, they tacked on this 'E' and it is 'Extended AX', so you need to know that there are other smaller forms of the registers that you can still actually access. you may see assembly code trying to just access the 'AH' register, you're going to say 'AH register? we did not talk about that. we know EAX.' so, this is for your own purpose to know that there were subdivisions of the larger registers. you may see someone move something into a 32 bit register, they move something in EAX, but then they may only pull out a byte from 'AH' or 'AL'. (inaudible question). you will see in the instructions if they are trying to grab some subportion of the register, they will refer to it as 'AX' or 'AL' or 'AH', etc. then keep in mind, something like this, there is no notion for the higher order 8 bits of the EAX, there is no register named anything up here. there's only the lower 16 (AX), and the lower order 16 as two different bytes (AH and AL). you can access each of those independently. so for these four registers, you've got all these forms where you can go down to byte form, etc. for the next set of registers, the 'SP' to 'ESP'. originally you had 'SP' (stack pointer) and 'BP' (base pointer) that are extended SP (ESP) and EBP. and you can see none of these have shorter forms where you can access just a byte of them at a time. this is for your own purposes to know, if you ever see any of these short forms of them referenced, it's just some subset of this 32 bit register. similarly, if we went to 64 bit there's going to be a longer register with 'R' as the prefix, such as 'RAX' where 'EAX' would be a subdivision, AX, AH, and AL would be subdivisions. the one special purpose register which i'm going to talk about a bit, except i'm going to leave out alot of details on this one, i'm only going to ask you to remember a little bit about it. there is a register called EFLAGS, it is 32 bits, 'E' extended, there used to be a 16 bit version called just FLAGS. in this register, it's a series of single bits within it. and each of those bits, almost all of them, have specific names. there can be something for instance called the 'zero flag' and there will be a specific bit positioned within EFLAGS which corresponds to the 'zero flag'. what happens is, off to the side whenever the CPU is doing some calculations, the results of calculations will be indicated in this EFLAGS register. let's say you're doing an add instruction, and you added one to negative one, the result would be zero, and the 'zero flag' would get set. i had a question in the last class, which was actually pretty good, 'why don't we just check the register. if we want to see the value is zero, why don't we just go and check the register is zero'. the conveniance from having the centralized EFLAGS would sense a bunch of flags after every instruction, is that if we wanted to check if some value is zero, you do not know if that is registered on EAX, EBX, ECX, etc. by having a central place where if the result is zero you set the zero flag in this register, having that central place, then you can just compare against a single flag, and say 'if zero flag is set, then i'm going to do one thing. if not, i'm going to do another thing'. so the EFLAGS is used for most all of your, probably all, conditional logic. (inaudible question) 0:38:56.019,,0:39:19.709 every single instruction behind the scenes the hardware is updating these flags, toggling things on and off, basically, based on the passed intruction. the other thing i should say is that, and in the manual it will say 'this instruction potentially modifies these flags in the EFLAGS register', so a subtract instruction will modify a certain number of flags, a multiply instruction will modify a certain number of flags, and x or an and, they will each have some set of flags they can potentially modify based on the output result. that's why it is potential modifications, because the result could be zero or it could not be zero. you can't just say the and instruction always modifies the 'zero flag' because if the result was not zero it doesn't change anything. after every instruction, the tags in this register are always updated. therefore, there will be other instructions which will check these flags to say 'if the flag is set, i'm going to do something. if it is not set, i am going to do something else'. that is where conditional logic comes from, basically. the only two things i want you to worry about in this class are just so you can have some notion, is the 'zero flag' ('ZF') where if the result of the last instruction was zero then we're going to set this flag to one, otherwise if the result was anything else the flag will be zero. and the 'sign flag' ('SF') which is set equal to the most significant bit of the result. going back to our numbers on the board, we said before that 'FFFFFFFF' equals negative one. through '80000000' equals negative two billion something. you have a four billion range with your 32 bits. and we have '00000000' equals zero, and '7FFFFFFF' equals positive two billion. these are our ranges for positive and negative numbers in 32 bits. if you take the most significant nibble nd turn it into binary: '8' is 1000, 'F' is 1111, '7' is 0111, and '0' is 0000. the commonality here is, for negative numbers they always start with '1' at their most significant bit. and for positive numbers they always start with '0' for their most significant bit. the 'sign flag' has to do with, if the result is '854321' or whatever, it will be something that starts with '1' (because binary of '8' is 1000). if that is a negative value, the 'sign flag' will be set to '1'. and if it is a positive number, the sign flag will be set to '0'. negativeness or positiveness is all in the eye of the code. the hardware doesn't care about negative or positive, that is a notion we impose through a sequence of bits. the hardware only knows there is a sequence of bits. and that's why for instance if you have an add instruction or something like that, and i said after an add instruction it updates a bunch of bits in this EFLAGS register, the add instruction actually the hardware does it as if this was adding signed numbers or unisgned numbers, it sets the flags accordingly in EFLAGS, basically if this would have been an unsigned add maybe the value would have been considered negative at the end, but if it was considered positive then it would be set differently. you could have some border case here, right where you add this minus one, i shouldn't do this because it's going to be overflow and not real but… you can think these are two border cases, where if i add these things together, they are not going to be positive anymore. 0:41:40.200,,0:44:03.059 these are positive numbers but they are too big and can potentially overflow into negative. if i pick something in the middle range, if i took 5 something plus 4 something, the addition of those would be 9 something and in the negative range. so the hardware doesn't care, the hardware only does addition of bits, the way bits are added. behind the scenes the hardware does instructions as if they were dealing with both positive values and negative values, sets the flags accordingly here, and the compiler is the one in charge of putting instructions which interpret those values as positive or negative. we'll kind of see that later. you'll see where if you make a value signed, you'll get a certain sequence of instructions, whereas if you make it unsigned you'll get a different sequence. because hardware doesn't care, only humans care. may have questions about the flags and so i went to the any questions about EFLAGS? so, ZF sets to '1' if the result is zero, and SF sets to whatever the most significant bit is. alright! first instruction. 'NOP' (no-op instruction) does nothing. not entriely pointless, it has its uses. you will see this little star when we have a new instruction. NOP does nothing, just there to potentially pad bytes or something like that. you put the instructions in and maybe due to the Intel optimization guidelines they will say they want all functions to start on a 16 byte boundary, so if the previous function did not stop at exactly 16 bytes, you can throw in a few NOP until the next function stops at the 16 byte boundary. they are used to make the canonical buffer overflows more reliable, but that is next class. all right so the late breaking NOP news, amaze your friends by citing this awesome x86 trivia: the NOP instruction quoted from the manual, the one byte NOP instruction is an alias mnemonic for the exchange (XCHG) EAX, EAX instruction. you can guess what XCHG EAX EAX does, it takes whatever is in EAX and exchanges for whatever is in EAX. it does nothing, taking the same value and putting it back into the same register. i had never looked at the NOP instruction, you look at it and say 'it does nothing!', but behind the scenes it is actually this exchange instruction. for the people who know x86 and haven't taken this class, you can go up to them and say 'that NOP instruction, how is that implemented behind the scenes? what does it actually do?' and they will not know. because no one knows, no one ever looks NOP in the manual because it does nothing. there you go, your first assembly instruction- NOP. now we're gonna talk about the stack a bunch. the stack is a conceptual area of RAM. right so by conceptual area i mean one stack is certain type of data structure, and two it can be anywhere in RAM and it's just up to the operating system to decide where to put it. and so when the oprating system starts the program it reserve some chunk of RAM and it says 'this is where i'm going to put that stack'. and a stack is a data structure is the last in first out or first and last out however you want to call it. it's the last in first out data structure where date is pushed on to the top of the stack, and then it's popped off the top of the stack before you can get anything else. if i put the way the computer science classes talk about it, you're at a buffet and have a stack of plates at the buffet, if i put one plate on and another plate on, i have to take the second plate off before i can get to the first plate. so i can't jump under and pull them out, because they're on a spring and therefore they've got the guards on the sides. you must take the thing off of the stack that you put on to it last. now by convention on x86 the stack goes towards lower memory addresses. in this class i'm going to be drawing, you need to have mental flexibility with this because you'll see this represented different ways by different people but in this class i'm gonna be drawing low addresses at the bottom and high addresses at the top, that's like what you saw with our 'endianness' diagram before. so i'm going to say zero is at the bottom, 'FFFF' is at the top looked like and therefore which you need to know about the stack is the stack goes towards low addresses. so that means they're going to be added going downwards and the top of the stack is the thing which is the lowest numerically. so it goes towards low addresses and top of the stack lowest numerical address. we'll get used to it once we start drawing a bunch of pictures of it. so we talked about register conventions thus far and we said ESP always points at the top of the stack, and so specifically it is pointing at the data which is being used at the top of stack and it's pointing at the lowest address, right. therefore it's the the least significant byte of potentially a little endian stored 4 bit or 2 bit value, 4 byte or 2 byte, etc. the point is the esp is always following the top of the stack and, therefore anything that's at a lower address that esp we're considering that undefined, that's like not really there as far as the stack is concerned, you know there may be real data there but it's all stuff that should never be accessed by a program. the compiler generates things so that it only ever reads data that's on top of the stack or below it, basically. and so amongst its many purposes, the stack keeps track of which functions were called before the previous ones, there again going back to the notion of stack frame i can say function main, for instance, is right here, at higher addresses. when main calls subroutine in function one, a new stack frame is started at a lower addresses. that keeps growing towards low addresses and so the stack keeps track of which sequence of these frames has occurred and pops them off as you call a function it gets a new stack frame which is added lower numerically but it's the top of the stack. and then when you return from a function you take away that stack frame and you go back up numerically but lower on the stacks. we'll be showing plenty of pictures of this. the stack is where you hold local variables or parameters passed to the next function which is being called, in the caller or callee save registers, stuff like that, is all stored on the stack. really, understanding this is very important so ask questions as we go along if i confuse you too much. low, high, numerical that sort of thing. our second instruction here is going to be push. you can push a word sixteen byte, sixteen-bit value, or double word a quad word stack right according to the manual but for our purposes in this class we're only to be talking about the dwords really pushing four bytes of the stack at once. and this value which you're pushing can either be an immediate which is like a constant value that's hard coded into the assembly instruction, or it can be a value in a register. so you can say push. and then will say that it's pushing a 32 bit value, or you can say push eax then it'll say whatever is in eax, 32 bit register, stick that on to the stack. (inaudible question) and so, a side-effect of the push instruction writes we're putting something onto the stack and i said the stack pointer always points at the top of the stack. so if we push in anything on the stack pointer needs to move down by four bytes because we put a four byte thing onto the stack we're going to decriment esp by four so that it is still pointing at the top of the stack. the last data on the stack, esp should always be pointing at it, so this is sort of a side effect. so data will get added to this RAM wherever the stack pointer is and then esp he will be subtracted by four. so this is sort of a visualization of it lower addresses are going to be at the bottom of the diagram. higher addresses up here. so we say this is uh... we're going to say the stack is before we're going to say we're gonna try to execute this instruction push eax. and so we're just gonna say okay right now eax has the value per unit just picking some arbitrary starting value. we're gonna say right now esp is the value '0012FF8C', this is like a RAM address so this is just your main memory, virtual memory. esp is pointing right here which is the top of the stack. right now the thing that's on top of the stack is the number two. when i execute this push eax instruction esp gets moved down by four and this value three which was in eax gets put onto the stack right here. so esp still points to the top of the stack there's this three got put into RAM and esp changes. said implicitly behind the scenes this esp register changed from here it was '8C' and here it's '88'. so '8C' minus four is '88'. 'C' is 12, minus 4 is eight. the corollary to push is pop so if you want to take something off the stack you just use pop and pop can take whatever's on the top of the stack, whatever esp is pointing to right now, take it off and put it in to whatever register you specified and then increment esp by four in order to move it up and start pointing at a different pointer stack. exact same thing backwards right if i then called pop eax, if eax has 'FFFFFFFF' in it at the start, so esp is pointing at 0x0012FF88, now the thing is, like i said, everything below esp is considered numerically undefined but in reality there is data there, it's just you should never ever be accessing it, the compiler should never generate anything that can access it. therefore we say that when i pop this '3' into the eax register, this value right here becomes undefined, and there's nothing there as far as you're concerned but actually the value is still there, you just took the value copied it to the register and moved the stack pointer. any questions on push or pop? now we're going to talk a little bit about calling conventions and this is again one of those things that matters to the to the people who are interested in reverse engineering and things like that. so calling conventions has to do with how you pass parameters to functions and how you get parameters back and things like that. so if i have function main and i want to call some subroutine question is how do i pass some parameter to it, we'll get into it later. there's only two calling conventions which really going to get into in this class there's a bunch more which you can look up at wikipedia and other places right here we're gonna talk about 'cdecl' or C declaration and standard call calling conventions. so 'cdecl', C declaration, this is the most common calling convention this is default typically for most C code, and some C++ code maybe made default to this also. the thing here is that function parameters, whatever you are going pass into this function, those function parameters are pushed onto the stack using push instructions from right to left. so if i've got some function, so say it's something like 'printf' preferably you have a string and then you have some number of parameters afterwards. it gets interpolated into that format string. so 'printf', '%d' and then my_var with cdecl convention, you push things from right to left, so you would push the address of my_var or the value in my_var onto the stack and you push the address of the string '%d' and then you would call the function so you push them right to left and then you call function in cdecl. the first thing that function does you know so there has to be an agreement right between the person calling the function of the function itself like that's if you have mismatch in your function calling conventions the called function will get messed up because it'll be expecting did in a certain way like right to left and you may be passing in the different way both the caller in the callee must agree on this calling convention and so the cold function here the one uh... which is accepting these parameters the first thing that it does is it saves the old stack stack pointer, frame pointer, sorry so it takes the old frame pointer. so that frame pointer was saying "i'm going to save that address on to the stack and then i'm going to set up my own new stack. so i'm going to point ebp here, and say everything below this is mine. and i saved a copy of the previous guys that when i'm done i can go back and put that back to being the case." and then eax or edx:eax returns the result. this is the convention i was talking about in terms of if you want to return something back from the function which was being called. you're going to stick the value in eax or if the value is 64 bits wide, rather than 32, you would put it half in eax and half in edx. the last bullet is the one you want to put a star nearby, this is where cdecl standard call differ from each other in cdecl the caller is responsible for cleaning up the stack for those parameters that it passed to the callee and so what that means is right at the beginning we said you know function parameters are passed on the stack from right to left, so you push, push and call and so in standard cdecl the caller, the one who pushed him onto the stack, is the one who's responsible for popping them off and getting rid of them cleaning up the stack down here in standard call uh... the only difference is that in standard call, the callee is responsible for cleaning up any stack parameters. that means i'm a function and i've got two parameters and therefore i know i may as well like go ahead and clean those two parameters off the stack before i return to the guy who called me so in standard call they're responsible for cleaning them up and you'll see standard call being used uh... by the Win32 API type stuff you can actually you'll be able to see based on the uh... but based on the assembly whether something looks like it's using standard call or uh... cdecl because either you'll see something push parameters on the stack call instructions and then like moving the stack pointer like to get rid of parameters like immediately after the call instruction if you see that you have to think okay well that's cdecl. caller pushed alone caller cleans it up if you see the pushing stuff on the stack recall something and there's no like clean up and then you go into that function at a called if there's a special return instruction which like implicitly at like subtract stuff from the start then that's standard call but we haven't talked with return instructions (inaudible question) did the function which gets called have to push it back on so actually the function which was called never actually took the stuff off the stack so basically it's there on the stack in the previous uh... functions stack frame and then this function will read that data out of the previous guys stack frame as just like leaves it there and then when it returns based on standard uh... cdecl or standard call it'll either return and like stop at its own stack frame in just destroy its own stack frame or it doesn't like always copy them off and have them be held in registers they're sitting there on the stack the entire time and it's just a question of who destroyed them off the stack we're going to talk about call instruction so we can see some of them being used in practice. fourth instruction, so far. call's job is to transfer control to a different function transferred and changed the EIP essentially, but it's basically just saying you know we know that programs are organized in you know a series of functions where this has this functionality and you want to call that from this call is the actual assembly instruction which implements this, it says in the absence of a call, basically uh... the CPU keeps incrementing EIP and it just keeps moving down right it's just going to do this instruction then it's going to do the next instruction and the next instruction and the next instruction so until we get to things like this which uh... alter where you want to go the rest of the time it would just go uh... you know top-to-bottom it would just execute instruction you know, add however big that instruction was uh... instruction was one byte, then it would add one byte to EIP and execute the next instruction and if the next instructions was five bytes, it would add five to the EIP exit and the next instruction. so the normal thing that the CPU is gonna be doing is just executing instructions in sequence now when we get to things like the call instruction when the call instruction is executed the CPU is not just going to go to the next instruction sequence it's gonna go wherever that call instruction tells it to go it's implicitly modifying the EIP. there some side effects to call instruction right so it doesn't just change the EIP, so you don't just start executing somewhere else you call a function and you want to have some way to return to the function which called it right so you typically if you want to just go somewhere never come back then you use like a jump or in C notion use 'goto', the 'goto' says go there i don't worry about ever coming back right so we know in C, when i call function expected do something and it would have done it comes back to the next C line after it, right and so in order to implement that uh... sort of capability to go somewhere and then still come back to wherever you used to be behind the scenes call pushes the address of the next instruction onto the stack so when you call something, youre going to change the EIP to something else but you're also going to put the address of the next instruction on stack so that later on when that something else is done they can take that address off the stack using the return instruction and go back to where ever came from the thing that's push on the stack is sort of the little letter that says here's where i came from please come back here later right when you're done this is where you need to go back to you and uh... the target of the call is where you're going to the return instructions which we'll see next uses that thing which is pushed on to the stack won't change the EIP destination address, etc. so there's a multiple ways that you can specify if the target of the call instruction so you can either give an absolute address or relative address an absolute address would say 'i want to go to address zero zero four zero one two three four' right, so just like this is definitely the address i want to go to and then the relative version says i want to go to some address that's you know hex 50 bytes pass to the end of the next instruction so relative addresses are relative to the end of the next instruction when we get to the actual you'll see them or when we get to some stepping through some actual something so you call instructions to go to a procedure and you use the return instruction come back and so the return is actually implicitly using whatever that thing was that was on the stack pushed by the call instruction call puts it on the stack and then return takes it off and says 'okay that's i'm gonna pop it off the stack and go back to wherever that that address was and there's two forms of this and these have to do with the cdecl kind of thing or standard call. so in the cdecl case you basically just take whatever's on the top of the stack return there and uh... that's it then the function who called you, they are responsible for cleaning up the stack in the standard call case since the callee is responsible for cleaning up the stock parameters the callee needs to also get rid of a certain number of elements off the stack however many parameters and if it had three parameters it knows that it needs to do like twelve bytes that need to be removed from stack get rid of those three parameters so the second form of return is something you'll maybe see so if you see the very end of the function you see like a return eight or return 12 or return hex 20 or something like that that implies to you that this is a standard call function because it's returning so it's taking whatever's on the top of the stack and going back to that putting the EIP there but then it's returning and then it's also like getting rid of hex 20 bytes off the stack or hex eight bytes, etc. the type of return instruction will kind of imply what type of calling convention was used in that function. as will uh... the code, that calls it basically. almost to where we can talk about some pictures so next one is the move instruction and this is used to move a variety of form so you can have something in one register and you want to put it in another register. so move one register to one register some register a memory or memory order to the register right an immediate to register, or an immediate to memory so again, immediate was like a constant which is hard coded into the actual instruction stream so it'll be like move and then they'll be in line with the rest of the bytes for that instruction to be moved in the next one two three four something like that. we don't talk about you know the actual bytes which define an actual instruction until the very end in the back your mind you can have an idea that there's going to be some sequence of bytes which define an instruction and immediate is a hard coded constant but uh... the big thing you need to know about the move instruction is there's no memory to memory type and so this is uh... something which typically, at least it bit me a lot when i was first learning x86 assembly and if i was trying to make quick and a few hand coded assembly instructions i would want to like say well 'i know i have this value in this C variable store in memory somewhere and i know i have this value and that's the variable and i want to just like add that memory plus that memory and like add them and have them be destined for that first memory location but you can't actually do moves or adds or subtracts or anything. Intel generally does not give you forms of instructions which will be memory to memory operation so you typically have take something out of memory move it into a register memory to register type of transfer and then take a register in memory and you can like move those or add those main thing to remember is just you don't have any memory to memory form and i'm gonna make reference to it then later on i'll say that there's a form that i'm gonna call in this class an r/m32 form of uh... and that's how you specify memory address that's just saying each of these places where i say you can give memory to register memory will be satisfied as an r/m32 address form, and i'll talk about that later. so we start showing a picture of some of our stack frames on my pictures lower addresses are at the bottom higher addresses are at the top and the stack grows towards low addresses right so as i keep adding stuff to the stack frame i'm going to expect it to grow toward the bottom and here we're just going to uh... pretend that main is the first function called into any given program, it's not, there's some initialization code which happens before main but we're just going to pretend the very first thing executed in any program is main so we're going to say when the OS starts your program it has main, and the first thing main does is it reserves space on the stack for its local variables so it's going to do some subtract from the ESP in order to make a place for its first local variables and now let's say that main wants to call some sub-routine okay so may now is going to be the caller of the subroutine right it's going to perform some caller saves on some registers if necessary right it will always do this but if necessary if it's got values in these registers and it wants to make sure that the thing it's going to call doesn't smash them it's going to perform caller saves on these registers the next thing it's going to do is push on to the stack the arguments that it wants to pass to the callees, so i said before you go from from right to left pushing each of your parameters onto the stacks. so that's going to be you know the rightmost parameter, the next most right, and then the left most paramater, depending on how many parameters. this is main's stack frame right before it actually calls the function right so it's not yet executed the call instruction, it's pushed stuff onto the stack it wants takes execute the call instruction next to call some sub routine but hasn't yet so now, when we execute the actual call instruction in the next thing that's get pushed onto the stack is that saved address of the next instruction after the call is from so that's put on there so that when the callee is done it can just go look at this value and say 'okay well that's where i need to go back to'. that's this sort of a little pointer that says 'please come back to this location when you're done' okay and so now we executed the call instruction this got pushed onto the stack and so the first thing that sub routine does, and i need to go back and change the subs so they don't look like subtract function or something so that this is just a generic sub routine uh... the first thing this sub routine is going to do is it's going to make uh... it's going to save the frame pointer for this previous thing so right here this is the entire uh... main stack frame and we said by convention EBP always points at the start of the stack frame, we'll show it in later pictures but you can think like the EBP is gonna be pointing at this very beginning of the stack frame. it's gonna be saying this is where my stack frame starts so the first thing that sub is going to do is it's going to take that pointer which points at the top of the local variables there it's going to save it under the stacks that again when it's done and it's ready to destroy its own stack frame it just takes that stack pointer puts it back into the EBP so that again it will point at main's frame instead of its own so it saves the previous guys frame pointer and then it takes its own frame pointer and says well 'i have my own pointer and it's going to start right here right everything from here and down is going to be sub routine in stack frame. Ami's question was: that's nice and you can do by convention but there's nothing enforcing that it it needs is to save the framepoint of the previous thing, and that's correct basically if the compiler wants to be using this frame pointer convention the compiler will generate instructions at the beginning of all these functions that say i'm always like my first instructions we'll see this by the end you'll get sick of these instructions. there's two instructions at the beginning of all these functions which save the frame pointer and put the new frame pointer to you know the beginning of the new frame basically they don't have to do this and you can set compiler options on your code which say "i don't even want to use frame pointers" we're just gonna have one huge stack thing and we'll have no notional difference between different frames and uh... you know that can mess with you as an analyst if you're trying to analyze things to do not have this nice division between stack frames uh... but typically the default case uh... people unless they explicitly don't want to use frame pointers which if you ever see compiler options will be like 'omit frame pointer option' if they omit that then yea you won't have uh... separate stack frames by default and all that stuff we're gonna see in this class the compiler will always generate a thing that says save the previous guys frame pointer and set up a new frame pointer starting at my code or my frame and so the next thing uh... subroutines going to do is it's going to take any callee save registers that it wants to use right so if it needs to use some of those callee save registers it will go ahead and save them actually this is just sort of backwards i was looking at some well it's not backwards necessarily this this uh... which order this happens in actually can be different i guess i'm thinking of in the last class in GCC they actually save the callee's save registers after they make the local variable so you can't guarantee that like the local variables will be first in the callee save will be next but somewhere in the stack frame is gonna be some local variables there's gonna be callee save things my note to myself is a need to go confirm again that visual studio saves things in that order so sub routine you know it's gonna save any callee save registers and ensures that it needs to and it's going to create some space on the stack for local variable so typically uh... functions right at the beginning the compiler knows how many local variables you declared in your function so you may have you know two ints or something like that so the compiler knows you've got two ints that's eight bytes therefore i'm just gonna allocate eight bytes all of the beginning. so typically the compiler will allocate all your space for your it'll just create instructions to allocate all your space for your local variables on the stack at the start the instructions you generate will let you know read into that local variable space based on where the compilers put the different local variables. so now i have to kind of slide this uh... the stack frame up and we're going to focus just on some sub routines stack frame will kind of push main's frame off of the top now we're going to say ok well that was nice subrouting had its local variables had its callee save space and it can do its processing, etc. but when sub routine decides that it wants to call another sub routine, sub two, uh... than it needs to go through again the exact same process that main did it's gonna take any caller save registers that it needs to save it's going to push those on the stack to make sure that sub two doesn't destroy it's registers and then it's gonna pass any arguments to sub two right to left, put them on the stack and then it's going to call to it. i think i didn't put the actual call to it and i just said this is kind of uh... the last point basically so now you know it basically at this point you know put those arguments uh... onto the stack it would call sub two, we could go through the exact same sequence right sub two would save the frame pointer, then sub two would then save its callee-save, and then when sub two is done then it would destroy its calle-save registers, destroy its frame pointer, return back up, and when sub routine is done, sub routine would basically clean up these arguments, clean up callee-save registers, clean up its local variables, and so when any given functions done it basically just destroys these things in sequence and then it takes that saved frame pointer we go back all the way here so let's say that sub i think the reason i didn't put pictures is because i can just go backwards like that excuse the slides in reverse order right when sub is done calling uh... it's going to go ahead and you know get rid of its uh... when it's done calling sub two it gets rid of those parameters that it passed to sub two and when sub is actually done executing it's going to get rid of its local variables get rid of its uh... callee-save registers disappeared and that's gonna take that saved frame pointer it's going to replace the EBP with this values that points back up again at the main spring and that's going to take this called save the save return address to take that use a return instruction and then that's going to pass control back to uh... function at of uh... instruction after the call instruction, which was in main main is back to this and main would clean up you know if main is doing standard call main would clean up these arguments any questions on um... on the stack frame creation like just basically what these pictures are supposed to show you his this is kind of the maximum stuff that you'll ever see one in one stacks frame but really not at any given point you know these things might not be here like that the caller save registers and arguments to callee those may not be there because you're not calling anyone you probably are going to have some local variables if you do anything of consequence right but you may not actually, i guess in some of our sample code we don't even have local variable so you may not even see local variables you may not see callee-save things you may not even see frame pointer, so it's possible to have a completely empty frame uh... but generally speaking this is the maximum kind of stuff but you can actually expect to see in any given frame so any questions about this so far? so our first vote is example is going to be a super simple thing where a stack frame has nothing except a save frame pointer and uh... return address for instance like we won't even have any local variables of callee-save things and then once we get to example two then we're gonna start seeing passing values to functions. we'll go through much more concrete things than this coming up. one other point i wanted to make here is that stack frames are actually sort of organized as a linked list this EBP saved frame pointer thing but i was talking about i said, the first thing sub does is it saves the frame pointer of the previous frame right and so this used to be pointing at the top of that frame and now we're saving it and we're putting that pointer to the top of our current frame. and so it's essentially a backwards pointer to the top of the previous frame and if i said the compilers generating these instructions to always do that at the very beginning of every function. then as you go down you know if sub saves the EBP right at the top of its frame and that points back at the top of the previous guys frame, and then sub two saves EBP at the very beginning, it points back at the previous the stack frames are basically a link to list we have organizing stuff where the most recent portion which is called the current function you're executing is at, you know the top of the stack numerically lower addresses and the previous frames whoever called you and whoever called the guy who called you those are each links back in this chain of stack frames. that's kind of the point of having the EBP frame pointer point at the top of your frames. and why you save it always at the beginning it's to maintain the sort oflinked list and we'll see later when we're using the debugger that you know typically most debuggers are going to have a view where it'll say 'show me the call stack' where it's trying to say 'show me which function i am in, which function called me in, which function called that'. and basically all the debugger is doing is just taking those EBPs and reading, it's going back and how it figures out which function is which is because basically at the first value in the frame is that same EBP. the last value in the previous frame is that saved instruction pointed out the call instruction that's there. when the debugger is going back, it can look at that chain and say "okay what's the value right above the EBP, and i know that's going to be something that was pushed by call instruction and therefore i'm going to look at that address and say which functions is that address in and therefore that's in an address that's within the bounds of main and therefore that's the mainstream. this is an address which is the bounds of sub therefore that's a sub frame". i'm just going to describe what what did you hear with example one of their we're gonna take our uh... ten minute break. and so with example one, we have exceedingly simple C code. we have main, and main calls the subroutine sub, and the only thing the subroutine does is it returns 0xbeef right so that's all it does it doesn't have any local variables or anything else, it just has this immediate value, hex beef and returns it. and then main doesn't even do anything with that return value. it just calls the function, doesn't even take the return value and then just returns food anyways. so main always will return food calling this subroutine functionally. you know it does nothing but what it does is it executes a call instruction it will go into that then we will see a new stack frame being made and then we'll see that stack frame being destroyed so this will be the most basic level of stack frame creation destruction and function calling etc. these are all the instructions we've learned thus far right we said push pushes stuff on to the stack, move like this could move register to register in this case. call is just saying i want to change my address to you know whatever is given and then move i can take an immediate put in to register. pop, takes something off the stack. and return, take whatever you know saved return address was put onto the stack by call instruction and pop that off and go there ten minute break back at ten thirty and will start going into this will going to aid in excruciating detail where we will look at every single instruction.