Def Con 21 - Marion marschalek - A thorny piece of malware and me

>> MARION MARSCHALEK: Hello. Welcome to my talk about the thorny piece of malware. My name is Marion. I'm a malware analyst. I'm sparing my second name in because people seem to have problems pronouncing it anyway. And I work for the Austrian software company Ikarus Security Software. My talk is going to be about one specifically thorny piece of malware I analyzed, and I'm going to start out with some fancy fun facts about that sample. And the rest of the talk is all going to be about analyzed issues I had when looking into it. I'm going to bring two analyst techniques I encountered in the sample and two more analyst headaches that still provide problem for reverse engineers. First one that is exception handling that cannot ‑‑ execution path. And the second one is junk code that I encountered in there that was pretty nasty at first pass, but then you needed to pass by after all. I'm going to talk about binary analyzes of C++ executables and about multi‑threaded applications for reverse engineering. All right. Let's start over. Now, all together, this is my favorite piece of malware. Now why would it be my favorite piece of malware? Well, I reversed it in and out from top to bottom, and I really had a lot of fun. It is a challenging piece of malware but not impossible to pass by even for beginners. It's not packed or encrypted but still provides enough interesting topics to research. But what does it do after all? Well, all together, I summarize it there. It's an Asian multi‑threaded, non‑polymorphic, file‑infecting spy bot. What does it do? Like what spy bots do, it can produce screen shots. It can produce screen captures and send them to C & C server. It can delete files. It can copy files. It can execute files. Most of all, it can update so it can download a new version of itself and execute this one. So basically it can do anything the mode of control wants to. Anyway. What are interesting facts about it? The sample uses structured exception handling to obfuscate its execution part. That means by throwing deliberate exceptions, the malware author can pass execution control from one place in executable to another one, namely the exception handler. And the interesting thing about exception handlers is an exception handler can find a new entry point that's going to be executed after the encryption. How does this work? Well, the documentation I could find still was written in 1997 by a guy called Matt Pietrik. He's one of my big heros now because he did already did recognize documentation this issue, namely, A Crash Course on the Depths of ‑‑ were committed ‑‑ of this article. Actually, exception handling is implemented as a chain of exception handlers which is located on stack and intertwined with the functions that frames that are around there. And it all starts with the thread formation book because every thread has its own chain of exception handlers. A reverse engineer can find this through the FS register adopted 0 which points to exception registration structure which looks more or less like this. And in some cases, structure contains a pointer to the handler which could eventually handle the thrown exception and a pointer which points to the previous registration block which looks like this and eventually, the chain, there comes a default handler and, well, minus one. All right. Now this is based on the stack. And in one of the function stack frames. And there's a whole science about building stack and unwinding the stack, but what's really interesting for a malware author is, of course, he can register his own exception handlers and deliberately throw exceptions and control like put ‑‑ point execution flow to some other piece in the code. Now, this looks more or less like this. If you're inside of a binary and can spot something like F:0 and see the structure where a new exception handler is linked into the list, that most likely has to do with exception handling. Now I told you there's a pointer in there pointing to the handler code which would be the first switch to some other point in the executable for execution. And inside of this handler now, someone can change the execution flow to completely different point inside of the executable. The magic thing about this is an exception is treated as a software interrupt which means every time the exception occurs, the whole context structure of the file that's running is saved away and loaded back into the CPU when the exception handler is finished. And the interesting thing there is that someone can change this content structure and point the structure pointer somewhere completely different. So yeah, I know there's a lot of people in there getting excited when they hear they can point instruction pointers somewhere. Right. And I know today a lot of things have changed, especially concerning C++. And in Visual C++, it is based on structured exception handling I showed you before. But the things that have changed mainly is now every function has its own exception handler and uses a funcinfo structure which contains information about try blocks and cache blocks and I think I need to take a break. >> SPEAKER: You would be correct. (Applause) >> SPEAKER: We have a little tradition here at DEF CON. Let me tell you all about it. It involves ‑‑ >> AUDIENCE: Drinking. >> SPEAKER: Louder. >> AUDIENCE: Drinking. >> SPEAKER: Why? Why are we making her drink? >> AUDIENCE: First timer. >> SPEAKER: We need someone from the audience. Do we have any first timers here in the audience? No? So there ‑‑ really? Nobody is a first timer? None? Wait. Okay. Who's everybody pointing at? All right. Get up here. I can't believe this is the only guy. That's amazing. Cheers! Welcome to DEF CON! (Applause) >> SPEAKER: Have a good time. >> MARION MARSCHALEK: Thank you. Sheesh. >> SPEAKER: Where were you? >> MARION MARSCHALEK: Right there. Okay. Now where was I? All right. Visual C++ structured exception handling. It's still based ‑‑ sorry. (Coughing) (Laughing) It's still based on the principal of structured exception handling I showed you before, but now every function has its own dedicated exception handler which is compile generated and uses some structure called fun infrastructure that contains a lot of information, namely information for unwinding fun clips about try blocks and cache blocks and well, the pointers to the exception handlers that eventually handle exceptions. Right. There's that built‑in function code SEH frame handler, which this funcinfo structure is handed over to and then performs ‑‑ well, the matching exception handling to executed exception handlers, of course. And still, as I mentioned, the important thing there, the exception handler can define your entry point. Now, I pointed to a really nice diagram that we see here. Interesting. Right? Open RC is painted by Igar Sakinski, who did a lot of research on structured exception handling. And I've provided some scratchbook painting here to show you. It's really not pretty, but let's look through that. There is a pointer to the exception handler on the stack, right, that's the compile generated the exception handler. There's the SEH frame differential where the funcinfo structure ends up at. This pointer points to a try block map. The try block map points to a handler array. The handler array points to a handler offset which points down to a handler. You got the thought, right? Well, I provided some screen shots here, from IDA Pro. Let's get back to the bot. In practice, this would look like this. For example, there's a registration sequence. I hope people can read this. Maybe. I don't know. But there is the zero flying by and a new exception handler is registered at the beginning of the function. And sometime later, there's an exception happening. If you can read that call, this will almost never work, right? Because this memory address puts to ecx there is somewhat likely not to be valid. So there's the exception and there is the registered exception handler which causes the system to execute the co‑paginated handler. The other funcinfo structure is handed over to the SEH frame handler which then performs the magic. Let's look into this funcinfo structure. In this funcinfo structure, there's the values that Igor thankfully pointed out in his diagram. So there's the try block map and the handler array, and finally, the pointer to the handler that the user registered. So there's the user generated handler. And in there, you can find the offset to the entry point. If you have a look at the user generated handler, it is really obvious that this handler is just registered for obfuscation because there's nothing else happening there. Then setting of this offset for the entry point. Right. So much about exceptions. The second point, the junk code in the file. There was really quite a lot of junk code defined in the sample which is pretty scary for young analysts if you see a lot of source and a lot of shifting operations and a lot of loops that actually dump phony, any useful information in there. So I was kind of overwhelmed on this junk code until I found the principle of the junk code in my sample. There's a whole lot of research about junk code and binary pass. And principle of this junk code is pretty simple, opaque predicates. Now the opaque predicates is something that's just ‑‑ well, branch statement, it always returns true ‑‑ or always returns false. And so it's always going to be just executed one branch of the branches that there are. And the other branches gets the junk code. So well, in the sample analysis, it looks somewhat like this. On the right side, you see the first screen shot. On the left side, there's a simplified version. And if you think through that, the compare statement in yent is never going to produce any 0 flags, so the junk not 0 is always going to take the green branch. Right. You think that is simple? It's true. It was like this all throughout the sample. Was just as simple. So what did the analyst do? I just put the normal down and green branch for precedent. If you can see this graphics. I'm not sure. All right. So the yellow boxes are the productive code and the white boxes are just junk code. So this was really pretty simple to get by. NS headaches. I spend a lot of time with sample into the ‑‑ especially because of the threats in there. People have seen the movie Take Me to Hell, and that's what I've been through with that application. The author of the sample actually has all my respect because he produced this in C++. This is a simplified version of the threats that I found in there. There's actually a lot more, but it boils down basically to one threat that malicious, the whole instance, namely, the ‑‑ well, the bot instances that could start up in the system because eventually there's more than one file infected that could start up. Second threat, there was the file infector, always infecting processes that would start up. Of that machinery that would handle the sending side of the bot which could send messages and data to the C & C and on site, the receiving side of the bot. And, of course, the C & C command switching. Now how did I get to that information? That was pretty tricky and spent a lot of trial and error time in there. But actually what I did was in first steps, I realized that I have to spot the ‑‑ really interesting threats because there's a lot of timing, synchronization going on. After doing this, I had to spot the interface communication and the synchronization meters which actually told me a lot about what threats were about, by triggered by specific events. I will talk a little bit more about this pretty soon. In the first step, of course, I had to analyze somewhat the function base of the threats to really find out what they do, what information they generated and where this information would eventually flow to. Knowing all this, in the first step it could bring down this big picture of where is information generated? Which threat? Which thread, sorry ‑‑ accepted information, processes it and eventually takes any action. All right. So if you go back to that diagram, I found four different ME thoughts of synchronization in there which were events for triggering the file infector and for managing the different instances that were started. Threat messages which remain used at the receiving side of the bot. IA completion part which was used to manage the ‑‑ so you had the receiving side of the bot, the threat messages for the receiving side of the bot and the critical sections for data exchange between the threats. When I had that, I could paint the threats around the synchronization meters. All right. Now here comes the last nastiness for today. C++. All right. There's actually a lot about reversing C++. There's a whole science for people who are interested in that I collected a lot of links on that research on the last slide of this presentation. But I actually want to talk about our visual function calls. Our visual function calls are very interesting to reverse because they're indirect calls, and they're only fully determinable at one time. These simple multiple inheritance features C++, so one of these special function calls can actually call into several different meters at run time. They're translated using visual function tables which has a lot in reversing these sorts of binaries. I provided the ‑‑ an example here. In this example, there's a visual function table actually loaded into the register EX. And at offset 4 of the virtual function table, there's ME thought that's going to be called with this call station. That was really sort of the catch me if you can. Actually, I collected another sample from open RC and Igar Sakinski because he did a lot of research on this as well. Here's one Class A where there's two virtual functions defined in there. Underneath this class definition, you can see the myriad of Class A where there's a virtual function pointer actually pointing to the virtual function table of Class A. Now virtual function table is something that just class have to have virtual functions defined in there. All right. Here's the second Class B which also has really similar layer with two virtual functions defined in there. And another interesting thing is the Class C because Class C inherits Class A and Class B and implements one virtual function each. Now, I already have Class C somewhat bigger because as it inherits other classes, the testing includes their class layout and also the virtual functions pointers in there to the virtual function tables. These virtual function tables are now adapted to fit the needs of Class C and point to the actual function offsets that Class C implemented. All right. This is really dry to look at, at code. Back to business. Here's the C & C command switching function which is a really good example for virtual function calls. Under you see little yellow boxes. This is all memory allocation for objects that are going to be instantiated in the green boxes. And then you see one pink box which is the virtual function call which was actually used to call to the bot functions. The bot functions are implemented as direct classes for one bot action super class. And all had one function overloaded, sorry, implemented that was the bot action. Now here, another IDA Pro example with the move file object. Here in yellow you see the object instantiation. I'm sorry. The memory allocation where there's space reserved for the object that is going to be instantiated in the green box. And what you see there is a call to a constructer. Now this constructer actually has call into the super class constructer, as it work with direct objects. And there you see the first VFTable. I will talk about this in a second. As I mentioned, this constructer has call into the base class constructer, and there's another virtual function table where there is space reserved for two virtual functions. Now IDA Pro checked the cross‑reference of this base class constructer. There have 23 cross‑references, I guess, surprise, there's like 23 bot actions that can be taken by the bot. All right. Along this ‑‑ the final step is the call into the function ME bot of the new file object. What you see there is that the function table of the new file object is loaded into the register and the function offset is called ‑‑ if you have a look at the virtual function table of the new file object, the ‑‑ for there is move file function. So theory approved. Using these virtual function tables, you can not easily but pretty fast determine which functions are going to be called at these virtual function calls. All right. This was my presentation. Here are the promise to the links. The samples we found online and they're the first link. And while if there's any questions, you can contact me on Twitter or I'm going to be out in the hallway to answer your questions or receive critics or anything you want to tell me now. Thank you. (Applause)