Tip:
Highlight text to annotate it
X
>> ANDY DAVIS: Hi. Thanks for sticking around for my talk. I know a lot of people have headed
off already but, yeah, thanks a lot. So I'm going to talk a little bit about USB. I have
only got 20 minutes and I have got plenty of slides to get through so I'm going to ***
through them at reasonable pace. Okay. So the general thrust of what I'm going
to be talking about is ‑‑ let's say you've got a device that you want to assess the security
of via USB, so you want to identify any vulnerabilities in the drivers, the USB drivers, at all the
different levels in the USB stack. But you want to find out as much as possible
about that driver stack and the drivers that are installed prior to doing any kind of active
fuzzing because as we will get into a bit later, it's a lot slower process fuzzing USB
rather than fuzzing the network services, that kind of thing.
So the more information you can gather prior to starting fuzzing, the more effective the
process becomes. The other kind of thrust behind the research
is around embedded devices and certainly more and more within our company we are getting
asked to test black boxes that may be part of automotive solutions, SCADA, that kind
of thing, where you are just given a black box with a bunch of interfaces in it and no
more information than that. And using these kind of techniques, you can identify the operating
system that's on board and any applications that are on board as well. So kind of fingerprinting
techniques. And to wrap up, I will show a demo which will demonstrate some of the techniques.
Okay. So information gathering, why do we care? If you can connect to a device, surely
you already know the platform. Well, as I said, with the whole embedded system, that's
not necessarily the case. And as I also mentioned, you really want to gain as much information
about the drivers that are installed before you start fuzzing.
So a little bit of background on USB, what we're talking about is the enumeration phase.
The host knows nothing about it. It goes through this phase of asking the device questions
and gaining an idea of what its capabilities are. Now, if we are trying to gain information
about the host, we've got a bit of a problem here because the way USB works, it is a master/slave‑type
setup. As the device, you are the slave, so you can't ask the questions. You can only
answer them. So you have got to kind of answer the questions in ways that will prompt questions
from the host to try and deduce what information you want about the configuration of that host.
It is a pretty complex process and there is lots of different implementations this information
exchange on different implementations of USB hosts and, therefore, we can take advantage
of that to try to work out what the host is. So here's a particular enumeration phase,
the arrows there indicating the direction of the traffic. So there is a bunch of different
descriptors. The descriptors just contain information about configuration, the data
structures essentially. So you get a device descriptor. The host sets
an address. It then requests a device descriptor again, configuration descriptor, string descriptor,
a bunch of configuration descriptors. The host then sets the configuration and then
you can start using the device. But hang on. Why was that device descriptor
initially requested twice? Well, when it first connects, there is no address set for the
device. So it needs to know some information about it with the first request. It sets an
address. Then resets the device and goes through the process again. There's also a multiple
requests for some of the other descriptors from different layers within the USB stack.
So as the information gets processed by different parts of the USB stack, some of that information
needs to be requested again. You can also get class‑specific descriptors. For example,
hub descriptors, hid descriptors, "hid" being mice, keyboards, human‑interface stuff.
So a bunch of different USB stack implementations. I will *** through these slides because as
I said, I only have 20 minutes. Typical components. You've got host controller hardware at the
lowest level. This is implementing things like timing, electrical signals, all that
kind of stuff that needs to be implemented in hardware. You have host controller driver
which provides an abstraction layer to the hardware for the USB core drivers. And it
is those core drivers that perform the actual enumeration. You then have the class drivers
which are mass storage drivers, printer drivers, that kind of thing and then application software.
So, you know, you plug in an USB camera and it pops up some photo editing software to
display your photos. That's what I'm talking about, application software, stuff like that.
A bunch of different implementations which I will *** through. Okay. Interacting with
USB, how are we actually going to communicate with USB to try and gain the information that
we want? We need to be able to capture and replay all the USB traffic. We need to have
complete control of the generated traffic that we want to create. We don't want to be
banged by any of the spec, so, for example, if you were going to purely use some test
equipment to do this, a lot of the test equipment will be written to just use or comply with
the USB spec because people generally want to use it to test their kit. Or as we want
to use it to generate unusual traffic. Each of the different USB classes has got
its own detailed spec document to explain how that class‑specific data transfer occurs.
We don't want to have to go to the spec every time we do packet capture. So if we've got
a class decoder for each of the USB classes, that's really useful. We want to be able to
support different speeds. USB 3.0 speed will be fantastic.
Hey, guys. >> Hey.
(applause) >> What do we call this? Shot the n00b. What
do we need on stage? Right there, right there. So first time at DEF CON? This is your first
time at DEF CON? >> Yeah.
>> He is going to say yes anyway, right? >> Wait, wait, wait. Test. What is the Dark
Tangent's real name? >> (speaker off microphone.)
>> We do not use real name at DEF CON. >> Off the stage. No. I feel bad now. But
don't ever say that again. >> Do we have them poured yet?
>> We are fast, it is a fast talk. >> This is our final one.
>> I'm going to miss this. >> This is the last one, so drink if you got
them. >> To you guys.
(applause) >> This is the gold‑plated solution right
here. >> Gold‑plated solution. I don't get it.
>> What does that even mean? >> Can anybody in the audience tell us what's
going on in the talk? >> Is this about monster cables?
>> You have your work cut out for you. >> I just noticed speakers have been putting
their phones up and counting down. You're screwed, my friend.
>> ANDY DAVIS: We try to be diligent. Okay. I tried to be diligent and I failed.
Thanks, guys. >> You're welcome.
>> ANDY DAVIS: Okay. So, yeah, if you have got plenty of cash and you want to spend $20,000
on an USB testing solution, then go ahead and buy the elitest. I couldn't afford that.
It is pretty useful having some kind of test solution for the class decoders like I mentioned.
I have got one of those packet masters. The much cheaper approach which you can get
away with for kind of 95% of the stuff I'm going to talk about is to use a Face Dancer
sub board. This is a fantastic solution developed by Travis Goodspeed. It is awesome. If you
haven't played with one, get a hold of it and have a play with it. It is fantastic.
The solution to do everything I'm talking about is to use both. Basically you're generating
the traffic using the Face Dancer sub board because you have actually full control over
the device that you're emulating. So I can send any packets you like. And you can use
the Packet Master or any of the other appliance to capture the traffic coming back and deal
with the class decodes for you. Also, the Packet Master has microsecond timing which
can potentially be useful as we'll talk about in a minute.
Okay. So there is a bunch of targets here. There is no reason other than the fact that
these were the devices I had lying around at home. I wanted to find out if you could
use some of these techniques to differentiate between a bunch of different operating systems.
These are ones that I had free. So we want to be able to identify what the
different class types are that are supported. So kind of standard USB drivers. Most OSs
come with a drive for head. So keyboard, you can plug a keyboard in. Also a mass storage
device. We also want to enumerate all these specifically installed drivers. And if there
are any other devices that are already connected, going back to the embedded system type situation,
you might have a black box that has a HTPA modem inside connected by USB internally.
So if you can connect those devices, that's pretty cool.
Where is the information stored? Well, information of the classes is stored within some of these
descriptors I mentioned, within the device descriptor and also within interface descriptors.
It is normally in interface descriptors because devices with multiple interfaces might have
different classes of interface. And the information that relates the device
driver to specific devices is the vendor I.D. and the product I.D. that's stored within
the device descriptor. So it's kind of like a lookup table. So after it's gone through
the enumeration process, it can go off and say am I aware of this device? Look it up
with a vendor I.D. and product I.D. Using the Face Dancer, you can go through
a brute force approach to try and identify those. You don't need to go through the entire
bit space of each of those 16‑bit values because everyone who wants to use a vendor
I.D. or product I.D. has to register those with the USB implementation ‑‑ bleh ‑‑
alcohol, the USB Implementors Forum. So you can download that list and the e‑map tool
uses that list. To identify what drivers are installed, it's
just a case of emulating each of the device types when you connect to the host. So you're
virtually connecting to the host using the Face Dancer. You bring it up and say today
I'm a printer. Next time I'm an USB camera, that kind of thing. The host will respond.
And if it goes all the way through to the set configuration command there and stops,
there's no driver installed. If it continues and starts talking, the protocol associated
with that class, you know it is installed. Simple as that.
I talked about trying to identify the connected devices purely by sniffing the USB bus. If
you look for traffic on other addresses, you will see those are other devices connected.
The way the structure of the USB works, as long as you are on the same tier of the kind
of starred tier arrangement of hubs, you will see all traffic associated with other devices
connected into that hub. Potentially I've been thinking about a scenario
whereby you could control other devices. So the Face Dancer allows you to be a host as
well as a device. So if you've identified there's a device on address 4 and you start
sending get device descriptor requests to that, will it start revealing information
to you? I think it probably will. It's currently untested and it is something that's part of
my future research plans. So a bunch of fingerprinting techniques, seriously,
we are running out of time so I will *** through this. OS identification, here we've
got Linux‑based TV set top box, Windows 8 and the same mass storage device was plugged
into each. All the traffic was captured. You can quite clearly see that the types of class
commands that are used and the order in which they're requested are completely different
for those two OSs. And it's worth pointing out that every time
you plug into these specific OSs that's the pattern of requests and replies. So, yeah,
you can see ‑‑ quite clearly fingerprint them there.
Application identification, so I talked about the different applications that automatically
spawn when you plug in an USB device. Here we are talking about an USB camera that's
plugged in. On Linux, you have got all these different requests and replies within ‑‑
this is in class‑specific data after the enumeration has occurred. Whereas, on Windows
8 you have got completely different commands. And what Windows 8 actually tries to do is
modify the property of one of those images and within that device property request, there
is a whole bunch of text with very specific information about the version of OS, IE, Windows
6.2.9200 is basically latest version of Windows 8.
So there are a bunch of other patterns that I identified based on different requests,
different numbers of requests, specific requests for certain OSs, that kind of stuff. So quite
easy to identify OS versions ‑‑ sorry, OS types.
Timing information. I talked about this microsecond‑timing capability with the Packet Master device.
So I've got five different captures here of the same enumeration phase for the same device.
And if you look at the amount of time it takes to perform each enumeration, the times are
wildly different when you are talking in microseconds. So there's massive difference across the entire
enumeration. However, I noticed that if you choose specific ‑‑
the time between specific requests, so, for example, in this case, between string descriptor,
the request for string descriptor 0 and 2, there's actually very little difference in
the timing between those. And this kind of implies if you already know the OS, there's
the potential for discovering some speed information here. Again, this is work in progress. And
I hope to talk a bit more about that in the future once I have done further research.
Some OSs have got their own specific descriptors you see. So if you see Microsoft OS descriptors,
you obviously know it is a Microsoft‑based OS.
Responses to invalid data. There is a whole bunch of different invalid data that you can
send within these descriptors that are the responses to the requests during both enumeration
and also in the class‑specific communication. So things like minimum/maximum values, logically
incorrect values, missing data, you know, short strings, long strings, all that kind
of stuff. So there are some situations that I identified whereby you get unusual behavior
that can be used for fingerprinting. But it's more useful as part of test cases for fuzzing
to identify bugs. So, for example, Windows, all versions ‑‑
I say all versions from XP up to current day. If you send a specific logically incorrect
head report descriptor, this happens. So not too great for any kind of enumeration perspective.
But it's a bug. And when I show you e‑map in a few minutes'
time, one of the test cases within e‑map will trigger this. So when I release it all,
you will be able to play with this and find out what this bug is.
Also, the order of the descriptor requests can be used for identification, too.
So let's do a demo of e‑map and hopefully the demo gods are going to be with us. Right.
So what I've got here is my laptop is connected via a Face Dancer board here which is down
here connected to a laptop that's running Linux. So if I run e‑map ‑‑ there is
a whole bunch of commands and things you can do with e‑map. I will show you a few examples
of the things you can do. First of all, let's list the different USB
device classes that e‑map is aware of. And most of these we can emulate. First of all,
we want to say ‑‑ let's pick one of those and say is it supported. So does it support
the audio class? So what it's doing is it is going through and it's systematically virtually
plugging itself in and saying I'm audio class sub class undefined protocol, protocol undefined.
No, not supported. Next one. Right. Now I'm another audio device, audio control, da, ta,
da, ta, da. It is going through each one and none of these are supported so far. Okay.
So it doesn't seem to support any audio devices, this Linux box.
So if we try ‑‑ go back to ‑‑ let's try an image class device. That's type 3.
So we go. That's supported. So it said now my camera, this is the particular class, sub
class and protocol that I'm using and it's come back and started talking image class
language. And, therefore, we know that that's supported and we can fuzz it later on.
Also, within the talk, I mentioned the vendor I.D. and product I.D. that's associated with
specific drivers that have been installed. If you know a VID, a PID and what it equates
to, it maintains a database of all that information. So as a quick example, it has looked up that
VID and PID. The various operating system identification
techniques I talked about, it goes through and uses those. So if I do capital O, it will
go off and try some of those. It systematically is plugging in and enumerating, looking at
the order of the packets going back. Okay. Cool. That's right. Good.
Let's say you just want to be a specific device, you want to emulate a device of a certain
class. I'm thinking of a scenario ‑‑ let's say you're doing a job where you know
that you've got an USB‑based end‑point protection system in play and you know that
USB mass storage devices are allowed but only from a certain vendor.
So you can say I'd like to be a mass storage class device with this vendor I.D. and product
I.D. and it will pop that right up and you can start using that.
Also, image class, let's do an image class example because it doesn't come by default
with Face Dancer. Okay. So that's now connected and said, hey, I'm a camera and that's all
the interactions of the OS. And, yeah. There you go.
So on to the fuzzing side of things, there's a whole bunch of test cases I've implemented.
So generic test cases can be used to fuzz any USB device and then based on the spec
for each of the USB‑device classes, I've gone through and developed fuzz test cases
for a bunch of those as well. So you can see all the layers in there.
Let's do a quick fuzz attempt. We know it supports the image class. So if we say fuzz
the image class, so it takes each one of those and it is basically, you know, going through
and saying, yep, I'm an image class device with this particular test case that we are
starting to fuzz. You can't see what's going on on my laptop but it is checking up loads
of errors. In a minute, it will actually check up a really serious error. You won't be able
to see it from there anyway, so you have to trust me.
It is a great way of identifying bugs. And just through testing of e‑map, I have identified
a whole bunch of bugs in different OSs. It is not quite ready for primetime yet. It needs
a little bit of tweaking and it needs me to implement the remainder of the device classes
before I make it public. But once I do, it will go live on our GitHub so you can download
it. So if I just stop that and go back to my wrapup.
So overview, you can enumerate supported device classes, operating system information can
be enumerated. So if you want to emulate any of the device classes and you know a specific
vendor I.D. and product I.D., you can do that. You can then fuzz any USB host implementation
to identify bugs within that. And as you can see, with the classes, sub classes and protocols,
this is an enormous attack surface. Lots and lots of this stuff is implemented in OSs and
people just don't know it's there. It's just not used. But it is there as part of the attack
surface and it can be fuzzed. I mentioned end point protection systems.
You can assess the configuration of them if they are USB based. Potentially circumvent
them if you know specific devices you want to emulate. And, yeah, as I said, completely
comprehensively test USB host test implementations. Wow, that's the quickest I have ever done
>> Thank you very much. That was great. Okay. There's a big crowd in here and there are
people coming in and I don't know why! Why are you coming in?
>> (speaker off microphone.) >> You're wrong. You could stay but you wouldn't
be staying for a damn thing because why would you ever believe anything that you read at
DEF CON? The program is wrong. It's time to go get in the queue for tracks 2 and 3. Good
luck out there, folks! Thanks for coming to Party Track. I will see you next year.
(applause)