Tip:
Highlight text to annotate it
X
My name is Tom Cross, and the Director of Security Research here at Lancope. What I'm
going to talk about today is the utility that audit trails of network activity can have
in trying to attack activity that's happened within a network and clean that activity up.
I want to start by talking about the subject of forensics as it applies to computer security.
I recently went down to SANS Digital Forensics and Infinite Response Summit in Austin, Texas,
with my colleague, Charles Herring, and some of the material you're going to see in this
presentation comes from a talk that we gave at that summit.
One of the things that I encounter often when I talk to computer security people is that
they ... some people have a very narrow definition of what forensics means in the context of
computer security, and I think that it's actually something that is a consequence of the nature
of the kinds of attack activity that people have been experiencing over the past decade.
I think that the kinds of attack activity that people are experiencing today has changed,
and our understanding of forensics and incident response needs to change with it.
Many people see forensics narrowly as being analysis of hard drive contents, particularly
in a context where there's a desire to collect evidence for a criminal prosecution, and the
idea is to have a chain of evidence with respect to the data that's on the disk and to be able
to search the disk for data that may have been deleted and reconstruct that data, and
search that data for evidence of a crime. That practice of forensics is obviously important,
but it is applied usually in cases where the owner of the computer is the person you suspected
of committing the crime. Usually when we talk about computer security, we think of attacks
that are launched against computer networks by outside attackers or external adversaries.
The problem is that often it's not possible to prosecute those people. It's very difficult
to get access to them or to identify who they are.
When people are dealing with computer security as a problem in general, then tend not to
focus on evidence collection. They tend not to focus on prosecution because it's not something
they can easily avail themselves of under the circumstances. Instead they tend to focus
on protecting their network from attacks, and they tend to focus on cleaning up attacks
that have happened. >> When a computer network is breached, people
have a sort of binary view of that. In the past, people have been very focused on protecting
the perimeter from breaches, and when breaches occur, it's game over. You failed to stop
the attack from happening, computers have been compromised, and the only thing to do
at this point is to clean them up. This point of view comes from dealing with
attacks that are broadly targeted and financially motivated, where the attacker is not really
interested in your organization specifically; they're interested in breaking into as many
organizations as they can. These pieces of malware, once they got on your machine, they
were just there to collect credit card numbers or other financial liquidable data; and so
they weren't necessarily there to search around within your network.
There was not a lot of analysis that needed to take place when an incident like this was
discovered. You just needed to clean the malware off the computer and get back up and running
again, and I think that that is where things are really starting to change. I think that
we're seeing more sophisticated, targeted attacks that are hitting a variety of different
kinds of organization, and those organizations are also simultaneously more aware of that
kind of attack activity happening than they probably were say five years ago.
>> When you have a sophisticated, targeted attacker who has compromised your network
and taken over computers in your network, you may need to ask some questions about what
was happening there beyond simply cleaning the malware up and getting the infected computers
back online. You need to understand how that attacker was able to infect your environment.
You need to understand what different assets you have that were compromised and make sure
that you have a comprehensive understanding of that attacker's behavior before you can
feel confident that you've really removed them from the network.
That process of analyzing the attacker's activity in your environment and trying to create a
complete picture of everything that they're attempting to do, that process is also something
that I would call forensics, or incident response. As we have dealt with more and more sophisticated
attackers, that process has become more and more important. It's important, first of all,
because these attackers often have multiple infection points in our network. They're using
multiple different kinds of malware with different command-and-control protocols to control the
environment. If you find one and you clean it up, you can rest assured that there are
others that you haven't found. Without doing that in-depth analysis, it's really difficult
to piece together a comprehensive picture of the compromise and to feel confident that
you've rooted it out of your environment. The second things is that what we've learned
through handing these attacks over time is that, when you analyze them and you understand
how they occurred, there are pieces of information that fall out of that analysis that might
help you detect future attacks by the same adversary. We talk about advanced persistent
threat. Persistent attackers are not going to be deterred because you found some of their
malware and cleaned it up. They're going to continue to target you.
The ability to learn from their techniques and to apply what you have learned to look
for continued attacks by them is a critical part of how you actually protect your network
again future attacks. >As a part of that analysis, I think it's important to consider the timeline of an intrusion.
You may have a user in your environment who goes out to the internet and accesses something
malicious, and their computer gets infected. That's hopefully something that you would
discover. You might have multiple means of doing that. You might have an IDS system that
fired. You could have a gateway advanced malware analysis system that's taking documents that
come in and analyzing them, and tell you if they appear to be malicious. You may discover
that this has happened, but likely you'll discover that it has happened after it's over
and that computer has become infected. Once you discover that the infection exists,
the question is: how much time does it take you to actually reach that computer and deactivate
it and remove it from your network. Often people chuckle when I put this timeline in
this chart up in front of them, because I'm showing the incident response team disabling
the infected machine in seven minutes. Most people don't have a responsiveness that is
anywhere near that. It can take days to get access to an infected machine and disconnect,
and to work with the business owners that are associated with whatever that computer
is doing in the environment and get them to understand that you've got to go in and shut
it down, and that that business process is going to be halted while that analysis is
taking place. This is a ... a seven-minute response time
is a really efficient, well-oiled machine with respect to responding to breaches.
The question that matters is that what happened in your environment ... between the time that
that malware infected that host and when you were able to disactivate it, the fact is that,
if the attacker had minutes or hours or days to operate on that machine during that window
of time, then they may have pivoted from that initial infection point to other points in
your environment. Really coming up with a full understanding of the incident involves
analyzing those factors and figuring out exactly what happened during that window of time.
In order to do that, you need access to data, and there's a wide variety of data sources
that could potentially be valuable to you. This is really where I'm talking about the
concept of incident response and forensics expanding. It's not just about analyzing a
hard drive anymore and figuring out what's on the hard drive, because the reality is
that your attacker is not leaving a record on that hard drive for you of everything that
they did in your environment. You've got to look at other data sources in order to get
that complete picture. I think there are three data sources that
are of critical importance. One of them is logs. End points have log information, network
security devices have log information, servers have log information; and of course, a well
environment has all of that log information going to a central database where it is stored
and where it is searchable. That's an incredibly valuable resource. However, once a computer
is computer is compromised you can't trust the logs coming of it anymore. The first thing
an attacker is going to do when they control a computer is to get control of the logging
process so that they're activities are no longer logged. Furthermore, logs have a tendency
to focus on events that devices consider to be interesting. For example, an IPS system
is only going to log attacks that it detected. If your attacker is hitting with zero-day
vulnerabilities that IPS system doesn't know how to detect, obviously, those things are
not going to be logged, so a log can miss some critical pieces of information that are
part of the complete picture of what happened in your network.
Obviously an ideal thing to get a complete picture of what happened on your network would
be to have packet capture happening everywhere within your environment, and then to be able
to store those packet captures forever; but the fact is that that's not realistic. Packet
captures are very powerful if you have access to them, but they're also very expensive to
store. The fact is that you're likely to only store a few days or maybe a couple weeks of
packet capture if you're really heavily invested in it.
The other thing is that you're not likely to be doing it pervasively throughout your
network. You're likely to be doing it at an access point that exists between your network
and the outside world, and maybe you've got a little bit of packet capture spattered around
within your internal environment, but it's very difficult to capture every single packet
that happens everywhere. This is where NetFlow comes in. I think NetFlow
is a very powerful tool for collecting an audit trail of what's happened in your environment,
and it's a good complement to packet capture in certain ways because it can see things
that packet captures aren't going to see. The first thing is that NetFlow is compressed.
It's just a header information regarding the transactions that happened, and so it's easy
to store much more NetFlow for a much longer period of time for packet capture for the
same investment in disk space. It really boils down to how much time you want to store, how
much history you want to store. With NetFlow you can potentially store months, whereas,
with the same amount of disk space, you might have only gotten days of packet capture.
The other thing is that it's really easy to get NetFlow pervasively from your environment.
You can get NetFlow from down in the access and distribution layers of your switching
fabric, and so that enables you to see ... if you look at this network map, if the computer
at the very bottom of his network map, the bottom right corner of the chart, gets infected,
a system up at the firewall level might be able to identify transactions that came from
that computer and went out to the internet; but they're aren't going to record transactions
that happened between those end point nodes down there at the access level, and those
transactions may be critically important when you're putting together what happened during
a security incident to understand how the attacker pivoted form his initial point of
infection to other machines within your environment. For those two reasons, I think NetFlow is
a critical ingredient in the recipe of how you defend your networks against attacks.
It has a lot of unique value alongside syslog and packet capture. > Let me talk about certain things you can do with NetFlow. Obviously, once you're collecting
NetFlow, you can do real-time detection. That's not really what this talk is about, this talk
is about forensic audit trails, but understanding what you can do in real time helps you appreciate
what you can do with the history. Obviously, if you're getting all the network transactions
and you're doing real-time monitoring, you can get a picture for what's happening, and
you can attempt to detect activity that is suspicious.
For example, you can detect data exfiltration. If large amounts of bytes are moving out your
network onto the internet, that's something that's going to be visible ... or moving around
within your network ... that's something that's going to be visible via NetFlow because you
get an idea of what transactions are taking place and how many bytes are moving. NetFlow
can detect specific activities that are suspicious such as reconnaissance, and again, it's really
important to be able to detect reconnaissance within your network because, once an attacker
has compromised one machine, he's going to scan around inside that network to find the
data he's looking for and other points that the can compromise, and that reconnaissance
activity is something that is valuable to detect in real time so this is useful.
Another thing you can do is you can look for botnets. If you have thread intelligence,
if you know the IP address of a command-and-control server or of a drive-by download site, you
can monitor for that in your environment in real time by looking at the network transactions
that are happening and seeing if they're ... have the same IP address or url. You can do real-time
threat intelligence monitoring with real-time NetFlow monitoring.
This, I think, is a good gateway into the question of what you can do with the months
of history that you stored, and I think that the first key is to understand that you usually
don't get thread intelligence when it's fresh. When someone tells you about an IP address
of a malicious attacker, by the time that information got to you that attacker has already
been operating for some period of time. If you take that IP address and you put it in
your system, and you start monitoring for attacks that it targeted, the fact is that
you may have already been targeted before that happened; and so it's really valuable
to be able to take thread intelligence data and do a historical analysis of that data
in order to see if you have been targeted by that adversary in the past.
You may remember about six months ago there were a number of organizations, primarily
Mandiant, who released information a thread actor called APT1, which was an actor that
engaged in sophisticated, targeted attacks across a large number of organizations for
many years. Mandiant released a number of domain names and ND5 hashes of malware, SSL
certificate IDs, and other things related to this attacker.
Other organizations released some IP addresses, and in fact, Lancope released some unique
IP addresses and other indicator associated with this adversary based on some of our analysis
of this adversary and their activity. What do you do with all this thread intelligence?
The fact is that the minute all this stuff came out on the internet, this actor stopped
using all these ... all of the systems associated with these indicators. All of this thread
intelligence was burned. The IP addresses that were being used for command and control
were deactivated. The domain names were abandoned. The malware was abandoned.
What value is all this abandoned thread intelligence? If you've been storing history of your network
transactions for several years, it's possible for you to go back and check that data to
see if you ever, in the past, interacted with those hosts. Even though the attacker may
have abandoned that particular host at this time, there are probably other ways that they
are engaged in command and control in your environment; so if you find that you were
communicating with this host in the past, that can be a starting point for an investigation
that can lead to discover what's happening in your network today.
In fact, we had several customers who, based on the information that was disclosed about
APT1, were able to discover activity that happened in their network in the past by looking
at their NetFlow collections, that they were previously unaware of. This turned out to
be a valuable tool in some cases. One of the things that you can do with this
data in StealthWatch is you can build these charts, like the chart you see here, which
shows you a long period of time ... this is a month's worth of data ... and it's graphing
when your network saw activity to this suspicious IP address. You can quickly see patterns of
interaction between your network and this suspicious IP over a long period of time,
and then it's possible to drill into each of those interactions to see the specific
activity that's happening. Once you have discovered that you did interact
with one of the systems in the past, the next stage is to engage in a deeper investigation
of what kind of activity occurred as a consequence of that infection. Of course, if you've got
all the network history there, particularly from your access layer, you've got a great
resource to do that kind of analysis. This is a scenario that we built here in our
lab related to following an indicator of compromise. In this scenario, we've received a couple
of IP addresses for a website that was engaged in a watering-hole campaign, so that means
that the website was taken over by the attackers, and the attackers had placed an exploit there.
The website was selected because it's a site that people in your organization visit.
We take a look at the IP addresses for that website in our StealthWatch NetFlow records,
and we see that we did have a computer access that site, which is not necessarily surprising
because, in a watering-hole attack scenario, obviously people in our environment are likely
to be accessing that site. We dig in and look at the details of the different
network transactions that that host engaged in around the time that they accessed that
website, and when we look at these details we discover that there are a few suspicious
http connections that occurred right after contact with the infected site. This tell
us that this might be a drive-by download attack, because typically in a drive-by download
scenario, the site that is the initial point of infection redirects the user's browser
to other systems where the actual exploit payload is delivered. You can often pick that
right out of your NetFlow logs. Then, of course, later we see that reverse
FSH shell, so an FSH connection has come out of our network to someplace on the internet,
but most of the bytes were sent by the client. If you think about FSH, you type LLS or DIR
and press enter, and you're going to get a bunch of data. The user in FSH sends less
data than the server that they're accessing, so if you see an FSH connection coming out
of environment but you are sending more data than you're receiving, then that looks a lot
like a command-and-control channel where somebody on the internet is remotely controlling a
computer in your network, and it jumps right out at you when you look at NetFlow records.
Clearly we did have a host that was infected by this watering-hole.
The next thing we need to do is look at other behavior that that host has engaged in, and
in this case the host appears to be scanning the internal network for computers that are
running SMB or MSRPC, so Microsoft systems, and it may be the case that this host knows
a vulnerability in MSRPC and so they're trying to pivot from this initial infection point
they obtained to control other computers in our network. By analyzing these records, we
may see that ... we see these little transactions here with one or two packets being sent, and
that means that's just scanning activity; but if we see a connection get packed up between
this host and one of the victims that it's scanning, we now know that that most may have
successfully pivoted to that second location. So now we have another computer in our environment
that we need to investigate. In addition, what we can do is we can take
the command-and-control IP that was running this reverse FSH shell, and we can search
for it to see if we have other computers in our environment that are reaching out to the
same command-and-control system; and in this case we are seeing that activity happen.
This is where thread intelligence is really valuable. You can see that we started with
one piece of information about a malicious site, and based on that piece of information
we were able to analyze the kill chain a little bit and see the whole process of the attacker's
attack activity, and we discovered new IP addresses that the attacker was using; and
then based on this IP addresses, we were able to discover additional infections in our environment.
Another scenario where this can come into play is where you see zero-day attack activity
and then you see IPS signatures, for example, come out that will detect attacks that targeted
a vulnerability that was being exploited in the wild before it was publicly disclosed.
When you see that happen, you IPS may detect attacks that involve that vulnerability; but
the question is: were you targeted by those attackers in the past before that IPS signature
became available? When you get those IP addresses off of those IPS signature [fires 00:25:16],
you can go back and check your NetFlow and see if you previously communicated with those
IPs before that IPS signature was there, and that's a great way to identify successful
attacks in your environment that happened in the past, before you had access to the
threat intelligence in question. Bear with me for one second.
Here's another scenario that is interesting. This is a sequel injection scenario. What
you see here is a console that's been set up in StealthWatch to monitor a number of
web servers and activity happening with those web servers. In this case, you see a great
deal of data leaving this web server and going to the internet, and that spike on these charts
significantly exceeds the day-to-day traffic that this web server is engaged in out to
the internet. That's strange, so we'll investigate it a little bit more deeply.
We take a look at the actual transaction in question, and we can see a very large amount
of data being downloaded from the server, so we dig in again and we take a look at the
site, and we see that, from a different source address, there has been a bunch of reconnaissance
activity that has been taking place against that host.
Again, we're beginning to work back our kill chain where, in the beginning, we saw this
event that looked like a lot of data being downloaded off the website, and we're able
to see now that there's also a bunch of reconnaissance activity that took place and this is indicated
of a sequel injection attack where somebody has exploited a vulnerability in our website
to access raw records in our database and dumped the entire the database out of the
output. We've got a few IP addresses now of different
... associated with this actor that we could do some more investigation on to see if they
engaged in other activity on our network. These discussions have centered around external
[predactors 00:27:25]. I think it's also important to consider insider threat as a subject. I
think that insider threat is something that a lot of organizations don't have a very good
practice around because people have not necessarily understood how to build an effective insider
threat practice in the past. I'm constantly talking this book ... Carnegie-Mellon
obviously runs this group called CERT, which I'm sure most of you are familiar with, and
they've been doing research on insider threat for many, many years. There are incredible
resources up on their website. If you Google for CERT inside threat you'll find their resources
up there. They published a book last year called The CERT Guide to Insider Threats,
which is the best guide on insider threats that I've come across. They have some very
good recommendations in there about how to build an effective program that's based on
evidence that they've collected over ten years of studying really insider threat cases.
One of the key pieces of information that CERT tells you is that the insider threat
is not strictly an IT problem. Whereas many computer security are considered an IT issue
exclusively, insider threat is a management issue, it's an HR issue, and it has to do
with the relationship that the company has with employees and employees that have become
disgruntled. Often the way that an incident is detected
in the context of insider threat, it's not because some computer system identified that
something was going wrong. It was because the people that work with the person who had
become disgruntled identified that that person was making threats against the organization,
and it did seem like that person was likely to do something wrong; and then that information
was then taken to IT, and IT was able to analyze monitoring systems and logs that were available
to find evidence of whether or not the suspicions were true, essentially. Typically, in insider
threat cases, if they are successfully identified and prosecuted, it's a consequence of good
log collection. In this scenario, consider that HR may have
come to you and said, "We're concerned about this particular individual. He threatened
to do something destructive to the organization. We'd like IT to take a look at that," or "This
person just quit on bad terms, and we're concerned they may have actually traded some data."
Within StealthWatch, if you're tied in user identity information, it's very easy to put
someone's username in and get information about what network transactions they've engaged
in, even if they moved around to different IP addresses such as when they were in the
office or logged in over the VPN, or moving around to different wireless access points,
you can get a complete picture of their behavior. Another thing, in addition to simply getting
the raw network transactions they engaged in, you can also see a picture of different
security events that fired against that host. In this case, we have the suspect data loss
security event, which detects data being exfiltrated out of network, and we noticed that that event
fired while Lucy was using the IP address in question; and so maybe Lucy was exfiltrating
data. In this case, we can see that this transaction has occurred, so we've got some information
that Lucy might have moved a lot of data out of the network, and if that's what we suspect
Lucy of having done, then this becomes some evidence that we can use to establish that
that actually took place. Let's look at a little bit more a more complicated
example. This is ... in the course of looking at Lucy, we also noted that Baron had a suspect
data loss event that fired on his IP address, and so we're concerned that he might have
exfiltrated data as well. We take a look at Baron's exfiltration, and we look at ... in
this case we got some application details, so it looks like a MySQL database dump, which
is interesting, and it's significantly large and it's been sent out to a host in the Ukraine,
which is troubling. We note that this exfiltration occurred at 9:07 a.m.
We continue to investigate this, and first of all we're wondering: where did Baron get
this MySQL dump. Baron is not a person who usually works with database systems, and we
look at the different network transactions associated with this IP address and we notice
that there was a significant download from a particular database server on our network
that occurred to this host. Clearly Baron dumped some data out of one of our critical
databases; then he exfiltrated it to a host in Ukraine.
Now we're very upset with Baron, and we continue to dig in. We notice that, prior to ... in
this case we're still looking, so this is the database transaction, and we look at the
timeframe associated with the database transaction and we can see that this timeframe occurred
between 8:55 a.m. and 9:07 a.m.; 9:07 a.m. is when the exfiltration to the Ukraine began,
so again, this is just more context that this database dump occurred right before the exfiltration
of it happened. We dig into Baron a little deeper, and we
find out that, in fact, there has been another transaction between Baron's client and this
host in the Ukraine that's been going on since 4:00 that morning, and this looks an awful
lot like a command-and-control channel, so it may be the case that Baron is not actually
responsible for stealing this data and that Baron's computer has been infected by an external
actor, and as a consequence, that external actor used Baron's system to steal some data
from our database and send it to the internet. It's important that we know and that we begin
to investigate deeper and try to understand more about this attacker and why they're compromising
our network. I think that it's really important to keep
these 5 W's in mind as you perform these investigations of your computer network. You're trying to
establish who did this activity. You want to know what they did, and when I say you
want to know what they did, you want to have a complete picture of everything they did.
You want to know what systems and how they targeted those systems, and what data they
accessed. Every bit of information that you put together about their overall behavior
can lead you to being able to, first of all, ensure that you've completely rooted them
out of your network, and secondly, understand how you might be able to detect attack activity
from them in the future. You want to know where they want. You want
to know what in your network they accessed. You want to know when. You want to build a
timeline for the incident so that you can, again, be sure that you have a complete understanding
of what happened; and ultimately, you want to understand what their objective is. That's
very important in terms of being able to improve your posture going forward. If you know what
you're adversary is after, you can take steps to protect that asset more effectively and
to monitor that asset more closely. Hopefully, I've established that forensics
in the context of computer security is not nearly about preparing evidence for trial.
I mean, we may never be able to prosecute these APT actors that are hitting our network
from remote sites. It's become more about understanding the attacks that we're subject
to and using that understanding to better protect our networks. It's not just about
analyzing hard drives. It's about getting a complete picture of an incident that has
affected us from a NetFlow standpoint as well as from a packet capture and syslog standpoint
as well. I hope that it's opened your eyes a little
bit about what's possible in terms of network audit trails.