Saturday, June 20, 2009

What do you seek?

If you work in this field long enough you will come across a situation where you need to justify your methodology. You will be asked to show why you need to look at all of the data points you look at. It's par for the course. When I get asked to do this I respond simply by asking the following question in return.

Do you seek an answer or do you seek the truth?

This question tends to make the doubter pause. When you are staring a potentially damaging case in the face, do you seek an answer or do you seek the truth? More importantly do the decision makers seek an answer or the truth?

There is a school of thought out there that says if any file containing sensitive data is accessed after the system is compromised, then analysis should stop right there, a line should be drawn and anything accessed post compromise date should be notified upon. I talked about it back in December when discussing footprints in the snow. Think on that for a moment. If a system in your organization is compromised and you run an antivirus scan and trample on Access times, it means you're done, you're notifying, and you're going to have a lot to answer for when your customers get a hold of you. You will not have given the case its due diligence.

In just a second you'll see a graph that I generated. It shows file system activity based on a mactime summary file. Take a few moments to analyze the graph. *I did have to truncate the data set. There were hundreds of thousands of files touched on 5/12*



Does it tell you anything? Imagine the system were compromised on 5/5/09. There are a few things that should stand out almost immediately; Such as the dramatic increase in file system activity beginning on 5/11 and continuing through 5/12. Or how about more simply that there is a story to be told here.

Do you seek an answer or the truth?

A person in search of an answer is going to get a response of "ZOMG the attacker stole a lot of data and you're notifying on every single file contained on the system that contains PII data". If you seek an answer you are not interested in the story that needs to be told, you are not interested in any of the details of the case. You want simply to put the matter to rest, get it behind you and move on to the next case that will be decided by the uninformed.

A truth seeker will ask what happened on 5/11 and 5/12. A truth seeker will interview key individuals, a truth seeker will evaluate the log files present on the system and many other data points to determine what the cause was. A truth seeker will want to hear the story based on your expert opinion, which you reached by examining all sources of data.

A truth seeker will take interest upon hearing that the system administrator not only scanned the hard drive for malware, but he copied hundreds of thousands of files from the drive. A truth seeker will want to see the keystroke log files. A truth seeker will thank you for decrypting the configuration file and output used by the attacker to determine intent and risk. A truth seeker will ask you to look at network logs and a variety of other sources of data to reach a conclusion and render an opinion.

So, the next time someone questions your methodology ask them if they want an answer or the truth. If all they want is an answer, more power to them, ignorance is bliss after all but there is always a story to be told.

Windows Forensic Analysis 2E - a review


In ancient times, when philosophers and scientists gathered to discuss and debate important topics, people would travel for weeks and months to arrive, just to hear the debates. To listen to the great minds of the time, to learn from them, and on occasion ask questions. In 2009 that trend continues though in a different fashion.

In the case of Windows Forensic Analysis we are fortunate enough to have Harlan Carvey. He has a deep well of knowledge to pull from and he continues to pull buckets of information out of the well to keep us all well hydrated. I was honored to read this book, and it's my privilege to write a review. It's the least I could do.

It's a text book, it's a field manual, it's reference material. This is Windows Forensic Analysis Second Edition and it's the best damn book on the planet for Windows Forensics. I thought I liked the first edition and then I read the second.

It's been updated to be sure, but it's also been expanded. There's current information contained in the over 400 pages of content. There are case studies, there are details you won't find elsewhere.

Want to know how to dump memory and collect volatile data? It's in the book.
Can't recall which tool has certain limitations or what the tool can do? It's in the book.
Want to know how to analyze volatile data? It's in the book.
Want to learn how to registry works? It's in the book.
Want to know how to do Windows Forensic Analysis? Read this book.


I've watched the forums and mailing lists since the first edition of the book was released two years ago. Time after time I read the questions being asked and went to the book. In an overwhelming majority of cases, the answer was there. To those of you that asked these questions, do yourself a favor. Go to the bookstore, or online store and buy the book, read it, highlight it, dog ear pages for reference. Make use of the knowledge that has been shared, your clients deserve it.

In ancient times, people would travel for weeks or months to listen and learn from the greats..all you have to do is spend a little money and open the book.

Thursday, June 18, 2009

The need for speed

When it comes to incident response we are still struggling to close the gap. It's been mentioned here before, and other places. When a compromise occurs, how quickly do you get to analyze the compromise? Hours, days? Some time even weeks!

And when you do get your hands on a system what are you left with? A lot of trampled data points and an incomplete data set. And..what kind of time frame are we talking about to conduct the analysis, just to make a cut as to whether or not it passes the "who cares" test? Consider the time it takes to acquire and process an image in FTK or Encase. This is commonly referred to as machine time, and it's not uncommon to have too many systems locked up in machine time while the analyst waits for the case to process. And then of course you have the report writing part of the process. All told this takes an Average of 20-40 hours per case, often times more. This equates to two bottlenecks in the process.

The two primary bottlenecks are

  1. The time it takes to get to the point where analysis data points are collected, processed and analyzed. There is a lot of time wasted when the analyst is waiting for "machine time" to complete.
  2. The time it takes to generate a report.
This is the traditional model and its inefficiencies. If this doesn't make sense, perhaps the picture below will help.


So now let's look at a model that involves a triage phase.

By including a triage process that collects primary analysis datapoints while the system is live we increase our efficiencies by multi-tasking. Collection of primary data points can be largely automated. The analyst can then focus on analysis of the collected datapoints while the traditional acquisition and case processing takes place. Essentially what this does is take advantage of all available resources - human time and machine time by changing the model.

The benefits of this model are

  1. Not always having to acquire a disk image. If the "who cares" test isn't passed with the datapoints that are collected early, then you can easily move on.
  2. This is a one-to-many relationship. An analyst can quickly collect datapoints from several systems and conduct triage analysis instead of waiting for a linear acquisition process to progress.
  3. It uses all available resources at the same time instead of waiting on one component to complete.

Here is another illustration.

Again...it's not like this is built around F-response or anything. I'm not saying..I'm just saying.

What are your thoughts? I *am* looking for feedback.

Wednesday, June 17, 2009

Active Directory Snapshots

With Vista, Microsoft finally made proper use of Volume Shadow Copy and the Volume Shadow Copy Service. A lot of great work was done to help others use this during analysis. Server 2008 continues this model but it applies it to Active Directory. Sounds cool eh?

First off, let me say that this is well known to sysadmins, but I'm fairly certain it's not well known in this part of the industry. I've not seen it discussed on any list or forum I pay attention to at least.

For background - Read these pages here and here...I'll wait.

And now, go read this page here....I'll wait for you again.

So, now that you've read about creating Active Directory Snapshots and how to mount a VHD file in windows, let's discuss it.

When performing incident response in an Active Directory Environment, you're most likely going to want to look at a domain controller, especially if the domain controller is compromised, or there is something funky happening in the directory itself. Any self respecting sysadmin is going to have a regular system state backup of the domain controller. This is done so restores can occur if objects are inadvertently deleted, and also as a good practice. In server 2008, this backup is stored as a .VHD file. In a response scenario involving AD, we want to maintain our methodology of not modifying the system any more than we have to, so, we don't want to work on a live copy of Active Directory, we want to work from a snapshot of it.
Here's a pseudo scenario.

A compromise is believed to have occured in Active Directory.
Logging was disabled by the attacker on the domain controller, or the attacker covered his tracks in the logs.
You have been tasked with figuring out what was changed.
You have a recent system state backup.
You mount the system state backup and recover the AD core files.
You create an Active Directory Snapshot.
You load up Sysinternals Active Directory Explorer
You load the snapshot and the AD core files and Diff them in AD explorer.
You now have a smaller dataset to work with and you have a point in time diff of "what changed".

I'll be putting this together in a more formal manner..but I wanted to throw this out there for anyone that deals in Active Directory Compromises, especially with server 2008 domain controllers.

Memory Acquisition for First Responders

Not long ago I sat down with a group of First Responders to discuss triage of security incidents. I discussed leaving the network connection up so I could remotely access the drive and physical memory. Their response is one that I expect many to have come across.

"If we leave the system up, even if we tell the user to not use the computer, the minute we walk away, the computer will be used."

That was kind of interesting to me considering what's at stake but I completely understood their point of view. Too many organizations can't trust their users. So then I thought hmmm....well memory acquisition has come so far so fast that I can simply teach tech staff at any level to collect physical memory. With targeted training and proper documentation it's a fairly straightforward process to follow on contained systems.

Here's a sample from a doc I drafted detailing use of mdd from mantech.

ManTech DD (MDD)

Limitations:
Less than 4GB memory
32 bit Windows Operating System

Installation
Download mdd from Mantech
You can download the standalone executable (recommended) or a .zip file.
Copy the file(s) to a directory on your USB key
Rename the mdd executable to mdd.exe

Usage
Log in to the compromised system
Insert USB drive
Create a directory for the incident on the USB key or SMB share
Open the trusted command prompt for the operating system
Change directories to where mdd is installed
Execute mdd

Command line
E:\IR\mdd>mdd.exe –o E:\00000\memorydump.img

Where 00000 is the case number you've been given.

Notes
Mdd creates an md5 hash of the output of the memory dump. It’s important to capture this information. You can take a screenshot of the window using Ctrl+Alt+Print Screen or copy/paste from within the command line to a text file. Both forms of output are acceptable. Save this file as memorydump.md5


Training first responders to do a memory acquisition is much easier these days.

START methodology

START is a methodology applied to Mass Casualty Incidents or triage centers and frequently it is applied to battlefield medicine. START stands for Simple Triage and Rapid Treatment. I will focus primarily on Mass Casualty Incidents and triage centers. This methodology has a direct tie to The Golden Hour.

It is my humble opinion that START can be applied easily to Computer Security Incidents; Those of both the mass casualty and triage center variety. In a Mass Casualty Incident you are typically confronted by several potential issues ranging from sensitivity of data to criticality of the resource and the threat posed by the compromise. These casualties come from all sides of the organization. The same holds true when you have an influx of dissimilar incidents and you need to prioritize them - think the ER at a major hospital on a warm Friday or Saturday night.

That said I humbly present my adapted START methodology.


Stage 1 Triage
Stage 1 triage is completed on a live system. This stage requires a network connection.

Conduct Rapid Triage
  • Collect Volatile Data
  • PII search non system created directories
  • Limited malware scan
  • FLS & MACTIME

Conduct Rapid Assessment
  • Preliminary memory analysis
  • Review PII tool logs
  • Review Antimalware logs
  • Review FLS & MACtime logs
  • Establish Time of Compromise

Influential factors in Stage 1

MADtime
  • MAD time is the Maximum Allowable Downtime an organization can withstand the loss of a resource. Or more the point…the time it takes for someone to get pissed off(MAD).
Initial Threat assessment
  • Is it known
  • Identify any knowns
  • Is it attacking other systems
  • Is it spreading
Initial Risk assessment
  • Sensitive Data presence
  • System Profile
Stage 2 Triage

Stage 2 triage is completed on a disk image either after Stage 1 has been completed, or in place of Stage 1 in the case of a physical drive being delivered or acquired from an unpowered system.

Conduct Rapid Triage
  • PII search disk image
  • Data point collection
  1. Network Logs
  2. Prefetch
  3. Registry
  4. Browser History
  5. MACtime data
  6. Malware scan
  7. Event Logs
  8. Application Logs
Log the case and turn it over for analysis. The combination of the above data points is more than enough to get an examiner started.


This sure looks like a case for F-response especially if you combine Stage 1 and Stage 2 triage...I'm not saying I built this around it or anything...I'm just saying.

Responder Pro use case

And then conficker woke up...

It was April 10th and there was a bad moon rising. Three systems (one laptop, two desktops) woke up, updated and began attacking the network infrastructure at a customer site. Within minutes, hundreds of accounts were locked out. Within hours, all were locked out. After the initial screams for help and DoS complaints I got onsite, got the interview out of the way I was off to do the field work.

After collecting memory dumps - using FastDump Pro - I went back to the office of the local system administrator and fired up my mac and my XP virtual machine with Responder Pro installed.

After running the memory dumps through Responder Pro I did a quick analysis and compare and contrast. The questions I had to answer immediately was "HOW the hell did this happen?" and "HOW do we prevent it from happening again?"

I won't bore you with gritty details.

To answer the first question of "HOW the hell did this happen?" let's take a look. Two desktop systems and one laptop. I had my suspicions but me being me..I had to know. Using DDNA I quickly identified the injected svchost.



Next I quickly viewed the strings of the binary.


Note the telltale signature of the HTTP string http://%s/search?q=%d. That's confirmation of Conficker.

And for the coup de grace I looked at the Internet History.


That was easy. HOW the hell did this happen? The laptop user who had been out of the office for a while returned, plugged in and nailed two unpatched systems, and instructed them to download a copy from the laptop on port 4555. All told it was 15 minutes to Root Cause Analysis. That's what I like to call Rapid Assessment.

HOW the hell do we keep this from happening again? Well that's an exercise I leave to you..needless to say NAC has gained traction at this customer site.

Tuesday, June 16, 2009

The human dimension

I've been on a rather large engagement for the past few weeks and as a part of it I was co-opted to provide some education to "the end user". I know what you're saying and you're probably right. End user training doesn't work. Unfortunately it's part of the process and no amount of technology can solve the human factor.

Let me begin with a personal opinion on malware infections, especially those that are based on browser hijack, drive by downloads, and all other web based exploitation of end user systems.

"In the majority of browser based, malware related incidents, systems get compromised because the person using the computer, unless it is their job to browse the web outside of corporate resources, is not working at the time of infection".

So then, let's explore the human dimension when it comes to these cases. As Microsoft mentioned recently, the fastest growing threat right now is Rogue Antivirus. The question is why? Why is it so successful? How have so many people been duped by it? The answer is not as clear unfortunately but I do have some thoughts on the issue.

First there's the economics of it all. These guys can do this cheaply and they are making a small fortune doing it. Second, there is no deterrent. Third and more importantly, there is the human dimension of it. Since item 1 and 2 are something that won't be solved easily, let's evaluate the human side of the equation.


What's in play during a rogue antivirus malware incident?

1) A user is browsing the web.
2) The user is assaulted with popups.
3) The popups take advantage of common flaws in human computer interaction.
  • We are more and more stimulated by visuals than we are by written words.
  • We click before we think and read the message in front of us.
  • The depth of knowledge of computers has drastically decreased as the technology has become more a part of our life. Therefore what is being presented is not understood.
  • Anything that is perceived to "get in the way" will be ignored, avoided, and subverted.
  • A panic situation is created and the user reaches an emotional state by appearing to lose control of the system.
  • A familiar setting (My Computer window) is presented in an altered state, with signs of alarm, further contributing to the panic and emotion.
  • The system is exerting authority over the user by claiming in no uncertain terms that something is wrong with the system.
  • What happens on a computer has not been translated to the physical world. The fear factor doesn't exist.
4) An executable is presented as the solution.
5) Then, salvation is presented to the user in the form of rogue antivirus products.
6) After this, the credit card limit is reached quickly.

With that background in hand why don't we look at the average user and common pitfalls when trying to train them, and why delivering technical training to non-technical people fails more often than it succeeds.

Why it fails more often than not:

1) We're boring the audience.
2) Technical jargon doesn't work.
3) What we're saying lacks relevance.
4) The average person comprehends at a 6th-8th grade level. We tend to assume people are smarter than they actually are, and certainly in terms of computer use.
5) People tend to have two types of actions burned in to the brain; humor & trauma. Presentations tend to by dry, dull, and lack interaction as well as either humor or trauma.
6) There is no relationship made to the real world, analogies aren't as digestible as they need to be to have an impact.


More on this subject later.

Monday, June 15, 2009

Too close to the problem

This is probably the 20th time I've sat down in an attempt to complete a blog post in the past month or two. With that many "draft" posts I finally realized what the problem has been. I've been too close the problem of forensics and incident response to write. The past few months has been a period of escalating action and reaction. An arms race if ever there was one. I feel comfortable saying that I've seen more cases in the past few months than many people see all year long. I do not say this in a gloating fashion or as an indication of anything other than it's been hell. I have been too involved in many cases, cases I never wanted to but secretly hoped to see, those full of intrigue and shadowy figures that I may never get the opportunity to meet face to face, so I must battle them from afar, armed only with my knowledge of his behavior and his tools, cases involving life and death and various other types. I have had talks with various government agencies of late, and had more run-ins with local and state law enforcement than I can recall having in the past 5 years combined. I have been too deeply entrenched in a backlog of cases to see beyond a tactical level.

I guess you could say this is the fog of war. I am too close to the problem and my writing and this blog has suffered. During this time I took an old adage in to account and that adage is "You can't learn anything when you're talking". Needless to say I've learned a lot in two long months.