Forensic Incident Response: 2008

Tuesday, December 30, 2008

Footprints in the snow

A computer intrusion takes approximately 60 seconds (usually far less) from initial entry to setting up a back door with administrative access. If not detected within the golden hour, it tends to be about a month or more before someone notices they've been breached. Imagine you've been called to a crime scene involving a theft. The footprints above are representative of the footprints left by the suspect.

By the time we arrive on scene a lot has happened that affects our ability to accurately investigate the scene. Forces and Factors are at play. Time....it is the constant factor..the one force multiplier that can confound an investigation, above all else. You see, time is not a force in and of itself. It is a constant (as far as those of us who are not full time philosophers are concerned) that never changes. Consider the footprints in the photo above. All things being equal, if the weather did not vary from today's weather, and the temperature did not change; the snow would not melt, there would be no rain, no wind or other elements that would otherwise alter the footprints in the snow. However, as we are aware (again for those of us who are not full time philosophers), though time is a constant, the weather and other elements are not constant; they vary. It could be 60 degrees tomorrow, it could rain, or it could snow. Someone could ski over the footprints, someone could shovel it away...you get the idea.

Regarding digital investigations, time is what allows systems (Antivirus scans, scheduled defragmentation for instance) to impact artifacts left by an intruder, it is what allows the attacker more of an opportunity to find your PII data and cover their tracks. It is what allows the user to modify their files and the system. It is what allows untrained technicians the ability to delete files left by the attacker.

In short, Time is what permits other forces to have an effect on the persistence of data. We must be careful in this thinking so let me restate that it permits other forces to have an effect. It does not guarantee that a change will occur that will impact our ability to investigate the intrusion. So how do we use time in an investigation?

First we must accurately identify the time of intrusion. Once the intrusion is contained, we have what is called temporal proximity or the duration of time between two separate points in time. Our evaluation of artifacts takes place within this time frame. This is fairly well known, however I have seen this evaluation of artifacts squandered by those who suggest that the only evaluation that needs to take place is that of time itself. In practice what I have seen are those that simply evaluate MAC times. The evaluation is simple - any file containing PII data with an Access time that postdates the time of compromise is considered to be notifiable. This is a safe play and I congratulate those that feel morally and socially responsible enough to notify so easily, however it is a knee jerk reaction and indicates laziness. Look at this photo here, of the same location taken a day after the initial footprints were made.

What has happened to the footprints in the snow?

Time moving forward allowed the nights snow to cover the footprints. Are the footprints still present, or has the overnight period completely confounded the investigation? Would you be prone to suggesting that simply because there is fresh snow, the footprint is destroyed? Take a closer look. You can still see the feint outlines of footprints. Applying even the slightest amount of investigative elbow grease what do we see?

Hmmm..an impression of a foot, or footprint is easily visible. Now, we can easily expose the obscured tracks within a certain amount of time, though after enough time passes, the footprints will be indistinguishable from the surrounding area. As time passes the ability to identify accurately explain the source of the original footprints will become more difficult. It is because of this that speed is of the essence. We must close the gap between time of compromise and time of containment.

As seen below I have exposed the tracks but what else do you see?

That's right, you see additional footprints. How were they made, who made them and when? Were they made by another person, the intruder, me, or some other unknown force? Each footprint must now be analyzed individually. Had we casted the footprints after documenting the scene, and taken the casts for immediate analysis, our investigation would be more complete and accurate. Now we will have a slightly more difficult time but it can still be done. However we must be able to explain the changes that occurred in the time that elapsed since the original footprints were made.

You may be starting to see how time can confound the investigation of the original footprints. As time continues forward, the first responder and investigator must be even more careful to preserve the original. This is the reason documentation and a sound approach, especially when dealing with volatile data is critical.

And if time continues, then what? Will time allow more environmental forces to influence the ability to accurately investigate?

We can still see the rough outline of a footprint here, even though the snow is in the process of melting. Time has once again allowed another force to alter the original. Eventually, our ability to see the footprint disappears. After more time passed the footprint looks like this.

Wait. We can no longer accurately establish the location of the footprint. Now, in this case I simulated about 4 months of time, and it is around this point that our ability to accurately investigate an actively used system in an intrusion becomes nearly nil.

Okay..enough meatspace.

Remember I said that time is what permits other forces to have an effect on the data. This applies greatly to MAC times. An Antivirus scan could take place after the time of compromise that updates MAC times, a user could have accessed those files containing PII data; Simply put any force capable of modifying timestamps post compromise could have updated the timestamp. Given this, MAC times for this reason do not provide us with anything other than a point in time, a measurement if you will.

Secondly, time needs to be evaluated as a point or points when change occurs. Our role is to explain the cause of the change. When discussing digital forensics, systems should be evaluated as a world of events running in a steady mechanism of before and after, of cause and effect. When an intrusion has finally been contained and the analysis is underway we must evaluate the changes that took place during that window. Was a key file accessed? Who accessed it? Can we explain the access? Did the intruder gain administrative access? Did they have access to files containing PII data? Did they have access to other systems? Did they use that access? Did they install a backdoor? Did they enumerate your other systems? Did they attempt to cover their tracks? Is malware present? What are its capabilities? These are just some of the evaluations that must take place during the investigation.

Finally when a change occurs at a specific time, there will be several plausible explanations for the change. This is where we must apply a scientific method of testing the most plausible explanation for the change. We can reduce the noise.

For example:
FACT:
A file had an access time updated during the time an attacker was operating on a computer.

BACKGROUND:
Q: Under what conditions is an access time updated?

A: An access time is updated when a file is opened for reading, specifically a file's attributes.

Q: Does a file access time being updated indicate execution?

A: NO. It simply indicates that the file's attributes were accessed.

Example: the 'touch' command would update an access time, as would an A/V scan and many other utilities.

Conclusion: There are many plausible explanations for a file's access times being modified.

Our job: Determine what is most plausible and present your conclusions with supporting documentation.

Assuming a Windows XP system:
In the case of executable being executed under normal circumstances, what artifacts could we expect to find?
1) A prefetch file would be created
2) Depending upon method of execution we could expect to find artifacts in the registry.
3) Memory analysis would show that it had been executed
4) Other sources yet to be discovered or mentioned.

Variables: (Factors and Forces).
1) Intruder privileges - with full administrative privileges, the attacker effectively has their hand on the clock's dial, and do what they please.
2) System operation - Antivirus scans, backups, other scheduled tasks that have the potential to alter an access time.
3) User activity - a user logged in to the system at the time could have done something to update the access time.
4) Intruder activity - The attacker used ftp.exe or had a tool capable of modifying timestamps on the system.
5) Unknown possibility - something that hasn't been thought of or discovered that has the possibility to modify access times.

So, let's start processing this:

What we know:
- An antivirus scan was being run when the file access time was updated.
- The sysadmin confirmed this.
- There are no artifacts present in the registry suggesting execution.

Given the data presented, what do you believe? More important, what would someone else, a lay person (read: decision maker) in particular, be likely to believe?
That an access time had been updated by:
A) a normal system operation (A/V scan)
B) The attacker had executed the file and exfiltrated data.

Are these the only possibilities? No they are not. However, absent any data to refute that A is the most likely answer, what would someone be likely to believe?

For B to be the more likely answer in this case what must be present?
1) Network or other logs during the time of compromise suggesting ftp connections.
2) Artifacts suggesting execution of ftp.exe

Let me summarize:

1) Speed is of the essence. The gap in temporal proximity must be closed. Others have said this (notably AAron Walters and Harlan Carvey)
2) Time is a force multiplier that allows other forces to impact artifacts.
3) Intrusions must be analyzed in terms of changes that take place between t1 and t2.
4) Strict MAC time analysis is lazy and inaccurate, and should be a last resort investigative method.
5) Changes of probative value should be examined in depth and plausible explanations should be presented along with an opinion and documentation.

Monday, December 29, 2008

A quick analysis helper

I commonly analyze systems that run Symantec Antivirus Corporate Edition. A common question we have to answer is regarding the last date a scan was run and the date of the definition files. I did some quick research and came up with the following. May it also help others in the same situation.

The registry keeps track of symantec definition dates in:
HKLM\Software\Symantec\SharedDefs

Defwatch_10 is the value and the data contains the path and date of definitions and revision.

EX:
DEFWATCH_10 REG_SZ C:\PROGRA~1\COMMON~1\SYMANT~1\VIRUSD~1\20080902.016

Defdate is: 20080902, rev 16.

Log files are located in C:\Documents and Settings\All Users\Application Data\Symantec\Symantec Antivirus Corporation Edition\7.5\Logs

The key to the logfile is here

Files are timedate stamped as follows: mmddyyyy.Log

Pulling out relevant information can be accomplished in many ways.

One simple way is by doing the following:

[root (Logs)]# awk -F, '{print $5" "$6" "$7" "$8" "$35}' 09102008.Log

This returns the following information:

Computer Name, User logged in, Name of the malware identified, File location of the malware, IP Address of the system

A scan starting looks like this:

260A1C0B2618,3,2,9,D98B90D03,Administrator,,,,,,,16777216,"Scan started on selected drives and folders and all extensions.",1227890305,,0,,,,,0,,,,,,,,,,,{C446AF0D-2434-4C32-99F7-
B41DC042A2DC},,(IP)-0.0.0.0,,WORKGROUP,00:0C:29:E6:8C:72,10.1.6.6000,,,,,,,,,,,,,,,,0,,,,D98B90D03

The key to interpretation are fields 1-3. In this case it's 3,2,9 which indicates a realtime scan started. A realtime scan is obviously different than a manual scan in that a realtime scan is initiated by the system and a manual scan is initiated by the user. A manual scan looks like this:

260B1D142927,3,2,1,D98B90D03,Administrator,,,,,,,16777216,"Scan started on selected drives and folders and all extensions.",1230601302,,0,,,,,0,,,,,,,,,,,{C446AF0D-2434-4C32-99F7-B41DC042A2DC},,(IP)-0.0.0.0,,WORKGROUP,00:0C:29:E6:8C:72,10.1.6.6000,,,,,,,,,,,,,,,,0,,,,D98B90D03

The key again are fields 1-3 which in this case are 3,2,1. This is a clear indicator that a manual scan was started by Administrator. When someone says "I didn't run an antivirus scan", you now have a quick way to determine whether or not they are telling the truth.

The witness, the perpetrator and the victim Part I

When we investigate a system compromise we are often left with only one portion of the cause->effect equation. It's up to us to take what we are presented with, and reconstruct a crime scene in order to determine what happened and often times determine whether or not PII data was acquired.

Using your imagination, try picturing the following scenario:
You arrive at a crime scene at a jewelry store and are lead to a body laying on the floor in a pool of blood. There is a broken lamp on a table, a large dent in the painted wall at about the 5'6" mark near the victim, a hole farther down on the wall at about the two foot height mark. There's blood spatter on the wall, floor and ceiling. Bloody footprints surround the victim and lead away from the victim out the back door of the store. The jewelry cases are smashed and there's blood on some of the glass. The victim is wearing a blue sweater and grey pants, is female, weighs 120lbs and is 5'6" tall.

So what you have is an apparent homicide with many traditional sources of evidence in play. How would you begin to investigate this scenario?

Now imagine the following. You are lead to the scene of a computer intrusion at a local bank. You arrive at the office of a credit card manager and see the following:

A black dell optiplex 755 sits under a desk and a 19" monitor resides on the tabletop. An external hard drive is plugged in to the computer and resides on the tabletop, and you note a USB key plugged in to the USB hub on the monitor. A small HP MFC unit is plugged in and rests on a small table next to the desk. Some papers litter the desk along with a tabletop calendar, a rolodex and a phone and a blackberry. The computer is on, and has Microsoft Outlook 2003 open on the desktop along with excel, Internet Explorer and one of the banks internal applications for credit card management.

This is pretty typical. So, how do you begin your investigation? What's the major difference in the two scenarios ?

In scenario 1, you have appear to have no one to interview. You must examine the deceased, review tapes, interview acquaintances and so on.

In scenario 2, you have a person to interview. The credit card manager was obviously using the computer and someone decided to call you for one reason or another. You must be able to determine if they are the witness to the intrusion, the perpetrator or the victim. Or if they are all three!

Suppose in scenario 2 you conduct your interview before you touch the computer (which I always recommend). What questions do you ask? Questioning a person can be seen as a bit of an arcane art form. The goal is to get the interviewee to be forthcoming with responses. Many people get embarrassed easily and get defensive, especially if they know they did something they probably shouldn't have. We want them to be calm and accepting of us and our questions. So, set some ground rules with your own team first. A few helpful rules of interviewing could be:

1) Never accuse.
2) Keep your cool. Emotions play a larger role in system compromises than people believe.
3) Be aware of your body language. You must always be aware that your face, posture and hand play, are a huge role in gaining the trust of the interviewee.
4) Ask leading questions.
5) Listen. You can't learn anything if you're talking.
6) Be nice.
7) Get them talking and keep them talking until you have enough information to proceed appropriately.

With the information I provided in scenario 2, you have no way of knowing what has happened yet, however, I am willing to bet you have already made some assumptions and perhaps even made some hypotheses. This is a natural occurrence in the brain and it's not a bad thing, unless you fail to view every angle because you develop tunnel vision.

Assuming the credit card manager told you the following how would you proceed?

They arrived at 7:45am
They opened Outlook to check email, and read some mail
They opened an excel attachment containing this month's stats
They plugged in their blackberry to sync it
They plugged in their usb key to copy files they were working on at home
They opened IE and visited yahoo.com and started researching colleges for their teenage daughter who is looking at schools.

Books to buy

I'm ordering the following books this holiday season. What's on your list?

Windows Forensic Analysis Second Edition by Harlan Carvey

This book was awesome the first time around and now there's even more of it.

SQL Server Forensic Analysis by Kevvie Fowler

If you haven't heard of Kevvie or read his paper you're missing something special.
Kevvie's book is probably my most anticipated book for this year other than WFA.

Oracle Forensics using Quisix by David Litchfield

An oracle forensics book by litchfield..need I say more?

Thursday, December 4, 2008

What your antivirus isn't telling you part II

If my last post on this subject wasn't clear. Here's an illustration:

12/2/08 - The Hallmark/Coke/Mcdonalds postcard/promotions/coupon malware was being sent via email.

12/2/08 - Malware was submitted at 3:30pm

12/2/08 - At 6:15pm the malware was classified as downloader by Symantec.

The definition of downloader?

Downloader connects to the Internet and downloads other Trojan horses or components.

What does that malware actually do?

It spreads in multiple ways:
Reads your address books and emails the malware
Copies itself to USB media

It also:
is a keylogger
opens a backdoor
Phones home over port 80
Injects itself in to explorer.exe

12/3/08

For 24 hours this was detected as "downloader" yet it is clearly more than that. In fact it was given it's own name of W32.ackantta@mm.

24 hours is enough time to do a good amount of damage depending on where this thing is installed.

Monday, December 1, 2008

Let the class action suit begin

It's about time. BNY mellon is now facing a class action lawsuit for "losing" a box of unencrypted backup tapes containing PII data for millions of people. The breachblog has information on this rediculous incident. I sincerely hope that BNY mellon gets nailed on this one. Their actions - negligent, fraudulent, reckless, wrongful, and unlawful is something I just don't get. I'm sure there are a whole host of "reasons" (re: excuses) for this.

The 62 page class action complaint can be found here.

If you received a notice from BNY mellon, check your credit reports and contact the law office listed at the breachblog link above.

Wednesday, November 26, 2008

Redemption

Though I missed the true beta period I downloaded and installed the pre-release version of FTK 2.1 last night. FTK 2.0 left us all in a state of shock. Many questions and accusations flew around various industry forums and mailing lists. Prices went up, quality went down, and we were wanting what we paid for. A lot of faith was lost in Accessdata and their ability to provide a solid product moving forward.

Don't throw away your dongles quite yet. 2.1 is the product 2.0 was supposed to be.

Compared to 2.0, the installation of 2.1 was a breeze. The only missing link was that I needed to reboot to get KFF installed.

Some remarkable improvements I noticed are:
Speed - moving between tabs is as it should be. Processing is much much faster.

Resource usage - Obviously with a 64bit install FTK will use as many resources as can be thrown at it. I like this. I have a good machine, with plenty of resources and before I moved to 64bit, I always watched in horror as my resources just didn't get used.

Here's a shot of FTK just beginning to process an image:

Here it is 10 minutes in to processing:

Usability - Wow, when you click on a tab, you open that tab immediately, even while processing a case.

Does it still require a huge amount of resources? Why yes, yes it does. My test rig has the following specs:
8GB ECC 667 RAM
Dual Xeon 2.66GHz Quad Core processors
System drive is a raid-0 on two 146GB SAS drives
Database drive 3*500GB SATA raid-0

All in all I have to hand it to Accessdata. After all the tongue lashing they took when 2.0 was released, they listened to their customers, licked their wounds, and went back to the drawing board and worked to remedy the problems. I won't say just yet that all of the problems have been fixed. I just installed the product last night, and I'm still processing cases, but this is what I wanted to see - a solid product capable of living up to its marketing, and a product that gives me what I paid for.

Addendum: An 80GB disk took about 6 hours to process and index. I imagine if I had more disk available I could get it taken care of in under 4 hours. Compared to FTK 1.7 which took 20 hours to process an image, I'm happy, very happy with the performance. Currently, I'm processing two more images of 100GB and 150GB in the same case.

Sunday, November 23, 2008

What your antivirus isn't telling you

Ever look at your antivirus logs or the antivirus logs of a compromised computer and found something like SillyFDC or Trojan.horse? These happen to be generic definitions provided by Symantec, but other vendors have generic detection signatures too. Generic detection is a common method of dealing with malware. While generic detection is generally fantastic, it's a big double edge sword.

Let me explain about the two types of malware above.

SillyFDC is a generic signature for removable media malware.

Trojan.horse has the following caption: Symantec antivirus programs use Trojan horse as a generic detection when detecting many individual but varied Trojan horse programs for which specific definitions have not been created.

So, using these signatures, we call things we don't have signatures for but exhibits trojan like properties a "trojan horse" and something that uses removable media as a spreading mechanism "SillyFDC". Ok, no problem right?

It is in fact a problem.

Antivirus now being the 40% solution against bots, it's likely to miss a recent variant of malware. Additionally, when your clients or users discovered a variant of these types of malware, how are they to know what to do? It's been detected generically. Symantec says that the malware is a low risk. Is it really? Again, how is an organization to know? What about how long it takes for an infection to be detected?

In a real world scenario, I first discovered a variant of removable media malware some 30 days before a definition was made available by Symantec. This malware, not only spread by removable media, but was a key stroke logger as well. Once Symantec generated a definition for it, it was labeled as trojan.horse.

Now, let's look at this from a sysadmin perspective. You run a managed antivirus environment and one day, after your server and clients grab the latest set of definitions, you get an alert for malware called trojan.horse. Great! you say to yourself. My antivirus has done its job. You move on about your day as if nothing happened, afterall your AV product detected and removed the threat. You never bother to look at the file, or the timestamps of the file, and you certainly don't bother to investigate. This is an all too common problem and scenario.

What's my point?

When an antivirus product fires an alert for a generic detection, it always bears investigation. It stands to reason that when something is generically detected, it's much more serious than it appears. Using Trojan.horse as the example, when no existing definition exists, it gets classified as trojan.horse so it can be detected and removed. That's fine, but you have no idea what that malware is actually capable of. An immediate threat assessment should take place, even if you simply submit the malware to an automated sandboxing web site.

What should you look at:

How long has the malware been on the system?
What capabilities does it have?
Has data been exfiltrated as a result of it?

Generic detection, while a good thing for the vendor, is a bad thing for the rest of us. It's misleading and provides no information whatsoever. Trojan.horse is a low threat level according to Symantec. I can think of no small amount of people that would consider a key logger a huge threat, especially one that was present on a system for 30 days before a definition was available.

*note I'm not picking on Symantec. This is an issue with all antivirus products*

Tuesday, November 4, 2008

Double take

This is short. Very short.

Accessdata offered to purchase Guidance Software's remaining stock.

Read about it here

The offer was rejected, but Guidance should now be aware that there are sharks in the water, and they smell blood.

Thursday, October 30, 2008

Beware the key

USB keys are prevalent. They are used heavily by many incident response teams and first responders. They're less fragile than CD's, faster and offer greater storage.

They are also weapons of destruction and can become fast victims of compromised systems. It's been estimated that 10% of malware has the ability to infect removable media devices. Recall if you will that old is new. When you respond in an incident, you'll want to take some precautions if you use USB devices.

1) Make sure your devices are wiped and formatted after each case. If your device infected, your device becomes a weapon.

2) Create a directory named Autorun.inf in the root of your devices. This offers some protection against autorun malware.

To protect your Windows workstations if you haven't already, do the following things.

Copy and paste this in to a .reg file and merge it.

REGEDIT4
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\IniFileMapping\Autorun.inf]
@="@SYS:DoesNotExist"

Follow the instructions here:
http://support.microsoft.com/kb/953252

Monday, October 20, 2008

OODA at play

Imagine the following scenario:

You've identified a system communicating with an botnet C&C over IRC. This system happens to be a system that should never be communicating via IRC. It's a webserver. It's running multiple Vhosts and has multiple IP addresses. The IRC connection is active. You check in with the system administrator and inform him of the situation. You discover that the webserver is merely serving public data. It doesn't process or store sensitive information. It's a good case for root cause analysis, eradication and rebuilding.

The system administrator calls you back and says it looks like SSH binaries have been replaced on the system. The administrator happens to be running cfengine and informs you that a large amount of systems have had ssh binaries replaced. What was a run of the mill investigation and analysis just blew up and turned in to an incident for which there is no playbook. Friends, this is a triage situation.

Triage is less about solving the problem as it is about prioritizing systems and stopping the bleeding to buy time to properly assess the situation, and react appropriately. The problem with triage is business continuity. Triage situations would be much easier if we could identify all of the affected systems, contain systems based on priority and threat, and move to more thorough response and analysis. Unfortunately we can't do that. The systems that need to be contained more often than not, can't be contained because they are critical to operations, meaning they can't be shut down.

Returning to the incident at hand. Over 50 systems have had SSH binaries replaced. At this point we need to triage the situation. Were we dealing with human beings, this would be a mass casualty incident and a methodology called START is applied to the situation. When dealing with human beings in an MCI, the priority goes to the most critical patient that can't survive long without immediate treatment. The job of the people performing triage is to assess only. No care is provided except opening airways and tending to patients that are bleeding severely. A good starting point is here. People get classified in to the following categories:

Dead
Immediate
Minor
Delayed

There's a lot that can be taken from this type of real triage in a mass casualty situation and applied to Incident Response when dealing with a lot of systems.

What kind of systems do we typically come across? Let's use the incident I mentioned above. Assume 50 systems. Assume the attacker is actively attacking and compromising systems. There are obvious limitations to physically visiting each system. So what can we do? Assess the situation from the network. In a few easy steps we can triage the situation. With 50 systems its rare that you would find different attackers and different methodologies being used against you. So, we make an assumptive hypothesis based on the following premise. Cfengine detected ssh binary replacement on 50 systems, therefore the attack signature will be similar across systems. In addition, we can assume that very few remote systems will be used in such an attack. So what can be done to triage?

We can quickly divide the systems in to the following categories:

4) Systems that can't be blocked at the perimeter
3) Systems that can't be taken offline (network or power)
2) Systems that that can be blocked at the perimeter (internally critical systems)
1) Systems that can be taken offline (network or power)

Now you might be asking why is priority 1 a system that can be taken offline rather than the system that can't be taken offline? The idea is simple. If I can take it offline, then I should do so by whatever means are necessary. If I can't take the system offline, the task of response is more advanced. Assign the system administrators or other tech staff the role of identifying and containing the systems that can be quickly contained. The idea being that if you are hemorrhaging from 50 holes, and can close 30 of them then you've cut down the tedious work by 60%. Get them under control and off of the immediate concern list.

If I can block a host at the perimeter, then I should do so, quickly. This is a solution that can work to directly cut off the attacker, however with so many systems, there is no way to guarantee the effectiveness of this type of action. An indirect attack is still very possible. Sometimes though, you just have to make a decision, and adapt.

If I can't take a system offline, and I can't block it at the perimeter then I need to respond quickly and carefully. These are the business continuity cases that hamper triage and response. So what can be done to triage them? Remember we're buying time, not solving the problem 100%.

If we work based on our assumptive hypotheses, we can enable a perimeter block to stop the remote sites from being accessed by any of the compromised or soon to be compromised systems.

Have you noticed the OODA loops?

As systems are being contained - via network blocks and physical containment - more compromised systems begin actively attacking. Port scanning begins on internal hosts. Initial triage, while containing 60% of systems left an opening. Once again, division of forces is key to success. With two IR staff, one can work on active containment, while the other works to gather more intelligence.

F-response is a fantastic intelligence gathering tool in this case. Using it, a remote system can be analyzed in real time. Connecting to a system, being able to identify the files created/modified/accessed during the attack lends itself to a more rapid action cycle. Combined with traffic captures and live access to disk based data, we can break in to the OODA loop of the attacker. We can predict what the tools being used will be used for and what files get replaced. We can predict what the attacker will do at each step and can develop a rapid active response to stop him before he begins. With the situation unfolding and new information, further containing the systems that couldn't be taken offline or blocked at the perimeter becomes simple. With a tool like cfengine, a few commands can remove the active threat and we can continue working the problem.

As the situation is contained, a signature is developed and active monitoring is implemented to watch for other systems showing signs of intrusion.

Saturday, October 18, 2008

The clock is ticking

When an incident has been detected or an event has been escalated to incident status a timer starts. The attacker is now inside your OODA loop. Every minute wasted could be money lost, identities stolen, disruption of operations. He/She controls of, or has access to something of yours and can disrupt your ability to determine the correct measures. The speed and accuracy of your response will make all the difference.

Regarding the OODA loop, there's one thing to remember. The attacker has the initiative, we have to play catch up and out maneuver. In a non automated attack scenario, an attacker has presumably done a reasonable amount of homework on the target host or target network. In the majority of scenarios the attacker has been entrenched in a system for hours, days, weeks or even months before their presence is detected. They are already two or more steps ahead of us. As Incident Responders we are at an immediate disadvantage and we have many foes working against us. Not just the intruder, but many times the local IT staff work against us, albeit unintentionally for the most part.

So, how do we, as incident responders react? What must we do to be effective?

Our response must be fast, accurate, appropriate. What does our OODA loop look like?

Observe:
Confirm Incident - Are we certain we're dealing with an incident?

Threat assessment - What's the threat doing? Is it actively attacking or scouring systems for data? What's the depth of the penetration? What has local staff done already?

Prior Reference - Have we seen this before? What happened then? What's different?

Victim assessment - Is sensitive data present on the system? Where is sensitive data stored, how is it processed?

Business Continuity assessment - Can the system be shut down? How long can the system be down? If the system goes down, what is impacted?

Defense mechanism assessment - What options do we have for containment? How quickly can we enable them?

Orient:

In this portion of the loop we take our various assessments and synthesize and analyze our results. We must weigh them against each other and they feed one another. This is by far the most thought intensive portion of the process. We take large amounts of data must process it quickly as time is of the essence.

Decide:

Decisions need to be made. In a recent incident this phase was done on a whiteboard with a co-worker. We identified what we knew about the scope and gravity of the situation, and what options were available to us. We then, on the best information available at the time made a decision to do a specific set of things. An evaluation takes place during the decision making, generally along the lines of "If I do X, what will happen"?

Act:
At this point we act upon those decisions that make the most sense. Not all decisions get acted upon, because not all decisions are appropriate. Action feeds back in to Observation and Orientation.

Now, recall that this is a loop. It's not a step by step protocol. It's a thinking, living, breathing course of Assessment, testing, action, reaction, and adaptation. We tend to do these things naturally. Assuming the intruder is in the system during a response, they will be working through their own OODA loop and will be attempting to subvert and disrupt your OODA loop.

But wait. What advantages do we have, or rather what advantages does your client have?
The good news is that the battlefield is one of our choosing. We know the landscape and have the opportunity to plan ahead. This is a great place to inject an Incident Response Playbook.

What is an IR playbook? It's a set of protocols that educate responders - from the first responder to the tier 1 responder, and it allows the incident handler to make faster decisions, and provides a control structure for handling the incident. It allows everyone involved a chance to orient themselves to the landscape, thereby speeding up the defender OODA loop. In a playbook, many things can be decided ahead of time, and the answers to questions are already present. For instance we can walk in to an incident already knowing:

1) If sensitive data is on the system, and how it gets processed and stored.
2) What containment options are available.
3) Business continuity can be pre-assessed.

With a playbook we can short circuit the initial OODA loop and improve our response accuracy and speed. Of course we can't always rely upon a playbook. There will be times when the playbook must be thrown out because it doesn't apply to the situation at hand.

All the credit for OODA obviously belongs to John Boyd. A fantastic book is here.

Wednesday, October 15, 2008

Remotely examining a disk image

Recently I've been exploring new modes of operation. One such mode is working on disk images remotely. This has become increasingly important as cases roll in constantly. Disk to Disk or Disk to Image imaging is great for small operations. It's mandatory when working a criminal case. A new mode of operation I'm exploring is image to SAN and analyze from SAN. Imaging to SAN is a great way of operating if you can swallow the cost. One of the best, and most cost effective tools on the market today is F-response. It's now cross platform. As I've mentioned to Matt Shannon - It's mac-tested and mother approved. I've used it several times in such cases. It sure beats doing target disk mode let me tell you. Imaging is a pretty straight forward activity. Imaging to a SAN is no different. Where it gets interesting is analysis. How do you do it?

My current tests center around using multiple platforms.

I have a linux box that I have attached to the SAN. It has access to my disk images that are stored there.

I mount the disk image in a particular directory as follows:

mount -o offset=32256,ro,noatime,noexec,nosuid,nodev,show_sys_files,nls=utf8 image.dd /path/to/case

This is a straight forward method of mounting a disk image for analysis in linux. Now, how to gain access to it from Windows, where the armory of analysis tools exists?

I've been using sftpdrive. I enter all the requisite login information and folder location and voila. sftpdrive is really the windows equivalent to the native linux sshfs. After logging in I now have an SSH secure tunnel to a mounted disk image, that is listed as drive letter in windows. I can run any and all analysis tools I need to. One caveat I've run in to has been a timeout issue. The fix to this has been to "always trust" the SSH key within sftpdrive.

Do you image and analyze disk images from a SAN? What methodology do you use if you use a non-traditional method?

Old is new - Tales from the field

If we believe in certain truths such as "past is prologue" or "life is circular" or "security is cyclical" and "what was, is, and what is, will be again", then what we are seeing in the field is easily explained and indicates the cycle is starting all over again. If I've confused you, then my job is done here.

No really, let me explain. In the past two months I've seen compromises that take me back to the days of yore, when we had malware like stoned floating around on boot sectors of floppy disks. I can recall one bit of malware a friend shared with me at the time that would play a "drip drip drip" sound as it deleted the contents of the system from underneath you. Why did these types of malware work? Well, think about it. Floppies were prevalent, and they were primary storage medium of users. In order to move data from computer to computer you put it on a floppy disk and carried it to the next system you needed to use. This was otherwise known as using a sneaker net.

Stepping forward, developers figured out how to largely prevent this type of infection from taking root in a system. Soon we had Bios detection of boot sector viruses, we had antivirus detection of removable media and the threat changed. This type of malware was removed from our forward thinking minds. Self spreading worms were the threat of the day soon after and the world suffered. We then moved in to the user being our weakest link. Browser based exploitation of systems is popular, phishing works and so on.

Apply brakes here.

In the past 30 days I've seen compromises that are based on MBR rootkits, and removable media. I will not spend a lot of time detailing the intricacies of the malware, because others have done so really well. This was detected as..well it wasn't detected by antivirus. It couldn't be. Because of the way the malware loads and runs, Antivirus could not detect the malware on a live running system. Oddly enough, Antispyware software identified the malware through the presence of registry keys. RegRipper was able to assist in this identification. Harlan, has the world said "thank you" yet? Full detection could only occur by mounting the disk image I had. In testing after the fact I was able to mount the disk remotely and detect the trojan using F-response. This trojan, while using OLD concepts, uses new techniques. Discovery of files in C:\Windows\Temp is what really got the blood boiling. One contained a logfile of keystrokes entered in Internet Explorer. Another was XOR'ed. Thanks to Didier Stevens fantastic tool XORsearch I was able to determine the key used (11 byte XOR using keyword "bank") and the file was "un xor'ed". A list of over 900 banks was uncovered. The malware's intent was revealed and the case moved on.

Next....

Let's now discuss removable media malware. This is floppy malware all over again. Except this is 2008 and self spreading worms are dead right? Well, not quite. They've simply ducked under the external threat radar and they are using the internal threat agent - the user. The class of malware is Worm.Autorun and the name says it all. It functions by creating autorun.inf files on removable media and fixed drives and relies upon the user and windows autorun functionality. Once the malware makes its way on to a host, the real fun begins.

The malware hooks in to the registry and replaces your registry key
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon to point to a new userinit.exe that is created on the system by the malware. Why is this important? userinit.exe is responsible for execution of scripts and other programs that need to get loaded before explorer.exe runs such as establishing network connections. That's under normal operation. This replacement userinit.exe is a keystroke logger that has a degree of shell integration, since it's loaded before, and injected in to explorer.exe. It grabs the window titles and contents of balloon windows and all else that exists in explorer.

Attempt to remove or rename the autorun.inf from the infected system, and 2-10 seconds later a new one is created. Upon attempted deletion you receive a dialog box asking if you're sure you want to delete it. Guess what generates that dialog?

Lots of things happening, lots of cyclical security issues are reaching their return time, so it may be time to revisit old and forgotten response models, brush the dust off and update them.

Tuesday, September 16, 2008

Drive Erazer

Historically I've always done disk wiping through DBAN. It's free, and easy to use. I have had a system capable of attaching plenty of drives and drive types for just this purpose.

Recently though, our tech shop purchased a bunch of Wiebetech Drive Erazers because the thought was it's easier and sometimes faster to just plug a drive in and flip a switch. Well, these devices fit that niche perfectly. They ended up tossing one at me and asked me to verify their functionality.

They ordered the "Pro" model which can do ATA-6 secure wiping as well as writing a simple zero pattern to the disk. So I first wanted to test the basic wiping functionality.

I hooked up a 13GB IDE drive and flipped the switch.

About 10 minutes and an 'xxd' check later and I had a disk full of zeroes that was verified. Not bad.

Next up an 80GB IDE drive. No problems here either. Approximately 40 minutes later I had a zeroed disk.

As I write this I'm about 50 minutes in to a 'secure wipe' of a 250GB SATA disk. This device also detects and removes HPA and DCO on the device, which I really like. I expect it to take a reasonable amount of time, around two hours or so. Weibetech states approximately 35MB/s wipe speed which is respectable for the pricetag and functionality. So far this little device is as good as advertised.

Procedurally I like devices like this, because a tech can easily attach a drive, flip the switch and go do something else while the device is working. As much as I like DBAN for my own use, I think these little drive erazers are very handy to have around, and they're extremely portable (fit in my palm) which only adds to the usability.

Some dislikes:
Counting blinks to determine error or time to completion. It's a little like morse code but for the price tag it's not a showstopper.

The jumper location in the device is all but unreachable unless you have fingers the size of paper clips. A precision set of needle nose pliers takes care of this though.

The IDE ribbon should be a little bit longer. At current length, you need to bend it too far to keep the hard drive flat.

Conclusion:
I like the drive erazer. For the $149 price tag on the Pro model I'd recommend it to people who don't want a computer dedicated to wiping disks, but a few more tests need to be completed before I "approve" it. I have a few small quibbles about design but none are major.

If you have thoughts about these devices or can recommend similar and similarly priced devices I'm all ears.

Addendum The 250GB disk finished wiping in one hour and twenty-two minutes.

Tales from the field

Marc Weber Tobias used the phrase "The key does not unlock the lock, it actuates the mechanism which unlocks the lock" in his medeco presentation at techno security. I'm going to stretch this out to subversion of any system which is what I think he was actually saying. In other words, don't think in a linear fashion when attempting to solve a problem, or "think outside the box".

Basically what you think will compromise a system isn't what actually compromises the system. When attackers attempt to break in to systems, they have a number of options. They can use brute force, the browser, email, exploit etc. Many of these methods are direct, but attackers understand that the front door is not the only way in to a system. There are more ways in than one. Once in a great while you come across a particularly interesting method of subversion or attack.

Yesterday was one of those days.

Something landed on my desk that I'd not actually seen in the wild. I'll start at the beginning so this all makes some sense.

At about 10:40am or so I was alerted to an anomaly. Time for a phone call..

ME: "Hi there, there's a possible compromise in your area of operation. Could you check out IP address x.x.x.x? It's connecting to the following IP address y.y.y.y"

SA: "Sure I'll track it down and we'll take a look at it."

I go off and worry about other bits and bytes and check back in a little bit. No change but then the phone rings.

SA: "We've got the system offline, and we're going to rebuild it. The user just returned from Botswana and it was a flash drive causing the infection. They were logging in as administrator. Looks like they won't be doing that any more."

ME: "Great! Go ahead and rebuild the system and I'll close out the case."

About an hour passes and I'm off again looking at more bits and bytes and another alert fires. Hmmm...same destination host, same destination port, different source IP. I call up the same SA...

ME: "Hi, looks like you've got more problems. A new IP is connecting to y.y.y.y"

SA: "Really? What's the IP?"

ME: "z.z.z.z"

SA: "Oh crap. I told him not to plug that system in. The tech took the system back to his bench to rebuild the system, he must have plugged it in."

ME: "Ah, ok. Give me a call once you've confirmed that's the case."

SA: "Will do."

The SA calls back and confirms the situation. The system then disappears.

About 45 minutes later another alert fires. Same SA, different IP address. By this point I'm getting frustrated...Phone call time.

ME: "Looks like yet another IP, showing the same signs of compromise."

SA: "What's the IP?"

ME: "q.q.q.q"

SA: "Ugh that's the tech's system. Hang on."

SA talking in the background...

SA: "He plugged the flash drive in to his virtual machine, so that's what you were seeing."

Now I'm curious.

ME: "Hey can you send me that USB key?"

SA: "Sure I can even drop it off on my way home."

ME: "Great."

To summarize, A user turned on his computer, plugged in a flash drive and his computer connected out a remote host suspiciously. System was contained and taken back to a tech shop for rebuilding. Tech plugged the system in, and connected it to the network. Another alert. Tech took the flash drive and plugged it in to his system where he had a virtual machine set up. Once the flash drive was connected, his system started connecting to the remote host.

Here's a look at the traffic:

The flash drive showed up as promised by the end of business. Always eager to explore I plugged the device in to my booted helix instance. After copying it with dcfldd, I did what any reasonable person would do...I mounted it!

root(sda1)]# mmls usb.dd

OS Partition Table
Offset Sector: 0
Units are in 512-byte sectors

Slot Start End Length Description
00: ----- 0000000000 0000000000 0000000001 Primary Table (#0)
01: ----- 0000000001 0000000031 0000000031 Unallocated
02: 00:00 0000000032 0000499711 0000499680 DOS FAT16 (0x06)
03: ----- 0000499712 0000503807 0000004096 Unallocated

root(sda1)]#mkdir /tmp/case
root(sda1)]#mount -t vfat -o offset=16384,ro,noexec,noatime,nosuid,nodev usb.dd /tmp/case

After running fls and mactime against the image I find the following which just looks odd:

Mon Aug 25 2008 00:00:00 4096 .a. d/d--x--x--x 0 0 508 /RECYCLER
4096 .a. d/d--x--x--x 0 0 39049 /RECYCLER/S-1-5-21-1482476501-1644491937-682003330-1013
Mon Aug 25 2008 11:44:52 4096 ..c d/d--x--x--x 0 0 508 /RECYCLER
62 ..c -/-r-xr-xr-x 0 0 39174 /RECYCLER/S-1-5-21-1482476501-1644491937-682003330-1013/Desktop.ini
4096 ..c d/d--x--x--x 0 0 39049 /RECYCLER/S-1-5-21-1482476501-1644491937-682003330-1013
Mon Aug 25 2008 11:44:54 4096 m.. d/d--x--x--x 0 0 39049 /RECYCLER/S-1-5-21-1482476501-1644491937-682003330-1013
4096 m.. d/d--x--x--x 0 0 508 /RECYCLER
278 ..c -/---x--x--x 0 0 509 /autorun.inf
79360 ..c -/---x--x--x 0 0 39175 /RECYCLER/S-1-5-21-1482476501-1644491937-682003330-1013/autorun.exe

Wait a sec...populating a directory called RECYCLER, with an inf and an executable file? Time to check out those files!

root(case)]#cat Autorun.inf:
[autorun]
open=RECYCLER\S-1-5-21-1482476501-1644491937-682003330-1013\autorun.exe
icon=%SystemRoot%\system32\SHELL32.dll,4
action=Open folder to view files
shell\open=Open
shell\open\command=RECYCLER\S-1-5-21-1482476501-1644491937-682003330-1013\autorun.exe
shell\open\default=1l

Ok, so we've got the usb key using the autorun function. What's this you say!? USB can't autorun, unless it's a U3 drive. Correct you are..sort of. Look again at the autorun.inf contents.

the open line is obvious. It will execute the file listed.
icon=%SystemRoot%\system32\SHELL32.DLL,4 Now that's interesting. Just what is the fourth icon in SHELL32.DLL? It's a closed folder. Most people wouldn't even notice when they plug in and windows asks what you want to do with the device, that the top icon listed in the window is a closed folder.

If you don't know what I'm talking about here's a visual:

Now imagine the highlighted icon is a folder that has a caption of "open folder to view files".

What about autorun.exe? First things first, I run strings to check it out.
root(case)]#strings autorun.exe
iams.wearabz.net
svchost.exe
explorer.exe
f424
asd..65637fj
"" "
Mozilla
Success.
Failed.
Tester
Start flooding.
Flooding done.
fstop
SeDebugPrivilege
sendto
WSASocketA
htonl
setsockopt
CloseHandle
GetCurrentProcess
lstrlenA
ExitProcess
lstrcmpiA
GetProcAddress
LoadLibraryA
KERNEL32.dll
AdjustTokenPrivileges
LookupPrivilegeValueA
OpenProcessToken
ADVAPI32.dll

Ok, there's some interesting strings listed. Functions aside there's hints at a possible hostname to look at, flooding capability, svchost.exe and explorer.exe are both mentioned.

I quickly submit the binary to cwsandbox
Results are here
and virustotal and if you feel so inclined, I've uploaded it to offensivecomputing (md5: 3240c08878c7491b85b79c97db5c9204). The best description of this malware that I've seen is here

Some lessons learned here:
1) Harlan is right. "Many times, an organization's initial response exposes them to greater risk than the incident itself, simply by making it impossible to answer the necessary questions." In this case we knew the system had nothing of value on it, other than personal files. The tech in this case, after containment, turned the system back on in the shop, plugged it in to the network, and then he took the usb key and plugged it in to another system that wasn't properly configured. Any malware investigation virtual machine should be set to "host only".

2) What you fear the greatest is the least likely to kill you. We concern ourselves with the big targets most times, but it's rarely those systems that get compromised, at least directly. Going after soft targets is the best way to get at the hard targets.

3) You can place as many detection systems, deploy as many security products as you want, do all the staff training you can, but all it takes is a person not paying attention to compromise the system/network.

Thursday, August 28, 2008

City buses are like desktops

Just yesterday I was sitting on the bus, when we pulled up to a stop where at least 20 people boarded. I happened to be sitting near the rear exit door when I saw a young'ish kid jump through the open door and land in the seat in front of me. I looked around to see if anyone else noticed..apparently they didn't or didn't care to say anything. Then, I looked up at the front of the bus where there were all of the passengers waiting to go through the "security check" where you either scan a card, wave an rfid card, use your traditional boarding pass, or pay with cash. The bus driver, though he has a few mirrors to look in to was fully pre-occupied with the boarding passengers and had no way of detecting the fare thief.

This friends is a compromise of the system, where the system is the bus. All of the good little applications on the system checked in with the bus driver and were deemed acceptable. That one sneaky application jumped in through the exposed hole in the system, and looks just like all of the other applications on the system. I, playing the role of Antivirus or other security technologies inspected, and let it happen. The bus driver, being administrator was too busy with other duties to notice.

There's a saying that it's rarely the things we fear most that kill us. That seems to be the case when it comes to security. That which we armor our systems against is rarely what leads to compromise. In this case, The bus has multiple methods of authentication, guarded and monitored by a human. However, the backdoor of the bus gets opened every time the front door does, so anyone can go out the back door, or come in. My point? Every time you open the door to your system to put new things in it, you increase the potential for compromise by opening a back door.

We build great firewalls, and have operating system security products, and plenty of gee whiz tools, but there's also another saying...
"You as a system administrator can screw up only once", that's all it takes to lead to compromise. All it takes is one misconfiguration, one step in the wrong direction, one rear door of the bus staying open 3 seconds longer than it should, and bam, you've got a compromise on your hands.

Recall the Routine Activity Theory if you will. I firmly believe that RAT defines security incident occurence. Why do security incidents occur?
1) Target of opportunity
2) lack of proper guardianship
3) a motivated offender

Monday, August 25, 2008

Knowing is half the battle

Any G.I. Joe fans out there? This was the catch phrase used at the end of the cartoon during the public service announcement when they teach some badly behaving kid a lesson. In a recent investigation my network logs showed a mysql intrusion using a root account and the ubiquitous User Defined Function attack. When I arrived on site to take a look at the system, the manager asked me what happened. I said "based on network logs it looks like a database server got compromised."

If there ever was a deer in the headlights look this was it.
"Database server? What database server, I didn't know there was one".

We grabbed the user of the system...same look..and same response.
"Database server? What database server, I didn't know there was one".

Many software packages come bundled with supporting software that isn't readily apparent during installation. It may say something like "installing database". Microsoft is actually good about this and says "this product requires a database, do you want MSDE or a real SQL installation"? After doing software installs all day long, the tech working on the computer is probably surfing the web or getting coffee while the install is happening, and comes back when the install is done and that's that. Sure, I understand that, who the heck watches software installs? No one I know, unless they're in a hurry or compiling something on a linux box.

However, the software needs to be checked before install, because you need to know what to expect. You need to know that you just opened a hole in the security of your organization and someone now needs to deal with it appropriately. When risk is introduced in an organization it needs to be known and addressed, and remember..knowing is half the battle.

Sunday, August 24, 2008

When users attack

This is one of those things that make my head hurt. Just last week an IDS alert fired and ended up in my inbox. This was one of those alerts that require validation so after consulting network logs, a conclusion was reached that this was an incident and not just an event. Phone calls were made, emails were sent, and the local system admin called to let me know they were on their way to the machine. Great news. I instructed the sysadmin not to touch the system, not to let the user touch the system and just unplug the network cable. "Sure thing" said the sysadmin. I was already offsite on another engagement so it took a short while for me to get to the site. Upon arrival I checked in with the receptionist and spoke to the manager. The sysadmin was no longer around. I was directed to the computer user and together, we headed to the office where the computer was located. As instructed the user was not operating on the computer that had been compromised. This seemed promising...

We get to the office and the user says...

"I just got done running an antivirus scan, and it didn't find anything".

I'm literally at a loss for words at this point. "Err, uhm, what?!" I think to myself.

Friendly user offers up lots of other information about their actions and the nature of the system, including that they had no idea the vulnerable piece of software that got exploited was even installed on the system. This is bad(TM). Could be worse I suppose but to think that the sysadmin echoed back my request and agreed to pull the network cable and remove the user from the system and not to touch the system themselves, and then the user scanned the system..yikes. But wait! It gets better. The user has the risk history window open in symantec antivirus. Well well, looky what we have here, a scan that precedes the one just run by the user..and it's an administrative scan that identified lots of badness. 5 pieces of badness to be exact. I suppose it's a good thing that antivirus found the malware, but did it find it all? How can we be sure?

When next I speak to my sysadmin friend I think we'll need to talk. Ever feel like Chris Tucker and Jackie Chan in Rush Hour? "Do you understand the words that are coming out of my mouth"? The ever elusive Jun Tao snuck in, did damage and disappeared..all before I could get there. Common isn't it?

My sysadmin stick is being sharpened....

V is for validation

Whether it's a complaint of weird system behavior, an alert from a detection system, a phone call or some other mechanism, a very important step must occur; Validation. Validation is absolutely important if only so we don't waste effort, and charge clients unnecessarily. Not long ago I received an email alert from an organization overseas that alerted our group to a system that may have been compromised. The alert went on to say that the system was likely compromised and a rootkit was probably installed. Like any well intentioned IR team we took the alert seriously and started making some phone calls. A time was arranged to preview the system in question. Two of us visited the datacenter housing the system and wouldn't you know it, the system that had been identified was not a single system.

It was the head node in a high performance cluster with 64 nodes. With models and simulations being actively run on the system we naturally couldn't just power it down. So, we validate before escalation to investigation. The head node and subsequent systems were running linux and we just happened to have our handy cd containing trusted statically compiled binaries. Some of you might be saying.. "Now just wait right there you can't touch the system, you'll impact forensic integrity" . Remember please that this is validation, we aren't in an investigation yet, so our goal is to minimize the impact we have, because we can not avoid having an impact. If we get to a full blown investigation, we put on our "forensic purity" hats. Ok, so back to the validation...

The alert we received was nice enough to include a set of characteristics *cough* tool marks anyone?*cough* The tool marks listed were of the individual nature and even then, they varied. First things first. We need to capture memory. I liken memory captures to photographs of a crime scene, so we take our pictures before disturbing the system. Ok, so we grab a copy of /proc/kcore and kernel and symbol table and shoot them to the response laptop over a netcat connection. Then we attempted to locate the locations and files that were listed in the alert. Nothing, nothing and more nothing. Great news! But wait, just what happened here, we asked ourselves. Afterall we're responders and forensic analysts, we want to be able to understand and explain.

Tracing back through the alert, a username was identified and that's why we received the alert. The alerters thought "ahah we have a user name and an IP address that the user logged in to, hence the computer he logged in to is likely compromised". While we appreciated the alert, it was awfully presumptive. Yes the user in question logged in to the system we received the alert about from a computer in the foreign country where the alert originated - hence the user name and ip address of the system he logged in to. The system he logged in from, was identified as being compromised by those that sent us the alert, so they alerted us that one of our systems may have been compromised. This makes good sense but it still obviously required us to validate the compromise. Validation cost us a little effort but we certainly saved a bundle of time by not jumping to conclusions and going right in to investigation mode. The owners of the cluster would not have been happy if we had.

Sunday, August 17, 2008

Situation Normal....

When analyzing a disk image or live system we're often confronted with the need to scan the system for malware. We need to know what was on the system, if anything, and what capabilities it has. Many people scan with well known vendor utilities like Symantec Antivirus or Mcafee. Others scan with some less popular tools, but all have the same end in mind; Find malware on the system by signature. I think it's past time we as examiners be honest with ourselves. Antivirus is not sufficient when attempting to detect the presence of malware on a system. Sure, it functions and will catch what it's aware of, but malware changes too rapdily for antivirus to be effective. You can scan a system or disk image all you want, but if the signature does not exist, you have no hope. Case in point: Asprox botnet related files. Yes I'm still watching it. Today I grabbed four of the newest binaries available.

The results from virustotal?
1,2,3,4

This is of course brand new malware on the block. But this is obviously a frequent thing.

Guess what? You don't stand a chance. If you're scanning a system for malware in the next few days because you're processing an image for a case, or responding to an incident..FAIL. You can not, based on an antivirus scan, even pretend to claim that the system is malware free. Your certainty level suffers greatly and that friends is what we call doubt.

Now, just because a binary evades signature detection doesn't mean you can't detect it. We just need to adapt out methodology when we search. As examiners we must accept the fact that Antivirus is a failing technology in that it consistently falls short and is no longer reliable if you base your conclusions on the results of a scan.

As such it's time to look at alternative methods when determining the presence of malware. Malware detection in forensics needs to move to a more behavioral based approach. Booting a disk image in vmware and looking at system behavior is a must. Capturing memory and analyzing it is a must. Running a sniffer is a must when the vm is booted. Using multiple antivirus products is no longer an option. I'd suggest that at least three products be used to scan all disk images and systems during response and/or forensics. What am I using? Symantec, Kaspersky, Bitdefender. With the samples I listed above, of course these wouldn't work..but the point is simply this: Just as more sources of evidence leads to a more solid case, the more sources that get consulted during a malware analysis leads to a higher degree of reliability in the results. Is it perfect? No, not at all. Is it more reliable? yes, it's more reliable if you:

1) Look at the filesystem for things that don't fit; new files in system32,new drivers, new services, new batch files, vbs scripts etc.
2) Scan with multiple AV products.
3) Boot the disk image in Vmware, watch the behavior of the system, capture memory, run a network sniffer.
4) Analyze memory, the behavior and the sniffer output(put it through snort and reconstruct streams).

This is far better and more reliable than simply stating "I scanned the disk image with Antivirus Product X, and could not identify any malware on the system. The system is clean."

Saturday, August 16, 2008

Windows Forensic Environment

Not much coverage on this yet...and I don't really know why.

The Windows Forensic Environment is based on the Windows OPK or AIK depending on your affiliation. I'm not an OEM so I got to use the AIK. I can't share many details on building this environment right now as I don't have my documentation on hand however consider the possibilities. We may have something on our hands now that can give Windows users a fair chance at reasonable forensics using a bootable CD. Sure we've had Helix for quite a while now and it's been great, but if you've ever trained people in using linux when they are completely unfamiliar with it, the odds that you'll get blank stares is high. Dos prompts are more familiar to many people, as are programs like encase (which works really well in the environment). X-ways Forensics works as well, as does F-response - which provides an interesting opportunity for using this as a known clean environment in a VM and a live capture scenario. Unfortunately FTK does not function as a result of the codemeter USB key. At least Imager Lite works though. It's been noted that the environment has a strong affinity for modifying the disks in the system so if you're using this, do some heavy testing. I'll have more information on this later.

Bots, no longer childs play

A few years ago botnets were pretty much childs play. The bot herders would run an IRC server, sloppily infect computers and detection was pretty simple. You'd find a rogue ftp server, and some form of bot capable of DoS'ing and that's about it, maybe some good movies and weird music but that's about it.

Over the past few weeks I've been following Asprox and some other botnets. I'll start with Asprox. Sure it's been documented by some of the biggest names around. Joe Stewart (who does amazing work if you haven't checked), SANS, Dancho Danchev (who does amazing work as well). Asprox right now is launching massive SQL injection attacks, and is succeeding in large numbers. It's a simple XSS attack, but wow is it effective. So, once your favorite website has been compromised(yahoo anyone?), and your users visit the site, what happens? If you visit the page with IE, you get sent down a specific path, and if you visit with firefox you go down a different path. What I found interesting is that attacks that the code used in the attack exploits MS08-041 in addition to simple XMLHTTP gets, malicious flash (based on browser detection, and then flash version detection, getting the browser to trust the binary and then the binary modifies the Anti-phishing bar for IE, and the botnet comes complete with statistics tracking and updating.

The victim computer becomes pwned in, every sense of the word. You end up with a keylogger, game password stealer, general information stealer, you're connected to the botnet C&C which is proxied and throw in a bit of fast flux just for fun. In the two weeks I've spent on this, I've seen the malware change 5 times - that's new malware not just revisions, and the SQL injection attacks are now coming in with a variable padding, attempting to bypass any filtering of the attacks. This botnet has been used for spamming, phishing and now SQL injection attacks to grow the pharm as it were. And Asprox is small compared to other more nefarious botnets.

Yet another botnet I'm looking in to, (not sure if it has a name) is being used for spam. I had a drive brought to me recently and took a look at it after looking at network traffic. Well, it uses methods similar to things like coreflood in that C&C communication is done over HTTP connections and consists of simple POST and GET requests, though it currently connects over port 18923. Yeah..that's HTTP over port 18923. This particular botnet comes with a rootkit that is not detected by modern signatures in software like Symantec (big surprise there I know), although antivirus evasion is apparently pretty darn easy.

It's been known for quite some time in small circles that botnets are big business but many people out there still don't get it. They see a system spamming and nuke it from orbit without doing even a simple Root Cause Analysis. An RCA in these cases provides a wealth of information. It can be said that everything has a signature, and malware leaves tool marks - from the installation, to activity and so on. An RCA allows us to create that signature to improve detection and knowledge of the methods and mechanisms used by these botnets. Next time someone in tech support or someone at your client's site wants to just nuke a system from orbit, ask them if you can image the system. This is no longer just child's play.

Thursday, July 24, 2008

Anomaly detection

One of the major issues we face when dealing with compromises is detection. Detection of a compromise typically happens well after the incident. Often it takes weeks or months. Many organizations rely on IDS, log analysis, netflows or some other form of traditional monitoring. These are all well established and do a pretty good job. One other technology that exists is anomaly detection or Network Behavior Anomaly Detection (NBAD).

Anomaly: something that deviates from what is standard, normal, or expected : there are a number of anomalies in the present system | a legal anomaly | [with clause ] the apparent anomaly that those who produced the wealth were the poorest | the position abounds in anomaly.

What's interesting about NBAD? Consider this. If you believe that humans are creatures of habit (which we are), and computers are deterministic (this can be debated), then with rudimentary logic, we can claim that computers are therefore extensions of the creatures of habit. That said, a computer will then follow patterns of predictability, which is to say that when a user comes in in the morning, they fire up their email program, their web browser, and perhaps office applications and various other programs and the computer will respond accordingly. Ok great, we've got something to baseline behavior against. Another way to look at this is behavioral profiling of users and systems. But wait! We don't just baseline a single user and a single system. We baseline networks, which are made of computers and devices being operated by people, who are creatures of habit. So now we have a network that becomes a creature of habit. Or does it? The answer is yes, a network is a nebulous creature of habit, and the answer is no. A change in behavior would be defined as an anomaly if that behavior is outside the expected, normal or standard. Let's take a look at this.

Your webserver will exhibit dramatic changes in behavior when your site launches a new product or ad campaign or you get slashdotted. These are behavioral anomalies. But these are not bad. These events require tuning of the NBAD so you stop getting alerts. So, just what would we be concerned with? Your webserver should be serving web pages and that's just about it. If your web server starts initiating outbound connections, or starts serving FTP or SSH traffic out or communicating via IRC when it hasn't previously done so, that's a sign of badness. What if the system has previously served FTP and SSH traffic, what would be the anomaly? How about who it's communicating with and how much? If your system never communicates with hosts in say..china, and it all of a sudden starts to, that's a behavior to investigate. The same can be expanded to the network as a whole. For instance, if there is no IRC traffic for weeks and all of a sudden it appears, that's a behavior to look at. There's a lot more to discuss here and I won't try to cram it all in one post, but let's look at the downsides of NBAD.

1) Baselines are tough. Creating a baseline for users and individual systems is pretty easy. Creating a baseline for a network is tough. Why? Baselines assume the network you are looking at is 100% clean. If you create a baseline with some form of badness on the network, your baseline is invalidated before you create it.

2) Behavior changes. If human emotion conflicts with rational thinking you often get behavioral changes in people which may lead to changes in behavior of the system they use. If you get a new project, your network behavior changes. If someone gets fired, the behavior of the system and network will change. This leads to constant readjustment of the baseline.

There's much more to discuss about this including: policy based anomaly detection, thresholding, and how to put all of these things together to effectively use NBAD to assist in incident detection and identification.

NBAD in my opinion at this point is similar to a kid with a metal detector that goes to the beach. They always expect to get coins and gold, but usually end up with bottle caps. Time will tell how this type of detection mechanism plays out. If you have thoughts on this, feel free to share.

Tuesday, June 24, 2008

Apparently Gov. Patterson agrees

It would appear that NY governor David Patterson and the NYS senate agree with my sentiments that we can do better to protect the individuals affected by Identity Theft. This article discusses the amendments to the New York State Data Breach Notification Law. The amendment can be found here. Among the changes...

Only the last 4 digits of SSN can be stored/used by employers when access is 'open'. I'm not quite sure what they mean by this.
skimmer devices have been outlawed.
Affected individuals can now contact the Consumer Protection Board which will help people undo the damage caused by ID theft.
Affected individuals are now entitled to restitution for time spent clearing up any damages caused by the theft.

This is precisely what we need. I'd like to see even more of this, and definitely more accountability.

Friday, June 20, 2008

Finding PII data

In case you didn't know I live in New York. New York has a fantastic law on par with California's SB1386. In case you're not sure if your state has a similar law check out this article. Odds are your state has one of these great laws enacted.

Why is this important? In every security breach the following question MUST be asked. "Was there PII data on the system?". If you're not asking that question or addressing it in a timely fashion, you're not doing your job if you're dealing with security breaches. If you don't believe me, ask the company identified in this article. Waiting six weeks to notify even if data was not accessed is considered NEGLIGENCE, and it'll cost you $60,000 in New York. That's just negligence...the cost of the investigation usually starts around $500,000 for a small incident involving this type of data.

This law generally applies to every business, including educational institutions that suffer a security breach where Personally Identifiable Information is at risk of unauthorized disclosure. So what is PII data?
In New York it's:
(1) social security number;
(2) driver's license number or non-driver identification card number;
or
(3) account number, credit or debit card number, in combination with
any required security code, access code, or password that would permit
access to an individual's financial account;

"Private information" does not include publicly available information which is lawfully made available to the general public from federal, state, or local government records.

What I find absolutely amazing is that the commercial world doesn't seem to give a damn about this information, let alone losing it on a regular basis, yet the proverbial security industry punching bag (read higher education) has taken the lead in this arena.

Cornell University has a tool and a feature list of the latest version is here. The use of this tool has been mandated in a few places already.

Virginia Tech has a tool
Utexas Austin has a tool
Illinois U has a tool as well.
Sippy has released a tool called WHACK as well that will work with web sites, though I've not tried it yet.

Oh yeah..Identity Finder, a commercial outfit, has well uhm.."borrowed" code from some of these applications and is charging for it. I wonder if they've heard of GPL.

You can also use tools like PowerGrep to search for PII data.

You could also use expensive tools like Encase to do searches but that will cost you about $3,800, and you can only do one machine at a time. Many of the tools listed above are FREE as in BEER and have many more capabilities. You can also run the tools above over an F-response connection during a live response, but I definitely prefer to see a proactive scanning methodology. If you have the consultant edition, you could provide a service..hint hint. Figure out a technique that works for you.

My point in this post is simply to deliver a small wakeup call to whoever is reading. I have conducted numerous searches for PII data in recent years and it's EVERYWHERE. It's in email, it's in databases, spreadsheets, word docs, Scanned PDF's, CV's and on and on. It's in your organization right now and I guarantee that the majority of organizations are doing nothing about it. The worst thing about this data is that it's been so heavily overused in the past few decades that it's on computers and people don't even realize it. Search for it, Get rid of it. You don't need it, and consumers do NOT need to provide SSN for many of the transactions that take place.

I also promise this..if I ever receive a notification letter from your company, you'll be hearing from me.