Wednesday, June 27, 2007

Where is the science?

Well it's happened again. First a case in Pennsylvania, now a case in Georgia

What I'd like to know is where is the science that would prove that someone knowingly stored these images. To quote one of the judges in the panel "It's not enough, wrote Miller for the panel, to prove a defendant has pornographic images in the inaccessible cache files of his computer."

This is yet another failure of the experts and prosecution to make their case in a clear and concise manner that is backed by something other than "well my super-duper automated toolkit found these images in the temporary internet files directory."

Folks, this has got to stop. It's time we find a solution to the following claims:

...arguing that the state hadn't shown he knowingly possessed the images because he hadn't taken any affirmative action to store the photos on his computer, was unaware the computer had automatically saved the images and had no ability to access the saved images.

We need something scientific that would prove that the only way one could store those images is knowingly, and unless deleted, he had the ability to access the saved images - without special software. If we can't do this, then we might as well pack it up and find new jobs.

Thoughts welcome.

Saturday, June 23, 2007

Review - Windows Forensic Analysis

I was doing some thinking lately and I realized that there's something commonly missing from our field - Reviews. (surprise!)

Not just peer reviews but reviews of tools and books on the subject of forensics. As such I think I'm going to start adding reviews to this blog. You won't see me adding "stars" to the reviews as I don't put much value in this type of rating. Rather I'll be rating the books/tools on how useful they are and have been to me.

Up first is my most recent read: Windows Forensic Analysis by Harlan Carvey. I own both of Harlan's books and having participated in a number of the same venues as Harlan for a few years - and seeing where his research was going, I was really anticipating this new addition to my bookshelf.

The book begins with some of my favorite subjects such as live response collection and analysis. These chapters of the book pick up where I believe Harlan left off with his first book on the subject of Windows Incident Response. I probably won't do this type of review very often since many books contain too many chapters, however with 7 chapters, this book was digestable and I think a chapter by chapter highlight review works.

Chapter 1: Live Response - Data Collection.

This chapter provides some great insight in to the mind of the Incident Responder and the focal points of a live response. What to collect, what not to collect, suggestions on how to collect, common tools, their usage and output and my personal favorite - introduction of methodology. This is something that's missing from a lot of other books, useful methods providing guidance.

Chapter 2: Live Response - Data Analysis.

Of all of the great chapters in the book, I was perhaps the most disappointed by this chapter. It's one of the shortest in the book and doesn't cover much in the way of analysis. I would have loved to see a scenario where multiple disparate volatile data sources are pulled together to reconstruct the events.

Chapter 3: Memory Analysis

This chapter was one of the better, more informative of the book. Memory analysis is relatively new (just 2 years) and Harlan does a fantastic job of detailing the intracacies of how processes and threads are structured and created, and how to collect this information and present it in a usable format.

Chapter 4: Registry Analysis

Troy Larson was telling the truth. I've already referred to this chapter a few times.

Chapter 5: File Analysis

This chapter, like the book fills several important gaps in what's currently "out there". The event log detail and analysis is second to none and I've also used this as a reference a few times already.

Chapter 6: Executable File Analysis

I enjoyed this chapter because it included PE header analysis. While there are several books and papers on this subject, Harlan details import and export tables which you don't see elsewhere.

Chapter 7: Rootkit detection

This is another short chapter but it includes a variety of tools that are used in rootkit detection, some of which I hadn't come across.

Did I mention that Harlan writes some of the best and most useful scripts around? The DVD is full to the brim with scripts to collect and analyze all types of data. I've used several of these scripts already in my day to day operations and in my honeynet.

Final notes:
This book fills a very important gap on the subject of Forensics. Harlan manages to cover topics that I've not seen elsewhere and he's included relevant and accurate information based on a lot of research and practical experience. This book is a strong reference and really useful and it belongs on your shelf within easy grasp. Even though it's new, I've already used it several times as a reference.
Favorite Chapters: 1,3,4,5,6

Friday, June 22, 2007

Article on Network Forensics

A colleague sent me a link to the following article in Network World. The article, to my dismay calls Digital Forensics a science, when I think we've established that we're not a science yet however the author rightly states that we have a poor(or non-existant) taxonomy and methodologies vary greatly.

To quote the article
There’s also an evidence-collection chronology best practice: Focus on network danger first, then collect the data

Sorry, but this is not an evidence-collection chronology, this is called business need based response. Proper methodology is to contain the threat or stop the bleeding. This is many times required because of the nature of the system at risk. If you note, the Forensic Incident Response groundwork I laid out before suggests network based collection to gather as much data as possible.

Interestingly enough, NIST says that their tool testing can't handle network forensics because the tool requirements are so strict. Well NIST..maybe it's time to realize that the strictness is going to rise up to bite you(and the rest of us) in the butt. Again...collection by its nature will modify the original.

I'm a bit perturbed by the notion that the author says the best way to standardize is through commercial network forensics tools. Maybe someone should inform the author that many commercial tools use open source libpcap libraries for their foundation? The tools don't need to be standardized for network forensics. We need a standard that is scientific in nature. This is exactly the same as the other Digital Forensic Science specialties. They're all lacking in a lot of the same ways.

So what's important when conducting network forensics?

A proper methodology
Standard format - I suggest libpcap of course.
Complete data - Full content collection is preferred.
Authenticity - There needs to be a way to authenticate the content - perhaps hashing each packet and the total packet capture.
Collection from multiple locations - due to the rate of dropped packets, it's best to at least have full content from one source and session data from another.
Collecting from the right point on the network.

On Volatility...and ESI

I've been doing a little reading recently on Criminalistics and Criminal Profiling and for one reason or another I started thinking about volatility. This is also a response to Harlan's post on RAM.

What is volatility?
In short it's a measurement of change over a given amount of time. The more volatile something is, the more dramatic the change over a shorter amount of time. It's also a measure of how prone a substance is to disturbance.

When it comes to Digital Forensics we often hear about and refer to the Order of Volatility. The OOV is pretty straight forward - the more volatile the source of data is, the higher the priority it is for collection. Something to understand is that given time and disturbance everything becomes volatile. There is no such thing as non-volatile data. There is highly volatile and less volatile data. RAM is highly volatile and everything else is considered less volatile. Therefore all data, RAM included is or should be considered ESI. It's simply a matter of crafting the proper request.

Given this interpretation of data volatility - that it's all volatile would indicate that RAM is and should be considered ESI. However, as Harlan points out, there are no more free tools to collect the contents of memory - I'd add that this is only for newer operating systems. Does this equate to an undue burden on the producing party? In my opinion yes. If one has to purchase a special product in order to produce, then that's an undue burden and the requesting party should be the one to provide the tool(s).

Forensic Incident Response - the groundwork

No this isn't just the name of this blog. It's a concept and a term I proposed about 18 months ago. It would appear that others are using this term as well.

So just what is Forensic Incident Response? I'd like to think it's the application of Digital Forensics [Science] methods and techniques to the art of Incident Response or a blend of the two. For as long as I can remember, Incident Response has been regarded as a black art where the responder arrived on scene, made some people nervous and worked some magic to help determine the cause of the problem, propose resolutions, and keep the enterprise humming along. The role of the incident responder has taken a drastic change in recent months and will continue to do so until a legal precedent is set and demands that we collect volatile data, and not just collect it, but collect it with a forensic awareness. First, the Heckenkamp decision, now the MPAA case regarding RAM. On top of this we are seeing the DOJ state that volatile data must be collected. We must apply forensics to incident response because what occurs during the response may end up in court and we need to move away from being just black artists. The times..they are a changing.

If Incident Response is like being an EMT(and I happen to think it is), then it's time we put on two hats during our response effort and move past being just the EMT. Not only can we no longer be satisfied with the methodology of PDACERF and PICERF, they are both missing what should be a requirement - Investigation. In a Forensic Incident Response we need to be able to respond with the efficiency of an EMT and with the precision of a forensic investigator. Whereas an Incident Responder functions largely with intuition and experience, a forensic investigator needs to apply scientific objectivity and needs to reserve making a conclusion until all facts are known. As such I propose the addition of a formal investigation phase to the incident response methodologies that are common today in an attempt to make the responder more aware of the need for a forensic incident response.

Let's look at the current methods quickly.

Detection & Analysis


My proposal is:
Detection & Analysis

One thing to note, is that I think investigation needs to be parallel to the ERF phases of response in order to be as effective as possible. Basically we need to branch the response in to continued response(ERF) and Investigation.

If you were to use the EMT analogy here, Detection would be arrival on scene and finding the person in need, Analysis would be the situational assessment where you determine the breadth and depth of the incident or if what you're dealing with is actually an incident. Containment[1] is the act of stabilization of the "victim" and the scene. Investigation is immediately moving to identification and collection of artifacts.

So, let's talk a little about just what is included in the investigation phase of a Forensic Incident Response.
So there is no semantic confusion here:
Data - digital information
Artifact - Traces of activity
evidence - Artifacts used to support a claim

1) Identification of data of interest.
Ask yourself where you want to collect data from. Will it be relevant? Can you collect it efficiently? Is it worth it?

2) Collection[2] of data of interest.
Do you have the tools to collect it? Can you collect it in a manner that is documented, explainable and minimally invasive?

3) Preservation of data of interest.
After collection, the data must be preserved. Preservation equates to establishing the chain of custody, validation of the collected data through hashing, and proper storage of the data of interest.

4) Analysis of data of interest.
Once the data has been collected, the responder should begin to analyze the collected data for artifacts.

5) Reconstruction
If artifacts of value are found, then a reconstruction effort should take place. Reconstruction is the act of determining what created the artifact, and crafting an explanation. The knowledge of how artifacts are created is a moving target. If you don't know the answer, don't make one up to support your arguement. Apply a scientific methodology, and remember evidence dynamics.

6) Reporting
A final report is a given. Report your findings accurately and truthfully. Do not speculate or present fact based on assumption.

[1] On containment...Containment during a forensic incident response is a tricky subject. The act of containing a system can lead to loss of useful network based artifacts. As such, before containment I would suggest a few things if possible. Ensure your routers are collecting netflow data. Implement an emergency network based collection system. Ideally this system should be connected to the local switch where the victim system is located on a SPAN port, or by installing a network tap. This system should be capable of matching LAN line speeds and should collect all data in pcap format using something like tcpdump. Just as an image of physical memory is a requirement, so should a network state snapshot. Network based collection should take place during the entire response effort.

[2] On collection...It's a fact that the act of collection will modify the original. It's inevitable and unavoidable. Your collection will not be pristine. The goal is to be minimally invasive or least intrusive and be as efficient as possible. Your tool should accurately collect as much relevant data as possible while leaving the smallest footprint possible. Collection will leave traces on the system, so you must know your tools ahead of time.

Monday, June 11, 2007

Family addition

My son Asher was born 6/6/07 at 10:58 am. It was a long 30 hour labor but he finally joined the world with a lusty cry. Mom is recovering and we're both experiencing a lack of sleep! We are truly blessed and he's a happy baby so far.

Friday, June 1, 2007

The faces of Digital Forensic Science

Previously I discussed Digital Forensic Science and after talking to many people about the subject and giving a few talks on the subject I think I'm finally reaching a conclusion about the discipline. Digital Forensic Science exists in concept but not practice. Don't get me wrong, I think the science is there, but the majority of practitioners don't use it. As such I'm beginning to create an ontology to conceptualize my thoughts on the field. Here's some early work on the idea, providing a simple outline of DFS as a whole, some of the contributing disciplines and areas of specialty. You'll see there are no connecting lines between anything but the areas of specialty. That's simply due to the fact that I don't think I'm anywhere near done yet.

Digital Forensic Science has many faces. Or should I say it has many facets? Either way not all of them are directly related to the legal system and very few of them are forensic science at all. However, they each contribute to what is considered Digital Forensic Science.

I touched on Pollitt's keynote from the 2004 DFRWS conference where in one slide he proposes the addition of Roles to the Framework. These roles he says are not defined by "forensics" as a process but rather they are constrained by the role's purpose for using forensics. I find this to be very interesting and true.

Well, why don't we attempt to define some roles and the purposes for using forensics in each. Ask yourself "Why do I use forensics?"

If you've got suggestions for addition or think I'm wrong let me know..afterall I'm only one man and I'm not involved in a couple of these fields

Incident Response:
Maintain or restore business continuity
Provide accurate information to security team to defend against further attacks
Determine root cause
Report findings to management or client
Provide evidence that suggests or "proves" that regulated or sensitive data was or was not accessed(insert stolen if need be) by unauthorized individuals.

Law Enforcement and Criminal Investigations
This one is simple...
Get the conviction, get the conviction, get the conviction.

Civil Discovery
Determine the truth of the claim of the plaintiff
Corroborate the physical evidence provided by recovering the digital copy

Real time analysis
Keeping systems available and accurately providing information

Incident Responders like EMT's provide triage and stabilization of a scene. While preserving data for possible use in a forensic examination is important, it's not always the primary concern. There's a difference between operational responses and forensic responses. There's typically a decision tree that's followed where the team lead decides to go to full blown response and investigation mode or conduct RCA, then flatten and restore the system from backup. DFS doesn't really apply to incident response because of the fluid nature of response and the fact that response scenarios aren't really repeatable. What does apply are the best practices utilized in each Digital Forensic specialty.

Law Enforcement and Criminal Investigations are probably the closest we get to real digital forensic science. However, as I've said in a previous entry, I don't believe that true "science" is carried out. I base this on the failures I laid out when facing a Daubert Challenge.

Civil Discovery situations aren't exactly science either. The same failures related to Daubert still apply here. In addition, we don't always conduct a full forensic examination. Sometimes all we do is a keyword search, or grab representative data from a logical drive. It depends on the case at hand.

I'm afraid I can't add commentary to the intelligence community since I'm not involved there and have no experience with that community. If a reader is part of the intel group using forensics..please indulge me.

I tend to believe that we can reach a full blown "science" but we're not there yet. We need development in many areas. One such area is live response/live forensics (or the term I coined a while back Forensic Incident Response). We're going to start seeing more widespread use of these techniques in the court system pretty soon, especially given the direction many tools and investigators are taking.

Live Response Tool Testing

Over on the Windows Forensic Analysis group there was a recent discussion about live response tool invasiveness and some methods that people currently use. I started thinking about how to codify the methods used so that a standard methodology is created that can be used by anyone.

Below is a copy of my message to that list. I hope readers of this blog can respond with suggestions and any comments.

In order to effectively begin to measure the effects our tools have on systems we need to devise a testing methodology. Harlan recently asked "what's meant by testing of tools"?

I'd like to take a first whack at this one to see what others think. I'm hoping people add something to this. One thing to understand is that live response by its nature is not repeatable. However, testing in a controlled environment is as long as variables are identified and documented. Using a standard methodology will allow us to create a system by which our efforts can be measured, and our actions will be much more defensible as a result.

To create a standard methodology for measuring the effects of live response tools on Windows Operating Systems.

Constraints and scope:
1) This methodology will be for live response tools only.

2) The methodology will measure the effects of the tool on: physical & virtual memory, the file system, registry, and network state. Modification will be measured in two categories. Type of data being modified and a quantifiable amount of each type of data being modified(i.e, a list of file modifications, amount of memory displaced etc).

3) The control system must be isolated and an accurate baseline must be established.

Define an accurate baseline.
Identify tools and current methodogies for using each.
Create a standard baseline image.
Define the process for measuring the effects of the tool.