Thursday, June 18, 2009

The need for speed

When it comes to incident response we are still struggling to close the gap. It's been mentioned here before, and other places. When a compromise occurs, how quickly do you get to analyze the compromise? Hours, days? Some time even weeks!

And when you do get your hands on a system what are you left with? A lot of trampled data points and an incomplete data set. And..what kind of time frame are we talking about to conduct the analysis, just to make a cut as to whether or not it passes the "who cares" test? Consider the time it takes to acquire and process an image in FTK or Encase. This is commonly referred to as machine time, and it's not uncommon to have too many systems locked up in machine time while the analyst waits for the case to process. And then of course you have the report writing part of the process. All told this takes an Average of 20-40 hours per case, often times more. This equates to two bottlenecks in the process.

The two primary bottlenecks are

  1. The time it takes to get to the point where analysis data points are collected, processed and analyzed. There is a lot of time wasted when the analyst is waiting for "machine time" to complete.
  2. The time it takes to generate a report.
This is the traditional model and its inefficiencies. If this doesn't make sense, perhaps the picture below will help.


So now let's look at a model that involves a triage phase.

By including a triage process that collects primary analysis datapoints while the system is live we increase our efficiencies by multi-tasking. Collection of primary data points can be largely automated. The analyst can then focus on analysis of the collected datapoints while the traditional acquisition and case processing takes place. Essentially what this does is take advantage of all available resources - human time and machine time by changing the model.

The benefits of this model are

  1. Not always having to acquire a disk image. If the "who cares" test isn't passed with the datapoints that are collected early, then you can easily move on.
  2. This is a one-to-many relationship. An analyst can quickly collect datapoints from several systems and conduct triage analysis instead of waiting for a linear acquisition process to progress.
  3. It uses all available resources at the same time instead of waiting on one component to complete.

Here is another illustration.

Again...it's not like this is built around F-response or anything. I'm not saying..I'm just saying.

What are your thoughts? I *am* looking for feedback.

2 comments:

Keydet89 said...

I think that in a lot of ways, this is definitely something that needs to be addressed. Even in exams that aren't driven by outside forces (PCI, etc) there is a need for answers ASAP.

Triaging can reduce the total number of systems that have to be acquired, and can speed up the overall process. Having an analyst begin pulling data and shipping it off for analysis (per the timeline topics on my blog) will also serve to speed things up...

Chris H said...

I agree also, but on the PCI front, I am seeing as much as a month go by before a client makes a determination as to which firm he or she is going to hire.

Scope is very important, I've seen firms image everything because it was on the same network.