Saturday, January 3, 2009

The golden hour

I briefly mentioned the golden hour in a recent post. Matt beat me to the punch in his post, but I wanted to spend a little time on the golden hour. In my opinion, the golden hour exists in two places for incidents and incident response.

1) The time between the time of compromise and time of containment
2) The time between the time of containment and the point in time where response truly begins

This post will focus on the second frame of time.

The golden hour is a medical term generally reserved for a period of time that is most critical for a patient in need of care. The idea is that if the proper care can be given within this golden hour, the chances the patient will survive increases dramatically. This isn't necessarily a hard period of 60 minutes, rather it can be seen as a principle of providing high quality care as rapidly as possible.

I like to think of it as a guiding principle for Incident Response.

You see, I think it applies to Incident Response in a way that I don't think many people pay attention to. Of all the people I have been in contact with, there is one that really gets it ( actually he preaches it as well ) and I can't count the times Harlan has said something to the effect of "incident responders are like EMT's". EMT's play a huge role in the success of the golden hour. They are responsible for providing immediate care and transport of the patient. The symbol of the EMT is known as the star of life. It's a symbol many have seen but I wonder how many people understand the meaning behind each point and the symbol in general.

I won't attempt to reiterate the details of the symbol and it's humble beginnings, and under no circumstance would I attempt to in any way take anything away from the folks that wear the symbol by doing a direct comparison between EMS work and Incident Response. I know several people that wear or have worn that symbol and simply stated, they save lives. We work with bits and bytes and there is no comparison.

That said, as I have said before, the best way to master our own field is to study the methods used in other fields. So here goes.

The 6 points of the 'star of life' each represent a specific portion of the role EMS plays. They are as follows:
  • Detection
  • Reporting
  • Response
  • On Scene Care
  • Care in Transit
  • Transfer to Definitive Care
This is where our 'golden hour' begins to take shape.

If we can do the following, our OODA loops close quicker, our response is more effective and who knows..we might be able to get closer to the truth. Our goals are (or should be):

  • Early Detection
  • Early Reporting
  • Rapid Response
  • Good on scene practices
  • Care in transit
  • Solid Forensic analysis

Early detection I think I covered already.

Early Reporting
The reason this is here is twofold. We want our clients and customers to file reports with us as early as possible. In addition, we need to be able to provide our reports, to the proper people in a timely manner and communicate as often and as appropriately as possible. The worst decisions are made when the decision maker is uninformed. It is our duty to inform the decision maker as early as possible provided we have something substantial to share.

Take the following in to consideration regarding reporting:
  • Provide your clients with a standard reporting template. This template should be updated regularly to ensure it allows the client to provide you with enougn information to respond properly
  • Provide multiple intake options such as Web, phone, paper
  • Use a standardized report template
  • Notify the proper people within your organization within 24 hours of an incident occurring
  • Get in touch with the client as soon as possible to establish expectations
Rapid Response
This is where Matt's post filled in a gap for me. Rapid Response is essential to effective response. It is for this reason that a mandatory response time should be established. Every minute that passes after an incident occurs without a response is a minute that you've given to the intruder. Unfortunately responding to an incident takes time. Typically there's a huge gap whereby the intruder can do any number of things to the system.

Getting on scene physically or remotely as quickly as possible is tough. I've had incidents where I had to get to no fewer than 5 physical locations. There are others who have geographical concerns. This is where F-response closes the gap and allows us to respond rapidly. Think about it. What is in the standard toolbox of a technician that might respond to a compromised system? Antivirus programs, file deletion, backup/restore software, hijackthis(or similar), netstat, task manager and other similar diagnostic tools. These tools are known to stomp all over the things we tend to care about. By the time we get on scene a technician will likely have touched the computer. Again, this is where F-reponse can play a role. I'd love to get F-response in to the hands of every technician out there. Why F-response over something like Encase Enterprise you might ask? Simple.

1) EE is expensive. Think of what else you could do with that money that would beef up your response.
2) F-response is inexpensive. This is the most appropriately priced tool in the industry.
3) It's simple. I can give a client a CD and tell them to simply pop it in a drive, and I can begin to analyze a system. I can teach them how to use the product in about 5 minutes.
4) It's tool agnostic. It provides the ability to use any of your other tools to do analysis. You aren't locked in to Encase or FTK or X-ways, you can use all three if you want to.

Putting a tool this simple in the hands of technicians would allow for an even more granular tier or response whereby the technician could gain access to a system in a protected manner and they could then perform basic triage functions. Antivirus scans could be run safely to identify any malware the AV product is capable of detecting. The rebuilding process could begin on another physical computer while you either travel or prepare to respond remotely. Either way what I think we tend to miss most often is setting the expectation for the technician or first responder. What do you want them to do, and what do you not want them to do? I can't count the times I've heard techs say "Sure you can have the disk for imaging but I need to get some files off first", or "I copied the files I could find and zipped them up, then I deleted the files. I thought I was doing the right thing". From their perspective I can understand that, however that's not what we want them to do. In order the create a granular response capability whereby the technician or local sysadmin can assist what do you need?

1) Training
2) Usable Procedures - think basic flowchart
2) Simple tools that work

In addition to tools like F-response, live response procedures must exist and they must be followed. If your team members are going out in the field, and they are dealing with systems that are still on when they arrive, they need to know how to conduct a live response. Where resources are spread thin, you can provide a rapid response capability by creating tiers of response. Not all incidents require a full incident response team traveling on site. Some incidents are minor or can be triaged by local staff before you can arrive.

Rapid response is founded on early and accurate reporting. Imagine getting on scene and not having the proper equipment. That would be a disaster. We need to be informed before we respond. You should be able to define for someone what you need to know.

When you are on site in a response situation, you should be able to make decisions when you need to. There are times when you should follow the playbook and there are times when you should throw out the playbook because the playbook doesn't cover the situation at hand and you need to make a solid tactical decision.

Some things to consider about rapid response:

  • Geographic separation
  • Resource availability and tiered response
  • Training of response staff
  • Training of first responders
  • Tools that enhance or provide rapid response capability
  • Knowledge of diverse set of operating systems
  • Established protocols
  • Response team is cleared to make decisions
  • Response team should be properly informed
Good on scene practices

Once we arrive on scene we need to know what we're doing and we need to be able to do it well. Getting on scene and screwing something up is unacceptable. The right person needs to be available for the response and the right people need to be available from the organization you're serving. It does us no good to have the web guy when we need the database guy. So what helps when you're on scene?

  • Establish protocols for common scenarios
  • Ensure your tools are updated and work as expected
  • Be thorough
  • Be flexible
  • Have the right people available
Care in transit
This for incident responders is relatively simple. Not only must we travel safely, but if we have taken a disk or a disk image we need to transport it safely. Carrying a bare disk around is not something I enjoy seeing. Drives should be packaged carefully as should systems that are seized. If a drive dies in our possession we are to held responsible. So what must we do?

We need to prevent tampering, static and shock from affecting a drive we've seized. These are the principles of device and drive seizure yet they are not always taken in to account. I use a combination of drive cases and anti-static bags for this purpose.
Take the following in to consideration for care in transit:

  • Safety
  • Shock
  • Static
  • Tampering
Solid Forensic analysis
Finally we need to provide solid forensic analysis. Once the drive or image has arrived in our various evidence lockers and we are ready to perform an analysis, it must be thorough, complete, accurate and precise. The use of "I believe" or "perhaps" in our conclusions is nothing more than self reassurance and these statements tend to falter in the face of scrutiny. We must be careful in our analysis and in our presentation of data. I will not spend to much time on this as it tends to be a subject worthy of other future discussions. Forensic analysis is the final destination for Incident Response.

Our goal is to reach this stage as rapidly and as effectively as possible. Our clients count on it. It is our duty as incident responders to provide proper response in the golden hour. Have any thoughts to add?