Sunday, December 30, 2007

Honeynet Upgrade

I recently got a little funding for my honeynet and bought some new hardware for it. The major additions was some 3ware equipment, new processor and motherboard and lots of hard drives. I put 8 500GB sata II seagate ES drives in the box and a 3ware 9650SE as the raid controller. I finally made the move to vmware as my honeynet platform - expanding the honeynet from 5 to 20 machines. I'm currently building up a new website for it and adding some services to be exploited. I also moved away from the the honeynet project's roo cdrom - it just wasn't cutting it anymore. Snort 2.3 is way too outdated.

I re-used my old honeywall hardware and loaded fedora core 8 on the box and I loaded snort and snort running in inline mode, argus, tcpdump, swatch, sebek-server and some other goodies. The iptables configuration was created using fwbuilder.

So ultimately the honeynet is now a hybrid with a physical box for the honeywall and vmware based honeypots. I'm somewhat excited and hopeful that people will attack it, but I guess time will tell.

I'd like to add some automation to the activities on the honeypots using autohotkey or autoit and robotask.

I was pretty shocked to find that one of my hosts was almost immediately getting dinged with sasser.b - yeah I said sasser. I'm also getting some potshots taken at the nepenthes collector. Just yesterday I picked up a binary named msnnmaneger.exe (an sdbot variant).

Wednesday, December 19, 2007

Tool Marks

I was reading a paper on tool marks and trace evidence and started thinking of how that translates to digital forensics. The connection is actually obvious.

From the paper: "The use and application of a variety of tools is commonplace during the commission of various criminal acts. By their very nature, the use of tools typically involves the application of relatively large amounts of force to gain a mechanical advantage in performing a work related task."

As we are well aware this is where Locard's name gets thrown in and everyone says "Ahh, transfer evidence".

In the digital realm we experience much the same thing. When an intrusion occurs, a variety of tools are used to exert a rather large amount of force (exploit, brute force, other means of gaining access) in attempt to gain access to the victim. We commonly recover these tools on investigation, but in order to reconstruct the incident we need to be able to determine how they were used during the attack, Their purpose and that the attacker did in fact use them. This is where tool marks come in to play.

The execution of a tool will leave a mark on a system, just as using a tool to gain entry will leave striations or other markings. With luck we might have a cross transfer - that is when the tool leaves a mark and parts of the object are transferred to the tool.

Here's an example:

Let's look at a hackerdefender attack that's been in process for a while. The attacker gained access to an exposed system by exploiting a vulnerability. The attacker downloads a toolkit on to the system consisting of smbscan, pwdump, and a few other tools. The attacker proceeds to offload the SAM, crack the admin password, then smbscans the network and installs rootkits all over the place.

As a tool leaves a mark on the surface it touches, the digital tool will hopefully provide us with tool marks and other transfer.

There are 4 major tool mark categories:

Striated - This is when two objects effectively scrape each other and the tool leaves a mark on the opposing object or vice versa. This is typically seen as lines from the marks - think pliers... The harder object will leave the mark on the softer object.

Impressed - The harder tool will leave an impression of itself on the softer object.

Crush or cut marks - When a tool exerts force on both sides of the object and crushes or cuts it. Imagine bolt cutters...

multi stroke - a tool used in a repetitive movement will leave multi stroke marks.

Time to bring this in to the digital realm...

Striated tool markings obviously can't exist or do they? Just as a tool has a "signature" it leaves on a softer material, a digital tool will leave markings on the victim. Refer back to the HXDEF example. The tool used to gain entry to patient zero will have left tool markings on the system and the network. Polymorphic code aside, the tool as used by the attacker will leave trace & transfer on the network. On the system you can expect to find a wealth of trace & transfer. A digital tool mark isn't about hardness however, it's about the effect it has on the system. Hacker defender has a definitive stria. It has an ini file, and many customizations can be made by the tool "manufacturer". Consider this tool striation and just as it happens in the real world the striations are unique because of tool usage and "manufacturing". You can see many of these unique striations on the network as well as the host.

Tool impressions - A physical tool will leave an impression if it's pressed in to a softer material. The digital tool will leave impressions of itself on a victim as it's buried in the system. The files in the toolkit that was downloaded by the attacker leaves an impression of themselves on the filesystem. These can be collected and analyzed.

Crush or cut marks are a bit tougher to identify in a digital realm.

Multi-stroke tools would be a saw, or any tool that it used in a repetitive motion to gain entry. Let's use the example above. SMBscan would be an example of a multi stroke tool. It uses a repetitive motion to gain entry to systems. It will leave marks of a definitive pattern.

So what's the big deal? Well if we start to look at tool marks we can begin to classify things a bit better and respond more efficiently. Tool marks are important to reconstruction and investigation. Tool marks can be broken in to class characteristics and individual characteristics. These start to become important as does the predictive nature of any assumptive hypothesis we can derive. Many of us do this already. When you see a hxdef.ini on a system, and there's an odd port open, you may inherently assume "uhoh weve got a hacker defender intrusion". This can be right or wrong. If you've developed a tool mark library you can refer to it during your investigations. Developing a tool mark library consisting of class and individual characteristics can assist any investigation by adding predictive power to your assumptive hypotheses.

EDIT 12/20/07 - The paper was: The Synergistic Nature of Trace Evidence and Tool Mark Examinations by Vincent J. Desiderio and George W. Chin

Thursday, December 13, 2007

Can't take that host down?

How many times has this happened to you? You are called in to respond and you just can't take a system down? We all know about live response at this point. There are plenty of vendors out there that sell software, and there are plenty of open source tools.

What's another major component of live response? How about the network? If you can't take the system down you need as much real time data as possible to come to any form of conclusion.

Enter the Teeny Tap.

Here's mine in the box:

This is probably one of my most favorite devices and most worthwhile addition to my jump kit after the essentials. I've used mine in a number of incidents. So how exactly do you use the teeny tap in an incident?

I'd start by saying it depends. Are you responding to one host or looking at an entire network? Let's look at a single host for starters. You've arrived on scene, conducted your initial interviews, have your initial threat assessment complete and have identified the host.

How to proceed from here:

Follow your standard practices (photograph in-situ, identify peripherals etc.)

Locate the network cable from the host.
Locate an electrical outlet.
Unpack your tap.
Connect power and cables to the tap.
Connect monitoring sensor to the correct cables and power it up.
Log in to your sensor and start your collection software (tcpdump, wireshark, snort, argus etc). I tend to use argus and tcpdump and then I post process with a number of tools.

Now for the important step. Connect the host to the tap and the tap to the other end point(could be the wall jack, a switch, a cable modem).

Now you are monitoring the network connections to/from the host and you can begin your live response.

When tapping in to a network follow the same steps as above, but you should change your insertion point to the perimeter. Depending on the situation, you may want to just tap the perimeter, but be aware that this may not capture internal host-host communications.

When you're done collecting data, stop your collection software, save the file and hash it. Never work from this file. This file is to be treated as your original and you should only work from exact copies of it. If you're transmitting live response data over the network be sure to identify your host and your data streams so as to prevent any claims of contamination.

A few gotchas:
You'll probably want to bond interfaces on the monitoring sensor. If you don't then you'll need to set your monitoring software to have a few instances (one per NIC) and you'll have to combine the streams after the fact.

Make sure your cabling is correct.

I'll probably add some more documentation to this at another point...


Friday, December 7, 2007

I've lost my mojo!

Kidding...I'm kidding.

I was reading through some stuff today and started to get inspired by portable virtual computing environments.

One such environment is MojoPac.

I installed Mojopac freedom on my windows xp virtual machine while running Inctrl5. Other than the insane amount of changes to the system, the major change is the creation of HKLM\RINGTHREE\VM1\REGISTRYMACHINE\SOFTWARE. There's simply too much to post here so suffice it to say it looks like it takes a veritable snapshot of your computer's most important components and copies them to a USB key (in my case my Corsair Voyager GT).

During install I was asked to register with mojopac/ringcube, which I did.

When mojopac actually starts up it looks exactly like an embedded XP configuration would - just the basics. The fun begins when you install software. I proceeded to install two programs in to my new mojopac: metasploit and firefox.

I'll obviously need to do some more work on this but so far...

It leaves traces such as this:

Prefetch files for any application executed from within mojopac:

These are particularly interesting because they blatantly indicate that mojopac was used.

Strings output from the firefox prefetch:
It appears that mojopac uses some linking to system binaries and makes some use of side by side assemblies but when it uses an application installed in a mojopac device it runs from this location: \DEVICE|HARDDISKVOLUME1\DP(1)0-0+5\[...]
I haven't devised what the significance of the DP(1)... is yet other than it being the volume. Anyone know?

It also leaves autorun traces in:

The easiest way to identify the use of mojopac is looking for signs of the following:

Prefetch files exist for both.

In addition you must be administrator of the machine to use mojopac unless an add on is installed..

More on this later.

Sunday, December 2, 2007

Putting the Forensics in Anti-Forensics

There's a lot of noise about Anti-forensics still so while working on some material for a presentation and IR course, I started working on some fancy videos of what an attack looks like, versus what your average live response utility will collect. I started creating this to illustrate a few points.

1) Do you trust your tools?
2) The average tool is not capable of providing enough information when facing modern attacks.
3) An investigation does not begin and end on the disk.

Imagine this scenario:

The IT department at a client of yours calls you up one day and says "Our computers have been acting funny the past few days, and we've checked them but can't find anything out of the ordinary. Can you bring your kit and take a deeper look at the network and systems to see if you can find anything?"

You arrive and as part of your incident investigation methodology, you put an emergency NSM sensor at the perimeter of the network doing a full capture. Almost immediately you start seeing odd things occuring, and you hear a shout from down the hall, calling your name. You grab a CD and walk down the hall. Arriving at the secretaries desk you see her and the IT guy talking and they look elated when they see you.

You pop in your trusty Helix CD and run WFT.

AAron Walters was kind enough to host this data for me. Thanks AAron!
Download the data here. The MD5 is here.

You'll find the video of "whodunnit" in the file above. Watch it beforehand if you want to cheat, but I would hope some would want to analyze before looking at the answer.

So, now refer to my points above.

1) Do you trust your tools?

Trust is mainly about reliability and confidence. How confident are you that your tools are showing you accurate information? How reliable is the data output?

2) The average tool is not capable of providing enough information when facing modern attacks.

As attacks get more complex, the dataset grows, and likewise becomes more complex, and obscure in revealing useful information. The average "forensic" tool of today can't compete. You may be required to throw out the protocol book once in a while in favor of getting the job done.

3) An investigation does not begin and end on the disk.

Like I said attacks are getting more complex. Forensics is not simply about data collection from one source. There are many data points in an investigation, and we can't afford to limit ourselves to just the disk, for there is information elsewhere that may not exist on disk.

Enjoy. If you find interesting bits of data, please post them as a reply on on your own blog/site and provide a link.

Collecting physical memory

I was recently looking at a memory collection of about 4GB and started thinking about the order of volatility. A 4GB memory capture to a USB key can take an inordinate amount of time, and during that time the other volatile data on the system is not only changing, it's potentially disappearing. Take for instance a netstat command that shows a connection that's ESTABLISHED. While collecting memory and nothing else on the system that connection would go from ESTABLISHED to CLOSED, to disappearing all together from your state table. While this information can certainly be collected from memory, it's still collected when you run netstat (or other similar network state table collection software). So what's my point?

Well I suppose it requires more experimentation with the concept but is it necessary to collect network state information externally if you're already collecting ram contents which contains the same state information? My thoughts are no, although it does give you network state during two points in time, which may be a more accurate "smear". Would forking the memory capture process to collect network state at the same time make more sense? It would create a bigger footprint, but the network state would be captured independently of memory, at the same time.

What are your thoughts?

EDIT: 12/4/07 changed wording about ESTABLISHED connection.

Monday, November 26, 2007

Digital Criminalistics

Once upon a time a forensics investigator would arrive on a scene and seize a computer. The scene would be photographed, evidence would be collected, bagged & tagged and so on. A few floppies would be collected, maybe a CD but not much more.

Fast forward about 10 years....

A forensics investigator arrives on scene to discover a wireless linksys router with a 4 port switch on the back. 3 wires connect to a triplex data drop in the wall. One room with a computer became a house with a computer in each room. A 3.5" floppy is now obselete, and data is every where. Did you remember the cell phone? The PDA? The camera, and the media cards? The DVR or media center computer? It just keeps going...

Even a few years ago as Incident Responders we knew what to look for. Botnets were rampant. They were easily detected and a vast majority of them followed the same standard. These attacks adapted and became more complex. Fast Flux networks are in full swing.

Digital Forensics is quickly becoming as complex as real world criminalistics. In the sea of agile data, we must be as agile. Our methods need to be fluid and adaptable. Simply pulling the plug is no longer the best practice. Digital Forensics doesn't begin and end on the disk. It begins at point A and ends at Point Z. There are many data points in between. We need to know what these points are, and how to collect relevant data from them. We must also understand the forces at work and how they influence the data to be collected.

After a scene has been contained..note I said contained. Let me digress....Containment is: the action of keeping something harmful under control or within limits. While pulling the plug is a method of keeping something harmful under control, we must understand that it is the most extreme method in use today. Pulling the plug is what I consider to be a knee-jerk reaction used due to lack of understanding of data, the forces at play, and the influences those forces have.

Getting back on track..after a scene has been contained, the evidence preserved and collected the reconstruction begins. This is where things can get difficult, because we as an industry don't have a scientific background for our field. Consider if you will that the source code of Windows is closed. This leads us to seeking empirical truth, or that which we can observe. This is actually converse to what we are seeking to accomplish. Simply because we can observe something, doesn't mean it's the only explanation. We can only ever hope to be obtain a level of certainty in our conclusions.

In search of science we begin with induction: "The timestamp was modified". After finding a binary capable of modifying timestamps we move quickly to say "This binary is responsible for modifying timestamps on the system". Is this correct? Maybe. It requires additional work to be a more complete conclusion.

Next we have deductive reasoning: "The timestamp was modified by this binary". After experimenting we deduce that "The timestamps could have been modified by this binary". This is better. We are not absolute in our conclusion.

Neither of these is wrong, and neither is right, however both are incomplete. Why? We're only looking at one source, and we can't make a dogmatic statement based on a sole source of evidence. As digital criminalists, we must be able to say that given all data points, and after careful evaluation the most likely explanation for the modification of the timestamps is this binary. There are many potential methods of timestamp modification, but given the data I am reasonably certain that any other explanation would be less believable and unlikely.

We must understand that our conclusion is not simply the correct one. It is the correct one, because other possible explanations have been ruled out as being plausible given the dataset. To truly achieve this we must have corroboration. Multiple sources of data must be in support of one another to have an accurate conclusion.

As the complexity increases, we must become more certain of our conclusions. We must truly understand what is the cause, and what is the effect. If we understand causality we can move towards science.

Tuesday, September 18, 2007

Enter the MAC

When apple moved to intel based hardware I was really excited. For years I abhorred apple and macs, commonly referring to them as macintrash, macincrap, or the ubiquitous doorstop. With the move to the intel platform I decided to give them another try. Last week I turned in my IBM T41 and picked up a 15" macbook pro with a 2.4GHz core 2 duo, 2GB Ram (upgraded to 4GB using geil dimms), 160GB hard drive. Big deal right? Well as an incident response/forensics d00d I now have the best of all three worlds in which I commonly live.

My first move was to install Vmware fusion. Installation was simple. I gave myself 30GB, entered a username and password and the serial number, and off went my vista installation. I next went ahead and installed Ubuntu 7.04 Fiesty Fawn.

Note if you will, that the windows start button is in the lower left hand corner. That's pretty sweet if I do say so myself.

The other pieces of the system that are nice are the built in firewire 400 and 800 for attaching those pesky write blockers.

I'll be adding XP soon, but so far I'm very hopeful about the new platform for investigations. I just hope the hardware holds up. Anyone else doing this?

Tuesday, September 11, 2007


I've been quiet for a while and mainly that's because I haven't had much to say lately on top of being overwhelmed with work.

However, after reading a recent post on Richard Bejtlich's blog I'm starting to get really annoyed with the notion of "anti-forensics". It's quickly become the buzzword of the year it seems, in no small part due to the blathering journalists at CSO magazine trying hard to keep C level execs in the loop.

Just what is forensics ?
forensics: The application of science to answer legal questions. Or "used or applied in the investigation and establishment of facts or evidence in a court of law".

So, what then is anti-forensics?

According to Stach and Liu(You know..those antiforensics metasploit guys) it's: application of the scientific method to digital media in order to invalidate factual information for judicial review.

Ok, so here we see it's the antithesis of what forensics is. Great, just what we expected! Is this entirely accurate? No - why? Because the application of the scientific method is lacking. So why all of the confusion about what antiforensics actually is? Perhaps because everyone is using an umbrella definition to describe and define what is actually very specific methods and techniques.

Previously I started mapping out the world of forensic science and digital forensic science in an attempt to make sense of the many facets of the industry. Forensic science, while it includes a conglomeration of many fields of study and science relies heavily on human beings and their senses to interpret information and present it as fact. There are 5 major human senses as we all know. These senses translate to the digital world to form the basis of how investigations are conducted and the requisite skills to accurately perform said investigation.

Remember the saying "What the eyes see and the ears hear, the mind believes"? This is not only true of forensic science but of digital forensic science as well. So what is antiforensics really?

Techniques and methods designed and intended to reduce the forensic analysts ability to accurately reconstruct and present data as fact, the accuracy and trustworthiness of the data, and the tools used to conduct forensic examinations.

Ah, now we're getting somewhere. Antiforensics attacks the analyst, the data, and the tools.

It's been demonstrated time and time again that tools and data can be manipulated to the point of appearing to be useless to an analyst, so what should be the real focus? The human dimension. No tool is perfect, they can all be circumvented in some form, and data shouldn't be trusted until verified. Antiforensics can mislead, deceive, and thoroughly stump an investigator or analyst until a decision point is reached and the investigation is stopped in favor of easier wins, it drags on, or an incorrect conclusion is reached. So what must an investigator do to counter antiforensics? Simple, the analyst needs to be better trained, and have a firm understanding of situational awareness.

Situational awareness information can be found here:

Situational awareness when it comes to forensics and incident response is vital. The investigator needs to know and understand everything that is going on. You need eyes in the back of your head, and an extra set of hands. You must be able to take in new data constantly, process it, compare and contrast to existing data, put it in to perspective to make the right decision. In many cases, you don't have a lot of time either.

When it comes to training...

There is one component of antiforensics that seems to escape many people. The user of antiforensics must understand forensics in order to use the techniques to maximal effect. If you don't know the techniques used by forensic analysts, don't understand their tools, and don't know how they think, then you can't possibly "anti" or "counter" everything. This has been called the "CSI effect" in the real world and now we're seeing it in the digital realm. Sure, a perp will splash bleach on blood stains in hopes of washing it away, but it takes time and until then all they've done is destroy the pigment. On top of this, Did they manage to plant evidence that it could have been someone else? Did they hide their footprints, fingerprints, destroy bodily fluids and so on? Odds are no, they didn't. In addition, if you've ever spoken with criminals before, many will tell you they got caught because they got greedy, were nervous, or didn't know what they were doing. Like the construction worker that robbed a store with his hardhat on; His name was written on the hardhat.. Or the criminal that went back in to the store one last time to get another load.

The failure to recognize that the people using tools with antiforensics capabilities didn't create them and don't understand what they're actually doing seems to be causing Fear Uncertainty and Doubt or FUD in a lot of practitioners. There are buzzwords abound and everyone seems to be throwing antiforensics around like it's some new threat. Remember if you will that digital forensic science and digital forensics is made up of many specialty areas and attackers or criminals aren't generally experts in defeating all of them. Antiforensics raises one point above the rest - Never make a dogmatic statement based on an isolated observation. Your investigation can not hinge on one source of data, and you can never make an accurate statement based on a single source.

So how do you as an investigator overcome antiforensics?

Use your senses.

Sight - Your eyes can and will deceive you so don't trust them. Use multiple tools each time you investigate. There is no one ring to rule them all and there is no one antiforensics tool or technique that defeats every forensic tool.

Smell - Smell out the rat. There is always evidence to suggest an intrusion and crime. The criminal or attacker will slip up somewhere when attempting to hide their tracks. You must be able to smell out the rat that will give away the perpetrator.

Taste - If you notice something weird, try it out yourself to see how it "tastes". If you have an unknown binary, sandbox it and see what it does. Get a demo copy of software that was used and see how it works in depth.

Sound - Listen to the evidence, not the people involved. The evidence will lead you in the right direction.

Touch - Get your hands on as much information and equipment as possible. This is where exposure increases your ability to outsmart the opponent.

Thursday, September 6, 2007


I recently started watching the show 24 A family member let me borrow the first few seasons on dvd. While I've enjoyed the show I've noticed a huge number of interesting topics that just seem out of place. One such topic is interrogation.

If you've ever seen the show, you might find it amusing - I know I did - when the interrogator claims to be "pushing" the suspect pretty hard when the suspect is asked about 3 questions and the interrogator says "ok I believe you".

If you've ever been assigned to handle an incident of any reasonable size and scope you've questioned people, their actions, the reasons behind those actions and had to dig for more information. Some might call this the "interview", I tend to view it as a passive interrogation for a few reasons.

- SA's and NA's commonly feel they have something to hide.

If you've ever worked as a network or sysadmin you probably have some sense of what I'm getting at. NA's and SA's tend to get territorial about their systems and networks and as a responder you are invading their territory. It's kind of like inter-agency "cooperation". Not only is territory an issue, but more importantly people try to hide their mistakes in an effort to cover themselves and most likely protect their jobs.

During an incident, me and my partner had to spend about 5 hours interviewing an admin. Initially we started out actually conducting a standard information gathering interview. We asked common questions related to network topology, system type, system configuration etc. As we began to delve deeper, the admin became more and more closed off and shut down, leading us to take some relatively extreme response methods such as locking down the entire network and relegating the admin to desktop support while we conducted a room to room search.

- You are the outsider

Even if you work for the same company, you are the outsider. We are members of what is viewed as the "hit squad". An alert of some form was sent, we respond and arrive on scene with our jump bags or pelican cases containing lots of gadgets (I typically arrive with 2 1650's and a backpack full of paperwork), we ask questions, we seize systems, conduct investigations and file a report when we're done. We are the outsider, regardless of who we work for. There are some ways to change this perception and it typically involves - atleast for me - winning over the administrative assistants. Admin assistants more often than not have the pulse of a department or company, and can get you just about anything you need if you win them over, especially if you're going to be there a while.

- Management fears the outcome

When approaching management with the potential to make them look bad to their bosses you must tread carefully because they can make the investigation a difficult one. If you interview management about policy and policy violations or poor decisions made based on purely financial reasons rather than accurate risk assessments, remember to be politic rather than accusatory. Do not try to intimidate them or second guess their decisions. Their decisions were already made, and it serves no purpose to tell them they were wrong. When it comes time to write your report, make your points in the recommendations section. This is the "bottom line" of an incident report for management because this is where costs commonly get associated.

To that end I want to make a few suggestions to those of you conducting interviews.

- Remind the interviewee that you're not there to get them in trouble. You're just trying to resolve the issue
- Be as thorough as possible
- Ask leading questions, and let them do the talking
- Don't let your frustration show
- Know when to press the issue and when to let it go
- Get what you need to get you started and move to secure the systems. You can always ask new questions later and the more you know, the better formed your questions will be
- Trust no one. The facts will do the talking.

Monday, August 27, 2007

A reversal of fortunes

In my "Where is the science?" entry I questioned the decisions on two cases of child pornography possession and that our ability as examiners to find images is just not enough. In an interesting reversal on the Diodoro case, the Pennsylvania superior court decided that viewing images is in essence exerting control or possession of CP.

To quote the article:
"[Diodoro's] actions of operating the computer mouse, locating the Web sites, opening the sites, displaying the images on his computer screen, and then closing the sites were affirmative steps and corroborated his interest and intent to exercise influence over, and, thereby, control over the child pornography,

He added that while Diodoro was viewing the pornography, he had the ability to download, print, copy or e-mail the images."

Wow, now that is actually an interesting way of looking at things. That you have the image displayed on screen means you have the ability to do something to or with it, and therefore you have control over the image.

Here's how I'm viewing this...

If I am viewing an image, it's true that I can do what I wish with it, except modify the original as displayed on the website. I am in possession of a digital copy of the original, which is as good as the original file as displayed on the website.

The copy that has been automatically downloaded to my computer's temporary internet cache and is being displayed is under my possession and control at that point in time when I am viewing the image. My actions (visiting the website willingly, and possibly expanding a thumbnail image) affirm the fact that I wanted to view the image and therefore I have the ability to exert control over it; I have the ability to manipulate the image as I see fit - which is to say I can save, copy, email, print, crop, etc...

Let's hope that other Courts can use this during prosecution of these types of cases where the law states that anyone who "possesses or controls" these images is guilty. Chalk one up for the good guys.


Tuesday, August 7, 2007

Review - Virtual Honeypots

I got this book approximately 3 days ago and absolutely tore through it. This book was fantastic in every sense of the word.

Niels Provos (of honeyd fame) and Thorsten Holz (from the German honeynet project) teamed up to provide a true wealth of knowledge and information in Virtual Honeypots *note I bought it from Amazon*

As the title suggests, this book is all about creating and utilizing a virtualized environment to host honeypots. From the first chapter on, there is no mincing of words and the technical aspects are covered from set up to configuration to usage. Virtual Honeypots is a logical progression from the initial honeypots and KYE books and focuses more on the honeypot than the honeynet. There's such a wide variety of topics discussed that this book is probably best served as a reference after reading it once or twice. I was in awe when I read chapter 7 and specifically the section on the potemkin honeyfarm which apparently has been used to emulate over 64,000 honeypots!

This book presents itself really well and the authors did a fantastic job covering all of the critical and really interesting projects that are out there in the honey(net|pot) world. If you operate a honeynet or honeypots this book is not an option, it simply provides too much information to ignore. Even if you don't operate a honey(net|pot) this book is well worth the money and It's going right on the shelf next to other quick grab reference books.

Thursday, July 26, 2007

Keeping up with the joneses

I had the pleasure of sitting down the other day with a technology law professor and we began discussing the issues surrounding memory capture. Not only the impact that a collection might have on a system and how a lawyer might attack that capture being presented as evidence, but the fact that the industry lacks standards and follows best practices, which presents its own set of issues.

I was pointed to a tort case involving tugboats of all things. The case of the TJ hooper is a rather important one it would seem. As this is an oft-cited case, I'll leave it to readers to google for it. In short, the case surrounded a cargo company, a transport company and a barge company. The transportation company assigned a number of tugboats to pull the barges along the atlantic coastline from Virginia to New Jersey. The cargo was lost in a storm and the cargo company claimed the barge company was at fault. The barge company in turn claimed it was the fault of the towing company. Not for any real fault of the tugboats - they were seaworthy vessels structurally. The judge however decided the tugs were unseaworthy because they did not have radios onboard.

Radios on tugboats in 1928...This was not the standard, or even common place in that era. However, some companies did have them. It was argued that radios were a best practice, and that the tug company should be faulted for not having them. The case ended up being about custom versus negligence. Was it customary for tugs to carry radios, and was the tug company negligent for not carrying them.

The professor I spoke with termed this "keeping up with the joneses" and said it was up to us to set the current best practices and ensure that these best practices are followed by all practitioners, lest they be held liable for negligence because others are performing live response and memory captures.

So, what is current best practice for live response?

Many would say that current best practice is to capture memory before doing anything else on the system, because it is the most volatile source of data.

I support this as a best practice, however there's a major problem with this idea; Judgement. The issue with following best practice rather than a standard is that judgement calls must be made. Make the wrong judgement call and you could be held liable for negligence. So, perhaps the industry needs to define cases for the application of best practice. Undoubtedly someone will respond to this and claim that these cases have already been defined. I would argue that sure they've been defined, but only in a cursory manner, and I would ask in what location can I find these best practices stated and which is actually the best practice when opinions differ?

Do we then begin to claim that one person is more credentialed than another and therefore the less credible persons opinions are not to be followed?

Other questions that some would ask:
When should you use a USB key instead of a network capture?
Should you capture memory twice during your response(before and after execution of response tools)?

EDIT 8/2/07: This bears clarification. It's been suggested that memory captures be taken twice during a response effort from two different devices. I've also been asked the above question before.

Is it best practice to capture memory before doing anything else on the system, even though the time it takes to capture memory can be prohibitive?
If you're in a race against time what do you do?
Do you need to run 40 tools in a script, when much of that information can be gathered from memory?
Do you need to run 40 tools in a script when 20 will suffice?

So, when you're performing a live response, remember the TJ hooper, and since we all follow best practice due to lack of standard, unless we all start dancing to the same tune, negligence will be hanging overhead at every turn.

EDIT: When I say I want cases, I don't literally mean legal cases, although it would be nice to have those as well. I was referring to cases as scenarios for when to and when not to do certain things.

Friday, July 20, 2007

A did you know

Quite some time ago, Tableau stated that they'd be providing the ability to remove HPA and DCO's using their write blockers. Well, that day has come and gone. The Tableau Disk monitor tool provides this capability on windows. Cool. Now it seems according to Tim of sentinel-chicken that there is a linux equivalent to the HPA/DCO removal capability. VERY COOL! The tool is called tableau-parm and it's a command line interface. I'll be trying this tool out presently.

Tuesday, July 17, 2007

Modeling a scene

I always wanted to do something fancy for a presentation when it came to describing how I found a scene. Sure, you can use visio, or autocad if you want to get super fancy, or even some real expensive modeling software for crime scenes. I decided to experiment with modeling a simple sample scene today in google sketchup. It took me about 20 minutes, mainly because I'm new to the program in this capacity, however I really like what this program can do. It's easy to learn, has video tutorials, and models are abundant. Anyways, I exported my model to a 2d jpg(with varying degrees of detail)because there's no way to post a 3d model that I know of. The 3d model is much much better but oh well.

What do others use for modeling a scene?

Monday, July 16, 2007

musings on bridging the gap and the blue wall

For a few years I've been trying to figure out how to pierce the blue wall to bridge the gap between law enforcement (government, state and local) such that people like myself can actually do some good with their skills. Why? Well when in school, after completing a project(details for those of you who are bored at the bottom), my CRJU prof said that LE could probably use a person like me. I found it inspiring at the time.

Some things I've learned along the way.

1) PD's don't respond to people

2) Pro-bono offerings don't get you anywhere

3) You must be "inned"

4) It takes a cop to work with them, or a professor

5) They don't want civilians seeing ugly all day long

There's a story here...I visited a state police cybercrime lab and spoke with the folks there. 60% of their cases were CP. I looked at the current pending cases board on the wall inside the exam room, and fully 90% of the cases at that time were CP. When asked what it's like, one examiner said "I see ugly all day long, I see it at work, I see it at home, I never stop seeing ugly, but I do what I can". Dealing with CP is tough work and this above all the other reasons is the one I can sympathize with.

For the most part, I'd be lying if I said I understood many of the reasons behind excluding the other half of the industry however I'm no fool, and no longer kid myself. The truth is, I'll never understand until I'm in the shoes of a LEO. Sure, I know several of them and I've heard stories, but that doesn't mean much, and it's not likely that I'd become an LEO.

As I've not yet been able to pierce the blue wall I offer an open invitation to anyone reading this blog that's a member of law enforcement of any kind. If there's anything that can be done(application analysis, testing, validation etc.) by a person such as myself, just shoot me a message. I have no hopes of contact, but like you all, I do what I can. If only LEO's would express their needs, the industry(commercial and individuals) I think could help. I've been all over the web, and I can't recall seeing an LEO sharing what it is that would help them the most. To that end I say keep on fighting the good fight, I'll continue to support the PBA, and just remember, there are people like me all over.

Thoughts, comments?

What was that project? Well, it was a mock search for a victim in an open area. We were given a scenario in one of my criminal investigation courses and asked to analyze the scene and determine a search pattern. I got a little crazy and went up in an airplane in January to take some aerials. Damn that was cold but so much fun. Some photos for your enjoyment(Thanks picasa!)..

One of many photos taken:

and my zone overlay.

Saturday, July 14, 2007

The rise of network forensics

I am starting to think that "network forensics" is going to quickly become the next "big thing"(TM) in the digital forensics discipline.

Well, what is network forensics?

by definition:Network forensics is the capture, recording, and analysis of network events in order to discover the source of security attacks or other problem incidents.

Unfortunately, that's the only definition I could actually find. Why is that I wonder? Perhaps because there is no such thing as network forensics? Is it just really another name for network security monitoring or intrusion analysis?

NSM is the collection, analysis and escalation of indications and warnings to detect and respond to intrusions. Intrusion analysis is much the same.

Let's look at a few of the tools used for each shall we?

net flows(cisco, argus etc)
Snort or other IDS

These are obviously only a few tools, many open source. Either way, they are multi-purpose tools used for things like protocol analysis, traffic analysis, intrusion analysis, security monitoring and forensics? Let's be honest with ourselves, network "forensics" appears to be just a buzzword for what's been done for years in the analyst field. There's nothing "forensic" about it..or is there?

So, where does the "forensic" component come in to play? Simple. Collection, preservation, and presentation as evidence.

Network forensics has many criteria that must be met in order for it to be useful as evidence.

My initial thoughts...

1) It must be complete

Whether it's session data, full content, alert or other, it must be complete. A capture that's missing packets could be seen as damaging to a case. A capture system must be capable of keeping up with the network it's connected to. Anomaly systems like Qradar that decide what to capture content on, or not, should not be used as evidence because the capture will be incomplete. A partial capture could be used to perhaps get a wiretap, or warrant but presenting it as fact or of evidentiary value would jeopardize a case in my opinion.

2) It must be authentic

Captures must be proven to be authentic. Hashing routines should be run as a capture log rolls over or each packet should be hashed as it's stored, before the analysis begins. In addition, the capture should be done by two independent systems if possible.

3) Time is critical

The collection system must be synched, and any time disparity must be accounted for.

4) Analysis systems need logging capabilities.

An analyst looking at a network capture to be used as evidence must allow for logging of the actions of the analyst.

5) Anyone presenting network based evidence must be trained in packet and protocol analysis, not simply the tools used.

In my opinion, being able to read a packets content in depth and being able to accurately interpret and analyze it is of utmost importance. Being able to explain the why's and how's is critical. It's easy to jump to conclusions that are incorrect with network based evidence. Richard rebuffs here. And here is an incorrect conclusion. It's very easy to do since many scenarios will fit a similar signature.

6) It must be used in conjuction with host based forensics if possible.

Of course, not every scenario is ideal, but remember that the power of digital evidence is in the ability to corroborate. Corporal and Environmental evidence should be used to corroborate each other if an accurate reconstruction is to take place. The value of the evidence will be bolstered if the two sources support each other.

7) Sensor deployment must be appropriate.

It does almost no good to deploy a sensor at the perimeter if the attack is internal. It might make sense to deploy multiple sensors in this type of case. Care must be taken to deploy at the right time and location.

What are others thoughts on the subject of network forensics? Is it snake oil, or a bonafide digital forensics specialty?

Thursday, July 12, 2007

Determining memory consumption

As I've been developing the methodology I talked about previously , one of the problem areas that's arisen is determining system impact. One component of impact is determining memory consumption. There's a lot of work to be done here because of the complexity of memory management, and there's a lot of work being done currently. If we're after precision, we need to know how many pages have been allocated to a process, how many are used, if they are in swap or resident in memory, what was previously in those pages, and what's in them now.

I started working on a primitive method to determine memory consumption. Note I said consumption rather than displacement, since determining displacement requires a lot of wrench time.

One such method to maybe start determining consumption is to use the debugging tools from Microsoft. The two tools I'll use here are windbg and cdb(same tool actually..just one is command line).

The tool I'm checking out, is another tool included in the debugging toolkit, also used in incident response and live response scenarios. goes.
fire up a command prompt and run this:
C:\Program Files\Debugging Tools for Windows>cdb.exe -y -o tlist.exe -t

Microsoft (R) Windows Debugger Version 6.7.0005.1
Copyright (c) Microsoft Corporation. All rights reserved.

CommandLine: tlist -t
Symbol search path is:
Executable search path is:
ModLoad: 01000000 012dd000 tlist.exe
ModLoad: 7c900000 7c9b0000 ntdll.dll
ModLoad: 7c800000 7c8f4000 C:\WINDOWS\system32\kernel32.dll
ModLoad: 77c10000 77c68000 C:\WINDOWS\system32\msvcrt.dll
ModLoad: 03000000 03116000 C:\Program Files\Debugging Tools for Windows\dbghel
ModLoad: 77dd0000 77e6b000 C:\WINDOWS\system32\ADVAPI32.dll
ModLoad: 77e70000 77f01000 C:\WINDOWS\system32\RPCRT4.dll
ModLoad: 77c00000 77c08000 C:\WINDOWS\system32\VERSION.dll
ModLoad: 7e410000 7e4a0000 C:\WINDOWS\system32\USER32.dll
ModLoad: 77f10000 77f57000 C:\WINDOWS\system32\GDI32.dll
ModLoad: 774e0000 7761d000 C:\WINDOWS\system32\ole32.dll
ModLoad: 77120000 771ac000 C:\WINDOWS\system32\OLEAUT32.dll
(7c.f90): Break instruction exception - code 80000003 (first chance)
eax=00181eb4 ebx=7ffd5000 ecx=00000001 edx=00000002 esi=00181f48 edi=00181eb4
eip=7c901230 esp=0006fb20 ebp=0006fc94 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
7c901230 cc int 3

Ok so that's good. Now we're stepping in to the tlist process and we've reached the first breakpoint. Now, fire up windbg and attach to the kernel (CTRL+K), select local and start with !process 0 0 to get a list of processes that are active.

lkd> !process 0 0

PROCESS 82b7c020 SessionId: 0 Cid: 007c Peb: 7ffd5000 ParentCid: 0da8
DirBase: 03660360 ObjectTable: e12f6a48 HandleCount: 7.
Image: tlist.exe

Aha, here's tlist. Now, we can key in on the process (output truncated):
lkd> !process 82b7c020
PROCESS 82b7c020 SessionId: 0 Cid: 007c Peb: 7ffd5000 ParentCid: 0da8
DirBase: 03660360 ObjectTable: e12f6a48 HandleCount: 7.
Image: tlist.exe
VadRoot 829ff7f8 Vads 27 Clone 0 Private 46. Modified 0. Locked 0.
DeviceMap e1b2efd0
Token e1165d48
ElapsedTime 00:05:54.546
UserTime 00:00:00.015
KernelTime 00:00:00.062
QuotaPoolUsage[PagedPool] 27892
QuotaPoolUsage[NonPagedPool] 1080
Working Set Sizes (now,min,max) (236, 50, 345) (944KB, 200KB, 1380KB)
PeakWorkingSetSize 236
VirtualSize 12 Mb
PeakVirtualSize 12 Mb
PageFaultCount 228
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 829
DebugPort 82e50a60

Look at the bolded text. We can see the process working set size. At present, the tlist process has been allocated 236 pages of memory at 4k. Multiply 236 * 4 and you'll get 944K. So, at the initial BP we see that tlist is "using" 944K of memory.

In the CDB window, if you tell it to 'go' by typing 'g' we'll see what happens to memory usage.

0:000> g
System Process (0)
System (4)
smss.exe (604)
csrss.exe (652)
winlogon.exe (676)
services.exe (720)
svchost.exe (904)
svchost.exe (1004)
svchost.exe (1116)
wuauclt.exe (3272)
svchost.exe (1232)
svchost.exe (1412)
ccSetMgr.exe (1608)
ccEvtMgr.exe (1656)
SPBBCSvc.exe (1772)
spoolsv.exe (428)
DefWatch.exe (332)
Rtvscan.exe (1100) Scan
VMwareService.exe (1384)
alg.exe (1976)
svchost.exe (1916)
wmiapsrv.exe (1444)
lsass.exe (740)
explorer.exe (1528) Program Manager
VMwareTray.exe (632)
VMwareUser.exe (940)
ccApp.exe (944)
VPTray.exe (816) Missing Virus Definitions
PicasaMediaDetector.exe (936) Picasa Media Detector
taskmgr.exe (3252) Windows Task Manager
cmd.exe (3904) Command Prompt - cdb.exe -y
/symbols -o tlist -t
windbg.exe (3500) Windows Driver Kit: Debugging Tools
cdb.exe (3496)
tlist.exe (124)
mmc.exe (3216) Performance
hh.exe (4092) Windows Driver Kit: Debugging Tools
eax=0300b7f8 ebx=00000000 ecx=002643e8 edx=00260608 esi=7c90e88e edi=00000000
eip=7c90eb94 esp=0006fe44 ebp=0006ff40 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
7c90eb94 c3 ret

Groovy, now we get our tlist output and the tool is done running. However, since we haven't freed the process from the debugger, we have a thread in wait state, which means we can figure out how much memory our process actually consumed:

key in on the same process as before in windbg and you'll see the following:
lkd> !process 82b7c020
PROCESS 82b7c020 SessionId: 0 Cid: 007c Peb: 7ffd5000 ParentCid: 0da8
DirBase: 03660360 ObjectTable: e12f6a48 HandleCount: 22.
Image: tlist.exe
VadRoot 82bc3840 Vads 36 Clone 0 Private 380. Modified 0. Locked 0.
DeviceMap e1b2efd0
Token e1165d48
ElapsedTime 00:12:59.265
UserTime 00:00:00.015
KernelTime 00:00:00.140
QuotaPoolUsage[PagedPool] 38236
QuotaPoolUsage[NonPagedPool] 1440
Working Set Sizes (now,min,max) (771, 50, 345) (3084KB, 200KB, 1380KB)
PeakWorkingSetSize 771
VirtualSize 18 Mb
PeakVirtualSize 19 Mb
PageFaultCount 765
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 1120
DebugPort 82e50a60

So, we see 3084K has been "used", but can that possibly be accurate? The answer of course is no. There are several factors at play.

1) This was run from a debugger - this adds memory allocated to the process to account for debugging.

2) There is shared memory being used by other DLL's.

3) The working set is not representative of memory actually used by the process. It's representative of the memory(shared, virtual, physical) allocated to the process, but not necessarily that which is consumed(or resident). In addition, Microsoft's documentation on MSDN is inconsistent(surprise, surprise) in describing what's actually included in the working set.

4) The working set is what is recorded by the task manager, which is inaccurate.

So, where does this leave us? Well, in terms of precision and accuracy for determining memory consumption for a process, the working set is not the answer because the allocation hasn't necessarily been consumed, maybe the private bytes associated with a process is what we need to focus on but the real question is, and this has yet to be answered..what's good enough? I'm thinking that maybe the working set size is good enough, but not necessarily precise.

Next steps? Boot a system in debug mode and attach a kernel debugger to it while it's live.


EDITS: the url is download, not downloads.

Tuesday, July 10, 2007

New ACPO guidelines

The ACPO released their new guidelines recently. I really have to hand it to the ACPO for reviewing their own guidelines on a regular basis and keeping up with new techniques and technology.

Of particular interest in the document is the Network Forensics and Volatile data section.

"By profiling the forensic footprint of
trusted volatile data forensic tools, an investigator will be
in a position to understand the impact of using such tools
and will therefore consider this during the investigation
and when presenting evidence."

It's about time that this statement was made in an official manner. While I'm in the process of actually defining a methodology to run these tests, it's really nice to see this.

It was also important for the ACPO to include the note about the Trojan Defense.

"Considering a potential Trojan defence, investigators
should consider collecting volatile evidence. Very often,
this volatile data can be used to help an investigator
support or refute the presence of an active backdoor."

Great inclusion!

The ACPO seems to hint at the requirement to add network forensics procedures due to trojan defense claims and the apparently large amount of claims that "the trojan did it".

Any of you fine folks across the pond have a metric for the number of claims?
It would also seem as if Network Forensics will be a major focal point in the upcoming months for many investigators.

I have to commend the ACPO for releasing these guidelines, it's a great resource.

A peer review - DOJ methodology

Back in April, after I posted on peer reviews and how no one shared their methodologies I was a little surprised that the responses were few. However, Ovie from Cyberspeak decided to send me something he picked up from the DOJ. Thanks Ovie!

First, here's Ovie's disclaimer.

Again, this is not a cyberspeak product it was produced by the Department of Justice, Computer Crime and Intellectual Property Section and they said it had no copyright so people were free to use it however they want.

So, here is the methodology:

Look at all the purdy colors! I must assume first off that there is a glossary for many of the terms listed in the flowchart because this is a very high level overview of the process.

From the get-go there is one thing missing. Validation of the hardware to be used by the examiner. Similar to calibration of the radar detectors before you go out and try to get someone for speeding. There's a block that says "setup forensic hardware" so maybe it's actually buried there but I don't see it.

There's also no mention of scanning for malware. While this isn't foolproof, it's a must for any analysis procedure in my opinion.

I personally don't care for the use of the word triage in this methodology. It just doesn't fit with the section it's listed under. I'd say "data identification/processing" rather than triage. There's really no triage happening here. If someone wanted to add something reminiscent of triage to this phase, they should add a method of prioritization of forensic data sources to be analyzed. In fact, adding that would fit with the arrow along the bottom where ROI diminishes once enough evidence is obtained for prosecution. Prioritizing would meld nicely here.

Data Analysis: Again, there is no mention of scanning for malware.

What I find really interesting is in the Data Search Lead List. There's mention of attempting to reconstruct an environment or database to mimic the original. Kudos to the DOJ for acknowledging the power of reconstruction!

This document provides a really great overview of the forensics process, but it raises a lot of questions about the guts of the process rather than the overview but I'm really happy that Ovie decided to send this along. This is the kind of stuff we need to start sharing if we're ever to narrow the gap that divides this industry and holds it back. If anyone else wants to send something to me, I'd be happy to take a look and send you my feedback. If you have a step-by-step, I'll even run it through a validation process.

For you Binary geeks....
Have a look at the at the bottom of the ROI arrow. There's something interesting in there..


When my son was born I decided it would be a good time to organize all of the photos I'd taken to date. I had experience with Picasa as a blogger, since picasaweb is associated with this blog, that is to say all of the images posted here have been indexed on picasaweb.

I went ahead and downloaded picasa. Installation was quick and right after install, picasa wants to scan your entire hard drive for images so it can organize them. Cool! I was amazed at what was found on my computer in that I had forgotten about the presentation material I'd created a few years ago.

Anyways, I figured it would be worth it to run an analysis of picasa and share what I've found so far.

File Locations:
C:\Documents and Settings\USER\Local Settings\Application Data\Google\


Registry Locations:




"lastmypics"="\\serutciP yM\\stnemucoD yM\\ylfgoh\\sgnitteS dna stnemucoD\\:C"
"lastmydocs"="\\stnemucoD yM\\ylfgoh\\sgnitteS dna stnemucoD\\:C"
"lastdesktop"="\\potkseD\\ylfgoh\\sgnitteS dna stnemucoD\\:C"
"Picasa Notifier"="rect(779 426 800 475)"
"mainwinpos"="rect(0 0 800 574)"






"AppPath"="C:\\Program Files\\Picasa2\\Picasa2.exe"



Ok, there's a lot there in the registry, and I'll start with preferences. Most of it's pretty straight forward, except for the paths that are reversed (what in the world is that all about?). Of particular value, is the last album selected. This value gets updated when picasa closes. It appears to be an MD5sum - probably of the timedate and name of the album combined. However, there's an easier way to determine what the value is.

in the db2 directory there are a number of files. The important files for this are albumdata_token.pmp, albumdata_uid.pmp and albumdata_name.pmp

Here are the contents of the files:
albumdata_name.pmp - this is the name of the albums in picasa. The first two are defaults and are not included in any of the other files.
Starred Photos
Sample Pictures

albumdata_uid.pmp - This is where the hashes are.


albumdata_token.pmp - Here's we see the uid applied to create a token for the albums. Note that "star" and "screensaver" do not have uids.


Now, if we look at the lastalbumselected value in the registry, we can pair it up to the hash since these files are all listed in the same order. If you exclude star and screensaver you can see that the lastalbumselected for me was sam3.

You can even go one step further if you include albumdata_filename.pmp. This file also matches up to the other files, except I forgot to mention one thing. "root" is literally the root of the logical drive that picasa searched(in this case C:), so it is excluded from albumdata_filename.pmp. This file contains the path to where the images are stored.

Other files to pay attention to:

These all follow the good old thumbs.db structure and contain thumbnails of all of the images at various resolutions, since picasa can send files directly to photo processing businesses.

One other thing that is of pretty vital importance in terms of proving that someone created an album and that the program didn't just index something.

In the Picasa2Albums directory you'll see a file for each of the album(s) created by the user under the folder using the DBID as its name. Below are the contents of the album I created stored in a file named c332f1814ff6d4f21dbb41b41149544d.pal.
* I had to remove the leading bracket as blogger wanted to parse it *

'property name="uid" type="string" value="c332f1814ff6d4f21dbb41b41149544d">
'property name="category" type="num" value="0">
'property name="date" type="real64" value="39272.630035">
'property name="token" type="string" value="]album:c332f1814ff6d4f21dbb41b41149544d">
'property name="name" type="string" value="Sammy">

It's pretty self explanatory. You can see the DBID (happens to match the updateid, so that's good), you see the albumid which as I've already shown, ties to Sammy. One oddity, is the Real64 date value. Real64 is a float data type using 64 bits, but I've never run across it for a date stamp before. If anyone wants to take a shot at converting that to a date, go ahead and let me know how you did it.

Other files of interest:
thumbindex.tid - This is an index of all images and paths to them.
wordhash.dat - Picasa indexes as much metadata as it can, so that you can search based off of it. This file contains all parsed text from the images and their metadata.
repository.dat - This is a description of the database format used by picasa.

Friday, July 6, 2007

Sansa MP3 player - C250

I recently collected a sansa mp3 player - model c250 from a friend and figured I'd share what I've found and some of the tests I ran on it.

First, I grabbed an image of the device by attaching it to my Tableau T8 Forensic USB bridge. These are great if you don't already have one.

Once connected, I fired up FTK imager to grab an image of the device. In 650 MB chunks, that's 3 files, not a big deal. I tend to like collecting in cd size chunks as a best practice, because if there are bad sectors or other forms of corrupted data, I've only hopefully lost at most 650MB. However, for analysis I like to work off of a monolithic image. On windows, combining files can be done a few ways but I like to use UnxUtils, and cat (as if I were on my linux box).

It's just a simple matter to cat the files together.
cat sansamp3player.00* >> sansa_mp3player.dd

A monolithic file tends to work better for me when using on with the show.

Immediately I notice 2 partitions, one is FAT16 and is 1.9GB in size, and another partition is ~20MB and is unrecognized by Winhex but it's identifier is 84 which according to documentation is for "suspend to disk" or Hibernation partitions. This by itself has spawned more thoughts than I'll put in this entry but suffice it to say that "suspend to disk" and "suspend to ram" deserve someone spending some time to look in to them.

One question you might ask, is why would they use partition type 84 for an MP3 player? Well, my best guess is that due to the constant power state changes of the device, it needs the ability to remember how it was configured before you turned the device off and it needs to turn on quickly - both of which are provided by suspending to disk.

The ~20MB partition appears to contain the primary_bootloader for the device and the sansa firmware. After a little searching I came across this site which provides a lot of information on the sansa mp3 players. The firmware for these devices according to Daniel's site is stored as an encrypted .mi4 file in the hibernation partition. So, how do we get access? I used sleuthkit and dd..

hogfly@loki:~$ /usr/local/sleuthkit/bin/mmls sansa_mp3player.dd
DOS Partition Table
Offset Sector: 0
Units are in 512-byte sectors

Slot Start End Length Description
00: ----- 0000000000 0000000000 0000000001 Primary Table (#0)
01: ----- 0000000001 0000000514 0000000514 Unallocated
02: 00:00 0000000515 0003885055 0003884541 DOS FAT16 (0x06)
03: 00:01 0003885056 0003926015 0000040960 Hibernation (0x84)

Ok, so we now see where the hibernation partition begins (sector 3885056) and can extract this partition using dd.

hogfly@loki:~$ dd if=sansa_mp3player.dd bs=512 skip=3885056 of=hibernat.dd
40960+0 records in
40960+0 records out
20971520 bytes (21 MB) copied, 0.721501 seconds, 29.1 MB/s

Now, I've got the hibernation file system extracted so I can experiment with Daniel's code.

I downloaded cutit.c and mi4code.c and compiled like so:
gcc -o mi4code mi4code.c -lgcrypt
gcc -o cutit cutit.c

Now it's a matter of execution..

hogfly@loki:~$ ./cutit hibernat.dd firm.mi4
seek done
firmware size: 3221504 bytes
Wrote 16384 bytes
[repeated many more times]
operation complete

firm.mi4 is a copy of the firmware used by the sansa c250 and it's now extracted from the image. At this point I could go ahead with the mi4code binary and begin an in depth examination of the firmware, but I'm holding off on that because for this, it serves no purpose other than perhaps validation.

On to the data partition...
The file structure as I found the device was as follows (this is really long by the way) and was generated by tree:

| version.sdk
| \---0
| +---0
| | +---0
| | +---1
| | +---2
| | +---3
| | +---4
| | | 00000044.DAT
| | | 0000004A.DAT
| | | 0000004F.DAT
| | |
| | +---5
| | | 00000054.DAT
| | | 00000059.DAT
| | | 0000005F.DAT
| | |
| | +---6
| | | 00000060.DAT
| | | 00000061.DAT
| | | 00000062.DAT
| | | 00000063.DAT
| | | 00000064.DAT
| | | 00000065.DAT
| | | 00000066.DAT
| | | 00000067.DAT
| | | 00000068.DAT
| | | 00000069.DAT
| | |
| | +---7
| | | 00000070.DAT
| | | 00000071.DAT
| | |
| | +---8
| | +---9
| | +---A
| | +---B
| | +---C
| | +---D
| | +---E
| | \---F
| +---1
| | +---0
| | +---1
| | +---2
| | +---3
| | +---4
| | +---5
| | +---6
| | +---7
| | +---8
| | +---9
| | +---A
| | +---B
| | +---C
| | | 000001CE.DAT
| | | 000001CF.DAT
| | |
| | +---D
| | | 000001D0.DAT
| | | 000001D1.DAT
| | | 000001D2.DAT
| | | 000001D3.DAT
| | | 000001D4.DAT
| | | 000001D5.DAT
| | | 000001D6.DAT
| | | 000001D7.DAT
| | | 000001D8.DAT
| | | 000001D9.DAT
| | | 000001DA.DAT
| | | 000001DB.DAT
| | | 000001DC.DAT
| | | 000001DD.DAT
| | | 000001DE.DAT
| | | 000001DF.DAT
| | |
| | +---E
| | | 000001E0.DAT
| | | 000001E1.DAT
| | | 000001E2.DAT
| | | 000001E3.DAT
| | | 000001E4.DAT
| | | 000001E5.DAT
| | | 000001E6.DAT
| | | 000001E7.DAT
| | | 000001E8.DAT
| | | 000001E9.DAT
| | | 000001EA.DAT
| | | 000001EB.DAT
| | | 000001EC.DAT
| | | 000001ED.DAT
| | | 000001EE.DAT
| | | 000001EF.DAT
| | |
| | \---F
| | 000001F0.DAT
| | 000001F1.DAT
| | 000001F2.DAT
| | 000001F3.DAT
| | 000001F4.DAT
| | 000001F5.DAT
| | 000001F6.DAT
| | 000001F7.DAT
| | 000001F8.DAT
| | 000001F9.DAT
| | 000001FA.DAT
| | 000001FB.DAT
| | 000001FC.DAT
| | 000001FD.DAT
| | 000001FE.DAT
| | 000001FF.DAT
| |
| +---2
| | +---0
| | | 00000200.DAT
| | | 00000201.DAT
| | | 00000202.DAT
| | | 00000203.DAT
| | | 00000204.DAT
| | | 00000205.DAT
| | | 00000206.DAT
| | | 00000207.DAT
| | | 00000208.DAT
| | | 00000209.DAT
| | | 0000020A.DAT
| | | 0000020B.DAT
| | | 0000020C.DAT
| | | 0000020D.DAT
| | | 0000020E.DAT
| | | 0000020F.DAT
| | |
| | +---1
| | | 00000210.DAT
| | | 00000211.DAT
| | | 00000212.DAT
| | | 00000213.DAT
| | | 00000214.DAT
| | | 00000215.DAT
| | | 00000216.DAT
| | | 00000217.DAT
| | | 00000218.DAT
| | | 00000219.DAT
| | | 0000021A.DAT
| | | 0000021B.DAT
| | | 0000021C.DAT
| | | 0000021D.DAT
| | | 0000021E.DAT
| | | 0000021F.DAT
| | |
| | +---2
| | | 00000220.DAT
| | | 00000221.DAT
| | | 00000222.DAT
| | | 00000223.DAT
| | | 00000224.DAT
| | | 00000225.DAT
| | | 00000226.DAT
| | | 00000227.DAT
| | | 00000228.DAT
| | | 00000229.DAT
| | | 0000022A.DAT
| | | 0000022B.DAT
| | | 0000022C.DAT
| | | 0000022D.DAT
| | | 0000022E.DAT
| | | 0000022F.DAT
| | |
| | +---3
| | | 00000230.DAT
| | | 00000231.DAT
| | | 00000232.DAT
| | | 00000233.DAT
| | | 00000234.DAT
| | | 00000235.DAT
| | | 00000236.DAT
| | | 00000237.DAT
| | | 00000238.DAT
| | | 00000239.DAT
| | | 0000023A.DAT
| | | 0000023B.DAT
| | | 0000023C.DAT
| | | 0000023D.DAT
| | | 0000023E.DAT
| | | 0000023F.DAT
| | |
| | +---4
| | | 00000240.DAT
| | | 00000241.DAT
| | | 00000242.DAT
| | | 00000243.DAT
| | | 00000244.DAT
| | | 00000245.DAT
| | | 00000246.DAT
| | | 00000247.DAT
| | | 00000248.DAT
| | | 00000249.DAT
| | | 0000024A.DAT
| | | 0000024B.DAT
| | | 0000024C.DAT
| | | 0000024D.DAT
| | | 0000024E.DAT
| | | 0000024F.DAT
| | |
| | +---5
| | | 00000250.DAT
| | | 00000251.DAT
| | | 00000252.DAT
| | | 00000253.DAT
| | | 00000254.DAT
| | | 00000255.DAT
| | | 00000256.DAT
| | | 00000257.DAT
| | | 00000258.DAT
| | | 00000259.DAT
| | | 0000025A.DAT
| | | 0000025B.DAT
| | | 0000025C.DAT
| | | 0000025D.DAT
| | | 0000025E.DAT
| | | 0000025F.DAT
| | |
| | +---6
| | | 00000260.DAT
| | | 00000261.DAT
| | | 00000262.DAT
| | | 00000263.DAT
| | | 00000264.DAT
| | | 00000265.DAT
| | | 00000266.DAT
| | | 00000267.DAT
| | | 00000268.DAT
| | | 00000269.DAT
| | | 0000026A.DAT
| | | 0000026B.DAT
| | | 0000026C.DAT
| | | 0000026D.DAT
| | | 0000026E.DAT
| | | 0000026F.DAT
| | |
| | +---7
| | | 00000270.DAT
| | | 00000271.DAT
| | | 00000272.DAT
| | | 00000273.DAT
| | | 00000274.DAT
| | | 00000275.DAT
| | | 00000276.DAT
| | | 00000277.DAT
| | | 00000278.DAT
| | | 00000279.DAT
| | | 0000027A.DAT
| | | 0000027B.DAT
| | | 0000027C.DAT
| | | 0000027D.DAT
| | | 0000027E.DAT
| | | 0000027F.DAT
| | |
| | +---8
| | | 00000280.DAT
| | | 00000281.DAT
| | | 00000282.DAT
| | | 00000283.DAT
| | | 00000284.DAT
| | | 00000285.DAT
| | | 00000286.DAT
| | | 00000287.DAT
| | | 00000288.DAT
| | | 00000289.DAT
| | | 0000028A.DAT
| | | 0000028B.DAT
| | | 0000028C.DAT
| | | 0000028D.DAT
| | | 0000028E.DAT
| | | 0000028F.DAT
| | |
| | +---9
| | | 00000290.DAT
| | | 00000291.DAT
| | | 00000292.DAT
| | | 00000293.DAT
| | | 00000294.DAT
| | | 00000297.DAT
| | | 00000298.DAT
| | | 00000299.DAT
| | | 0000029A.DAT
| | | 0000029B.DAT
| | | 0000029C.DAT
| | | 0000029D.DAT
| | | 0000029E.DAT
| | | 0000029F.DAT
| | |
| | +---A
| | | 000002A0.DAT
| | | 000002A1.DAT
| | | 000002A2.DAT
| | | 000002A3.DAT
| | | 000002A4.DAT
| | |
| | +---B
| | | 000002B2.DAT
| | | 000002B3.DAT
| | | 000002B4.DAT
| | | 000002B5.DAT
| | | 000002B6.DAT
| | | 000002B7.DAT
| | | 000002B8.DAT
| | | 000002B9.DAT
| | | 000002BB.DAT
| | | 000002BC.DAT
| | | 000002BD.DAT
| | | 000002BE.DAT
| | | 000002BF.DAT
| | |
| | +---C
| | | 000002C0.DAT
| | | 000002C1.DAT
| | | 000002C2.DAT
| | | 000002C3.DAT
| | | 000002C4.DAT
| | | 000002C5.DAT
| | | 000002C6.DAT
| | | 000002C7.DAT
| | | 000002C8.DAT
| | | 000002C9.DAT
| | | 000002CA.DAT
| | | 000002CB.DAT
| | | 000002CC.DAT
| | | 000002CD.DAT
| | | 000002CE.DAT
| | | 000002CF.DAT
| | |
| | +---D
| | | 000002D0.DAT
| | | 000002D1.DAT
| | | 000002D2.DAT
| | | 000002D3.DAT
| | | 000002D4.DAT
| | | 000002D5.DAT
| | | 000002D6.DAT
| | | 000002D7.DAT
| | | 000002D8.DAT
| | | 000002D9.DAT
| | | 000002DA.DAT
| | | 000002DB.DAT
| | | 000002DC.DAT
| | | 000002DD.DAT
| | | 000002DF.DAT
| | |
| | +---E
| | | 000002E0.DAT
| | | 000002E1.DAT
| | | 000002E2.DAT
| | | 000002E3.DAT
| | | 000002E4.DAT
| | | 000002E5.DAT
| | | 000002E6.DAT
| | | 000002E7.DAT
| | | 000002E8.DAT
| | | 000002E9.DAT
| | | 000002EA.DAT
| | | 000002EB.DAT
| | |
| | \---F
| | 000002FB.DAT
| | 000002FC.DAT
| | 000002FD.DAT
| | 000002FE.DAT
| | 000002FF.DAT
| |
| \---3
| +---0
| | 00000300.DAT
| | 00000301.DAT
| | 00000302.DAT
| | 00000303.DAT
| | 00000304.DAT
| | 00000305.DAT
| | 00000306.DAT
| | 00000308.DAT
| | 00000309.DAT
| | 0000030A.DAT
| | 0000030B.DAT
| | 0000030C.DAT
| | 0000030D.DAT
| | 0000030E.DAT
| | 0000030F.DAT
| |
| +---1
| | 00000310.DAT
| | 00000311.DAT
| | 00000312.DAT
| | 00000313.DAT
| | 00000314.DAT
| | 00000315.DAT
| | 00000318.DAT
| | 0000031B.DAT
| | 0000031C.DAT
| | 0000031D.DAT
| | 0000031F.DAT
| |
| +---2
| | 00000320.DAT
| | 00000321.DAT
| | 00000322.DAT
| | 00000323.DAT
| | 00000324.DAT
| | 00000325.DAT
| | 00000326.DAT
| | 00000327.DAT
| | 00000328.DAT
| | 00000329.DAT
| | 0000032A.DAT
| | 0000032B.DAT
| | 0000032C.DAT
| | 0000032D.DAT
| | 0000032E.DAT
| | 0000032F.DAT
| |
| +---3
| | 00000330.DAT
| | 00000331.DAT
| | 00000332.DAT
| | 00000333.DAT
| | 00000334.DAT
| | 00000335.DAT
| | 00000336.DAT
| | 00000337.DAT
| | 00000338.DAT
| | 00000339.DAT
| | 0000033A.DAT
| | 0000033B.DAT
| | 0000033C.DAT
| | 0000033D.DAT
| | 0000033E.DAT
| | 0000033F.DAT
| |
| +---4
| | 00000340.DAT
| | 00000342.DAT
| | 00000343.DAT
| | 00000344.DAT
| | 00000345.DAT
| | 00000346.DAT
| | 00000347.DAT
| | 00000348.DAT
| | 00000349.DAT
| | 0000034A.DAT
| | 0000034C.DAT
| | 0000034D.DAT
| | 0000034E.DAT
| | 0000034F.DAT
| |
| +---5
| | 00000350.DAT
| | 00000351.DAT
| | 00000352.DAT
| | 00000353.DAT
| | 00000354.DAT
| | 00000355.DAT
| | 00000356.DAT
| | 00000357.DAT
| | 00000358.DAT
| | 0000035A.DAT
| | 0000035B.DAT
| | 0000035C.DAT
| | 0000035D.DAT
| | 0000035E.DAT
| | 0000035F.DAT
| |
| +---6
| | 00000360.DAT
| | 00000361.DAT
| | 00000362.DAT
| | 00000363.DAT
| | 00000364.DAT
| | 00000365.DAT
| | 00000366.DAT
| | 00000367.DAT
| | 00000368.DAT
| | 00000369.DAT
| | 0000036A.DAT
| | 0000036B.DAT
| | 0000036C.DAT
| | 0000036D.DAT
| | 0000036E.DAT
| | 0000036F.DAT
| |
| +---7
| | 00000370.DAT
| | 00000371.DAT
| | 00000372.DAT
| | 00000373.DAT
| | 00000374.DAT
| | 00000375.DAT
| | 00000376.DAT
| | 00000377.DAT
| | 00000378.DAT
| | 0000037A.DAT
| | 0000037B.DAT
| | 0000037C.DAT
| | 0000037D.DAT
| | 0000037E.DAT
| | 0000037F.DAT
| |
| +---8
| | 00000380.DAT
| | 00000381.DAT
| | 00000382.DAT
| | 00000383.DAT
| | 00000384.DAT
| | 00000385.DAT
| | 00000386.DAT
| | 00000387.DAT
| | 00000388.DAT
| |
| +---9
| | 00000390.DAT
| | 00000392.DAT
| | 00000393.DAT
| | 00000394.DAT
| | 00000395.DAT
| | 00000396.DAT
| | 00000397.DAT
| | 00000398.DAT
| | 00000399.DAT
| | 0000039A.DAT
| | 0000039B.DAT
| | 0000039C.DAT
| | 0000039D.DAT
| | 0000039E.DAT
| | 0000039F.DAT
| |
| +---A
| | 000003A0.DAT
| | 000003A1.DAT
| | 000003A2.DAT
| | 000003A3.DAT
| | 000003A4.DAT
| | 000003A5.DAT
| | 000003A6.DAT
| | 000003A7.DAT
| | 000003A8.DAT
| | 000003A9.DAT
| | 000003AA.DAT
| | 000003AB.DAT
| | 000003AC.DAT
| | 000003AD.DAT
| | 000003AE.DAT
| | 000003AF.DAT
| |
| +---B
| | 000003B0.DAT
| | 000003B1.DAT
| | 000003B2.DAT
| | 000003B3.DAT
| | 000003B4.DAT
| | 000003B5.DAT
| | 000003B6.DAT
| | 000003B7.DAT
| | 000003B8.DAT
| | 000003B9.DAT
| | 000003BA.DAT
| | 000003BB.DAT
| | 000003BC.DAT
| | 000003BD.DAT
| | 000003BE.DAT
| | 000003BF.DAT
| |
| \---C
| 000003C0.DAT
| 000003C1.DAT
| 000003C2.DAT
| 000003C3.DAT
| 000003C4.DAT
| 000003C5.DAT
| 000003C6.DAT
| 000003C7.DAT

Well, as you can see it holds a lot of stuff for only 2GB!
First, what's important?

Everything is listed as a .DAT - even though they really aren't. There are .jpg's, mp3's, .wav, .wma files all listed there, so you want to run it through a file type identifier. This behavior appears to be tied to media players and playlists although I've not been able to replicate it yet. Sandisk also uses a media converter program to get photos to the device.

The Data directory contains .pdb files, each of which correspond to a menu in the device for music identification. Mainly they are used for organization and song recognition. I've tried a number of palm database dumping utilities to no avail on all but one .pdb file.
Object.PDB seems to be where file information gets stored for every file on the device. I downloaded palmdump and ran it against the Object.PDB file.

C:\temp\palmdump>palmdump.exe j:\SYSTEM\DATA\OBJECT.PDB > dump.txt
Database name: 
Flags: 0x400
Version: 0x0
Creation time: PC/Unix time: Wed Dec 31 19:00:00 1969
Modification time: PC/Unix time: Mon Aug 28 17:41:20 1972
Backup time: Never
Modification number: 0
Application Info offset: 0
Sort Info offset: 0
Unique ID: 0
Next record ID: 0
Number of records: 1536

Looks like a bunch of meaningless data, and the times are incorrect. I have to wonder if sandisk has done something a little different with their databases and palmdump and other tools just can't decode it properly. I'm not a palm programmer by any means but if anyone wants to shed some light on this, please do.

Anyways, the number of records is what was most interesting. The only record to actually contain something was record 1535. Running strings against this resulted in a complete listing of all files on the device.

MTPcontent - the device operates in two modes when, MTP and MCP. These are modes of communication so the device can be seen by the host computer. This directory seems to be reserved for copy operations directly from media players on host computers.

WMDRMPD - This guessed it, related to DRM. It's Windows Media DRM for Portable Devices. I don't have media player 10 installed but it's said that one were to copy files from media player directly to the device, then they would end up here. Store.HDS is the DRM license.

One other interesting factoid...As I found the device, it contained a bunch of .DS_Store files and a ._.Trashes file. These files are a dead give away if you examine one of these. They mean that the device was at one point connected to a MAC. These are artifacts left on the device during a file copy and delete operation. Here's the header from ._.Trashes.

Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F

00D9FA00 00 05 16 07 00 02 00 00 4D 61 63 20 4F 53 20 58 Mac OS X
00D9FA10 20 20 20 20 20 20 20 20 00 02 00 00 00 09 00 00
00D9FA20 00 32 00 00 0E B0 00 00 00 02 00 00 0E E2 00 00 2 ° â
00D9FA30 01 1E 00 00 00 00 00 00 00 00 40 00 00 00 00 00 @
00D9FA40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00D9FA50 00 00 00 00 41 54 54 52 3B 9A C9 FF 00 00 0E E2 ATTR;šÉÿ â
00D9FA60 00 00 00 78 00 00 00 00 00 00 00 00 00 00 00 00 x

The device has a built in format capability which as we all know will "wipe" data from the device so you can re-use it. Naturally the now unallocated data is recoverable.

After formatting, I decided to try a copy operation to the device. First I ripped my Metallica CD "And Justice For All" to MP3's using FreeRIP and copied them to the MUSIC directory. The object.pdb file was updated (the device goes through a "refresh database" cycle after unplugging it from your computer. There was nothing in the MTPcontent directories so this further supports the notion that MTPcontent is reserved for copying music to the device through a media player of some form. **Hint , if you examine one of these devices and find files in this directory, search media player configurations on the host computer**

There's more to play with on this device, but that's a good start I think.