Monday, July 20, 2009

The FTK 2 dilemma

So you're using FTK 2.x...Does the separate database server buy you anything? This is the question I've been asking myself for about a week now. After having good success with FTK 2 on a standalone system I moved to a split box configuration. Per the recommendations here, I put my more powerful system in place as the Oracle Database.

In addition I threw a quad core processor with 8GB and a handful of new SATA 2 drives in to a second system. It's not a brand new system but it meets the specs for an FTK2 worker system.

As it turns out, and in my humble opinion, the documentation appears to be misguided for a two box configuration. Here are a few thoughts.

FTK 2 worker:

  1. The worker is truly the worker. Splitting the configuration puts the majority of the load on the worker machine. The database system simply shuffles records across the network and handles queries.
  2. The worker requires a lot of resources - especially while processing a case. While processing a case, the CPU/memory/disk combination kept the worker box pegged, meanwhile the database server was sleeping. Pictures coming soon.
  3. The worker system not only does the heavy lifting, it also needs to manage the GUI. Try processing evidence and moving around the'll see what I mean.
Database Server:
  1. The database server is mostly idle until you load it up with data and need to 'work' the case. Even then, it doesn't require a lot of resources. It needs to fulfill queries and this isn't a transaction level oracle server. It does a lot of reading at one time and a lot of writing at one time.
  2. The database server only needs to meet the specs of the worker machine. It does not need to be more powerful than it as the worker machine is doing the heavy lifting.
  3. The database server requires disk and memory. CPU is nice to have but it doesn't need to be dual quad cores when you only have one worker.

If doing a two box configuration, here are my worker recommendations:

CPU - Quad core or Dual Quad Core CPU. The 9400+ series for core 2 quad, or if you've got the money for a new system, go with the i7. If you've really got some cash..go quad core xeon.
RAM - At least 2GB memory for each CPU Core; 4GB/Core if you can afford it. Trust me, don't skimp on the RAM.
DISK - This is broken down in to categories.
  • OS: A raid 1 works nicely here.
  • Index drives: At least a 4 drive Raid-0. This is where your indexes will be stored. These drives need to be the fastest available. 300GB WD velociraptors should do the trick. However, remember your storage requirements. Expect indexing to use 1/5th of the total evidence set. e.g; 1TB evidence = approx. 250GB indexing space.
  • Image drives: When you load a case you want to put your images on a locally attached storage media. I'd go with at least a 2 drive Raid-0.
Controller - Is it me, or is adaptec the only raid controller manufacturer that seems to be making good controllers anymore?
Adaptec 5805(internal) or the 5085(external) seems to be the best controller out there for the price.

And here are my oracle server recommendations:

CPU - A single quad core CPU.
RAM - 2GB/core.
  • OS: Use a Raid 1 here.
  • This is a database server. The question you need to ask yourself is: Do I want redundancy? If yes, go with at least a 4 drive raid 10, if not more. If no, go with a 4+ drive raid 0. Remember your space requirements. Expect to use 10% of the size of the evidence for database storage in each case, in addition to the minimum 6GB.
Controller - Adaptec 5405 or better.

And if you combine the two systems in to one here are my recommendations:

CPU - Dual Quad Core Xeon
RAM - 4GB/Core
DISK - Face it, you don't have enough space internally, even the cosmos can be tight on space (best case ever). Get an external disk array. Addonics has some very interesting cage configuration options here. Others have done the homework to spec out their own arrays, saving $$. You'll want Multilane E-sata or SAS drives. As with any I/O intensive need spindles to spread the load.
  • OS: Raid 1 still works here.
  • Indexing
  • Database
  • Images
Controller - the Adaptec 5085 gives the required connections to do an external SAS or SATA array. Get the Battery Backup.

And don't forget the backups. Backup servers/devices don't need to be high powered, they need to be reliable. Get a raid 5 NAS or a bunch of disks in older hardware.

Now that hardware is out of the way let's look at the real dilemma.

Does a separate database server provide any utility when you have two computers? My response is no, you don't. An average case these days will be fully processed(indexed, hashed, KFF, duplicates etc) in about 24 hours. I haven't seen any benefit in moving to two still takes 24 hours or more. There are major drawbacks to a two box configuration as well.

  1. FTK 2 has been so heavily over-engineered that all you need is network and agent complexity. The worker loses connectivity even on a dedicated link. How does it recover from this? Does it recover every thing completely?
  2. Backups on a two box system requires it's own whitepaper. If a GUI product requires a separate paper for backing something up when the single system backup is straightforward, there are too many variables.
  3. You now have to maintain two operating systems, two sets of hardware, twice the expense and two times as many failure modes.
My conclusion: A two box configuration makes no sense in FTK 2 as long as you only have one worker. If multiple workers were processing the case, or if the database server contributed to the processing it might make sense but as of FTK 2.2.1, it makes no sense whatsoever and provides no benefit. As I was writing this Accessdata appears to have released a paper on performance statistics. Interesting how they don't even mention a two box configuration in it and some of the nuances are left out of the document. FTK 1.x is Memory intensive. FTk2 is resource intensive. To run FTK 2 optimally, you need to have about $10,000(hardware+licensing). Don't have the money? Don't bother. I'll be moving the GUI and database server when I get my other system back in to production.

Addendum Pictures:

The Worker while processing a case
And the database server at the same time


Anonymous said...

It would be really interesting to reverse the worker and database workstations and take new screenshots...