|

CaseSoft, Case Analysis Made Easy!
The Bell Curve Document
Indexing - Imaging
Introduction
Remember the
"bell curve" from statistics class? The bell curve, so named because of
its shape, illustrates the frequency distribution of many phenomena, for
example, height. Measure a thousand people. For every person over 7',
you'll have a mob between 5'6" and 5'10".
Let's apply the bell curve to the document collections produced during
discovery. Out of every thousand cases, how many involve 1,000,000+
documents? 100,000+? 10,000? What does this distribution suggest
regarding strategies for imaging and searching documents?
Giant Cases – Special
Tools Required
We're all familiar with cases in which millions of documents are
produced during discovery. But we've also seen individuals over 7' tall.
Both instances are outliers occurring infrequently. Out of every
thousand cases, only a handful has 1,000,000 or more documents.
Cases with document collections of over 100,000 are also relatively
rare. Do even a hundred cases out of every thousand involve this many
documents? Widespread use of email has dramatically increased the volume
of documents present in many cases, but it hasn't turned every case into
a document monster.
Dealing with 1,000,000+ documents or even 100,000+ justifies a
substantial investment in scanning and coding. This type of case also
demands sophisticated software tools such as Concordance,
iCONECT,
IPRO,
Litigator's
Notebook, or
Summation
to assist with document indexing, image handling, and more.
So that's the story for the giant cases lurking out in one tail of the
bell curve. But what about the cases that populate the rest of the
curve? How many documents do these cases involve? What's an appropriate
image handling and text searching solution for them?
Normal Cases -- Perfect For
Adobe Acrobat
Cases with very small document collections fall at the other end of the
curve. For every 1,000,000 document case, there's a case that involves a
single red weld of documents. These cases with only a single folder or
box of documents are probably as rare as the ones with massive
quantities of documents.
Which brings us to the approximately 70% of all cases that fall into the
center area of the bell curve. My experience suggests these cases have
between 1,000 and 50,000 documents. A small number relative to a
gargantuan million document case, but still a heap of paper. More
documents than any trial team can memorize the details about. Certainly
a document collection that should be imaged and available in a
searchable form.
If your firm has one of the excellent products mentioned above, it can
definitely be put to work on smaller matters as well. However, another
wonderful option to consider on cases with small or mid-sized document
collections is having documents scanned as PDF and using
Adobe Acrobat.
There are numerous reasons Acrobat makes a great choice for a case with
a normal size document population. The fact that the PDF format has
become ubiquitous is a benefit in and of itself. You may already own and
be comfortable with Acrobat, perhaps in connection with court-filing
requirements. It's very likely expert witnesses, other law firms, and
even your clients are familiar with PDF files and have either a full
Acrobat license or the free Adobe Reader, making it easy to share case
documents.
Why has the PDF format become the de facto standard for electronic
versions of paper documents? The primary reason is that a single PDF
file can contain the images of all pages of the paper document as well
as the associated document text, typically captured by optical character
recognition (OCR) software.
If you're new to document imaging, you may be surprised to learn that,
prior to the introduction of the PDF format; the standard way to create
electronic versions of paper documents was to generate a series of
single-page TIFF images and a separate OCR text file. Thus, scanning a
15-page document would yield a total of 16 separate electronic files
--15 Tiffs and a text file.
When scanning first became available, the Many Electronic Files = 1
Paper Document approach was as good as it got and certainly beat nothing
at all. However, with the advent of PDF, which meant that 1 Electronic
File = 1 Paper Document, it wasn't long before PDF ruled the roost.
The argument for PDF has become even stronger following Adobe's release
of Acrobat 6. This important new version of Acrobat offers numerous
enhancements, including cross-PDF text searching and improved document
mark-up functionality. For example, you can search a folder containing
any number of PDF files and instantly locate those containing any term
or phrase.
Here's a final tip for any reader who's yet to experiment with document
imaging: Using Acrobat is a great way to get comfortable using
electronic documents without jumping into the deep end of the pool.
Don't scan every case document until you're sure it's worth the effort.
Instead, identify the 100 or so most critical documents and have them
scanned as PDFs and put in a folder on your network from which they can
be searched. You'll be able to evaluate the benefits of using electronic
versions of case documents with a minimal investment of time and
expense.
When you have documents produced during discovery imaged, be sure to let
the scanning vendor or your in-house support staff who does the scanning
know you want the resulting PDFs to contain both images and text. If
you're not clear about this requirement, you may get back PDFs that
contain only images and not the associated text of the documents. PDFs
that contain only images cannot be searched.
Conclusion
If you only handle cases with a gazillion documents, Adobe Acrobat isn't
the right answer for image-handling and text searching. However, for the
vast majority of us, Acrobat is a fantastic solution for some or all
cases. If you haven't put Acrobat to the test, you owe it to yourself to
try it on an upcoming matter.
Copyright 2004 Greg Krehel. All rights reserved.
About The Author
Greg Krehel is CEO of Casesoft. CaseSoft is the developer of the
popular software tools CaseMap, TimeMap, DepPrep, and NoteMap. CaseMap
makes it easy to organize and explore the facts, the cast of characters,
and the issues in any case. TimeMap makes it a cinch to create
chronology visuals for use during hearings and trials, client meetings
and brainstorming sessions. DepPrep helps prepare clients for
depositions. NoteMap makes it easy to create, edit, and use outlines. In
addition to his background in software development, Mr. Krehel has over
15 years of trial consulting experience. You can reach him via e-mail (gkrehel@casesoft.com)
or telephone (904-273-5000).
5000 Sawgrass Village Circle
• Ponte Vedra Beach, FL 32082
Tel:
904.273.5000 • Fax: 904.273.5001 • www.casesoft.com
|