 |
|
Once
these ratios are generated, you can begin to extrapolate how much data
you’re up against. More importantly, the page estimate will
help you determine the amount and type of data you’ll need to
initially review.
Sin
#3: Tale of the Tape
One
of the misconceptions about electronic discovery is that it requires
looking at backup tapes of data and recovering files clandestinely
deleted. While there are situations that require data forensic efforts
similar to those seen on “CSI,” these efforts may
be overkill for most digital discovery matters.
To
understand why, it is helpful to categorize electronic data into five
main types. The following “Hierarchy of Data”
(which is visually represented in Figure 1) can help teams prioritize
what data to process and review:
- Active
Data: This refers to email and electronic files that a person or
business can access at present. It encompasses data on laptops,
personal computers, networks, email servers, and even PDAs.
- Metadata:
As mentioned previously, metadata are the hidden attributes and
characteristics for each file. There can be hundreds of different
metadata for each file, many of which are not useful in a legal
context. However, some helpful examples are the name of the file, email
header, bcc recipients, and creation date.
- Replicant
Data: These are files that are automatically made by the
user’s systems, often without their knowledge and direction.
Replicant data includes auto-backup files generated by operating
systems, word processing applications, as well as system/audit logs,
Internet visit data (such as cookies), and recovered fragments from
system crashes.
- Backup
Data: Usually stored on medium like DAT or DLT tapes, this data is
comprised of information that the company regularly backs up. These
backups contain an enterprise-level view of all the company’s
data. Depending on the electronic data retention policy (and tape
backup recycle system) in place, there can be hundreds of these backup
snapshots available.
- Residual
Data: Consists of files and data fragments that have been deleted.
In
most instances, the most useful (and easiest to collect and process
cost-effectively) data can be found at the top of this pyramid.
Conversely, as you slide down the pyramid, the information not only
becomes less helpful, but also exponentially greater and more expensive
to process. This is especially true of residual data. To properly
recover and interpret deleted files usually requires specialized tools
and data forensic experts, whose hourly rates often rival those of big
law firm partners.
Likewise,
backup data can be quite expensive to restore. That is because backup
tapes are made for disaster recovery purposes by IT personnel, and
designed to restore all data at an enterprise, not just one particular
person’s files/emails. This not only includes data for every
user that was backed up, but also every application and system file as
well. To recover a particular user’s files and emails from
backups, the entire set of data must be restored to a comparable set of
hardware. For emails, this is especially true as the entire mail server
must be restored. Only after this is done can individual e-mails and
electronic files be accessed. And depending on the size of the data on
the backup tape, the process can take days. Even worse, this procedure
needs to be repeated for every backup tape set, which often duplicates
a great deal of data from the previous one.
There
are certainly situations where restoration and recovery of residual and
backup information is warranted, but it can be an expensive and
unnecessary first step. To avoid breaking the bank, electronic
discovery demands new collection and review approaches.
Sin
#4: Review Every Byte
In
digital discovery, there is a world of difference between preserving
and reviewing data. All parties must take steps to preserve every
electronic data type that may contain information related to the
matter. A first step is to cease any enterprise-wide electronic data
retention policies that could cause the destruction of responsive data.
Another step may be to take a data snapshot of key
custodian’s personal computer and laptops. Using data
mirror-imaging applications like Symantec’s Norton Ghost or
Acronis’ True Image will not only preserve active data, but
also all residual data that may exist. In general, preservation orders
should be as comprehensive as possible.
Unfortunately,
it is not practical to review the entire universe of documents
encompassed by a preservation order. Even when attorneys focus only on
active data, the volume is still significantly greater than seen in
similar paper-based collections. These higher volumes have forced
attorneys to take a more phased and measured approach to electronic
discovery. Rather than examine every single piece byte, attorneys now
are using common sense and technology to make the volumes more
manageable.
Some
methods are similar to tried and true approaches used for decades when
dealing with paper. For example, attorneys don’t blindly
start copying every scrap of paper in someone’s office or the
company warehouse. Instead, they interview persons who have knowledge
about how the universe of documents is organized and maintained.
This
practice is even more critical for electronic discovery, and there are
now statutory rules that require parties to disclose information about
responsive data in their possession. Federal Rules of Civil Procedure
(“FRCP”) Rule 26 provides for the disclosure of all
data compilations (e.g., electronic files, e-mails, databases, etc.)
that the parties may use to support their case. FRCP 26 also requires
parties to search available electronic systems for relevant information
even prior to a discovery request, as well as provide a volume
estimate. Another common method for gathering information is to depose
the person most knowledgeable about the information, as provided by
FRCP 30(b)(6). Both of these rules – along with comparable
state statutes and a growing body of supporting case law –
underscore the need to look before leaping into the digital discovery
fray.
Common-sense
paper culling techniques can also help narrow electronic collections.
Like paper folders and red welds, many individuals organize electronic
documents and emails in a similar electronic folder structure on their
hard drive or network server. Often times, a scan of these top-level
folders can quickly focus collection and review efforts.
Technology
can also play a key role in effectively culling down the amount of
electronic data, provided that the content and metadata contained in
the electronic file and email can be searched. If searchable,
electronic data can often be effectively narrowed utilizing the
following search criteria:
- Date
Range: If there is a relevant date range for the matter, then emails
and files can be filtered to ensure that they fall within the
applicable time period.
- Duplicates:
Using a variety of methods, duplicates can be identified and ultimately
removed from the electronic data collection.
- File
Types: Electronic data that isn’t user-generated (e.g.,
system files, applications), or yield poor-results when printed
/converted (e.g., databases, multimedia files) can be logged and not
processed.
- Keywords:
Although this requires negotiation with opposing counsel and the
courts, key terms and names can effectively separate the wheat from the
chaff. Another common use of keyword searching is to search for
in-house and outside counsel names and emails to identify potentially
privileged documents.
- Concept-Based
Searching: A few service vendors (e.g., Cataphora (www.cataphora.com),
FIOS (www.fiosinc.com) and Dolphin Search (www.dolphinsearch.com)) now
offer search engines that purport to be able to identify emails and
files by topics. If on the receiving end of electronic data,
concept-based searching may be a terrific method to identify hot
documents. On the other hand, it is unclear whether parties will be
able to limit production of electronic documents based on this type of
technology.
The
ability to search electronic data using the above criteria can help
parties manage digital discovery volumes, but only if they take
advantage of technology. Fortunately, the cost to employ these new
litigation support technologies can be cheaper than traditional
paper-based review methods.
Sin
#5: Paper Review is Cheaper
Every
discovery review employs a system. Paper-based review systems can range
from simply tagging documents that are responsive, to completing a
detailed coding sheet that reflects an attorney’s thought on
the document. With years of practice, most attorneys are quite familiar
and comfortable with this primarily manual process. Plus, most
reviewers understandably prefer staring at paper rather than pixels. So
it’s no wonder there is a strong aversion to anything but a
paper-based process.
|