Once these ratios are generated, you can begin to extrapolate how much data you’re up against. More importantly, the page estimate will help you determine the amount and type of data you’ll need to initially review.

Sin #3: Tale of the Tape
 

One of the misconceptions about electronic discovery is that it requires looking at backup tapes of data and recovering files clandestinely deleted. While there are situations that require data forensic efforts similar to those seen on “CSI,” these efforts may be overkill for most digital discovery matters.

To understand why, it is helpful to categorize electronic data into five main types. The following “Hierarchy of Data” (which is visually represented in Figure 1) can help teams prioritize what data to process and review:

  • Active Data: This refers to email and electronic files that a person or business can access at present. It encompasses data on laptops, personal computers, networks, email servers, and even PDAs.     
  • Metadata: As mentioned previously, metadata are the hidden attributes and characteristics for each file. There can be hundreds of different metadata for each file, many of which are not useful in a legal context. However, some helpful examples are the name of the file, email header, bcc recipients, and creation date.     
  • Replicant Data: These are files that are automatically made by the user’s systems, often without their knowledge and direction. Replicant data includes auto-backup files generated by operating systems, word processing applications, as well as system/audit logs, Internet visit data (such as cookies), and recovered fragments from system crashes.     
  • Backup Data: Usually stored on medium like DAT or DLT tapes, this data is comprised of information that the company regularly backs up. These backups contain an enterprise-level view of all the company’s data. Depending on the electronic data retention policy (and tape backup recycle system) in place, there can be hundreds of these backup snapshots available.     
  • Residual Data: Consists of files and data fragments that have been deleted.

    In most instances, the most useful (and easiest to collect and process cost-effectively) data can be found at the top of this pyramid. Conversely, as you slide down the pyramid, the information not only becomes less helpful, but also exponentially greater and more expensive to process. This is especially true of residual data. To properly recover and interpret deleted files usually requires specialized tools and data forensic experts, whose hourly rates often rival those of big law firm partners.

    Likewise, backup data can be quite expensive to restore. That is because backup tapes are made for disaster recovery purposes by IT personnel, and designed to restore all data at an enterprise, not just one particular person’s files/emails. This not only includes data for every user that was backed up, but also every application and system file as well. To recover a particular user’s files and emails from backups, the entire set of data must be restored to a comparable set of hardware. For emails, this is especially true as the entire mail server must be restored. Only after this is done can individual e-mails and electronic files be accessed. And depending on the size of the data on the backup tape, the process can take days. Even worse, this procedure needs to be repeated for every backup tape set, which often duplicates a great deal of data from the previous one.

    There are certainly situations where restoration and recovery of residual and backup information is warranted, but it can be an expensive and unnecessary first step. To avoid breaking the bank, electronic discovery demands new collection and review approaches.

    Sin #4: Review Every Byte
     

    In digital discovery, there is a world of difference between preserving and reviewing data. All parties must take steps to preserve every electronic data type that may contain information related to the matter. A first step is to cease any enterprise-wide electronic data retention policies that could cause the destruction of responsive data. Another step may be to take a data snapshot of key custodian’s personal computer and laptops. Using data mirror-imaging applications like Symantec’s Norton Ghost or Acronis’ True Image will not only preserve active data, but also all residual data that may exist. In general, preservation orders should be as comprehensive as possible.

    Unfortunately, it is not practical to review the entire universe of documents encompassed by a preservation order. Even when attorneys focus only on active data, the volume is still significantly greater than seen in similar paper-based collections. These higher volumes have forced attorneys to take a more phased and measured approach to electronic discovery. Rather than examine every single piece byte, attorneys now are using common sense and technology to make the volumes more manageable.

    Some methods are similar to tried and true approaches used for decades when dealing with paper. For example, attorneys don’t blindly start copying every scrap of paper in someone’s office or the company warehouse. Instead, they interview persons who have knowledge about how the universe of documents is organized and maintained.

    This practice is even more critical for electronic discovery, and there are now statutory rules that require parties to disclose information about responsive data in their possession. Federal Rules of Civil Procedure (“FRCP”) Rule 26 provides for the disclosure of all data compilations (e.g., electronic files, e-mails, databases, etc.) that the parties may use to support their case. FRCP 26 also requires parties to search available electronic systems for relevant information even prior to a discovery request, as well as provide a volume estimate. Another common method for gathering information is to depose the person most knowledgeable about the information, as provided by FRCP 30(b)(6). Both of these rules – along with comparable state statutes and a growing body of supporting case law – underscore the need to look before leaping into the digital discovery fray.

    Common-sense paper culling techniques can also help narrow electronic collections. Like paper folders and red welds, many individuals organize electronic documents and emails in a similar electronic folder structure on their hard drive or network server. Often times, a scan of these top-level folders can quickly focus collection and review efforts.

    Technology can also play a key role in effectively culling down the amount of electronic data, provided that the content and metadata contained in the electronic file and email can be searched. If searchable, electronic data can often be effectively narrowed utilizing the following search criteria:

  • Date Range: If there is a relevant date range for the matter, then emails and files can be filtered to ensure that they fall within the applicable time period.     
  • Duplicates: Using a variety of methods, duplicates can be identified and ultimately removed from the electronic data collection.     
  • File Types: Electronic data that isn’t user-generated (e.g., system files, applications), or yield poor-results when printed /converted (e.g., databases, multimedia files) can be logged and not processed.     
  • Keywords: Although this requires negotiation with opposing counsel and the courts, key terms and names can effectively separate the wheat from the chaff. Another common use of keyword searching is to search for in-house and outside counsel names and emails to identify potentially privileged documents.     
  • Concept-Based Searching: A few service vendors (e.g., Cataphora (www.cataphora.com), FIOS (www.fiosinc.com) and Dolphin Search (www.dolphinsearch.com)) now offer search engines that purport to be able to identify emails and files by topics. If on the receiving end of electronic data, concept-based searching may be a terrific method to identify hot documents. On the other hand, it is unclear whether parties will be able to limit production of electronic documents based on this type of technology.

    The ability to search electronic data using the above criteria can help parties manage digital discovery volumes, but only if they take advantage of technology. Fortunately, the cost to employ these new litigation support technologies can be cheaper than traditional paper-based review methods.

    Sin #5: Paper Review is Cheaper
     

    Every discovery review employs a system. Paper-based review systems can range from simply tagging documents that are responsive, to completing a detailed coding sheet that reflects an attorney’s thought on the document. With years of practice, most attorneys are quite familiar and comfortable with this primarily manual process. Plus, most reviewers understandably prefer staring at paper rather than pixels. So it’s no wonder there is a strong aversion to anything but a paper-based process.


     

    Pages: 1 2 3
© 2008 Strategic Discovery, Inc.