eDiscovery Production: Forms of Production

Pursuant to the Federal Rules of Civil Procedure, a party producing electronically stored information must produce the information in a form or forms in which it is ordinarily maintained or in a reasonably usable form or forms.

There are four basic forms of production for electronically stored information:

  • Paper;
  • Quasi-paper, which is essentially paper in electronic form, such as .tif files and .pdf files, often with associated metadata and full text;
  • Quasi-native form, such as an IBM AS400 database produced as an ASCII, comma-delimited file with associated file and field structural information; and
  • Native, where the electronic information is produced as it is maintained and used.

 

Paper

A paper production is just what it sounds like: electronically stored information is printed to paper and the paper is produced to the other side.

Although this form of production has traditionally been favored because of familiarity and simplicity, it is becoming less common due to the volume and costs associated with electronic documents.

Paper format should be considered in cases involving small numbers of documents and those in which the requesting party does not require production of metadata. However, when dealing with electronic documents, it should be negotiated not to produce in paper, since it is not the form in which the documents were ordinarily maintained.

Documents that originated in paper could be incorporated into your electronic document database, which will require that the documents be imaged, coded, and OCRed. In regards to form of production, parties could negotiate to exchange some or all of that information. Frequently in matters with multiple defendants, an agreement will be reached to retain a vendor to process the documents from all defendants and share the data and images -- and the costs.

Quasi-Paper

A quasi-paper production is one where electronically stored information is converted to images files, typically either TIFF or PDF, and produced in that format. The producing party might also provide some or all of the metadata associated with the underlying files as well as text extracted from the files.

Imaging is the process of converting a native electronic document or scanning a paper document into a non-editable digital file. Group IV TIFF is the most common format for production of images of electronic documents, although a non-editable PDF is another image format that can be used for production. The advantages of producing images are that the documents cannot be edited by the receiving party but they can be endorsed with confidentiality designations, bates numbers and redactions as necessary by the producing party. Some file types, such as spreadsheets and drawings, do not lend themselves to an image format due to their size and possible formatting issues. Since images are basically a "picture" of the document, they cannot be searched. For this type of production, a full text searchable file and/or the extracted text/metadata fields can be produced, but this is typically negotiated amongst the parties as to what fields and information should be produced.

Quasi-Native

In a quasi-native production, electronically stored information is produced in a reasonably useable electronic form other than the form in which it was maintained and used. For example, an IBM AS400 database might be produced as an ASCII, comma-delimited file with associated file and field structural information.

A quasi-native approach can be particularly useful when dealing with large databases and with systems built around proprietary software or hardware.

Native

In a native production, data is produced as it was maintained or used. For example, an Excel spreadsheet file would be provided to the other side as an .xls file.

There has been a lot of talk about "native file production," but there is no defined standard yet or formal rule requiring native file production for litigation or government inquiry. The Federal Rules of Civil Procedure do not mandate native file production. However, parties are required to reach an agreement on how the documents will be produced.

Special Considerations for Native Productions

Producing data in its native format has certain limitations and risks that should be considered. These include the inability to individually number or endorse "pages" for document control, inability to redact (leading to privilege problems), and issues with reviewing the production.

Where Native Production May Be Necessary

For some file types, the native format may be the only way to adequately produce the documents. For instance, Microsoft Excel spreadsheets do not lend themselves to being converted to image, because the worksheets often do not conform to a standard 8 1/2 by 11 inch page. Even if the number of rows and columns do conform to a standard size of paper, there are often formulae and other information that is essential to the matter at hand that requires that they be produced in native format. Databases are another good example of native data that may best be produced in native format. Databases can comprise massive amounts of completely undifferentiated tables of data. In order to understand a database, the producing party may need to provide metadata and software that is designed to present the data in a logical manner.

Alteration of Files

If producing in native format it is important to take precautions to protect the documents from alteration. Annotations with Bates numbers, confidentiality designations, and redacting is not possible without altering the native document. Therefore, work off a copy of the file, to ensure the original remains un-altered.

Metadata

Metadata can take the form of "change tracking" history information as found in current versions of Microsoft Office files (i.e., Word, Excel, PowerPoint). These documents can track the document's history of changes, and this information may not be desirable to produce.  

Suggested Approaches to Native Productions

Some suggestions to consider when producing native file documents include:

Hashing

Hash the documents prior to production to avoid calling into question their authenticity. Commonly used hashing algorithms include MD5 and SHA1. You might consider attaching the hash value to the documents as a field for a load file.

Tracking Produced Materials

Reach an agreement with other parties about how native documents will be managed throughout the discovery process (e.g., how will they be referred to in depositions?). One approach is to create a unique identifying number for each electronic document. If the documents will only be used by experts for analysis, agree on how the experts will refer to the documents in their reports.

Need for Metadata

Assess whether or not metadata needs to be produced. If the metadata falls into privilege characterization, obtain agreement from the opposing party before engaging in scrubbing activities.

Production Media

There is no clearly defined format for producing native files, but here is one highly secure and verifiable method for producing native files on a CD, DVD, or hard drive:

  • Files are organized into CD-sized volumes
  • Files are renamed with the beginning bates number as a suffix. (e.g., "Document.CDN00080.doc")
  • Files are set to Read Only
  • Index is delivered with each volume containing the following information for each file:
    • Volume
    • Beginning Bates Number
    • File Name
    • Modified filename to include the original filename, the beginning bates number and the original extension. e.g., "Document.CDN000080.doc."
  • MD5Hash tool is provided to verify the MD5Hash value on the index.

 

By following this procedure, the receiving party is able to relate each native file to the bates number of the document, because the MD5 hash value clearly indicates the exact native file that was produced. The hash tool can then be used to check that the hash value of the native file has not been inadvertently changed during native file review.

Source: EDRM (edrm.net)