The cost of processing documents for electronic discovery can be tremendous. Determining which party should shoulder this cost has been a critical driver of several court decisions in which the cost for producing the electronic evidence was shifted from the producing party to the demanding party. Data that exists in antiquated systems and which was originally created and exists in multiple media formats and file structures is costly to recover and to protect for discovery. The issues that are considered in these situations have been driven by the shear volume of this data and the technology required to recover it. It is important therefore to understand the elements that drive the cost for processing this data and any current electronic data.
The volume and composition of the dataset is a driving element in the cost of electronic discovery. As complex as identifying and collecting the data may be, the complexity of processing electronic data drives unique requirements on how the data must be handled. Converting and indexing the data into a common searchable usable format can be a labor intensive, specialized activity.
Supporting the conversion of the data are the software tools and service organizations that must be used to perform the electronic conversion of the data from numerous random formats to a common output that can be used for review. These tools can run the full range from simple tiff printers to enterprise wide integrated products capable of processing vast collections of source files in short periods of time. The cost of these tools and services depends on the volume and rate of conversions to be performed and the time that the data must be delivered in.
Software tools and services also require infrastructure. The infrastructure may be as simple as a desktop PC or it might include networked racks of servers and roomfuls of storage arrays. Support software, such as SQL databases, and native software applications, etc. contribute to the cost of infrastructure. Therefore, the capital investment in either software tools or infrastructure on which services are deployed drives cost.
This is clearly a situation where one size does not fit all and the need to select the correct tools, services and the correct support environment to meet the size of the discovery becomes critical. Service providers may be able to minimize costs by spreading the cost of the above elements across multiple cases.
Specialized tools and environments require a qualified and trained staff. The nature of electronic discovery requires a higher skill level than does paper document capture. Scan operators need to be replaced by IT personnel who understand and know how to operate the tools and support the environments mandated by the technology involved.
Because of the wide variability in the capability of the service providers, multiple schemas have been found for pricing these services. There is not a consistent method of pricing. Some vendors still charge via the page, some by the file count and others by the volume of gigabytes. For budgeting purposes, volume pricing is the most reliable since you typically know how much data you are dealing with for your case. However, you should be aware that in some instances per page or per file pricing can be cheaper.
It is important to make sure the team is informed about the impacts decisions made in the preservation, identification, and collection phases of the electronic discovery lifecycle have on the downstream services required, such as processing and reviewing the documents. Since data drives the cost, when possible it is always a good strategy to cull and filter your data before it gets to the extraction and conversion stage, making sure that the way the data is collected and filtered is negotiated and agreed upon if it could impact the integrity of the documents. Some vendors work on a media processing fee and then charge by the GB to keyword search the data.