Analysis: Focusing Collection
Initial case assessment and/or review frequently identify the need to collect additional materials, e.g., from newly discovered relevant people, new time frames or new keywords (if keywords are used as selection criteria in collection). Analysis techniques can be used to focus these additional collections and thereby reduce the amount of irrelevant material that might otherwise need to be reviewed.
Most collection activities acquire all documents and emails from a specified set of custodians within the relevant date range of the case. Initial case assessment, especially people analysis, can identify additional custodians much earlier than a review team would. Having these custodians earlier can lead to a more refined initial case assessment, and a more productive review process from having more of the materials up front to organize and prioritize.
It may be possible through analysis to narrow the date range for materials collected from additional custodians. For example, a people analysis may show that the new custodians only interacted with the central figures of the case on any topics of relevance (or any topics at all) during a specific narrower time period. This knowledge can allow you to restrict the collection of materials from the new custodians, yielding a smaller overall collection to analyze and review.
If you can use keywords as part of your selection criteria (both technically and within the restrictions of your case), then this process can be improved further. Formulate search queries that identify all items collected from the existing custodians that also involve the new custodian. Explore those result sets to determine the topics and specific vocabulary used in these materials that are relevant to the case. Use that information to select keywords so that you only collect materials likely to be relevant.
In some cases a decision is made to simply collect all data, without any up-front filtering. In this brute-force approach, a typical hit-the-nail-with-a-sledgehammer approach is used. The scope of analysis is left at or near "everything," sometimes due to the fact that the IT environment has been poorly managed or haphazardly organized. The brute force approach is typically used when (i) budget or other limitations limit or exclude use of analytical tools, or (ii) relevant corpus of interest is deemed to be extremely large relative to the entire dataset available. The brute force approach typically requires substantially more hardware for search, index, storage and processing of all data versus just a subset of that data, requires more review staff to analyze all data rather than just a subset of that data, and is associated with excessive costs, time, effort and mistakes.