Lawyers search for documents for many different reasons. TAR 1.0 systems were primarily used to reduce review costs in outbound productions. As most know, modern TAR 2.0 protocols, which are based on continuous active learning (CAL) can support a wide range of review needs. In our last post, for example, we talked about how TAR 2.0 systems can be used effectively to support investigations.

That isn’t the end of the discussion. There are a lot of ways to use a CAL predictive ranking algorithm to move take on other types of document review projects. Here we explore various techniques for implementing a TAR 2.0 review for even more knowledge generation tasks than investigations, including opposing party reviews, depo prep and issue analysis, and privilege QC.

Opposing Party Productions

Opposing party reviews are essentially knowledge generation  tasks. The objective is to weed through a collection to find  particularly relevant documents. Recall (i.e., finding all of the relevant  documents) is not as critical as precision—seeing more relevant  documents than irrelevant ones—and surfacing more hot documents  in the process.

CAL is particularly suited for this task. First, CAL is efficient in the  review of sparse collections. And, despite the general responsiveness  of opposing party productions, the truly important documents are  few and far between. Second, as discussed earlier, CAL is also a  superior way to surface Hot documents along the way.

There are a few different ways to initiate a CAL review of an  opposing party production. With the caveat that the language  used by opposing parties will typically differ, client documents  may provide a reasonable starting point. Relevant opposing party  documents provide even better seeds to initiate a CAL ranking.  Oftentimes, a handful of such documents are available through  past communications, or can be found through a modest analytics  assessment of the production—and only a handful of positive  documents is enough to start a CAL review. Otherwise, a CAL review  can be initiated with a single synthetic seed detailing precisely what is  being sought from the opposing party production.

Once the CAL review begins, there is no special workflow needed  to effectively review an opposing party production. CAL will elevate  relevant documents, including hot documents, and further minimize  the number of irrelevant documents that need to be reviewed along  the way. And contextual diversity will ensure coverage across the  Collection.

If, at any point, there is a desire to switch gears and truly focus on finding hot documents, it’s easy with Predict. Just spin up a new  Predict project ranking on the HotDoc field. Every decision to that  point will be used to train the CAL algorithm, and Predict will begin to  surface hot documents preferentially over even generally responsive  Documents.

Depo Prep and Issue Analysis

Preparing for multiple witness depositions and researching multiple  issues are both knowledge generation tasks, and they follow a similar  workflow. And both tasks often suffer from low richness within the  larger responsive collection, which makes CAL particularly useful.  In both cases, the setup follows the same approach. The typical  coding approach is to structure the witness list or issue list as a multivalue  field to allow reviewers to select more than one value (witness  or issue) for each document. To get even more granular in the coding  schema, and even further improve the effectiveness of CAL, each  witness or issue can be set up as a separate binary (yes-no) field.  Using either structure, creating a separate Predict project for each  issue or witness will ensure multiple simultaneous, independent  rankings. That way, ranking and review for every witness and issue  will be focused. Each review can then be conducted simultaneously  by multiple reviewers, or sequentially by a single reviewer.

This approach will not prevent reviewers from coding the full  spectrum of witnesses and issues pertinent to a particular  document. Rather, while every reviewer will see documents ranked  independently given their specific objective, they will be able to code  documents for other issues and witnesses when appropriate. Doing  so will correspondingly improve those other rankings.

Privilege and Privilege QC

Privilege assessment is a protection task, regardless of whether it is  an initial privilege review to locate privileged documents among a  group of unreviewed documents slated for production, or a quality  control measure to ensure that documents coded as not being  privileged are indeed not privileged. In both cases, CAL can be an  effective tool in preventing inappropriate production and disclosure.

CAL has its primary utility as an initial privilege review technique  when documents are being produced without an eyes-on review—  situations such as second requests and subpoenas. In that case,  the goal is to effectively locate and withhold all of the privileged  documents, without reviewing the bulk of the collection.

Certainly, in most instances, analytics will be used to isolate obviously  privileged communications exchanged with counsel. Any privileged  documents discovered in this analytics phase can then be used as  seed documents to initiate the CAL ranking for further privilege  Review.

The extent of the effectiveness of a CAL tool during this initial  privilege review will then depend largely on the features that are  used to inform the CAL algorithm. If email header information (To,  From, domains, etc.) is included in the feature set, the CAL algorithm  may have the ability to discern the identity of individuals making and  breaking privilege, and rank documents for review accordingly.  Otherwise, the algorithm will be constrained to ranking documents  based purely on content text. Text-based ranking is critical to  an effective privilege review nevertheless, because privileged  communications may be subsequently distributed internally,  without any reference to counsel. Assuming the general content of  the text has been coded as privileged (presumably in the original  communication with counsel) a CAL tool will then elevate similar  documents as potentially privileged.

Beyond this, the QC algorithms incorporated into a Predict review  provide one final defense to privilege disclosure, particularly in a  traditional production review. In that situation, every document being  produced has been reviewed and coded, inter alia, for privilege.  Spinning up a Predict project on privilege, then, will rank the entire  collection by the likelihood of each document being privileged.  Further, algorithmic QC will rank every document coded as “not  privileged” by the likelihood that they are, in fact, privileged. So, the  top-ranked documents actually look like they are privileged even  though they are coded as “not privileged.” Reviewing the top-ranked  documents in this ranking will provide a final measure of assurance  that privileged documents are not being produced.

Other Uses

We have no doubt that people will come up with other use cases  for CAL-based predictive ranking. We have written about two-tailed  reviews, where teams focus on both ends of the ranked spectrum.  We also believe CAL-like systems will prove useful for other kinds  of searches, including government records inquiries and patent  research, with the focus being on using good documents rather than  assumed keywords to build out better searches against all kinds of  documents.

To learn more about TAR 2.0, TAR for Smart People, Third Edition, is available in print and in a downloadable PDF format.