“Big data” is a term that gets tossed around a lot these days, with different meanings in different contexts. But for in-house counsel managing corporate e-discovery, big data’s implications are clear. As litigation-related data volumes soar, so do the time and effort required for a company’s legal team to review it all and comply with discovery requirements.
As in-house counsel strive to control litigation costs, they are managing their caseloads more actively than ever before. One increasingly common way they are doing that is by encouraging their outside counsel to use technology assisted review, or TAR, in appropriate cases.
Big data in e-discovery
Think of TAR as the antidote for big data e-discovery. Data volumes in litigation have exploded far beyond the capacity of legal teams to review manually. TAR has become essential for its ability to prioritize documents for review and even eliminate many documents from review. When a corporation’s outside firms use TAR, they can reduce review costs by half or more in many cases.
Use of TAR — also sometimes called predictive coding, predictive ranking or computer-assisted review — has grown significantly in the last two years. That is due, in large part, to U.S. Magistrate Judge Andrew J. Peck. On Feb. 24, 2012, Judge Peck issued the first judicial opinion anywhere to endorse the use of TAR in e-discovery.
The impact was significant. Just two months earlier, Judge Peck himself had written in a legal trade magazine: “While anecdotally it appears that some lawyers are using predictive coding technology, it also appears that many lawyers (and their clients) are waiting for a judicial decision approving of computer-assisted review.”
Judge Peck gave them the green light they were waiting for. In the two years since his decision, TAR has gained wide acceptance as a discovery tool, particularly in large and complex cases.
Reducing time, cost of review
For in-house counsel, the appeal of TAR is its ability to deliver substantial savings on review costs. Studies have shown that the use of TAR can reduce document populations dramatically, often by more than 50 percent. For larger corporations, that can save millions on its annual legal spend.
To understand the significance of that, consider how big data has changed the nature of discovery.
For as long as courts have mandated discovery, lawyers have had to review documents to find those that are relevant, those that are not, and those that are privileged or confidential.
It was traditionally a manual process. Attorneys eyeballed each document and decided whether it should be produced. But as paper was replaced by data, and as accumulations of data grew into big data, it was no longer feasible to review each document manually.
Today, it is common for major corporate lawsuits to involve gigabytes or even terabytes of electronic information. To help sift through it all, attorneys rely on technology. Their objective, as Judge Peck observed in his decision, was “to identify as many relevant documents as possible while reviewing as few non-relevant documents as possible.”
The technology attorneys most commonly use is keyword searching. But many attorneys fail to realize that keyword searches are far less precise and comprehensive than they might think. As Judge Peck put it, the way lawyers choose keywords “is the equivalent of the child’s game of ‘Go Fish.’”
Judge Peck’s decision cited a 1985 study by scholars David Blair and M.E. Maron. Experienced searchers were instructed to use keywords to retrieve at least 75 percent of relevant documents from a collection of 40,000. Although the searchers believed they had succeeded, their actual recall was just 20 percent.
“Computer-assisted review appears to be better than the available alternatives, and thus should be used in appropriate cases,” Judge Peck concluded.
While computer-assisted review is not perfect, he added, court rules do not require perfection. The overarching goal is to “secure the just, speedy, and inexpensive determination” of lawsuits.
Next generation of TAR
Even as the use of TAR has continued to expand, the technology that underlies it has continued to evolve and improve.
Without question, even first-generation TAR systems were a major advance over manual or keyword searching. However, early systems had a number of shortcomings that limited their usefulness in many real-world contexts.
For one, first-generation TAR systems required that a senior attorney be involved in training the system. The senior attorney would have to review and code hundreds or thousands of random documents until the system stabilized. Not only did that drive up the cost of using TAR, but it frequently delayed the process from ever getting started in the first place.
Another shortcoming in the early systems was that they required legal teams to have all their documents at the start. If a subsequent batch of documents arrived, the training would have to begin all over again.
The most significant advance in TAR technology has been the introduction of a new protocol called Continuous Active Learning. The key difference between CAL and older forms of TAR is that it continues to “learn” from decisions made throughout the document review.
In July, two of the leading authorities on e-discovery, Maura Grossman and Gordan Cormack, published a peer-reviewed study comparing CAL against other TAR methods. They concluded that CAL yielded “generally superior results” to other TAR systems and required “substantially and significantly less human review effort.”
What does this mean for your cases? For one, it eliminates the need for senior attorneys to spend time training the TAR system, as was traditionally required. Instead, with CAL, the review team can simply begin reviewing documents. As they make coding calls, the system will continuously learn and improve its results.
Also, because the system is continually learning and refreshing its rankings, new documents can be added at any time. This conforms to the way litigation occurs in the real world, where discovery documents typically arrive on a rolling basis.
Benefits for in-house counsel
Corporate litigation costs are soaring. One recent survey found that, in 2013 alone, litigation spending increased substantially, with 71 percent of all U.S. companies spending more than $1 million and 43 percent of larger companies spending more than $10 million.
Half or more of a corporation’s litigation costs are attributable to discovery, and the most expensive phase of the discovery process is review, estimated to be as much as 73 percent of the overall discovery cost.
TAR reduces the number of documents your legal team must review. It prioritizes the documents for review that are most likely to be relevant. Non-relevant documents go to the bottom of the pile. Rather than require review of every document, as with manual review, TAR enables review teams to review a significantly smaller percentage of the documents.
That, in turn, reduces the time it takes to complete the review and it reduces the overall cost of the review. The larger the document collection involved in the case, the greater the savings to the corporation.
Another advantage of TAR for legal departments is that it facilitates early case assessment. Using TAR to find key documents quickly, corporate counsel get a better understanding of the case and of its strengths and weaknesses. That enables better planning, both in terms of budget and litigation strategy.
There is no turning back from big data. Exploding data volumes have forever changed the litigation equation for corporations. As in-house counsel wrestle to control discovery costs, TAR is an essential tool for helping them achieve that.
John Tredennick is the founder and CEO of Catalyst Repository Systems, which designs, hosts and supports document repositories for large-scale discovery and regulatory compliance. A former trial lawyer and litigation partner with a large national law firm, he has written or edited five books and numerous articles on litigation and technology issues. He served as chairman of the American Bar Association Law Practice Management Section and editor-in-chief of its flagship magazine.