For years, technology experts and attorneys have been predicting the rise of computer-assisted coding and review for electronic discovery.
U.S. Magistrate Judge Andrew J. Peck for the Southern District of New York recently was the first to issue a reported opinion in support of the technology — also referred to as “predictive coding” or “intelligent review” — calling it “an acceptable way
to search for relevant [electronically stored information] in appropriate cases.”
“Computer-assisted review appears to be better than the available alternatives, and thus should be used in appropriate cases. While this court recognizes that computer-assisted review is not perfect, the Federal Rules of Civil Procedure do not require perfection,” Peck wrote in Da Silva Moore v. Publicis Groupe.
“What the Bar should take away from this opinion is that computer-assisted review is an available tool and should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review,” he said.
Michelle Lange, a staff attorney at Kroll OnTrack, a computer forensics company that specializes in electronic evidence, said attorneys and other e-discovery experts have been eager for official approval of the technique.
“There has been a lot of reluctance to be the guinea pig and depart from the gold standard of having a lawyer’s eyes on every document in a document review,” Lange said. “People have been waiting for the bench to officially bless the use of this technology, and the wait is now over.”
Dispute over protocol
The issue arose in a suit brought by five female plaintiffs alleging that public relations company MSL Group and its parent company, Publicis Groupe, engaged in systemic, company-wide gender discrimination in violation of Title VII and New York state law.
Initial discussions about discovery revealed an estimated 3 million electronic documents from the agreed-upon custodians that needed to be reviewed.
Both parties expressed interest in computer-assisted review. But issues remained to be hashed out as part of the protocol, including the sources of the electronically stored information — or ESI — and the sampling protocol.
MSL proposed to create a random sample of 2,339 documents from the entire email collection to determine which documents were relevant and not relevant for a “seed set” to train the software.
Working from the seed set, the software identifies properties of the documents that it uses to code other documents, and through a series of sample rounds overseen by a senior reviewer, who is typically a senior partner, the computer learns to predict the reviewer’s coding.
The system is deemed ready to take on the entire document set when the software’s predictions and the reviewer’s coding sufficiently coincide (in this case, a 95 percent confidence level was established).
MSL proposed using seven rounds to test the software, reviewing at least 500 documents per round. MSL also agreed to show the plaintiffs all the documents, even those deemed not relevant, throughout the process.
While the plaintiffs expressed concern about the reliability of MSL’s methodology, the court said such objections were premature.
“[P]laintiffs will see how MSL has coded every e-mail used in the seed set (both relevant and not relevant), and the court is available to quickly resolve any issues,” Peck wrote.
Further, the judge said, the parties needed more information, such as how many relevant documents would be produced and at what cost to MSL, whether the case would remain limited to the named plaintiffs or be broadened to a class action, and whether any “hot” or “smoking gun” documents would be found that require the software to be re-trained, all of which are issues that could be examined only as the protocol unfolds, he said.
“These types of questions are better decided ‘down the road,’ when real information is available to the parties and the court,” he said.
The parties also disagreed about MSL’s plan to review and produce only the top 40,000 documents. Peck rejected that proposal as an arbitrary cutoff point.
Where the “line will be drawn [as to review and production] is going to depend on what the statistic show for the results,” he said. “And if stopping at 40,000 is going to leave a tremendous number of likely highly responsive documents unproduced, [MSL’s proposed cutoff] doesn’t work.”
Parties working together
Andrew Cosgrove, an attorney at Redgrave LLP in Minneapolis, noted that the opinion focuses more on the process of meeting and conferring, agreeing to a protocol and exchanging information, than it does on the technology itself.
“The most important takeaway is how the court paid quite a bit of attention to the parties working together, building a protocol they both can live with, and doing the vast majority of the work outside the context of hearings and the courtroom,” said Cosgrove, who has an information law practice focusing on e-discovery, information management, privacy and data protection.
The message to litigants is to seriously consider computer-assisted e-discovery.
“The real strong undercurrent of this case is the obligation between the parties not to dismiss these types of options but to discuss, collaborate and be as transparent as you can be,” Cosgrove said.
But Robert Brownstone, technology and e-discovery counsel and co-chair of the electronic information management practice group at Fenwick & West in Silicon Valley, Calif., warned that some parties are concerned about aspects of the seed set-sharing cooperation.
“A lot of in-house lawyers and litigants feel very strongly that those kinds of tidbits of information — particularly the non-relevant search terms — are covered by the attorney work product privilege,” he said.
And while the decision has been hailed as an embrace of new technology, “this isn’t a one-size-fits-all endorsement of this technology being used in every case,” Lange said.
Application to smaller cases
Lange predicted it will be three to five years before computer-assisted review trickles down to smaller cases.
However, she said, “it is only a matter of time before solos see this in their own practice.”
Cosgrove agreed, adding that the growth of information shared by society generally will result in an increase in electronically stored information in all cases.
“Today we talk about millions of documents in cases, but in the future we will be looking at tens of millions or hundreds of millions of documents,” he said. “That inevitability has the courts looking to potential solutions for countering this information growth in the litigation context. And traditional methods of brute force — with hundreds of review attorneys reading every page — are proving to be absurd,” forcing both the bar and the bench increasingly to turn to technology for support.
Brownstone noted that over the last few years, state bar ethics committees have issued opinions stating that lawyers must use reasonable care and due competence to keep abreast of current technology.
While the directives are typically focused on information security or storage of client data, “it’s fair to say that with this bell-ringer of an opinion by Judge Peck, litigators need to make sure to stay abreast of current [ESI] search technology,” Brownstone cautioned.
Even if a sole practitioner may not be ready to use computer-assisted review, he will need to know how to respond when the other side offers or asks the attorney to take part in such a protocol, he said.