Federal Court In Washington Sanctions Attorney For Citing “Badly Out Of Date” Case Law

Defense counsel was sanctioned by a federal court in Washington for bringing a motion to compel in bad faith, with the court finding that defense counsel’s citation of case law analyzing a prior version of the Federal Rules of Civil Procedure was “inexcusable.”

In Fulton v. Livingston Fin. LLC, (W.D. Wash.), Defendants cited case law analyzing the version of Federal Rule of Civil Procedure 26(a)(1) that existed before the widely broadcasted amendments to that Rule took effect near the end of 2015.  Rule 26 governs the type of information that should be produced in discovery, and includes a list of elements for attorneys to contemplate when determining if a discovery request is proportional to the demands of the case.

In Fulton, Defendants had moved either to compel discovery or exclude certain medical evidence Plaintiff wished to present at trial.  Initially, the court found that found Defendants’ argument for seeking this information was unreasonable and constituted a misrepresentation to the court.  The court noted that the amendments to Federal Rule of Civil Procedure 26(b)(1) “dramatically changed” what information is discoverable, and therefore defense counsel’s citations to outdated case law was a misrepresentation that warranted sanctions.

Although defense counsel argued that the amended version of Rule 26 may not be applicable, the court found that the amendments apply to all proceedings pending as of the effective date of the new Rule insofar as it is “just and practicable.”  Therefore, the court concluded that defense counsel had misrepresented the scope of discoverable information when he failed to note the proportionality standard, but instead used the outdated law.

The court then ruled that it had an inherent power to sanction if it specifically found bad faith or conduct tantamount to bad faith, regardless of whether a specific court order was violated.  Because the court found that defense counsel recklessly misrepresented the law and the facts to the court in an effort to limit Plaintiff’s evidence at trial, sanctions were appropriate.

The sanctions were significant as they came in two forms.  First, the court imposed a cost and fees award, and also directed the sanctioned attorney to provide the offending briefing to his senior firm counsel along with an explanation that he was being sanctioned for citing “badly out of date” law.  Second, the attorney was subject to a potential future disclosure requirement designed to deter future transgressions.  These punishments, “strikes one and two”, were imposed to punish counsel as well as sufficiently inform future courts to properly sanction any further bad faith conduct.

This case vividly demonstrates how important it is for attorneys to stay abreast of changes in the Federal Rules of Civil Procedure as well as substantive law, or else face the real possibly that courts will punish those attorneys who fail to do so.

Federal Court In Washington Denies Motion To Compel Restoration Of Backup Tapes

A federal court in Washington recently denied a motion to compel the production of archived emails stored on backup tapes, rejecting the plaintiffs argument that the defendants culpability in failing to preserve the emails in a more accessible format outweighed the burden and cost to the defendant of restoration.

In Elkharwily v. Franciscan Health System (W.D. Wash.), the defendant did not maintain an email archive on its servers; rather, it saved its employees emails to physical backup tapes on a monthly basis. Without restoring the backup tapes, the only emails accessible to the defendant were those that remained in the relevant custodians live email accounts in other words, the emails that the users had not deleted from their inboxes.

Under Federal Rule of Civil Procedure 26(b)(2)(B), a party need not provide discovery … [when] not reasonably accessible because of undue burden or cost. The producing party has the burden of showing undue burden and cost, but, upon making that showing, Rule 26(b)(2)(B) places the burden on the requesting party to show good cause exists to produce the ESI in spite of the burden and cost.

The defendant in Elkharwily did not dispute that its backup tapes would contain at least some emails that were discoverable under Rule 26(b)(1), but it argued that obtaining those emails would impose undue burden and cost. Specifically, the defendant estimated that retrieving, restoring, and reviewing the emails archived on the backup tapes would cost $157,500.

The plaintiff did not contest the defendants cost estimate or otherwise dispute the defendants burden and cost arguments. Rather, the plaintiff argued that the defendant was at fault for the high cost of restoring the backup tapes, because the defendant should have preserved them in a more readily accessible format, starting from the first time that the plaintiff contended he had put the defendant on notice of the possibility of litigation.

The court found the plaintiffs blame-shifting to be unpersuasive, particularly in light of a dispute over when the defendant was on notice of the potential for litigation, for which the court found the defendants explanation to be more credible. What’s more, the plaintiff could not say with certainty what the emails on the backup tapes would show, or that any would even be responsive to the plaintiffs discovery requests. Accordingly, the court found that the plaintiff did not show good cause under Rule 26(b)(2)(B), and the court denied the motion to compel production of emails on the backup tapes. The court did allow the plaintiff to continue to pursue discovery of those emails, but only at his own expense.

This case provides important lessons for parties requesting ESI and for parties producing it. Parties requesting ESI should carefully examine cost estimates when faced with cost and burden objections, because the burden shifting that occurs following a showing of undue burden and cost makes a motion to compel much less likely to succeed. For parties producing ESI, this case provides yet another reminder that objections to discovery that would require restoration of physical backup tapes are often successful when they are supported by objective and credible cost estimates.


Unsupported And Exaggerated Assertions Regarding The Burden Of Production Will Not Persuade The Court

By now we’re all familiar with the language recently implemented in the Federal Rules of Civil Procedure, providing employers with some protection against unreasonable demands related to ESI: “A party need not provide discovery of electronically stored information from sources that the party identifies as not reasonably accessible because of undue burden or cost.”  Fed. R. Civ. P. 26(b)(2)(B).  Rule 26(b)(2)(B) further provides that the objecting party must show that the requested production is unduly costly or otherwise burdensome.  What is minimally required to establish that cost or burden likely varies by court but one recent case provides beneficial guidance on what is not sufficient.

In Mitchell v. Reliable Security, LLC, 2016 U.S. Dist. Lexis 76128 (N.D. Ga. May 24, 2016), the plaintiff in a pregnancy discrimination case asked for the employer’s ESI production of relevant e-mails and spreadsheets, to be produced in native file format.  The employer objected, claiming that it would be more expensive – by $3,000 – to produce the requested documents in their native format than to convert and produce the documents as PDFs.  This contention itself is a head-scratcher.  Why would it be more expensive to produce documents in their original format than to convert them?  Not surprisingly, the employer did not provide any substantive explanation for this purported cost, a fact the court noted the plaintiff was quick to point out:

Defendant’s statement regarding the estimated additional costs to produce native files rather than PDF files is insufficient because Defendant did not explain how it arrived at the estimated cost it provided, did not provide an actual estimate from an ESI expert or vendor, and did not explain its contention that production of emails and spreadsheets in native format would require more paralegal time to manage the production of native emails; because defense counsel’s own marketing communications suggest that it employs discovery management software commonly used to streamline ESI production; because there are other free or low-cost means of production of the native files; and because Plaintiff’s counsel has offered to assist in downloading emails in electronic format to minimize costs and avoid the retention of an expert or vendor to do the same.

The court found the plaintiff’s position persuasive, noting that it was “at a loss to understand why the production of native documents is more costly than production of PDF files” and ordering the employer to produce the files in native format as requested.

While this case underscores the importance of detailing how and why producing requested ESI would be too costly or otherwise burdensome, it also demonstrates that, typically, it’s not worth fighting over the form of production.  If the opposing party wants the documents in native format, give it to them.  If they would rather have PDF documents, give them that.  While on occasion there may be a valid reason for objecting to production in native format (for example, the need for redaction – not an issue in the instant case – which cannot be accomplished in a native document), most of the time it will not be worth the time and expense (and, as happened here, the disgruntlement of the court) to fight over format.

Federal Court In Virginia Rejects Defendant’s Proportionality Argument

A federal court in Virginia recently granted a plaintiff’s motion to compel the defendant to search its computer systems for electronically stored information, rejecting the defendant’s argument that the requested ESI was “inaccessible” due to burden and cost and that the requested discovery was not proportional to the needs of the case.  In Wagoner v. Lewis Gale Med. Ctr., LLC (W.D. Va.), the plaintiff sued defendant Lewis Gale Medical Center alleging that he was discriminated against on the basis of his disability in violation of the Americans with Disabilities Act.  The plaintiff had requested that the defendant search for the ESI maintained by two custodians over a four month time period and proposed a list of 14 search terms.  The defendant contended that it would cost over $20,000 just to collect the requested ESI, based on a cost estimate obtained by a third party vendor.  The defendant further estimated an additional $24,000 in costs to review the ESI prior to production, for a total estimated cost of roughly $45,000.  Based on these cost estimates, the defendant argued that the discovery should not be permitted because it is not proportional to the needs of the case in light of the plaintiff’s “limited potential recovery.”

In rejecting the defendant’s proportionality argument and granting plaintiff’s motion to compel, the court noted that the defendant “chose to use a system that did not automatically preserve emails for more than three days, and did not preserve emails in a readily searchable format, making it costly to produce relevant emails when faced with a lawsuit.”  The court also noted that proportionality is not solely a question of whether the particular discovery method is expensive.  The court pointed out that defendant failed to offer any alternative proposals to obtain the requested information other than to have the witnesses search their own computers for potentially responsive information, which the court found was insufficient.

The court also held that the defendant failed to carry its burden of demonstrating that the ESI was “inaccessible,” noting that the mere fact that defendant could not conduct the ESI searches in house and would be forced to hire a vendor did not render the data inaccessible.  The court also noted that the estimate provided by the defendant’s proposed ESI vendor “seems exceedingly high” since the request ESI was limited to two custodians over a four month timeframe.

Lastly, the court also rejected the defendant’s request for cost shifting in light of the fact that “the ESI sought is reasonably accessible without undue burden or expense.”

This case serves as an important reminder that broad brush assertions of undue burden will not be accepted by the court.  While it certainly is important to provide the court with objective data and metrics to support an undue burden and/or proportionality argument, it is equally important that those calculations not be overstated.

May A Company “Preserve In Place” To Satisfy Its Preservation Obligations

A common question that often arises is whether to physically collect/copy a person’s e-mail account once that person is placed on a litigation hold.  Rather than copy the e-mail account, many companies will simply turn off the “auto-delete” function and issue the employee a preservation notice. By doing so, the company essentially is preserving in place.  Although some may question this method of preservation, it is no different than if an employee had a box of potentially relevant documents in his office and, rather than make a copy of the box of documents, the employee was instructed to continue to hold the box in his office until further notice.  In most situations, this should be sufficient to satisfy a company’s preservation obligations.

One caveat, however, relates to a custodian who may have an incentive not to preserve (e.g., the alleged harasser).  In this type of situation, i.e., where you have the proverbial fox guarding the hen house, several courts have questioned the wisdom of such a protocol and have sanctioned parties when e-mails were lost.  (See below.)  Therefore, for those central figures in a litigation who may have an incentive not to preserve, a copy should be made of their e-mail accounts.

The following authorities are illustrative of the principles discussed above:

  • Federal Rule of Civil Procedure 37(e):  This newly amended rule now provided that “[i]f electronically stored information that should have been preserved in the anticipation or conduct of litigation is lost because a party failed to take reasonable steps to preserve it, and it cannot be restored or replaced through additional discovery, the court [may issue the following sanctions…]”  One of the key points of this rule is the possibility for sanctions hinges on whether the party took “reasonable steps” to preserve information.  Therefore, the rule contemplates that information can still be lost or deleted, but if reasonable – but not perfect – steps were taken to preserve, then sanctions cannot be levied for the loss of that information.
  • The Sedona Principles For Electronic Document Production, Comment 6a:  This comment provides that “[r]esponding parties are best situated to evaluate the procedures, methodologies and technologies appropriate for preserving and producing their own electronic data and documents.”  The comment cites to Zubulake v. UBS Warburg LLC, 220 FRD 212, 217 (SDNY 2003), noting there are various ways to manage electronic documents, and thus, many ways in which a party may comply with its obligations.
  • Green v. Blitz U.S.A., Inc., 2011 U.S. Dist. LEXIS 20353 (E.D. Tex. Mar. 1, 2011):  The court sanctioned a party who relied on custodians to self-collect relevant e-mails for preservation.  Based on the insufficient and lackluster efforts taken to attempt to locate relevant documents, the court found that the party did not take reasonable steps to preserve.
  • Northington v. H&M Intl., 2011 WL 663055 (N.D. Ill. Jan. 12, 2011):  The court held “[I]t is unreasonable to allow a party’s interested employees to make the decision about the relevance of such documents, especially when those same employees have the ability to permanently delete unfavorable email from a party’s system….  Most non-lawyer employees, whether marketing consultants or high school deans, do not have enough knowledge of the applicable law to correctly recognize which documents are relevant to a lawsuit and which are not.  Furthermore, employees are often reluctant to reveal their mistakes and misdeeds.”
  • Jones v. Bremen High School District 228, 2010 WL 2106640 (N.D. Ill. May 25, 2010):  The court chastised the school district for allowing those alleged to have discriminated against the plaintiff to conduct their own review for evidence relating to her claims.  Moreover, even though most of the gaps in the e-mail production were filled by finding missing e-mails in other locations, the court still sanctioned the school district, determining that it was reckless and grossly negligent in its handling of the litigation hold.

At the end of the day, if the process works and no relevant e-mails are lost, a company should not be at risk for sanctions.  However, if e-mails are lost, the main question will be whether the company took reasonable steps to preserve the information.

If the lost e-mails were from a marginal employee with no meaningful stake in the litigation, then the mere fact that a litigation hold was issued to the employee should be enough to protect the company from sanctions.  (In this situation, the company will want to be able to show that the employee received the hold notice, understood what it meant, etc.  To this end, it will be very helpful if companies periodically issue reminder litigation hold notices.)  Alternatively, if the lost emails were in the account of a key figure in the litigation, then the court will look more closely into whether the company’s efforts were reasonable in that specific situation.

Explanation of the Legal Profession’s Remarkably Slow Adoption of Predictive Coding

Well-known predictive coding expert attorney, Maura Grossman, and her husband, noted information scientist, Gordon Cormack, recently began on article in Practical Law magazine with the assertion:

Adoption of TAR has been remarkably slow, considering the amount of attention these offerings have received since the publication of the first federal opinion approving TAR use (see Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y. 2012)).

Grossman & Cormack, Continuous Active Learning for TAR (Practical Law, April/May 2016).

Winners in Federal CourtTAR, which stands for Technology Assisted Review, is their favorite term for what the legal profession commonly calls predictive coding. I remember when our firm attained the landmark ruling in our Da Silva Moore case. I thought it would open a floodgate on new cases. It did not. But it did start a flood of judicial rulings approving predictive coding all around the country, and lately, around the world. See Eg. Pyrrho Investments v MWB PropertyEWHC 256 (Ch) (2/26/16). Judge Andrew Peck’s more recent ruling on the topic contains a good summary of the law. Rio Tinto PLC v. Vale S.A., 306 F.R.D. 125 (S.D.N.Y. 2015). The bottom line is that at this point in time, late May 2016, the Bench is waiting for the Bar to catch up.

LoveAlthough I am known for my exuberant endorsement of predictive coding, this enthusiasm for new technology to find electronic evidence is still rare in the legal profession. Losey, R., Why I Love Predictive Coding: Making document review fun with Mr. EDR and Predictive Coding 3.0. (2/14/16). So why do I love this technology so much, and most other lawyers, not so much? It may have to do with the fact that I have been using computers since 1978 and am very used to pushing the technology edge. But that just explains why I was one of the first to knock on the door of predictive coding, not why I like the room. I have been an early adopter of many technologies that proved disappointing. (Anybody want to buy a slightly used iWatch?) No, I like it because it really works.

This in turn raises the question why have not all attorneys had this same reaction. If it really works for me, it should really work for everyone, right? And so everyone should be loving predictive coding, right? No. It is not working for everyone. Many have had unpleasant experiences with predictive coding. They left the room bored and frustrated. I am reminded of the old commercial, Where’s the beef? They went back to their old familiar keyword searches. Pity.

It took me a while to figure this out, that others were having failures and not talking about it. (Who can blame them?) In retrospect I should have seen this earlier. Still, it took Grossman and Cormack until 2015 to figure this out too. Interestingly, we have come to the same conclusion on causation. Bad software is not the main reason, although varying software quality among vendors is part of the explanation. Some software on the market is not that good, or does not even have bona fide predictive coding features using active machine learning. But these software differences only explain some of the dissatisfaction. The real reason for the failures is that attorneys have not been using the predictive coding features properly. They have been doing it wrong. That is why it did not work well for them. That is why many attorneys tried it out and never returned.

Grossman and Cormack explain this and provide their best-practice methods in the new article, Continuous Active Learning for TAR, and many other articles they have written since 2015. I read and recommend them all. I have shared my own best practices in my lengthy personal blog, Predictive Coding 3.0 article, part one and part two. Part one describes the history and part two describes the method. Our best practices are not exactly the same, but they are close and compatible. I have written a total of 59 articles on the subject now that are currently all online and freely available. I call the method Hybrid Multimodal and its basic steps are shown in the figure below.


Feel free to drop me an email if you are an in-house counsel and want to know more about predictive coding best practices. Training on this topic is one of the services that we offer our clients. So too is the search for responsive evidence in litigated matters, or internal investigations, using our proven successful Hybrid Multimodal method of Predictive Coding document review.



The Exploitation of America’s Cybersecurity Vulnerabilities by China and Other Foreign Governments

The Chinese People’s Liberation Army attacks American companies every day to try to steal trade secrets and gain commercial advantage for state controlled businesses.


Gu Chunhui

Criminal hackers can cause tremendous damage, whether trained in China or not. If a high level expert, such as any member of China’s elite Unit 61398, aka Comment Crew, gets into your system, they can seize root control, and own it. They can then plant virtually undetectable back doors into your systems. This allows them to later come and go as they please.

A member of the Comment Crew could be in your computer system right now and you would not know it. For instance, Gu Chunhui, who often goes under the online alias, Kandy Goo, and is a high ranking military officer of Unit 61398, could be looking at your computer screen now. Captain Goo could be running programs in the background without your knowledge. Or he could be reading your email. He would be looking for some information of value to his country, or of value to any of the thousands of businesses controlled by the Chinese government. Captain Goo may have a cute Internet name, and look more like a movie star in a martial arts film than an army man, but do not be fooled. Do not underestimate his considerable computer skills and strong patriotic intent. Yes, breaking into your computer systems and stealing data is a matter of patriotic duty for him and other hackers trained by the government of communist China.

Unit 61398 of the Third Department of the Chinese People’s Liberation Army is reported to be the best of the best in China. Gu Chunhui is a determined military officer. Although  DOJ documents show that Gu, like everybody else in Shanghai where he is stationed, takes a two hour break every day for lunch,  he still works hard the rest of the day to break into your computer system and steal your data (and your client’s). He and others in Unit 61398 are armed and dangerous. They have both viruses and guns. They should not be taken for granted. All of the Unit 61398 Comment Crew, including Captain Goo, are very good at what they do. I am worried. You should be too.

Do not get me wrong, the Chinese government does not have a monopoly on black hat hacking. The whole idea was born in the United States. It could also just as easily be a criminal hacker from Russia, the Ukraine, Poland, Iran, or Syria, who has taken control of your system. It could be a teenager down the street. They could be from anywhere, although if they are after trade secrets, not money, it is probably one of the thousands of hackers who works for the Chinese government. It could even be one of the five officers in Unit 61398 in Shanghai that have been indicted by the DOJ.


DOJ’s 31 Count Criminal Indictment Against Five Military Officers
of Unit 61398 of the Third Department of the Chinese People’s Liberation Army

Chinese-cyber-war_DOJFive military officers of Unit 61398, including Gu Chunhui, were indicted in 2014 by the Department of Justice for theft of commercial trade secrets from several large U.S. Corporations and a Union. No, they have not been arrested, nor is it likely they ever will be. This was more of a symbolic gesture than anything else, a wake-up call for American business. Still, at least one person in the U.S., a Chinese businessman, has been arrested and convicted of helping the Chinese government steal trade secrets. Businessman admits helping Chinese military hackers target U.S. contractors (Washington Post, 3/23/16).

The DOJ has also recently unsealed charges made against the Syrian Electronic Army — a hacking group that supports embattled Syrian President Bashar al-Assad. In addition, on March 24, 2016, the Manhattan U.S. Attorney announced charges against seven Iranians for conducting a coordinated campaign of cyber attacks against the U.S. financial sector on behalf of the Islamic Revolutionary Guard. A copy of the indictment of the Iranians is published here by the DOJ. It is a very dangerous world right now and very challenging to protect trade secrets.

The indictment against the Chinese Military officers is especially notable to the legal profession in that some of the secrets allegedly stolen include attorney-client communications. See the 31 count indictment against five Chinese military officers for details. The chart below provides a high level overview. Every count is against all five officers.

Count(s) Charge Statute Maximum Penalty
1 Conspiring to commit computer fraud and abuse 18 U.S.C. § 1030(b). 10 years.
2-9 Accessing (or attempting to access) a protected computer without authorization to obtain information for the purpose of commercial advantage and private financial gain. 18 U.S.C. §§ 1030(a)(2)(C), 1030(c)(2)(B)(i)-(iii), and 2. 5 years (each count).
10-23 Transmitting a program, information, code, or command with the intent to cause damage to protected computers. 18 U.S.C. §§ 1030(a)(5)(A), 1030(c)(4)(B), and 2. 10 years (each count).
24-29 Aggravated identity theft. 18 U.S.C. §§ 1028A(a)(1), (b), (c)(4), and 2 2 years (mandatory consecutive).
30 Economic espionage. 18 U.S.C. §§  1831(a)(2), (a)(4), and 2. 15 years.
31 Trade secret theft. 18 U.S.C. §§ 1832(a)(2), (a)(4), and 2. 10 years.

The possibility, indeed probability of hacker attacks on law firms is one reason we outsource the holding of all large stores of our client’s electronic data in e-discovery. We put the ESI in the hands of a global vendor with one of the most secure  facilities in the world. Feel free to ask me about it. Protection of client data is an important ethical duty of every attorney. We take it very seriously and conduct all of our work accordingly.

Conclusion to 14 Part Series on Document Culling

This is Fourteenth and Final blog in a series on two-filter document culling. (Yes, we went for and obtained a world record on longest law blog series!) Document culling is very important to successful, economical document review. Please read parts onetwothreefourfivesixseveneightnineteneleventwelve and thirteen before this one.


ralphlosey_cartoon_smallThere is much more to efficient, effective review than just using software with predictive coding features. The methodology of how you do the review is critical. The two filter method described here has been used for years to cull away irrelevant documents before manual review, but it has typically just been used with keywords. I have shown in this lengthy series of blogs how this method can be employed in a multimodal manner that includes predictive coding in the Second Filter.

Keywords can be an effective method to both cull out presumptively irrelevant files, and cull in presumptively relevant, but keywords are only one method, among many. In most projects it is not even the most effective method. AI-enhanced review with predictive coding is usually a much more powerful method to cull out the irrelevant and cull in the relevant and highly relevant.

If you are using a one-filter method, where you just do a rough cut and filter out by keywords, date, and custodians, and then manually review the rest, you are reviewing too much. It is especially ineffective when you collect based on keywords. As shown in Biomet, that can doom you to low recall, no matter how good your later predictive coding may be.

If you are using a two-filter method, but are not using predictive coding in the second filter, you are still reviewing too much. The two-filter method is far more effective when you use relevance probability ranking to cull out documents from final manual review.


Employers Have An Obligation To Provide Meaningful Direction To Employees In Email Searches, But Employers Can’t Be Compelled To Recover Company Emails Stored On Personal Accounts Of Employees

Douglas_JohnstonThis blog post is written by Douglas Johnston in our San Francisco office.

A recent case from the Northern District of California raises the importance of actively engaging with employees to coordinate the search for documents and electronically-stored information to comply with the employer’s discovery obligations. At the same time, the Court ruled that an employer cannot be compelled to produce business-related emails from the personal email accounts of its employees.

In Matthew Enterprise, Inc. v. Chrysler Group, LLC, the plaintiff, Stevens Creek – a car dealership – sued Chrysler for price discrimination in violation of the Robinson-Patman Act.  During discovery, Chrysler sought emails from Stevens Creek’s employees’ corporate Gmail accounts as well as emails from the employees’ personal email accounts which, at times, were used for business purposes.

As to the emails from employees’ corporate accounts, Chrysler argued that Stevens Creek used inadequate search parameters, failed to provide employees with a copy of the discovery requests, did not provide any meaningful direction to the employees on how to identify requested ESI and did not ask all relevant custodians to search for documents. In opposition, Stevens Creek argued it had undertaken reasonable efforts in good faith to comply with the requests for production.

With regard to emails from employees’ personal accounts, Stevens Creek argued that the emails were outside its “possession, custody, or control,” and, therefore, beyond the scope of discovery from Stevens Creek Chrysler responded that Stevens Creek has control over company information regardless of whether it is stored on personal email accounts and pointed to plaintiff’s employee handbooks instructing employees to keep “internal information” in the “sole possession” of Stevens Creek.

Magistrate Judge Paul S. Grewal, applying the recent amendments to the Federal Rules of Civil Procedure, found Stevens Creek’s ESI search efforts to be lacking, citing as a specific examples, the suggestion by Stevens Creek to its employees to merely pull any email with the word “Chrysler” in it and Stevens Creek’s limitation of the relevant custodians to sales employees.  Accordingly, Judge Grewal ordered  Stevens Creek to ask both salespeople and all other employees who may have relevant documents to cooperate with the search and for Stevens Creek to coordinate the search for documents by telling those employees exactly what Chrysler had asked for and suggesting broad sets of search terms.

However, Judge Grewal found that Chrysler had failed to show that any contract existed between Stevens Creek and its employees requiring its employees to provide information stored in their personal accounts despite language in Stevens Creek’s handbook instructing employees to keep “internal information” in the “sole possession” of Stevens Creek. The court noted that the handbook language did not create a legal right and there was no “authority under which Stevens Creek could force employees to turn them over.”

Judge Grewal’s ruling has two important implications for employers. First, when responding to requests for electronically stored information, employers must take an active role in assisting employee-custodians in their search for responsive documents.  Second, Judge Grewal’s ruling indicates that employers should have strong agreements in place with employees who may be storing company information in personal email accounts, such as Gmail, for otherwise they may be prevented from recovering them when needed. Instead, these employees may be subject to direct, third party discovery of relevant information in their custody and control under Rule 45. This can complicate the employer’s defense and overall increase the cost of electronic discovery.

Case Example of Quick Peek Type of Production Without Full Manual Review

This is part Thirteen of the continuing series on two-filter document culling. (Yes, we are going for a world record on longest law blog series.:) Document culling is very important to successful, economical document review. Please read parts onetwothreefourfivesixseveneightnineteneleven and twelve before this one.

Limiting Final Manual Review

Ralph_talkingIn some cases you can, with client permission (often insistence), dispense with attorney review of all or near all of the documents in the upper half. You might, for instance, stop after the manual review has attained a well-defined and stable ranking structure. For example, you only have reviewed 10% of the probable relevant documents (top half of the diagram), but decide to produce the other 90% of the probable relevant documents without attorney eyes ever looking at them. There are, of course, obvious problems with privilege and confidentiality to such a strategy. Still, in some cases, where appropriate clawback and other confidentiality orders are in place, the client may want to risk disclosure of secrets to save the costs of final manual review. This should, however, only be done with full disclosure and understanding of the considerable risks involved. We do not recommend this bypass, but in some rare occasions it makes sense.

In such productions there are also dangers of imprecision where a significant percentage of irrelevant documents are included. This in turn raises concerns that an adversarial view of the other documents could engender other suits, even if there is some agreement for return of irrelevant. Once the bell has been rung, privileged or hot, it cannot be un-rung.

Case Example of Production With No Final Manual Review

In spite of the dangers of the unringable bell, the allure of extreme cost savings can be strong to some clients in some cases. For instance, I did one experiment using multimodal CAL with no final review at all, where I still attained fairly high recall, and the cost per document was only seven cents. I did all of the review myself acting as the sole SME. The visualization of this project would look like the below figure.


Note that if the SME review pool were drawn to scale according to number of documents read, then, in most cases, it would be much smaller than shown. In the review where I brought the cost down to $0.07 per document I started with a document pool of about 1.7 Million, and ended with a production of about 400,000. The SME review pool in the middle was only 3,400 documents.

Army of One: Multimodal Single-SME Approach To Machine LearningAs far as legal search projects go it was an unusually high prevalence, and thus the production of 400,000 documents was very large. Four hundred thousand was the number of documents ranked with a 50% or higher probable prevalence when I stopped the training. I only personally reviewed about 3,400 documents during the SME review. I then went on to review another 1,745 documents after I decided to stop training, but did so only for quality assurance purposes and using a random sample. To be clear, I worked alone, and no one other than me reviewed any documents. This was an Army of One type project.

Although I only personally reviewed 3,400 documents for training, I actually instructed the machine to train on many more documents than that. I just selected them for training without actually reviewing them first. I did so on the basis of ranking and judgmental sampling of the ranked categories. It was somewhat risky, but it did speed up the process considerably, and in the end worked out very well. I later found out that other information scientists often use this technique as well. See eg.Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery, SIGIR’14, July 6–11, 2014, at pg. 9.

My goal in this project was recall, not precision, nor even F1, and I was careful not to over-train on irrelevance. The requesting party was much more concerned with recall than precision, especially since the relevancy standard here was so loose. (Precision was still important, and was attained too. Indeed, there were no complaints about that.) In situations like that the slight over-inclusion of relevant training documents is not terribly risky, especially if you check out your decisions with careful judgmental sampling, and quasi-random sampling.

I accomplished this review in two weeks, spending 65 hours on the project. Interestingly, my time broke down into 46 hours of actual document review time, plus another 19 hours of analysis. Yes, about one hour of thinking and measuring for every two and a half hours of review. If you want the secret of my success, that is it.

I stopped after 65 hours, and two weeks of calendar time, primarily because I ran out of time. I had a deadline to meet and I met it. I am not sure how much longer I would have had to continue the training before the training fully stabilized in the traditional sense. I doubt it would have been more than another two or three rounds; four or five more rounds at most.

Typically I have the luxury to keep training in a large project like this until I no longer find any significant new relevant document types, and do not see any significant changes in document rankings. I did not think at the time that my culling out of irrelevant documents had been ideal, but I was confident it was good, and certainly reasonable. (I had not yet uncovered my ideal upside down champagne glass shape visualization.) I saw a slow down in probability shifts, and thought I was close to the end.

I had completed a total of sixteen rounds of training by that time. I think I could have improved the recall somewhat had I done a few more rounds of training, and spent more time looking at the mid-ranked documents (40%-60% probable relevant). The precision would have improved somewhat too, but I did not have the time. I am also sure I could have improved the identification of privileged documents, as I had only trained for that in the last three rounds. (It would have been a partial waste of time to do that training from the beginning.)

The sampling I did after the decision to stop suggested that I had exceeded my recall goals, but still, the project was much more rushed than I would have liked. I was also comforted by the fact that the elusion sample test at the end passed my accept on zero error quality assurance test. I did not find any hot documents. For those reasons (plus great weariness with the whole project), I decided not to pull some all-nighters to run a few more rounds of training. Instead, I went ahead and completed my report, added graphics and more analysis, and made my production with a few hours to spare.

A scientist hired after the production did some post-hoc testing that confirmed an approximate 95% confidence level recall achievement of between 83% to 94%.  My work also confirmed all subsequent challenges. I am not at liberty to disclose further details.

In post hoc analysis I found that the probability distribution was close to the ideal shape that I now know to look for. The below diagram represents an approximate depiction of the ranking distribution of the 1.7 Million documents at the end of the project. The 400,000 documents produced (obviously I am rounding off all these numbers) were 50% plus, and 1,300,000 not produced were less than 50%. Of the 1,300,000 Negatives, 480,000 documents were ranked with only 1% or less probable relevance. On the other end, the high side, 245,000 documents had a probable relevance ranking of 99% or more. There were another 155,000 documents with a ranking between 99% and 50% probable relevant. Finally, there were 820,000 documents ranked between 49% and 01% probable relevant.


The file review speed here realized of about 35,000 files per hour, and extremely low cost of about $0.07 per document, would not have been possible without the client’s agreement to forgo full document review of the 400,000 documents produced. A group of contract lawyers could have been brought in for second pass review, but that would have greatly increased the cost, even assuming a billing rate for them of only $50 per hour, which was 1/10th my rate at the time (it is now much higher.)

The client here was comfortable with reliance on confidentiality agreements for reasons that I cannot disclose. In most cases litigants are not, and insist on eyes on review of every document produced. I well understand this, and in today’s harsh world of hard ball litigation it is usually prudent to do so, clawback or no.

Another reason the review was so cheap and fast in this project is because there were very little opposing counsel transactional costs involved, and everyone was hands off. I just did my thing, on my own, and with no interference. I did not have to talk to anybody; just read a few guidance memorandums. My task was to find the relevant documents, make the production, and prepare a detailed report – 41 pages, including diagrams – that described my review. Someone else prepared a privilege log for the 2,500 documents withheld on the basis of privilege.

I am proud of what I was able to accomplish with the two-filter multimodal methods, especially as it was subject to the mentioned post-review analysis and recall validation. But, as mentioned, I would not want to do it again. Working alone like that was very challenging and demanding. Further, it was only possible at all because I happened to be a subject matter expert of the type of legal dispute involved. There are only a few fields where I am competent to act alone as an SME. Moreover, virtually no legal SMEs are also experienced ESI searchers and software power users. In fact, most legal SMEs are technophobes. I have even had to print out key documents to paper to work with some of them.

Penrose_triangle_ExpertiseEven if I have adequate SME abilities on a legal dispute, I now prefer to do a small team approach, rather than a solo approach. I now prefer to have one of two attorneys assisting me on the document reading, and a couple more assisting me as SMEs. In fact, I can act as the conductor of a predictive coding project where I have very little or no subject matter expertise at all. That is not uncommon. I just work as the software and methodology expert; the Experienced Searcher.

Recently I worked on a project where I did not even speak the language used in most of the documents. I could not read most of them, even if I tried. I just worked on procedure and numbers alone. Others on the team got their hands in the digital mud and reported to me and the SMEs. This works fine if you have good bilingual SMEs and contract reviewers doing most of the hands-on work.


To be continued …. (final installment comes next!)