LawShelf courses have been evaluated and recommended for college credit by the National College Credit Recommendation Service (NCCRS), and may be transferred to over 1,500 colleges and universities.

We also have established a growing list of partner colleges that guarantee LawShelf credit transfers, including Excelsior College, Thomas Edison State University, University of Maryland Global Campus, Purdue University Global, and Touro University Worldwide.

For a limited time: Purchase a course multi-pack for yourself or a friend!

Getting the Data from the People who Store It - Module 3 of 5

Module 3: Getting the Data from the People who Store It

There are several formal ways to request discovery from other parties in litigation, including subpoenas, requests for admission, requests for production of documents and depositions. All forms of discovery are ways of moving documents and other data from the receiving party to the requesting party. What modern technology has changed about the discovery process is the form in which that data is held and the means necessary to search and obtain relevant, non-privileged data.

The legal system tried for some time to come to grips with the changing technology from the late 1980’s through the beginning of this century, but it became clear that simple requests for discovery were being frustrated by the vagaries of search and storage technologies being adopted by the business and legal communities. So, a new set of evidentiary, procedural and ethical rules were promulgated and adjusted to accommodate the technological needs of the requesting and responding parties and the courts. This module will focus on these rules.

E-Discovery Legal and Ethical Considerations

First, we’ll look at ethical rules binding attorney behavior. Ethical rules are not “laws” and can vary from state to state. They are developed and put into effect by state bars in conjunction with state supreme courts. Violation of these ethical standards can form the basis of bar disciplinary actions, court sanctions and malpractice lawsuits, among other bad outcomes.

Most state ethical rules are developed within the context of the American Bar Association’s Model Rules of Professional Conduct. As a set of model rules, these ethical considerations are not binding in and of themselves but serve as a basis for states to write their own rules. E-discovery — and law office technology in general — bring up several ethical considerations for lawyers.

Keeping up with technology

It’s easy for legal professionals to make mistakes with technology. These can include exposing private client information, social media posts that bring their professionalism into question, not properly vetting third-party storage companies and using the wrong e-discovery search technology. Although the number keeps increasing, lawyers are duty-bound in the ethical rules of only about half the states to keep up with technology to the extent that it applies to the law office on issues like security, privacy and the ability to conduct e-discovery.[1]

Responding fully to discovery requests

Lawyers are duty-bound under rules of legal ethics to respond to discovery requests “fully and completely” by not “hiding” any potential discovery or allowing their clients to do so, producing all requested documents when possible and not overestimating the costs of e-discovery in their proportionality arguments.[2] This is particularly applicable to “technology assisted review,” which is covered later in this module.

Clients’ documents privacy, confidentiality, and security

Another ethical issue for attorneys concerns storing e-discovery results in the cloud. Lawyers are responsible for the security of all documents related to their clients[3] and storing those in the cloud implicates both security and privacy issues. There are legal technology writers who will tell you that there is no system that cannot be hacked in some way by somebody, and the quickest way into a corporation’s innermost secrets can be though its lawyer’s computer systems.

Another set of ethical issues involves law firms hiring third-party e-discovery providers. The rules are clear that the attorneys are responsible for any problems caused by those providers.[4]

Legal liabilities of attorneys’ mishandling of these issues

Beyond these ethical considerations lies the fact that computer security breaches can generate lawsuits against the lawyers and cloud storage companies. Even so, attorneys continually are behind in keeping up with the latest in hacking prevention, social media, cloud security and many other aspects of modern technology that directly impacts their offices and their clients’ interests, and many of them get sued because of it.

E-Discovery Pretrial Rules

Under the rules of civil procedure, parties meet shortly after a lawsuit is filed to determine what discovery will be allowed and how long it will take. This can be a complex procedure involving lengthy conversations among the parties and the presiding judge.

Several of the Federal Rules of Civil Procedure govern how the parties proceed with e-discovery before trial. They try to speed up the trial and make cooperation amongst the parties a priority, but things can fall apart in a hurry, and the judge may have to step in and make decisions about the scope of discovery. 

The e-discovery pretrial rules include Rule 16(b) and 26(f) on meetings and conferences; Rule 26(b)(2) on the scope of discovery; Rule 26(b)(5) on privilege claims; Rule 26(f) on the pretrial discovery conference; Rule 34 on forms of production; and Rule 37 on failure to preserve electronically stored information. We will look at each of these, in turn.

FRCP Rule 26(f), 26(a)(1), 16(b): Pretrial planning

Taken together, these rules alert counsel and the court that they should consider e-discovery issues as early and comprehensively as possible. The parties must confer at least 21 days before the scheduling conference with the court to work out agreements on the preservation of electronically stored information, the forms of ESI production, the methods that will be employed to filter out irrelevant information and protection for privileged information. Once the scheduling conference has occurred, the court will issue a scheduling order that will govern the pace of the litigation.

Federal Rule 16 governs pretrial conferences in general. Rule 26(f) covers e-discovery pretrial meetings in particular. Rule 26(f) is the overarching rule governing how e-discovery will be handled during the case. The rule covers the pretrial e-discovery decisions that the parties and the court need to make before the discovery process can begin.  

Rules 26(f) goes over several topics to be settled by the parties, starting with potential settlement, ESI preservation and the creation of a discovery plan. A discovery plan is usually a written agreement between the parties about how e-discovery will go forward, but it can sometimes be imposed by a judge if the parties can’t agree. The court may also order e-discovery progress reports to be submitted along the way. The rule also gives a judge discretion with deadlines. This conference will also be the place to discuss privilege.

The discovery plan includes discussion of the scope of discovery and which electronic data discovery tools may be used by the parties. This discussion may continue through the pretrial process as discoverable data may require different discovery techniques as it is revealed. The discovery plan also includes the format in which the discovery will be transmitted and will cover metadata, file formatting, etc.

The conference also includes case scheduling and initial disclosure of information under Rule 26(a)(1), which covers witnesses, documents and other evidence, etc. that must be disclosed to the other side before trial.

Once the e-discovery plan is completed, the judge has 21 days to schedule a pretrial hearing to discuss the case schedule under Federal Rule 16(b).

Rule 34: Producing ESI

FRCP Rule 34 covers how the responding party to an ESI request must produce the requested data. For ESI, unless otherwise agreed by the parties or stipulated by the court, the data must be produced as stored “in the ordinary course of business,” and in a form that the receiving party can read—although the ESI does not have to be produced in multiple formats. The rule applies to parties and non-parties, such as witnesses.

Rule 37: Failure to Cooperate in Discovery

Failure to preserve or produce ESI in a timely fashion and in a usable form can bring sanctions under FRCP Rule 37(e). That rule speaks to both inadvertent and deliberate destruction of data. There are times when data is destroyed in the regular course of business (such as when memory is regularly wiped every 90 days), and those will not result in sanctions. If the data is unavailable for reasons beyond the control of the transmitting party, then the court simply proceeds without it. But if the data has been deliberately destroyed or spoliated after a litigation hold was properly served, the rule allows the court or jury to presume that the data was unfavorable to the party that destroyed it, or even to enter a default judgment.

Inadvertent Disclosure, Clawbacks and Waiver

In discovering and producing documents that can run into the millions, sometimes documents are transmitted which the other side is not entitled to for reasons of privilege, client confidential information, privacy laws or other circumstances. It also happens when documents are not redacted in the way that they are supposed to be. A model ethical rule covers this topic, admonishing lawyers not to inadvertently disclose client data.[5]

This can happen for several reasons, usually human error, but the response to inadvertent disclosure must be to attempt to negate its effect, if possible. On the other hand, if the data’s privileges are waived, then the data stays with the receiving party.

One 2017 New York case [6] can illustrate several ways in which this data can be inadvertently produced. In Mill Lake v. Wells Fargo, an attorney used a third-party company to find relevant emails. Unfortunately for the attorney, she never really understood how the process worked and ended up disclosing personally identifiable information of hundreds of her client’s customers to the other side because she thought she had reviewed all the emails when, in fact, she had not. A further problem was that the documents were incompletely redacted. This point is preventable human error, but it happens all the time. 

The solution in this case was to “clawback” the data—that is, to return it to the sender. That is the usual solution to these problems, although that may not be the perfect answer, of course. Clawbacks for inadvertently disclosed data are covered by FRCP Rule 26(b)(5)(B) and Federal Rule of Evidence 502(b) (for attorney-client privileged information and attorney work product). It may be impossible or even silly to order a party to “forget” information, but information subject to a clawback order or agreement cannot be used at trial and a court may order a party to delete all copies of the clawed back information.[7]

There are times when the court will allow inadvertently disclosed data to be clawed back, but there are also times when that data disclosure legally constitutes a waiver that allows the data to be used by the other side. These cases are decided based on the nature of the disclosure, but the recommended path is always to have an agreement between the parties in the discovery plan covering waiver and clawbacks.

Work Product Rule under Fed. Evid R. 502(d)

The work product doctrine protects documents prepared by or for an attorney in anticipation of litigation.[8] The work-product doctrine also covers data assembled within a database for attorneys.[9] However, documents that are responsive and not prepared in anticipation of litigation must be produced, even if they are stored in an attorney’s database.[10]

Searching for and Retrieving Data

Only a few years ago, a document was something written on a piece of paper. A document review consisted of lawyers or paralegals reading stacks or rooms full of documents. A “large” number of documents in a huge lawsuit might add up to a hundred thousand of them.

Now, documents are data and are reviewed by computer programs. Complex litigation may involve tens of millions of electronic documents and other materials that need to be reviewed, checked for privilege, narrowed down to potential evidence, submitted to the other party, court or jury and so on.

So, who reads all these documents? Nobody, actually. Computer programs now dive into the data, looking for key word and phrase “hits” on relevant material. Instead of looking at each document, these programs look for specific words and phrases that the parties have decided are relevant to the case.

The term in general use in the legal world for the computer tools that find the data subject to e-discovery is electronic data discovery. There are numerous electronic data discovery tools covering various kinds of data storage and access. There are two basic technological ways of running a search. The first is using specific search terms, and the second is relying on a machine learning program.

Using search terms is straightforward. The search program looks at each word in the data field, looking for a word or group of words. This is expensive, time consuming, and generates numerous false positives (think of yourself running Google searches while researching a complex topic), but it is ultimately accurate.

Machine learning programs do not look at every document but make algorithmic predications about which documents fit the search parameters. The process of using  machine learning to find relevant data is called “technology assisted review.” Those programs use “predictive coding” to find relevant materials in the sea of data. This process is also called predictive intelligence and computer-assisted review and is based in the broad field of machine learning, or artificial intelligence.

Predictive coding looks at huge fields of documents according to set parameters given to the program by a human and determines which of those documents is relevant to that parameter. This is done by creating a small or “seed” set of documents which are used to find relevant terms and then those findings are used to create the relevancy parameters of the large search through all the documents. Predictive coding can be used for other issues beyond relevancy — for instance, responsiveness to certain issues or privilege.  

Humans then analyze the relevant documents that the predictive coding found. In the best case, predictive coding takes millions of documents and presents a small percentage of those to be analyzed by more straightforward methods like word search or having a human read them. Predictive coding, if it works the way that it is supposed to, can save countless hours. In fact, studies published in the early 2000’s found that predictive coding was far more accurate in finding relevant documents than human document review.[11]

Predictive coding as applied to litigation has been used since about the beginning of the century, but has only been allowed in courts since about 2012,[12] and is now the standard in litigation data searching. 

Predictive coding isn’t perfect, of course. After human review, in fact, most of the documents found by predictive coding have been found to be not relevant. Predictive coding is just a starting point to narrow down the numbers of documents that undergo physical human review. Nevertheless, courts now accept these algorithms as sufficiently accurate that the use of this technique is now almost universal in federal court.

In our next module, we will focus on the rules of proportionality; in other words, when discovery requests are considered too burdensome to be reasonable and thus enforceable. We will also look at the admissibility of the products of e-discovery in court.


[1] ABA Model Rule 1.1, Comment 8 has been adopted by around 30 states so far. It states: “To maintain the requisite knowledge and skill, a lawyer should keep abreast of changes in the law and its practice, including the benefits and risks associated with relevant technology, engage in continuing study and education, and comply with all continuing legal education requirements to which the lawyer is subject.” See also Sarah Andropoulos, Most States Now Require Tech Competence for Lawyers. What Does That Mean for You?, Justia, (Feb. 9, 2017), https://onward.justia.com/2017/02/09/states-now-require-tech-competence-lawyers-mean/.

[2] Fed. R. Civ. Pro.26(g) (certifications—discussed at length in another section of this course).

[3] ABA Model Rule 1.6(c) requires lawyers to make “reasonable efforts to prevent the inadvertent or unauthorized disclosure of, or unauthorized access to, information relating to the representation of a client.” A 2017 ABA Ethics Committee formal opinion specifically includes third-party cloud storage security.

[5] Model Rule of Professional Conduct 1.6(c).

[8] Leibovic v. United Shore Fin. Servs.,LLC, No. 15-12639, 2017 U.S. Dist. LEXIS 137643, at *3-4 (E.D. Mich. Aug. 28, 2017).

[9] In re Columbia/HCA Healthcare Corp.Billing Practices Litig., 293 F.3d 289, 304 (6th Cir. 2002).

[10] Cason-Merenda v. VHS of Michigan, Inc., 118 F. Supp. 3d 965, 969 (E.D. Mich. 2015).

[11] For example, a 2005 study entitled Automated Document Review Proves Its Reliability found that a human review of a set of documents was 51% accurate, while a computer review of the same documents was 95% accurate. See Anna Kershaw, Automated Document Review Proves Its Reliability, 5Digital Discovery & e-Evidence 1, 3 (2005).

[12] The first judicial decision to endorse the use of TAR was Moore v. Publicis, 287 F.R.D. 182, 193 (S.D.N.Y. 2012). This was by agreement of the parties, but another court (state) approved TAR over the objection of one of the parties months later: Global Aerospace, Inc. v. Landow Aviation, L.P., No. CL 61040 (Vir. Cir. Ct. Apr. 23, 2012); see also Virginia State Court Judge Allows Defendants To Use Predictive Coding, K&L Gates, (Apr. 25, 2012), https://www.ediscoverylaw.com/2012/04/virginia-state-court-judge-allows-defendants-to-use-predictive-coding/. In the Moore 2012 decision, Judge Andrew J. Peck had written: “Computer-assisted review appears to be better than the available alternatives, and thus should be used in appropriate cases.” Just a few years later, Judge Peck wrote that TAR is now “black letter law.”