Where ESI is Stored and How it is Retrieved - Module 2 of 5
Module 2: Where ESI is Stored and How it is Retrieved
The person in charge of data storage is termed the “custodian” of that data. That could be the party who created the document and keeps it on a local computer or it could be a cell phone company or cloud storage host.
Electronically stored information can be stored and accessed in many ways and places. This means physical storage of data, which requires some form of electronic, usually magnetic, “media.” This media can include permanent media like personal computer hard drives, company network drives, phone storage drives, and so on. It can also include temporary or removable media like flash drives.
The process of locating and charting all this data and attendant corporate policies regarding that data is called “data mapping.” There are numerous data mapping platforms.
The media can be physically located anywhere and can be accessed from anywhere. This includes hard drives “in the cloud,” which means data stored offsite, often by third-party providers, and accessed by the entity who is subject to discovery.
To begin in this module, we will discuss where ESI is stored and how to access it. One factor in accessing ESI is the cost, and as the federal rules allow discovery to be limited by cost, that is an important consideration in discovery requests.
The “cloud” accomplishes offsite data storage, which means that the servers housing the data are not located at the same place as the business or residence as the entity that owns the data. “Onsite” means that the servers are physically located the same place as the person who generated the information. For example, an iPhone stores text messages (onsite), but those texts are backed up in iCloud (offsite). Documents could be stored on a PC (onsite), but the files are backed up in the Google cloud (offsite).
The term “cloud” is shorthand for what is actually known as “Software as a Service,” or SaaS. Cloud storage companies perform the data storage and access functions that could be accomplished onsite, but which are usually cheaper to transfer to a cloud storage service. The term refers to both the physical servers housing the data and the software that makes those servers perform storage function.
One of the major changes to Rule 26(b) of the Federal Rules of Civil Procedure in 2006 (and again as amended in December of 2015) was the introduction of the idea of “proportionality.” Proportionality is a cost-benefit analysis that can be applied to a discovery request for ESI.
Prior to that time, courts rarely considered the costs of discovery. But it became evident over time that the costs of drilling down into millions of documents looking for certain key words or an email chain were expanding exponentially. There were some cases where the costs of discovery were greater than the amount in question in the case. So, courts were given discretion to limit ESI production on a case-by-case basis based on cost.
Types of Data Storage
One way to categorize these storage methods for proportionality analysis breaks them down into five types:
Type 1: Active, online data. This is the data that the holder of the ESI actively uses and consists of all active storage on hard drives, network servers or in the cloud. This is the most easily accessible information and is the least costly to maintain and access. It is “first tier” data.
Type 2: Near-online data. This is information that has been stored on removable media like flash drives, external disk drives, SD cards, etc. It also includes data stored and accessed via remote devices like magnetic tape and optical drives. It’s also considered first tier because the data can be accessed quickly and easily.
Type 3: Offline storage and archives. This is information that has been sent out for storage—generally it is no longer in active use but can be accessed if needed. This is data housed outside of the normal business and needs to be physically accessed. This data is on magnetic tape, optical drives, etc. It is reasonably accessible because it is still a part of normal business operations. It’s still considered first tier data since it can be accessed relatively easily.
Type 4: Disaster recovery files. These are offsite and usually compressed or encrypted. This data could be physically held by a third party but may be more complex to recover. This kind of data will be subject to a proportionality analysis because retrieval can be very costly. It is second tier data.
Type 5: Erased, fragmented or corrupted data. This data may be difficult or impossible to access but it also may contain evidence of tampering with the data. Trying to reconstruct this data can be very expensive and may not produce results. The proportionality arguments here will often be speculative—as in “we know they erased this data, but we need to access the hard drives to prove it.” Courts vary greatly in how they handle discovery requests for this data.
People store their own data personally on their computers, phones, tablets and other devices. Businesses store data on their in-house servers. All this data is subject to e-discovery under FRCP Rule 26, state discovery rules, and state and federal rules of evidence. All this private data is also subject to the preservation rules outlined in the first module.
Third Party Storage
Because cloud companies have physical control of data, they are also subject to subpoenas for that data. In addition, attorneys who use cloud storage services are subject to ethical considerations regarding client security and privacy. Both of those ideas will be expanded on in coming modules.
Principles of accessing cloud storage also apply to companies that transmit and store data as a regular part of their customer business models. This includes phone companies like Sprint and T-Mobile, and social media companies like Facebook and Reddit. There are also companies that offer encrypted storage.
If any of these companies store data that was transmitted or posted by a customer, then they are third party data storage companies for purposes of responding to discovery requests or subpoenas. Who the actual owner of the data is might be unclear, but all companies that store other people’s data for any reason, whether the customer even knows the data is being stored, are subject to e-discovery.
Like any potential evidence, all ESI documents are subject to a chain-of-custody analysis to determine their authenticity. This is done by looking at the metadata of the documents and the storage devices to track the document’s creation and history.
Forms of Electronically Stored Information
Here is a look at some primary forms of ESI and how each may present difficult e-discovery problems.
Email is the predominant form of ESI, comprising, by one estimate, 90% of all e-discovery. [That is because not everybody texts, not everybody uses social media and not everybody stores data in the cloud—but everybody uses email.] Email is archived so that it can be accessed, but email systems often don’t have the native search capabilities to adequately respond to e-discovery requests. So, there are numerous third-party email-centric search platforms available in the market.
Databases also present preservation and search challenges to the discovery process. Databases can hold hundreds of millions of documents, and even the most sophisticated search platforms can take a substantial amount of time and money to search properly.
In determining whether discovery requests sent to holders of a database are enforceable, the threshold questions revolve around proportionality and will include questions about the standard and enhanced reporting capabilities of the database, exporting capabilities, database structure, query language capabilities, user experience and interface and more. All of that can also be expressed in general terms and in case-specific terms. Once proportionality is established, the parties should work out amongst themselves what data is accessible and how it will be accessed and delivered. What information is privileged needs to be addressed as well.
Word Processing and Spreadsheets
Here we have data that is often changed and passed around to multiple users in the ordinary course of business. The metadata changes every time the document or spreadsheet changes and numerous versions can be stored on numerous devices. Chasing down all the versions of these documents and spreadsheets around multiple users’ devices is a form of detective work—and the time involved may impact proportionality.
Challenges also exist, of course, for all other forms of ESI. Faxes are long-distance photocopies but are also stored in various places. Audio and video files, photographs and multimedia often are geo-stamped and have their own metadata but are also subject to editing. Other documents are subject to audit trails that are discoverable. And, of course, the Internet itself is a source of a tremendous amount of information.
All in all, while there is a lot of data that is discoverable, it may not be so easy to get to it. And there is, of course, an entire e-discovery software industry dedicated to producing products that find, transmit and analyze all this data.
Trying to Keep Documents Private: Encryption and Passwording
Keeping data from prying eyes—hackers or other unauthorized users—requires some effort on the part of custodians of that data. The most common techniques to achieve privacy include password protecting, encryption and physically separating the data storage unity from the rest of the system. That last technique is called “air gapping.”
Password protecting and encrypting files, emails, texts and other data is a common practice for users to help ensure data privacy and security. When faced with a subpoena or discovery request, encrypted files may cause come problems—some of them fatal to the discovery request.
Encryption is coding information; scrambling in a way that only a party with the encryption “key” can translate. It is as old as human communications, traceable back at least as far as the time of Julius Caesar. When we talk about data encryption in the modern context, we are referring to a process that makes data inaccessible to a reader unless that reader has a key. Basic passwords are a type of encryption.
Encrypted data keys are lengthy strings of letters, numbers and symbols. Those can be of varying length, but the standard keys today are usually 256 characters, or “bits.” The 256-bit encryption is often referred to as “military grade.”
There are two types of keys—single key and public key. A single key is held by the sender and the recipient. This is also referred to as “end-to-end” encryption and is found in private texting platforms like Signal and WhatsApp. Public keys are used in e-commerce, where one party, say an online store, holds a key available to the public, and then each user creates a personal password to access the website or purchase a product. There are numerous encryption standards. You may see initials like RSA, PGP, SSL, SET, DES or others, but for legal purposes, the standard does not matter. What is important is access to the data that is hiding behind the encryption key.
The three primary uses of data encryption within the legal context are in data storage, data communication, and online transactions.
For e-discovery purposes in civil cases, courts will generally require data to be un-encrypted by the respondent to a discovery request or subpoena under the principles of open discovery discussed in Module 1, so much of the time encryption really is not much of an issue.
The primary problems that may arise on the civil side in attempting to access encrypted data are ones where the data key is not available—often because the one person who held the key died or can’t be located.
There is some precedent in cases where the key can’t be located to simply declare the data unavailable as “not reasonably accessible because of undue burden or costs.” If recovering that data is reasonable, the court could order that recovery.
Criminal cases are a different matter because they run into the Fifth Amendment right to remain silent, which may include the right to avoid making one’s data available. Federal courts have taken both sides of this issue, called “compelled decryption.” Some courts have held that the Fifth Amendment protects the data and some have ruled that it doesn’t. The recent trend seems to be away from the Fifth Amendment protecting the data. However, a military court recently ruled that forcing a member of the military to reveal his smartphone password violates his Fifth Amendment rights. On the other side, a Philadelphia law enforcement officer was recently found in contempt of court for stating that he could not remember his passcode.
Third party storage services are also subject to subpoena on both the civil and criminal sides. The government and Apple, for example, are often at loggerheads over this—particularly with the automatic encryption that comes standard on an iPhone. Newer iPhones contain the ability to block physical access to the phone by law enforcement by shutting down the phone’s port.
Blind Subpoenas and Digital Forensics
Just to make things a bit more complicated, there are independent encryption providers that sell services to encrypt data and hold the keys specifically in response to a “blind” subpoena, which is a subpoena served on the data storage service without the knowledge of the owner of the data. For example, law enforcement may subpoena iCloud for a user’s data but not notify the user.
Now the data is separately encrypted by this “fourth” party, who holds the keys. The fourth party then is in on the conversation of what data gets unencrypted and what data is subject to privilege or objection.
There are times that extraordinary measures may be called for to retrieve data that is impossible or nearly impossible to read in its current state. These occur when data storage devices (like hard drives or smartphones) have been corrupted or destroyed or the data has been erased, or when the data is behind an unavailable password or is encrypted and the key is unavailable.
In these cases, computer forensics experts come to the rescue. On the civil side, these knights of the data wars can be independent companies with a team of techies who have tools to reconstruct the seemingly lost data. On the criminal side, they may be agents of state and federal law enforcement trained in the retrieval of seemingly irretrievable data.
These experts are very good at what they do, but there are times when data is simply lost forever. For instance, “erased” data can usually be found pretty easily, because full erasure of a hard drive is at least a two-step process, and most people don’t engage the second step. While a destroyed hard drive or smartphone may yield no useful data, there are cases where the data was backed up without the knowledge of the device’s user and thus can be found.
Reconstructing “lost” data can be very costly, both in terms of time and technology.
On the civil side, the costs of reconstructing this data will be subject to the usual proportionality test of FRCP Rule 26, and the court may order either party to bear the cost of retrieval or order that it is too costly to do so. On the criminal side, the state will bear the cost.
In our next module, we’ll turn to retrieving data through discovery channels provided for under the rules of civil procedure.
 Data Custodian, EDRM, https://www.edrm.net/glossary/data-custodian/
 See attachment 1 to this module for a semi-comprehensive list.
 Data Mapping, Technopedia. https://www.techopedia.com/definition/6750/data-mapping
 Eric Griffith,What is Cloud Computing?, PC Magazine, (May 3, 2016)
 For a full definition of the cloud, see Peter Mell & Tim Grance, NIST SpecialPublication 800-145, The NIST Definition of Cloud Computing, Computer Security Resource Center (September 2011)
 Fed. R. Civ.Pro. 26(b).
 Kristen M. Bush, Proportionality in Discovery, Trial Bar News, (June 1, 2016), http://www.schwartzsemerdjian.com/trial-bar-news/proportionality-in-discovery.
 Michele C.S. Lange & Kristin M. Nimsger, ElectronicEvidence and Discovery: What Every Lawyer Should Know 1, 43 (American Bar Association 2004).
 Meredith White, Discovery From Data Storage Providers: Building a Silver Lining Into Your Cloud Storage Contract, Barnes & Thornburg (September 2013),http://www.btlaw.com/commercial-litigation-update-discovery-from-data-storage-providers-september-2013/.
 The US Supreme Court equates encryption and other privacy measures to any other kind of storage, electronic or not. See Riley v. California, 134 S. Ct. 2473,2493 (2014).
 ChristinaMercer, What is Encryption?, Tech World, (May 15, 2018), https://www.techworld.com/security/what-is-encryption-3659671/.
 See Cochran v. Caldera Medical, Inc., Civil Action No. 12-5109, 2014 U.S. Dist. LEXIS 55447 at *6-7 (E.D. Penn Apr. 22, 2014).
 In re Boucher, Vermont District Court No. 2:06-mJ-91, 2009 WL 424718 (Feb. 19, 2009); United States v. Fricosu, 841F.Supp.2d 1232 (D. Col 2012).
 See In the Matter of the Search of a Residence in Aptos, California,Case No. 17-mj-70656-JSC-1, 2018 U.S. Dist. LEXIS 45827 (N.D. Cal. March 20, 2018).
 U.S. v. Mitchell, Case. No. 17-0153, 76M.J. 413 (Ct. App. Armed Forces Aug. 30, 2017).