Where ESI is Stored and How it is Retrieved - Module 2 of 5
Module 2: Where ESI is
Stored and How it is Retrieved
ESI Storage
The person in charge of
data storage is termed the “custodian” of that data. That could be the party
who created the document and keeps it on a local computer or it could be a cell
phone company or cloud storage host.[1]
Electronically stored
information can be stored and accessed in many ways and places.[2] This
means physical storage of data, which requires some form of electronic, usually
magnetic, “media.” This media can include permanent media like personal
computer hard drives, company network drives, phone storage drives, and so on.
It can also include temporary or removable media like flash drives.
The process of locating
and charting all this data and attendant corporate policies regarding that data
is called “data mapping.”[3] There are numerous data mapping platforms.
The media can be
physically located anywhere and can be accessed from anywhere. This includes
hard drives “in the cloud,” which means data stored offsite, often by
third-party providers, and accessed by the entity who is subject to discovery.
To begin in this module,
we will discuss where ESI is stored and how to access it. One factor in
accessing ESI is the cost, and as the federal rules allow discovery to be
limited by cost, that is an important consideration in discovery requests.
The Cloud
The “cloud” accomplishes
offsite data storage, which means that the servers housing the data are not
located at the same place as the business or residence as the entity that owns
the data. “Onsite” means that the servers are physically located the same place
as the person who generated the information. For example, an iPhone stores text
messages (onsite), but those texts are backed up in iCloud (offsite). Documents
could be stored on a PC (onsite), but the files are backed up in the Google
cloud (offsite).[4]
The term “cloud” is
shorthand for what is actually known as “Software as a Service,” or SaaS[5].
Cloud storage companies perform the data storage and access functions that
could be accomplished onsite, but which are usually cheaper to transfer to a
cloud storage service. The term refers to both the physical servers housing the
data and the software that makes those servers perform storage function.
“Proportionality”
One of the major changes
to Rule 26(b) of the Federal Rules of Civil Procedure in 2006 (and again as
amended in December of 2015) was the introduction of the idea of
“proportionality.”[6] Proportionality is a cost-benefit analysis that can
be applied to a discovery request for ESI.[7]
Prior to that time,
courts rarely considered the costs of discovery. But it became evident over
time that the costs of drilling down into millions of documents looking for
certain key words or an email chain were expanding exponentially. There were
some cases where the costs of discovery were greater than the amount in
question in the case. So, courts were given discretion to limit ESI production
on a case-by-case basis based on cost.
Types of Data Storage
One way to categorize
these storage methods for proportionality analysis breaks them down into five
types:
Type 1: Active, online data. This is the data that the
holder of the ESI actively uses and consists of all active storage on hard
drives, network servers or in the cloud. This is the most easily accessible
information and is the least costly to maintain and access. It is “first tier”
data.
Type 2: Near-online data. This is information that has
been stored on removable media like flash drives, external disk drives, SD
cards, etc. It also includes data stored and accessed via remote devices like
magnetic tape and optical drives. It’s also considered first tier because the
data can be accessed quickly and easily.
Type 3: Offline storage and archives. This is
information that has been sent out for storage—generally it is no longer in
active use but can be accessed if needed. This is data housed outside of the
normal business and needs to be physically accessed. This data is on magnetic
tape, optical drives, etc. It is reasonably accessible because it is still a
part of normal business operations. It’s still considered first tier data since
it can be accessed relatively easily.
Type 4: Disaster recovery files. These are offsite and
usually compressed or encrypted. This data could be physically held by a third
party but may be more complex to recover. This kind of data will be subject to
a proportionality analysis because retrieval can be very costly. It is second
tier data.
Type 5: Erased, fragmented or corrupted data. This
data may be difficult or impossible to access but it also may contain evidence
of tampering with the data. Trying to reconstruct this data can be very expensive
and may not produce results. The proportionality arguments here will often be
speculative—as in “we know they erased this data, but we need to access the
hard drives to prove it.” Courts vary greatly in how they handle discovery
requests for this data.[8]
People store their own
data personally on their computers, phones, tablets and other devices.
Businesses store data on their in-house servers. All this data is subject to
e-discovery under FRCP Rule 26, state discovery rules, and state and federal
rules of evidence. All this private data is also subject to the preservation
rules outlined in the first module.
Third Party Storage
Because cloud companies
have physical control of data, they are also subject to subpoenas for that
data. In addition, attorneys who use cloud storage services are subject to
ethical considerations regarding client security and privacy. Both of those
ideas will be expanded on in coming modules.
Principles of accessing
cloud storage also apply to companies that transmit and store data as a regular
part of their customer business models. This includes phone companies like
Sprint and T-Mobile, and social media companies like Facebook and Reddit. There
are also companies that offer encrypted storage.
If any of these
companies store data that was transmitted or posted by a customer, then they
are third party data storage companies for purposes of responding to discovery
requests or subpoenas. Who the actual owner of the data is might be unclear,
but all companies that store other people’s data for any reason, whether the
customer even knows the data is being stored, are subject to e-discovery.[9]
Like any potential
evidence, all ESI documents are subject to a chain-of-custody analysis to
determine their authenticity. This is done by looking at the metadata of the
documents and the storage devices to track the document’s creation and history.[10]
Forms of
Electronically Stored Information
Here is a look at some primary forms of ESI and how each may present difficult e-discovery problems.
Email
Email is the predominant
form of ESI, comprising, by one estimate, 90% of all e-discovery. [That is
because not everybody texts, not everybody uses social media and not everybody
stores data in the cloud—but everybody uses email.] Email is archived so that
it can be accessed, but email systems often don’t have the native search
capabilities to adequately respond to e-discovery requests. So, there are
numerous third-party email-centric search platforms available in the market.
Databases
Databases also present
preservation and search challenges to the discovery process. Databases can hold
hundreds of millions of documents, and even the most sophisticated search
platforms can take a substantial amount of time and money to search properly.
In determining whether
discovery requests sent to holders of a database are enforceable, the threshold
questions revolve around proportionality and will include questions about the
standard and enhanced reporting capabilities of the database, exporting
capabilities, database structure, query language capabilities, user experience
and interface and more. All of that can also be expressed in general terms and
in case-specific terms. Once proportionality is established, the parties should
work out amongst themselves what data is accessible and how it will be accessed
and delivered.[11] What information is privileged needs to be addressed as
well.[12]
Word Processing and
Spreadsheets
Here we have data that
is often changed and passed around to multiple users in the ordinary course of
business. The metadata changes every time the document or spreadsheet changes
and numerous versions can be stored on numerous devices. Chasing down all the
versions of these documents and spreadsheets around multiple users’ devices is
a form of detective work—and the time involved may impact proportionality.
Challenges also exist,
of course, for all other forms of ESI. Faxes are long-distance photocopies but
are also stored in various places. Audio and video files, photographs and
multimedia often are geo-stamped and have their own metadata but are also
subject to editing. Other documents are subject to audit trails that are
discoverable. And, of course, the Internet itself is a source of a tremendous
amount of information.
All in all, while there
is a lot of data that is discoverable, it may not be so easy to get to it. And
there is, of course, an entire e-discovery software industry dedicated to
producing products that find, transmit and analyze all this data.
Trying to Keep
Documents Private: Encryption and Passwording
Keeping data from prying
eyes—hackers or other unauthorized users—requires some effort on the part of
custodians of that data. The most common techniques to achieve privacy include
password protecting, encryption and physically separating the data storage unity
from the rest of the system. That last technique is called “air gapping.”
Password protecting and
encrypting files, emails, texts and other data is a common practice for users
to help ensure data privacy and security.[13] When faced with a subpoena or
discovery request, encrypted files may cause come problems—some of them fatal
to the discovery request.
Encryption is coding
information; scrambling in a way that only a party with the encryption “key”
can translate. It is as old as human communications, traceable back at least as
far as the time of Julius Caesar. When we talk about data encryption in the
modern context, we are referring to a process that makes data inaccessible to a
reader unless that reader has a key. Basic passwords are a type of encryption.[14]
Encrypted data keys are
lengthy strings of letters, numbers and symbols. Those can be of varying
length, but the standard keys today are usually 256 characters, or “bits.” The
256-bit encryption is often referred to as “military grade.”
There are two types of
keys—single key and public key. A single key is held by the sender and the
recipient. This is also referred to as “end-to-end” encryption and is found in
private texting platforms like Signal and WhatsApp. Public keys are used in
e-commerce, where one party, say an online store, holds a key available to the
public, and then each user creates a personal password to access the website or
purchase a product. There are numerous encryption standards. You may see
initials like RSA, PGP, SSL, SET, DES or others, but for legal purposes, the
standard does not matter. What is important is access to the data that is
hiding behind the encryption key.
The three primary uses
of data encryption within the legal context are in data storage, data
communication, and online transactions.
For e-discovery purposes
in civil cases, courts will generally require data to be un-encrypted by the
respondent to a discovery request or subpoena under the principles of open
discovery discussed in Module 1, so much of the time encryption really is not
much of an issue.
The primary problems
that may arise on the civil side in attempting to access encrypted data are
ones where the data key is not available—often because the one person who held
the key died or can’t be located.
There is some precedent
in cases where the key can’t be located to simply declare the data unavailable
as “not reasonably accessible because of undue burden or costs.”[15] If
recovering that data is reasonable, the court could order that recovery.
Criminal cases are a
different matter because they run into the Fifth Amendment right to remain
silent, which may include the right to avoid making one’s data available.
Federal courts have taken both sides of this issue, called “compelled
decryption.” Some courts have held that the Fifth Amendment protects the data[16] and
some have ruled that it doesn’t.[17] The recent trend seems to be away
from the Fifth Amendment protecting the data.[18] However, a military
court recently ruled that forcing a member of the military to reveal his
smartphone password violates his Fifth Amendment rights.[19] On the other
side, a Philadelphia law enforcement officer was recently found in contempt of
court for stating that he could not remember his passcode.[20]
Third party storage
services are also subject to subpoena on both the civil and criminal sides. The
government and Apple, for example, are often at loggerheads over
this—particularly with the automatic encryption that comes standard on an
iPhone. Newer iPhones contain the ability to block physical access to the phone
by law enforcement by shutting down the phone’s port.
Blind Subpoenas and
Digital Forensics
Just to make things a
bit more complicated, there are independent encryption providers that sell
services to encrypt data and hold the keys specifically in response to a
“blind” subpoena, which is a subpoena served on the data storage service
without the knowledge of the owner of the data. For example, law enforcement
may subpoena iCloud for a user’s data but not notify the user.
Now the data is
separately encrypted by this “fourth” party, who holds the keys. The fourth
party then is in on the conversation of what data gets unencrypted and what
data is subject to privilege or objection.
Digital Forensics
There are times that
extraordinary measures may be called for to retrieve data that is impossible or
nearly impossible to read in its current state. These occur when data storage
devices (like hard drives or smartphones) have been corrupted or destroyed or
the data has been erased, or when the data is behind an unavailable password or
is encrypted and the key is unavailable.
In these cases, computer
forensics experts come to the rescue. On the civil side, these knights of the
data wars can be independent companies with a team of techies who have tools to
reconstruct the seemingly lost data. On the criminal side, they may be agents
of state and federal law enforcement trained in the retrieval of seemingly
irretrievable data.
These experts are very
good at what they do, but there are times when data is simply lost forever. For
instance, “erased” data can usually be found pretty easily, because full
erasure of a hard drive is at least a two-step process, and most people don’t
engage the second step. While a destroyed hard drive or smartphone may yield no
useful data, there are cases where the data was backed up without the knowledge
of the device’s user and thus can be found.
Reconstructing “lost”
data can be very costly, both in terms of time and technology.
On the civil side, the
costs of reconstructing this data will be subject to the usual proportionality
test of FRCP Rule 26, and the court may order either party to bear the cost of
retrieval or order that it is too costly to do so. On the criminal side, the
state will bear the cost.
In our next module,
we’ll turn to retrieving data through discovery channels provided for under the
rules of civil procedure.
[1] Data Custodian, EDRM, https://www.edrm.net/glossary/data-custodian/
[5] For a full definition of the cloud, see Peter Mell & Tim Grance, NIST SpecialPublication 800-145, The NIST Definition of Cloud Computing, Computer Security Resource Center (September 2011)
[7] Kristen M. Bush, Proportionality in Discovery, Trial Bar News, (June 1, 2016), http://www.schwartzsemerdjian.com/trial-bar-news/proportionality-in-discovery.
[8] Michele C.S. Lange & Kristin M. Nimsger, ElectronicEvidence and Discovery: What Every Lawyer Should Know 1, 43 (American Bar Association 2004).
[9] Meredith White, Discovery From Data Storage Providers: Building a Silver Lining Into Your Cloud Storage Contract, Barnes & Thornburg (September 2013),http://www.btlaw.com/commercial-litigation-update-discovery-from-data-storage-providers-september-2013/.
[13] The US Supreme Court equates encryption and other privacy measures to any other kind of storage, electronic or not. See Riley v. California, 134 S. Ct. 2473,2493 (2014).
[14] ChristinaMercer, What is Encryption?, Tech World, (May 15, 2018), https://www.techworld.com/security/what-is-encryption-3659671/.
[15] See Cochran v. Caldera Medical, Inc., Civil Action No. 12-5109, 2014 U.S. Dist. LEXIS 55447 at *6-7 (E.D. Penn Apr. 22, 2014).
[17] In re Boucher, Vermont District Court No. 2:06-mJ-91, 2009 WL 424718 (Feb. 19, 2009); United States v. Fricosu, 841F.Supp.2d 1232 (D. Col 2012).
[18] See In the Matter of the Search of a Residence in Aptos, California,Case No. 17-mj-70656-JSC-1, 2018 U.S. Dist. LEXIS 45827 (N.D. Cal. March 20, 2018).
[20] US v. Apple Mac Pro Computer, et. al., Case
No. 15-3537, 851 F.3d 238 (3d Cir. March 20, 2017).