4 Discovery Strategies for Cases Involving Source Code

A litigator and a Ph.D. in computer science explain source code and ways to make source-code discovery more cost-effective in litigation.

October 25, 2013

The advancement of computer-related technologies and their introduction to the consumer market has accelerated dramatically in the past decade. Product life cycles, particularly for consumer electronics, have shortened as the pace of innovation increases. These factors have complicated intellectual property litigation, including those relating to patents, trade secrets, and copyrights, by expanding the scope of cases in which the preprogrammed computer instructions, or source code, are relevant to the proof or defense of a claim.

The increased focus on source code also leads to increased cost for asserting or defending claims. For example, reviewing the other side’s source code to determine how the subject software operates in the discovery phase can be expensive and time consuming. For case and budget management, source code-related costs and their control can be challenging.

But the good news is that the cost of source code discovery is controllable. In this article, we first provide a brief background about source code, and then suggest discovery strategies to make source-code discovery more cost-effective.

A Brief Introduction to Source Code

Source code is the programmed instructions that specify how a software feature of a computing product is logically organized and processed. Each software feature—for example, pressing a button on a smartphone screen—has associated source code. It tells the computing product what to do. It may also contain, in the background, programmer comments that describe the function.

A computing device uses source code in a converted, machine-readable form. The conversion process translates the code from a programming language into a binary form known as object code. The resulting binary files run the program on a computing device.

Containing Discovery Costs

The source code must account for each different permutation of operation that a device may perform. As a result, it may comprise millions or even billions of lines of code. In a complex case, the volume of code may increase if there are multiple versions of a product or multiple products at issue. This presents unique challenges for those investigating this large amount of information.

The first challenge is finding and isolating the few lines of relevant code among the millions of lines produced. The second is performing the search in a cost-effective manner. One should approach these challenges with a defined strategy.

1. Consider narrowing discovery requests to target specific functionalities

In the case of source code discovery, reliance on broad requests for “all” source code related to a product may result in an unnecessarily large burden. Receiving “all” the source code for a product (and potentially multiple versions of code) means the receiving party must first work to eliminate irrelevant files. This is a time-consuming task because it still requires some review to determine what is irrelevant. Moreover, a broad request invites a producing party to dump all the source code for a product. The requesting party’s own request for “all” source code leaves little room to complain later about the overbroad production.

A targeted request for source code, however, may contain production to a manageable size. First, perform a detailed investigation of the scope of the source code issue. Identify the specific feature at issue, and craft a written definition to use as the basis for a request for production. A request based on the relevant product feature potentially avoids the issue of requesting additional code if the first request turns out to be deficient. The request should seek production of all code related to performing the feature identified, including at least the program files, library files, and configuration files.

In patent cases where claim language is relevant to the features at issue, a drafter should consider avoiding definitions of the accused product in words associated with the asserted claims. Using claim language to define the feature invites an objection and delay.

2. Define the scope of the request to include more than just the program files

A full understanding of how an executable program works is obtained from more than just the source code program files. There are three significant types of files relevant to understanding a program’s operation:

  1. Program files
  2. Library files
  3. Compiling files

The program files show the algorithm steps. The library files define the data structures into which data is stored and on which the algorithms operate. The compiling files define what files and/or portions of files are actually compiled into the executable object code. Without the compiling files, it is entirely possible to review code that is not used in the relevant product.

“Read-me”-type files are also useful for guiding review. These may contain a narrative describing characteristics of the program files themselves, or a history about them. Examples of information that could be found here include: an identification of the program files that should be in a directory folder; authors of the files; event dates related to the files; change and modification; and a corrections (or “bugs”) list. This information provides context to the source code and clues to its operation.

To avoid ambiguity in any request, a suggested course to obtain this information is to make a separate request for production. In some cases a party may not consider library and compiling files source code. In other cases a producing party may be unaware that requesting source code files may not include library and compiling file types, although the files relate to the source code. Avoiding this ambiguity with a document request should further simplify the discovery process and avoid disputes.

3. Shorten review time by using interrogatories

A requesting party can also seek assistance from the producing party on where to look in the code for the desired feature. An interrogatory can be used to obtain an identification and process flow of the source code for a described feature. A substantive response can at least provide a starting point at which to begin an inspection. A responding party may attempt to rely on Rule 33(d) of the Federal Rules of Civil Procedure, which allows a party to produce relevant documents in lieu of answering the interrogatory and simply reference the source code. Recent district court decisions suggest that a Rule 33(d) answer alone is insufficient, because a producing party should know how its own product, and its source code, operates. Personal Audio, LLC v. Apple, Inc., 2010 U.S. Dist. LEXIS 14421 (E.D. Tex. June 1, 2010); Laserdynamics, Inc. v. Asus Computer Int’l, 2009 U.S. Dist. LEXIS 3878 (E.D. Tex. Jan. 21, 2009).

In addition, an early understanding of the correspondence between a product’s commercial name and any internal names is also useful for streamlining review. Many times companies use separate development names for products. Companies may also use more than one designation, including a development name, a product/marketing name, a SKU, and perhaps some other internal reference number for financial reporting. Reporting functions within a company many not cross-reference across departments. A bill of materials for a product may identify the source code component, but the reference to the component may not match the source code as produced.

These ambiguities can be resolved with an interrogatory requesting identification of all internal and external product designations that a party may use. A complete response is difficult to avoid and entirely relevant to discovery in the case.

4. Consider taking an early deposition

Uncontroversial information regarding the folder directory on a produced source code computer, arrangement of the types of files produced, and the correspondence of files produced to relevant products should be available without dispute. In the instance, however, where a producing party is difficult, a deposition regarding the source code may quickly resolve these questions. To account for the potentially difficult party, try to provide for flexibility in the discovery plan with either a specific extra deposition on the source-code production or a deposition limit accounted for by hours instead of persons.

The Upshot

Source code discovery presents a significant risk of resource waste because the size and complexity of the data makes it difficult to review. Invariably, there are large source code sets involved that do not lend themselves to easy summary. However, with initial planning and a clear understanding of the end point, significant resources can be saved.

In our second article, we will address avoidable pitfalls that may otherwise increase review expense.

David A. Prange is a senior associate with Robins, Kaplan, Miller & Ciresi and focuses his practice on patent and trade secret litigation. In recent years, he has represented plaintiffs and defendants in patent cases involving source code issues, including the successful assertion of source code-based patent claims at trial. Esam A. Sharafuddin Ph.D. (computer science) is an in-house scientist with the firm and experienced in software development and system architecture.

Reprinted with permission from the October 2013 issue of Corporate Counsel. Copyright 2013 ALM Media Properties LLC. Further duplication without permission is prohibited. All rights reserved.

The articles on our website include some of the publications and papers authored by our attorneys, both before and after they joined our firm. The content of these articles should not be taken as legal advice. The views and opinions expressed in this article are those of the author(s) and do not necessarily reflect the views or official position of Robins Kaplan LLP.


Esam Sharafuddin, Ph.D.

Science Advisor

Back to Top