throbber
Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 1 of 25
`
`
`
`
`
`
`
`
`
`
`
`UNITED STATES DISTRICT COURT
`
`NORTHERN DISTRICT OF CALIFORNIA
`
`J. DOE 1, et al.,
`
`Plaintiffs,
`
`v.
`
`GITHUB, INC., et al.,
`
`Defendants.
`
`
`
`
`
`Case No. 22-cv-06823-JST
`
`
`ORDER GRANTING IN PART AND
`DENYING IN PART MOTIONS TO
`DISMISS
`
`Re: ECF Nos. 50, 53
`
`
`
`Before the Court are motions to dismiss filed by Defendants GitHub, Inc. and Microsoft
`
`Corporation, ECF No. 50; and Defendants OpenAI, Inc., OpenAI, L.P., OpenAI GP, L.L.C.,
`
`OpenAI Startup Fund GP I, L.L.C., OpenAI Startup Fund I, L.P., and OpenAI Startup Fund
`
`Management, LLC (collectively “OpenAI Defendants”), ECF No. 53. Court will grant the
`
`motions in part and deny them in part.
`
`I.
`
`BACKGROUND
`
`Plaintiffs are software developers who challenge Defendants’ development and operation
`
`of Copilot and Codex, two artificial intelligence-based coding tools.1 For the purposes of the
`
`present motions, the Court accepts as true the following facts in the operative complaint.2
`
`GitHub, which was acquired by Microsoft in 2018, is the largest internet hosting service
`
`for software projects stored in Git, a widely used open-source version control system for managing
`
`
`1 Plaintiffs define “artificial intelligence” (“AI”) as “a computer program that algorithmically
`simulates human reasoning or inference, often using statistical methods.” Compl. ¶ 2.
`
` 2
`
` By stipulation and order, ECF No. 47, the operative complaint in the consolidated case is that
`filed in Doe 3 et al. v. GitHub, Inc. et al., No. 22-cv-7074-JST, ECF No. 1. All references to the
`complaint in this order refer to the operative complaint in the consolidated case.
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 2 of 25
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`software source code. Using GitHub permits software developers or programmers to collaborate
`
`on projects stored in repositories. Repositories may be private or public; anyone can view and
`
`access code stored in public repositories.
`
`All code uploaded to GitHub is subject to the GitHub Terms of Service, which provide that
`
`users retain ownership of any content they upload to GitHub, but grant GitHub the “right to store,
`
`archive, parse, and display [the content], and make incidental copies, as necessary to provide the
`
`Service, including improving the Service over time.” No. 22-cv-7074-JST, ECF No. 1-2 at 27.
`
`This “includes the right to do things like copy [the code] to our database and make backups; show
`
`it to you and other users; parse it into a search index or otherwise analyze it on our servers; [and]
`
`share it with other users.” Id. at 27-28. Further, the Terms of Service provide that users who set
`
`their repositories to be viewed publicly “grant each User of GitHub a nonexclusive, worldwide
`
`license to use, display, and perform [the content] through the GitHub Service and to reproduce
`
`[the content] solely on GitHub as permitted through GitHub’s functionality.” Id. at 28.
`
`Without AI-based assistance, programmers generally write code “both by originating code
`
`from the writer’s own knowledge of how to write code as well as by finding pre-written portions
`
`of code that—under the terms of the applicable license—may be incorporated into the coding
`
`project.” Compl. ¶ 78. Plaintiffs have each published licensed materials in which they hold a
`
`copyright interest to public repositories on GitHub. When creating a new repository, a GitHub
`
`user may “select[] one of thirteen licenses from a dropdown menu to apply to the contents of that
`
`repository.” Id. ¶ 34 n.4. Two of the suggested licenses waive copyrights and related rights. The
`
`remaining eleven suggested licenses3 require that any derivative work or copy of the licensed work
`
`include attribution to the owner, inclusion of a copyright notice, and inclusion of the license terms.
`
`Each Plaintiff published code to a public repository on GitHub under one of the eleven suggested
`
`
`3 These eleven licenses are (1) Apache License 2.0; (2) GNU General Public License version 3
`(“GPL-3.0”); (3) MIT License; (4) The 2-Clause BSD License (“BSD 2”); (5) The 3-Clause BSD
`License (“BSD 3”); (6) Boost Software License; (7) Eclipse Public License 2.0; (8) GNU Affero
`General Public License version 3 (“AGPL-3.0”); GNU General Public License version 2 (“GPL
`2”); (10) GNU Lesser General Public License version 2.1 (“LGPL-2.1”); and (11) Mozilla Public
`License 2.0. Compl. ¶ 34 n.4.
`
`
`2
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 3 of 25
`
`
`
`licenses that include these three requirements.
`
`In June 2021, GitHub and OpenAI released Copilot, an AI-based program that can “assist
`
`software coders by providing or filling in blocks of code using AI.” Id. ¶ 8. In August 2021,
`
`OpenAI released Codex, an AI-based program “which converts natural language into code and is
`
`integrated into Copilot.” Id. ¶ 9. Codex is integrated into Copilot: “GitHub Copilot uses the
`
`OpenAI Codex to suggest code and entire functions in real-time, right from your editor.” Id. ¶ 47
`
`(quoting GitHub website). GitHub users pay $10 per month or $100 per year for access to
`
`Copilot. Id. ¶ 8.
`
`Codex and Copilot employ machine learning, “a subset of AI in which the behavior of the
`
`program is derived from studying a corpus of material called training data.” Id. ¶ 2. Using this
`
`data, “through a complex probabilistic process, [these programs] predict what the most likely
`
`solution to a given prompt a user would input is.” Id. ¶ 79. Codex and Copilot were trained on
`
`“billions of lines” of publicly available code, including code from public GitHub repositories. Id.
`
`¶¶ 82-83.
`
`Despite the fact that much of the code in public GitHub repositories is subject to open-
`
`source licenses which restrict its use, id. ¶ 20, Codex and Copilot “were not programmed to treat
`
`attribution, copyright notices, and license terms as legally essential,” id. ¶ 80. Copilot reproduces
`
`licensed code used in training data as output with missing or incorrect attribution, copyright
`
`notices, and license terms. Id. ¶¶ 56, 71, 74, 87-89. This violates the open-source licenses of
`
`“tens of thousands—possibly millions—of software developers.” Id. ¶ 140. Plaintiffs additionally
`
`allege that Defendants improperly used Plaintiffs’ “sensitive personal data” by incorporating the
`
`data into Copilot and therefore selling and exposing it to third parties. Id. ¶¶ 225-39.
`
`Plaintiffs filed multiple cases against Defendants, which were subsequently consolidated.
`
`ECF No. 47. Plaintiffs, on behalf of themselves and two putative classes,4 plead twelve counts
`
`
`4 The “Injunctive Relief Class” and “Damages Class” are each defined as: “All persons or entities
`domiciled in the United States that[] (1) owned an interest in at least one [U.S.] copyright in any
`work; (2) offered that work under one of GitHub’s Suggested Licenses; and (3) stored Licensed
`Materials in any public GitHub repositories at any time between January 1, 2015 and the present
`(the ‘Class Period’).” Compl. ¶ 34 (footnote omitted).
`
`
`3
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 4 of 25
`
`
`
`against Defendants: (1) violation of the Digital Millennium Copyright Act (“DMCA”), 17 U.S.C.
`
`§§ 1201-05; (2) common law breach of open-source licenses; (3) common law tortious
`
`interference in a contractual relationship; (4) common law fraud; (5) false designation of origin in
`
`violation of the Lanham Act, 15 U.S.C. § 1125; (6) unjust enrichment in violation of Cal. Bus. &
`
`Prof. Code §§ 17200, et seq., and the common law; (7) unfair competition in violation of the
`
`Lanham Act, 15 U.S.C. § 1125; Cal. Bus. & Prof. Code §§ 17200, et seq., and the common law;
`
`(8) breach of contract for violation of the GitHub Privacy Policy and Terms of Service; (9)
`
`violation of the California Consumer Privacy Act (“CCPA”); (10) common law negligence; (11)
`
`common law civil conspiracy; and (12) declaratory relief under 28 U.S.C. § 2201(a) and Cal. Code
`
`Civ. Proc. § 1060.5 Defendants now move to dismiss the complaint. ECF Nos. 50, 53.
`
`II.
`
`JURISDICTION
`
`The Court has jurisdiction over Plaintiffs’ federal claims under 28 U.S.C. § 1331 and
`
`supplemental jurisdiction over Plaintiffs’ state law claims under 28 U.S.C. § 1367.
`
`III. LEGAL STANDARD
`
`A.
`
`Rule 12(b)(1)
`
`“Article III of the Constitution confines the federal judicial power to the resolution of
`
`‘Cases’ and ‘Controversies.’” TransUnion LLC v. Ramirez, 141 S. Ct. 2190, 2203 (2021). “For
`
`there to be a case or controversy under Article III, the plaintiff must have a ‘personal stake’ in the
`
`case—in other words, standing.” Id. (quoting Raines v. Byrd, 521 U.S. 811, 819 (1997)). A
`
`defendant may attack a plaintiff’s assertion of jurisdiction by moving to dismiss under Rule
`
`12(b)(1) of the Federal Rules of Civil Procedure. Cetacean Cmty. v. Bush, 386 F.3d 1169, 1174
`
`
`5 While Plaintiffs plead several common law tort claims, they do not identify the state law which
`applies to each claim. “[D]ue to variances among state laws, failure to allege which state law
`governs a common law claim is grounds for dismissal.” In re Nexus 6P Prods. Liab. Litig., 293 F.
`Supp. 3d 888, 933 (N.D. Cal. 2018) (quoting Romero v. Flowers Bakeries, LLC, No. 14-cv-05189-
`BLF, 2016 WL 469370, at *12 (N.D. Cal. Feb. 8, 2016)). With respect to the state law claims in
`any future amended complaint, Plaintiffs shall identify the state under whose law the claim is
`brought. When claims which share a legal theory are brought under multiple laws (for example, a
`federal statute, a state statute, and common law, such as Plaintiffs’ claim for unfair competition),
`Plaintiffs shall state each claim as a separate count. For the purposes of deciding the present
`motion to dismiss, the Court interprets the complaint as asserting the common law claims of
`Counts 2, 3, 4, 6, 7, 10, and 11 under California law.
`
`
`4
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 5 of 25
`
`
`
`(9th Cir. 2004); see also Maya v. Centex Corp., 658 F.3d 1060, 1067 (9th Cir. 2011) (“[L]ack of
`
`Article III standing requires dismissal for lack of subject matter jurisdiction under Federal Rule of
`
`Civil Procedure 12(b)(1).”).
`
` “A Rule 12(b)(1) jurisdictional attack may be facial or factual.” Safe Air for Everyone v.
`
`Meyer, 373 F.3d 1035, 1039 (9th Cir. 2004). “In a facial attack, the challenger asserts that the
`
`allegations contained in a complaint are insufficient on their face to invoke federal jurisdiction.”
`
`Id. Where, as here, a defendant makes a facial attack, the court assumes that the complaint’s
`
`allegations are true and draws all reasonable inferences in the plaintiff’s favor. Wolfe v.
`
`Strankman, 392 F.3d 358, 362 (9th Cir. 2004).
`
`B.
`
`Rule 12(b)(6)
`
`“Dismissal under [Federal Rule of Civil Procedure] 12(b)(6) is appropriate only where the
`
`complaint lacks a cognizable legal theory or sufficient facts to support a cognizable legal theory.”
`
`Mendiondo v. Centinela Hosp. Med. Ctr., 521 F.3d 1097, 1104 (9th Cir. 2008). A complaint must
`
`contain “a short and plain statement of the claim showing that the pleader is entitled to relief.”
`
`Fed. R. Civ. P. 8(a)(2). Facts pleaded by a plaintiff “must be enough to raise a right to relief
`
`above the speculative level.” Bell Atl. Corp. v. Twombly, 550 U.S. 544, 555 (2007).
`
`“To survive a motion to dismiss, a complaint must contain sufficient factual matter,
`
`accepted as true, to ‘state a claim to relief that is plausible on its face.’” Ashcroft v. Iqbal, 556 U.S.
`
`662, 678 (2009) (quoting Twombly, 550 U.S. at 570)). “A claim has facial plausibility when the
`
`plaintiff pleads factual content that allows the court to draw the reasonable inference that the
`
`defendant is liable for the misconduct alleged.” Id. In determining whether a plaintiff has met this
`
`plausibility standard, the Court must “accept all factual allegations in the complaint as true and
`
`construe the pleadings in the light most favorable to the nonmoving party.” Knievel v. ESPN, 393
`
`F.3d 1068, 1072 (9th Cir. 2005).
`
`C.
`
`Leave to Amend
`
`Leave to amend a complaint “shall be freely given when justice so requires.” Fed. R. Civ.
`
`P. 15(a)(2). The decision of whether to grant leave to amend is “within the discretion of the
`
`district court, which may deny leave due to ‘undue delay, bad faith or dilatory motive on the part
`
`5
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 6 of 25
`
`
`
`of the movant, repeated failure to cure deficiencies by amendments previously allowed, undue
`
`prejudice to the opposing party by virtue of allowance of the amendment, and futility of
`
`amendment.’” Leadsinger, Inc. v. BMG Music Pub., 512 F.3d 522, 532 (9th Cir. 2008) (quoting
`
`Foman v. Davis, 371 U.S. 178, 182 (1962)).
`
`IV. DISCUSSION
`
`A.
`
`Article III Standing
`
`Defendants argue that Plaintiffs lack Article III standing to assert their claims and that the
`
`Court therefore lacks subject-matter jurisdiction over this action.6
`
`“[T]o establish standing, a plaintiff must show (i) that he suffered an injury in fact that is
`
`concrete, particularized, and actual or imminent; (ii) that the injury was likely caused by the
`
`defendant, and (iii) that the injury would likely be redressed by judicial relief.” TransUnion, 141
`
`S. Ct. at 2203. “The party invoking federal jurisdiction bears the burden of establishing these
`
`elements.” Lujan v. Defenders of Wildlife, 504 U.S. 555, 561 (1992). “Where, as here, a case is at
`
`the pleading stage, the plaintiff must ‘clearly . . . allege facts demonstrating’ each element.”
`
`Spokeo, Inc. v. Robins, 578 U.S. 330, 338 (2016) (quoting Warth v. Seldin, 422 U.S. 490, 518
`
`(1975)).
`
`“A plaintiff must demonstrate constitutional standing separately for each form of relief
`
`requested.” Davidson v. Kimberly-Clark, 889 F.3 956, 969 (9th Cir. 2018). In a putative class
`
`action, named plaintiffs “must allege and show that they personally have been injured, not that
`
`injury has been suffered by other, unidentified members of the class to which they belong and
`
`which they purport to represent.” Warth, 422 U.S. at 502.
`
`Plaintiffs advance two main theories of harm. One theory is that Defendants have shared,
`
`
`6 Both parties impermissibly incorporate substantive material by reference across their briefs.
`“The incorporation of substantive material by reference is not sanctioned by the federal rules.”
`Swanson v. U.S. Forest Serv., 87 F.3d 339, 345 (9th Cir. 1996). In the future, all arguments the
`parties seek to make must be raised in the briefing on a particular motion. See Woolfson v. Conn
`Appliances, Inc., No. 21-cv-07833-MMC, 2022 WL 3139522, at *6 (N.D. Cal. Aug. 5, 2022)
`(declining to consider substantive arguments purportedly incorporated by reference); Williams v.
`Cnty. of Alameda, 26 F. Supp. 3d 925, 947 (N.D. Cal. 2014) (“[T]he Court will not consider the
`arguments that Plaintiff improperly seeks to incorporate by reference. This Court only considers
`arguments that are specifically and distinctively raised by the parties in their briefs.”).
`
`
`6
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 7 of 25
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`sold, and exposed and will continue to share, sell, and expose Plaintiffs’ personal information,
`
`harming Plaintiffs’ rights to privacy and the protection of their personal data. The second is that
`
`Defendants’ use of licensed code as training data for Codex and Copilot harms Plaintiffs’ property
`
`interests in the licensed code because Copilot already has or likely will reproduce Plaintiffs’ code
`
`in violation of those licenses.
`
`1.
`
`Injury to Privacy Rights
`
`Defendants argue that Plaintiffs fail to plausibly allege any disclosure of personal
`
`information, and therefore fail to allege an actual or imminent injury sufficient to confer standing.
`
`In the complaint, Plaintiffs do not identify the specific sensitive or private information at issue.
`
`Plaintiffs therefore do not allege facts sufficient for the Court to infer that Plaintiffs’ privacy
`
`interests are implicated by the alleged misuse of such information. See Ji v. Naver Corp., No. 21-
`
`cv-05143-HSG, 2022 WL 4624898, at *7 (N.D. Cal. Sept. 30, 2022) (dismissing privacy-based
`
`claim for lack of standing where plaintiffs “have not alleged enough facts to show that the
`
`[personal data at issue] are the type of information that could give rise to a privacy injury”); I.C. v.
`
`Zynga, Inc., 600 F. Supp. 3d 1034, 1049 (N.D. Cal. Apr. 29, 2022) (evaluating whether privacy
`
`interests were implicated by allegedly disclosed personal information to determine whether
`
`plaintiffs had standing).
`
`Plaintiffs have not met their burden to allege facts demonstrating an injury-in-fact
`
`sufficient to confer standing for their privacy-based claims. Plaintiffs’ claims for breach of the
`
`GitHub Privacy Policy and Terms of Service, violation of the CCPA, and negligence are
`
`dismissed with leave to amend.
`
`2.
`
`Injury to Property Rights
`
`Defendants argue that Plaintiffs do not identify any instance of Copilot reproducing
`
`Plaintiffs’ licensed code and therefore fail to plead a particularized injury sufficient to confer
`
`standing.
`
`“For an injury to be ‘particularized,’ it ‘must affect the plaintiff in a personal and
`
`individual way.’” Spokeo, 578 U.S. at 339 (quoting Lujan, 504 U.S. at 560 n.1). “[A]t an
`
`irreducible minimum, Art. III requires the party who invokes the court’s authority to ‘show that he
`
`7
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 8 of 25
`
`
`
`personally has suffered some actual or threatened injury.’” Valley Forge Christian Coll. v. Ams.
`
`United for Separation of Church and State, Inc., 454 U.S. 464, 472 (1982) (quoting Gladstone
`
`Realtors v. Village of Bellwood, 441 U.S. 91, 99 (1979)).
`
`Plaintiffs argue that they “have alleged the requisite particularized injury” because they
`
`“allege that Defendants have violated provisions of [the] open-source licenses” under which
`
`Plaintiffs published their code. ECF No. 67 at 14. In the complaint, Plaintiffs allege that
`
`Defendants violate those licenses because Copilot reproduces Plaintiffs’ code as output with
`
`missing or incorrect attribution, copyright notices, and license terms.7 However, while Plaintiffs
`
`identify several instances in which Copilot’s output matched licensed code written by a Github
`
`user, Compl. ¶¶ 56, 71, 74, 87-89, none of these instances involve licensed code published to
`
`GitHub by Plaintiffs. Because Plaintiffs do not allege that they themselves have suffered the
`
`injury they describe, they do not have standing to seek retrospective relief for that injury.8
`
`Plaintiffs also argue that they “also allege an imminent and significant harm—now that
`
`their software code is part of the training data . . . , [it] can be output at any time without the
`
`legally necessary notices and attribution, leading to increased and imminent risk of
`
`misappropriation.” ECF No. 67 at 15. To the extent that Plaintiffs allege an injury based on the
`
`increased risk of misappropriation of their own licensed code, that injury is sufficiently
`
`particularized. However, an increased risk of future harm alone is not sufficiently concrete to
`
`confer standing for damages. TransUnion, 141 S. Ct. at 2210-11 (finding “persuasive” argument
`
`
`7 In opposition, Plaintiffs suggest they were also injured by Defendants’ use of Plaintiffs’ licensed
`code as training data for Codex and Copilot. ECF No. 66 at 12, 21. But Plaintiffs’ complaint does
`not describe such an injury. In the complaint, Plaintiffs allege that “Defendants had access to but
`were not licensed by Plaintiffs . . . to train any . . . functional prediction engine using the Licensed
`Materials.” Compl. ¶ 149; see also id. ¶ 37 (describing one question common to the putative class
`as “[w]hether Defendants violated the Licenses governing use of the Licensed Materials by using
`them to train Copilot”). However, Plaintiffs do not actually allege that Defendants’ use of
`licensed code to train Codex and Copilot constituted a breach of the open-source licenses at issue.
`See id. ¶¶ 172-87 (alleging that Defendants breached the licenses by failing to provide attribution
`in output, failing to provide copyright notices in output, and failing to identify applicable licenses
`and the text of those licenses in output). Because Plaintiffs do not allege they were injured by
`Defendants’ use of licensed code as training data, the Court does not address whether such an
`injury is sufficient to confer standing.
`
` 8
`
` This problem is related to redressability: Plaintiffs allege no injury redressable by the monetary
`damages they seek.
`
`8
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 9 of 25
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`that, “in a suit for damages, the mere risk of future harm . . . cannot qualify as a concrete harm . . .
`
`unless the exposure to the risk of future harm itself causes a separate concrete harm”) (emphasis
`
`in original); id. at 2213 (holding that “the risk of future harm on its own does not support Article
`
`III standing for the plaintiffs’ damages claims”). Plaintiffs do not allege any additional, concrete
`
`harm associated with this increased risk of misappropriation. Thus, an increased risk of
`
`misappropriation cannot provide standing for Plaintiffs’ damages claims.
`
`Finally, Plaintiffs argue that, “[g]iven the number of times users may use Copilot, it is a
`
`virtual certainty [that] any particular plaintiff’s code will be displayed either with copyright
`
`notices removed or in violation of Plaintiffs’ open-source licenses for profit.” ECF No. 67 at 15.
`
`Though this does not support standing for retrospective damages – for which, as discussed above,
`
`Plaintiffs must allege that they have suffered a qualifying injury-in-fact that has actually occurred
`
`to them – it may support standing for injunctive relief.
`
`“[A] person exposed to a risk of future harm may pursue forward-looking, injunctive relief
`
`to prevent the harm from occurring, at least so long as the risk of harm is sufficiently imminent
`
`and substantial.” TransUnion, 141 S. Ct. at 2210. “An allegation of future injury may suffice if
`
`the threatened injury is ‘certainly impending,’ or there is a ‘substantial risk that the harm will
`
`occur.’” Susan B. Anthony List v. Driehaus, 573 U.S. 149, 158 (2014) (quoting Clapper v.
`
`Amnesty Int’l USA, 568 U.S. 414 n.5 (2013)). A substantial risk means there is a “realistic danger
`
`of sustaining a direct injury.” Pennell v. City of San Jose, 485 U.S. 1, 8 (1988) (quoting Babbitt v.
`
`United Farm Workers Nat’l Union, 442 U.S. 289, 298 (1979)).
`
`Plaintiffs plausibly allege that there is at least a substantial risk that Defendants’ programs
`
`will reproduce Plaintiffs’ licensed code as output. Plaintiffs allege that both Codex and Copilot
`
`were trained on data that included all public GitHub repositories. Plaintiffs further allege that the
`
`programs reproduce well-known code in response to related prompts and offer several concrete
`
`examples of such reproduction. Finally, Plaintiffs allege that GitHub’s own internal research
`
`shows that Copilot reproduces code from training data “about 1% of the time.” Compl. ¶¶ 56, 71,
`
`74, 87-90. Plaintiffs thus plausibly allege that, absent injunctive relief, there is a realistic danger
`
`that Codex or Copilot will reproduce Plaintiffs’ licensed code as output. Plaintiffs further allege
`
`9
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 10 of 25
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`that Defendants have modified Copilot to ensure that it no longer reproduces license text,
`
`attribution, and copyright notices. Taking the facts of the complaint as true and construing all
`
`inferences in the Plaintiffs’ favor, the Court can reasonably infer that, should Plaintiffs’ code be
`
`reproduced as output, it will be reproduced in a manner that violates the open-source licenses
`
`under which Plaintiffs published their code.9
`
`While Plaintiffs have failed to establish an injury-in-fact sufficient to confer standing for
`
`their claims for damages based on injury to property rights, they have standing to pursue
`
`injunctive relief on such claims.
`
`B.
`
`Proceeding Pseudonymously
`
`Defendants argue that the complaint should be dismissed because Plaintiffs are not entitled
`
`to proceed under fictitious names.10
`
`“The normal presumption in litigation is that parties must use their real names.” Doe v.
`
`Kamehameha Schs./Bernice Pauahi Bishop Est., 596 F.3d 1036, 1042 (9th Cir. 2010). Parties
`
`may proceed pseudonymously only “in the ‘unusual case’ when nondisclosure of the party’s
`
`identity ‘is necessary . . . to protect a person from harassment, injury, ridicule or personal
`
`embarrassment.” Does I Thru XXIII v. Advanced Textile Corp., 214 F.3d 1058, 1067-68 (9th Cir.
`
`2000) (alteration in original) (quoting United States v. Doe, 655 F.2d 920, 922 n.1 (9th Cir.
`
`1981)).
`
`Threats of physical harm “present[] the paradigmatic case for allowing anonymity.”
`
`Kamehameha, 596 F.3d at 1043. Where a party seeks to proceed pseudonymously on the basis of
`
`
`9 Defendants argue that Plaintiffs must also allege facts regarding the type of code they published,
`what problem that code solves, how frequently that code might appear on GitHub, and how likely
`it is that a user would enter a prompt that would generate a match to that code. ECF No. 50 at 16.
`Defendants suggest that, absent such facts, Plaintiffs’ alleged injury is too “conjectural or
`hypothetical” to confer standing. Id. (quoting Lujan, 504 U.S. at 550). Defendants ask too much
`of Plaintiffs at the pleading stage. Such facts might help Defendants—the only parties with
`knowledge of how Copilot and Codex were designed and operate—understand whether Plaintiffs’
`alleged injury has already occurred or how soon it is likely to occur. However, taking the facts in
`the complaint as true, and construing all inferences in Plaintiffs’ favor, their alleged future injury
`is neither conjectural nor hypothetical: their licensed code was used to train these programs and
`there is presently a realistic danger that the code will be reproduced as output.
`
`10 Because the caselaw uses “pseudonymous” and “anonymous” interchangeably, this Court does
`so as well.
`
`10
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 11 of 25
`
`
`
`retaliatory harm, “a district court must balance five factors: ‘(1) the severity of the threatened
`
`harm, (2) the reasonableness of the anonymous party’s fears, . . . (3) the anonymous party’s
`
`vulnerability to such retaliation,’ (4) the prejudice to the opposing party, and (5) the public
`
`interest.” Id. at 1042 (quoting Advanced Textile, 214 F.3d at 1068).
`
`Plaintiffs seek to proceed pseudonymously because, “through their counsel, [they] have
`
`received legitimate and credible threats of physical violence.” ECF No. 67 at 17. Plaintiffs
`
`highlight three threatening emails sent to counsel prior to the filing of the operative complaint. All
`
`three emails contain veiled threats wishing death upon Plaintiffs’ counsel on the basis of his
`
`involvement in this lawsuit. ECF No. 68-1 at 2 (“[I]magine shooting against [AI] . . . . [G]o kys. .
`
`. . [K]ill urself.”); ECF No. 68-2 at 2 (“I hope you f*cking die you piece of sh*t. It’s people like
`
`you why this world sucks so f*cking bad and we can’t have nice things. I literally hope someone
`
`murder [sic] you. Go f*cking die.”); ECF No. 68-3 at 2 (“[G]o f*cking cry about [G]it[H]ub you
`
`f*cking piece of sh*t n****r, [I] hope your throat gets cut open and every single family member
`
`of you [sic] is burnt to death.”). Plaintiffs’ counsel have received “many” such messages. ECF
`
`No. 67 at 17. Plaintiffs argue they should be permitted to proceed pseudonymously because they
`
`reasonably fear they will be subject to threats of retaliation and violence if their identities are
`
`disclosed.
`
`The first two factors of the balancing test for retaliatory harm – severity of the threatened
`
`harm and the reasonableness of the fear of such harm – “are intricately related and should be
`
`addressed together.” Kamehameha, 596 F.3d at 1040. Plaintiffs need not prove the speakers
`
`“intend to carry out the threatened retaliation,” but rather “that a reasonable person would believe
`
`that the threat might actually be carried out.” Advanced Textile, 214 F.3d at 1071. “It is in the
`
`particular purview of the district court to view alleged threats in context and determine what the
`
`‘reasonable’ person in the plaintiffs’ situation would fear.” Kamehameha, 596 F.3d at 1044. “In
`
`11
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`

`

`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 12 of 25
`
`
`
`context, a plaintiff might reasonably fear a veiled threat of violence.” Id.11
`
`The threatened harm in this case – death – is plainly severe. Id. at 1043 (describing threats
`
`of physical retaliation, including death, as “undoubtedly severe”); Doe v. Steagall, 653 F.2d 180,
`
`186 (5th Cir. 1981) (permitting plaintiffs to proceed pseudonymously where they faced “threats of
`
`violence,” including veiled death threats, “generated by this case”); Doe v. Univ. Acct. Serv., LLC,
`
`No. 09-CV-01563-BAS-JLB, 2022 WL 623913, at *4 (S.D. Cal. Mar. 3, 2022) (permitting
`
`plaintiff to proceed pseudonymously where he received death threats). While the threatening
`
`emails were not sent to Plaintiffs directly, the emails wish death upon Plaintiffs’ counsel on the
`
`basis of their involvement in this lawsuit. It is reasonable for Plaintiffs to fear that such threats
`
`might be carried out against them if their identities were to become public.
`
`GitHub and Microsoft suggest that Plaintiffs’ fear is unreasonable because “the types of
`
`nasty messages at issue here are a fact of modern life in the era of internet ‘trolls.’” ECF No. 72 at
`
`19. GitHub and Microsoft do not explain why the rise of internet trolls renders Plaintiffs’ fears of
`
`harm unreasonable. Cf. United States v. Bagdasarian, 652 F.3d 1113, 1126-27 (9th Cir. 2011)
`
`(Wardlaw, J., concurring in part) (describing the connection between anonymous internet posts
`
`and subsequent real-world violence). Sending direct messages containing veiled death threats
`
`would seem to constitute behavior beyond trolling.

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket