`
`
`
`
`
`
`
`
`
`
`
`UNITED STATES DISTRICT COURT
`
`NORTHERN DISTRICT OF CALIFORNIA
`
`J. DOE 1, et al.,
`
`Plaintiffs,
`
`v.
`
`GITHUB, INC., et al.,
`
`Defendants.
`
`
`
`
`
`Case No. 22-cv-06823-JST
`
`
`ORDER GRANTING IN PART AND
`DENYING IN PART MOTIONS TO
`DISMISS
`
`Re: ECF Nos. 50, 53
`
`
`
`Before the Court are motions to dismiss filed by Defendants GitHub, Inc. and Microsoft
`
`Corporation, ECF No. 50; and Defendants OpenAI, Inc., OpenAI, L.P., OpenAI GP, L.L.C.,
`
`OpenAI Startup Fund GP I, L.L.C., OpenAI Startup Fund I, L.P., and OpenAI Startup Fund
`
`Management, LLC (collectively “OpenAI Defendants”), ECF No. 53. Court will grant the
`
`motions in part and deny them in part.
`
`I.
`
`BACKGROUND
`
`Plaintiffs are software developers who challenge Defendants’ development and operation
`
`of Copilot and Codex, two artificial intelligence-based coding tools.1 For the purposes of the
`
`present motions, the Court accepts as true the following facts in the operative complaint.2
`
`GitHub, which was acquired by Microsoft in 2018, is the largest internet hosting service
`
`for software projects stored in Git, a widely used open-source version control system for managing
`
`
`1 Plaintiffs define “artificial intelligence” (“AI”) as “a computer program that algorithmically
`simulates human reasoning or inference, often using statistical methods.” Compl. ¶ 2.
`
` 2
`
` By stipulation and order, ECF No. 47, the operative complaint in the consolidated case is that
`filed in Doe 3 et al. v. GitHub, Inc. et al., No. 22-cv-7074-JST, ECF No. 1. All references to the
`complaint in this order refer to the operative complaint in the consolidated case.
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 2 of 25
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`software source code. Using GitHub permits software developers or programmers to collaborate
`
`on projects stored in repositories. Repositories may be private or public; anyone can view and
`
`access code stored in public repositories.
`
`All code uploaded to GitHub is subject to the GitHub Terms of Service, which provide that
`
`users retain ownership of any content they upload to GitHub, but grant GitHub the “right to store,
`
`archive, parse, and display [the content], and make incidental copies, as necessary to provide the
`
`Service, including improving the Service over time.” No. 22-cv-7074-JST, ECF No. 1-2 at 27.
`
`This “includes the right to do things like copy [the code] to our database and make backups; show
`
`it to you and other users; parse it into a search index or otherwise analyze it on our servers; [and]
`
`share it with other users.” Id. at 27-28. Further, the Terms of Service provide that users who set
`
`their repositories to be viewed publicly “grant each User of GitHub a nonexclusive, worldwide
`
`license to use, display, and perform [the content] through the GitHub Service and to reproduce
`
`[the content] solely on GitHub as permitted through GitHub’s functionality.” Id. at 28.
`
`Without AI-based assistance, programmers generally write code “both by originating code
`
`from the writer’s own knowledge of how to write code as well as by finding pre-written portions
`
`of code that—under the terms of the applicable license—may be incorporated into the coding
`
`project.” Compl. ¶ 78. Plaintiffs have each published licensed materials in which they hold a
`
`copyright interest to public repositories on GitHub. When creating a new repository, a GitHub
`
`user may “select[] one of thirteen licenses from a dropdown menu to apply to the contents of that
`
`repository.” Id. ¶ 34 n.4. Two of the suggested licenses waive copyrights and related rights. The
`
`remaining eleven suggested licenses3 require that any derivative work or copy of the licensed work
`
`include attribution to the owner, inclusion of a copyright notice, and inclusion of the license terms.
`
`Each Plaintiff published code to a public repository on GitHub under one of the eleven suggested
`
`
`3 These eleven licenses are (1) Apache License 2.0; (2) GNU General Public License version 3
`(“GPL-3.0”); (3) MIT License; (4) The 2-Clause BSD License (“BSD 2”); (5) The 3-Clause BSD
`License (“BSD 3”); (6) Boost Software License; (7) Eclipse Public License 2.0; (8) GNU Affero
`General Public License version 3 (“AGPL-3.0”); GNU General Public License version 2 (“GPL
`2”); (10) GNU Lesser General Public License version 2.1 (“LGPL-2.1”); and (11) Mozilla Public
`License 2.0. Compl. ¶ 34 n.4.
`
`
`2
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 3 of 25
`
`
`
`licenses that include these three requirements.
`
`In June 2021, GitHub and OpenAI released Copilot, an AI-based program that can “assist
`
`software coders by providing or filling in blocks of code using AI.” Id. ¶ 8. In August 2021,
`
`OpenAI released Codex, an AI-based program “which converts natural language into code and is
`
`integrated into Copilot.” Id. ¶ 9. Codex is integrated into Copilot: “GitHub Copilot uses the
`
`OpenAI Codex to suggest code and entire functions in real-time, right from your editor.” Id. ¶ 47
`
`(quoting GitHub website). GitHub users pay $10 per month or $100 per year for access to
`
`Copilot. Id. ¶ 8.
`
`Codex and Copilot employ machine learning, “a subset of AI in which the behavior of the
`
`program is derived from studying a corpus of material called training data.” Id. ¶ 2. Using this
`
`data, “through a complex probabilistic process, [these programs] predict what the most likely
`
`solution to a given prompt a user would input is.” Id. ¶ 79. Codex and Copilot were trained on
`
`“billions of lines” of publicly available code, including code from public GitHub repositories. Id.
`
`¶¶ 82-83.
`
`Despite the fact that much of the code in public GitHub repositories is subject to open-
`
`source licenses which restrict its use, id. ¶ 20, Codex and Copilot “were not programmed to treat
`
`attribution, copyright notices, and license terms as legally essential,” id. ¶ 80. Copilot reproduces
`
`licensed code used in training data as output with missing or incorrect attribution, copyright
`
`notices, and license terms. Id. ¶¶ 56, 71, 74, 87-89. This violates the open-source licenses of
`
`“tens of thousands—possibly millions—of software developers.” Id. ¶ 140. Plaintiffs additionally
`
`allege that Defendants improperly used Plaintiffs’ “sensitive personal data” by incorporating the
`
`data into Copilot and therefore selling and exposing it to third parties. Id. ¶¶ 225-39.
`
`Plaintiffs filed multiple cases against Defendants, which were subsequently consolidated.
`
`ECF No. 47. Plaintiffs, on behalf of themselves and two putative classes,4 plead twelve counts
`
`
`4 The “Injunctive Relief Class” and “Damages Class” are each defined as: “All persons or entities
`domiciled in the United States that[] (1) owned an interest in at least one [U.S.] copyright in any
`work; (2) offered that work under one of GitHub’s Suggested Licenses; and (3) stored Licensed
`Materials in any public GitHub repositories at any time between January 1, 2015 and the present
`(the ‘Class Period’).” Compl. ¶ 34 (footnote omitted).
`
`
`3
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 4 of 25
`
`
`
`against Defendants: (1) violation of the Digital Millennium Copyright Act (“DMCA”), 17 U.S.C.
`
`§§ 1201-05; (2) common law breach of open-source licenses; (3) common law tortious
`
`interference in a contractual relationship; (4) common law fraud; (5) false designation of origin in
`
`violation of the Lanham Act, 15 U.S.C. § 1125; (6) unjust enrichment in violation of Cal. Bus. &
`
`Prof. Code §§ 17200, et seq., and the common law; (7) unfair competition in violation of the
`
`Lanham Act, 15 U.S.C. § 1125; Cal. Bus. & Prof. Code §§ 17200, et seq., and the common law;
`
`(8) breach of contract for violation of the GitHub Privacy Policy and Terms of Service; (9)
`
`violation of the California Consumer Privacy Act (“CCPA”); (10) common law negligence; (11)
`
`common law civil conspiracy; and (12) declaratory relief under 28 U.S.C. § 2201(a) and Cal. Code
`
`Civ. Proc. § 1060.5 Defendants now move to dismiss the complaint. ECF Nos. 50, 53.
`
`II.
`
`JURISDICTION
`
`The Court has jurisdiction over Plaintiffs’ federal claims under 28 U.S.C. § 1331 and
`
`supplemental jurisdiction over Plaintiffs’ state law claims under 28 U.S.C. § 1367.
`
`III. LEGAL STANDARD
`
`A.
`
`Rule 12(b)(1)
`
`“Article III of the Constitution confines the federal judicial power to the resolution of
`
`‘Cases’ and ‘Controversies.’” TransUnion LLC v. Ramirez, 141 S. Ct. 2190, 2203 (2021). “For
`
`there to be a case or controversy under Article III, the plaintiff must have a ‘personal stake’ in the
`
`case—in other words, standing.” Id. (quoting Raines v. Byrd, 521 U.S. 811, 819 (1997)). A
`
`defendant may attack a plaintiff’s assertion of jurisdiction by moving to dismiss under Rule
`
`12(b)(1) of the Federal Rules of Civil Procedure. Cetacean Cmty. v. Bush, 386 F.3d 1169, 1174
`
`
`5 While Plaintiffs plead several common law tort claims, they do not identify the state law which
`applies to each claim. “[D]ue to variances among state laws, failure to allege which state law
`governs a common law claim is grounds for dismissal.” In re Nexus 6P Prods. Liab. Litig., 293 F.
`Supp. 3d 888, 933 (N.D. Cal. 2018) (quoting Romero v. Flowers Bakeries, LLC, No. 14-cv-05189-
`BLF, 2016 WL 469370, at *12 (N.D. Cal. Feb. 8, 2016)). With respect to the state law claims in
`any future amended complaint, Plaintiffs shall identify the state under whose law the claim is
`brought. When claims which share a legal theory are brought under multiple laws (for example, a
`federal statute, a state statute, and common law, such as Plaintiffs’ claim for unfair competition),
`Plaintiffs shall state each claim as a separate count. For the purposes of deciding the present
`motion to dismiss, the Court interprets the complaint as asserting the common law claims of
`Counts 2, 3, 4, 6, 7, 10, and 11 under California law.
`
`
`4
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 5 of 25
`
`
`
`(9th Cir. 2004); see also Maya v. Centex Corp., 658 F.3d 1060, 1067 (9th Cir. 2011) (“[L]ack of
`
`Article III standing requires dismissal for lack of subject matter jurisdiction under Federal Rule of
`
`Civil Procedure 12(b)(1).”).
`
` “A Rule 12(b)(1) jurisdictional attack may be facial or factual.” Safe Air for Everyone v.
`
`Meyer, 373 F.3d 1035, 1039 (9th Cir. 2004). “In a facial attack, the challenger asserts that the
`
`allegations contained in a complaint are insufficient on their face to invoke federal jurisdiction.”
`
`Id. Where, as here, a defendant makes a facial attack, the court assumes that the complaint’s
`
`allegations are true and draws all reasonable inferences in the plaintiff’s favor. Wolfe v.
`
`Strankman, 392 F.3d 358, 362 (9th Cir. 2004).
`
`B.
`
`Rule 12(b)(6)
`
`“Dismissal under [Federal Rule of Civil Procedure] 12(b)(6) is appropriate only where the
`
`complaint lacks a cognizable legal theory or sufficient facts to support a cognizable legal theory.”
`
`Mendiondo v. Centinela Hosp. Med. Ctr., 521 F.3d 1097, 1104 (9th Cir. 2008). A complaint must
`
`contain “a short and plain statement of the claim showing that the pleader is entitled to relief.”
`
`Fed. R. Civ. P. 8(a)(2). Facts pleaded by a plaintiff “must be enough to raise a right to relief
`
`above the speculative level.” Bell Atl. Corp. v. Twombly, 550 U.S. 544, 555 (2007).
`
`“To survive a motion to dismiss, a complaint must contain sufficient factual matter,
`
`accepted as true, to ‘state a claim to relief that is plausible on its face.’” Ashcroft v. Iqbal, 556 U.S.
`
`662, 678 (2009) (quoting Twombly, 550 U.S. at 570)). “A claim has facial plausibility when the
`
`plaintiff pleads factual content that allows the court to draw the reasonable inference that the
`
`defendant is liable for the misconduct alleged.” Id. In determining whether a plaintiff has met this
`
`plausibility standard, the Court must “accept all factual allegations in the complaint as true and
`
`construe the pleadings in the light most favorable to the nonmoving party.” Knievel v. ESPN, 393
`
`F.3d 1068, 1072 (9th Cir. 2005).
`
`C.
`
`Leave to Amend
`
`Leave to amend a complaint “shall be freely given when justice so requires.” Fed. R. Civ.
`
`P. 15(a)(2). The decision of whether to grant leave to amend is “within the discretion of the
`
`district court, which may deny leave due to ‘undue delay, bad faith or dilatory motive on the part
`
`5
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 6 of 25
`
`
`
`of the movant, repeated failure to cure deficiencies by amendments previously allowed, undue
`
`prejudice to the opposing party by virtue of allowance of the amendment, and futility of
`
`amendment.’” Leadsinger, Inc. v. BMG Music Pub., 512 F.3d 522, 532 (9th Cir. 2008) (quoting
`
`Foman v. Davis, 371 U.S. 178, 182 (1962)).
`
`IV. DISCUSSION
`
`A.
`
`Article III Standing
`
`Defendants argue that Plaintiffs lack Article III standing to assert their claims and that the
`
`Court therefore lacks subject-matter jurisdiction over this action.6
`
`“[T]o establish standing, a plaintiff must show (i) that he suffered an injury in fact that is
`
`concrete, particularized, and actual or imminent; (ii) that the injury was likely caused by the
`
`defendant, and (iii) that the injury would likely be redressed by judicial relief.” TransUnion, 141
`
`S. Ct. at 2203. “The party invoking federal jurisdiction bears the burden of establishing these
`
`elements.” Lujan v. Defenders of Wildlife, 504 U.S. 555, 561 (1992). “Where, as here, a case is at
`
`the pleading stage, the plaintiff must ‘clearly . . . allege facts demonstrating’ each element.”
`
`Spokeo, Inc. v. Robins, 578 U.S. 330, 338 (2016) (quoting Warth v. Seldin, 422 U.S. 490, 518
`
`(1975)).
`
`“A plaintiff must demonstrate constitutional standing separately for each form of relief
`
`requested.” Davidson v. Kimberly-Clark, 889 F.3 956, 969 (9th Cir. 2018). In a putative class
`
`action, named plaintiffs “must allege and show that they personally have been injured, not that
`
`injury has been suffered by other, unidentified members of the class to which they belong and
`
`which they purport to represent.” Warth, 422 U.S. at 502.
`
`Plaintiffs advance two main theories of harm. One theory is that Defendants have shared,
`
`
`6 Both parties impermissibly incorporate substantive material by reference across their briefs.
`“The incorporation of substantive material by reference is not sanctioned by the federal rules.”
`Swanson v. U.S. Forest Serv., 87 F.3d 339, 345 (9th Cir. 1996). In the future, all arguments the
`parties seek to make must be raised in the briefing on a particular motion. See Woolfson v. Conn
`Appliances, Inc., No. 21-cv-07833-MMC, 2022 WL 3139522, at *6 (N.D. Cal. Aug. 5, 2022)
`(declining to consider substantive arguments purportedly incorporated by reference); Williams v.
`Cnty. of Alameda, 26 F. Supp. 3d 925, 947 (N.D. Cal. 2014) (“[T]he Court will not consider the
`arguments that Plaintiff improperly seeks to incorporate by reference. This Court only considers
`arguments that are specifically and distinctively raised by the parties in their briefs.”).
`
`
`6
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 7 of 25
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`sold, and exposed and will continue to share, sell, and expose Plaintiffs’ personal information,
`
`harming Plaintiffs’ rights to privacy and the protection of their personal data. The second is that
`
`Defendants’ use of licensed code as training data for Codex and Copilot harms Plaintiffs’ property
`
`interests in the licensed code because Copilot already has or likely will reproduce Plaintiffs’ code
`
`in violation of those licenses.
`
`1.
`
`Injury to Privacy Rights
`
`Defendants argue that Plaintiffs fail to plausibly allege any disclosure of personal
`
`information, and therefore fail to allege an actual or imminent injury sufficient to confer standing.
`
`In the complaint, Plaintiffs do not identify the specific sensitive or private information at issue.
`
`Plaintiffs therefore do not allege facts sufficient for the Court to infer that Plaintiffs’ privacy
`
`interests are implicated by the alleged misuse of such information. See Ji v. Naver Corp., No. 21-
`
`cv-05143-HSG, 2022 WL 4624898, at *7 (N.D. Cal. Sept. 30, 2022) (dismissing privacy-based
`
`claim for lack of standing where plaintiffs “have not alleged enough facts to show that the
`
`[personal data at issue] are the type of information that could give rise to a privacy injury”); I.C. v.
`
`Zynga, Inc., 600 F. Supp. 3d 1034, 1049 (N.D. Cal. Apr. 29, 2022) (evaluating whether privacy
`
`interests were implicated by allegedly disclosed personal information to determine whether
`
`plaintiffs had standing).
`
`Plaintiffs have not met their burden to allege facts demonstrating an injury-in-fact
`
`sufficient to confer standing for their privacy-based claims. Plaintiffs’ claims for breach of the
`
`GitHub Privacy Policy and Terms of Service, violation of the CCPA, and negligence are
`
`dismissed with leave to amend.
`
`2.
`
`Injury to Property Rights
`
`Defendants argue that Plaintiffs do not identify any instance of Copilot reproducing
`
`Plaintiffs’ licensed code and therefore fail to plead a particularized injury sufficient to confer
`
`standing.
`
`“For an injury to be ‘particularized,’ it ‘must affect the plaintiff in a personal and
`
`individual way.’” Spokeo, 578 U.S. at 339 (quoting Lujan, 504 U.S. at 560 n.1). “[A]t an
`
`irreducible minimum, Art. III requires the party who invokes the court’s authority to ‘show that he
`
`7
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 8 of 25
`
`
`
`personally has suffered some actual or threatened injury.’” Valley Forge Christian Coll. v. Ams.
`
`United for Separation of Church and State, Inc., 454 U.S. 464, 472 (1982) (quoting Gladstone
`
`Realtors v. Village of Bellwood, 441 U.S. 91, 99 (1979)).
`
`Plaintiffs argue that they “have alleged the requisite particularized injury” because they
`
`“allege that Defendants have violated provisions of [the] open-source licenses” under which
`
`Plaintiffs published their code. ECF No. 67 at 14. In the complaint, Plaintiffs allege that
`
`Defendants violate those licenses because Copilot reproduces Plaintiffs’ code as output with
`
`missing or incorrect attribution, copyright notices, and license terms.7 However, while Plaintiffs
`
`identify several instances in which Copilot’s output matched licensed code written by a Github
`
`user, Compl. ¶¶ 56, 71, 74, 87-89, none of these instances involve licensed code published to
`
`GitHub by Plaintiffs. Because Plaintiffs do not allege that they themselves have suffered the
`
`injury they describe, they do not have standing to seek retrospective relief for that injury.8
`
`Plaintiffs also argue that they “also allege an imminent and significant harm—now that
`
`their software code is part of the training data . . . , [it] can be output at any time without the
`
`legally necessary notices and attribution, leading to increased and imminent risk of
`
`misappropriation.” ECF No. 67 at 15. To the extent that Plaintiffs allege an injury based on the
`
`increased risk of misappropriation of their own licensed code, that injury is sufficiently
`
`particularized. However, an increased risk of future harm alone is not sufficiently concrete to
`
`confer standing for damages. TransUnion, 141 S. Ct. at 2210-11 (finding “persuasive” argument
`
`
`7 In opposition, Plaintiffs suggest they were also injured by Defendants’ use of Plaintiffs’ licensed
`code as training data for Codex and Copilot. ECF No. 66 at 12, 21. But Plaintiffs’ complaint does
`not describe such an injury. In the complaint, Plaintiffs allege that “Defendants had access to but
`were not licensed by Plaintiffs . . . to train any . . . functional prediction engine using the Licensed
`Materials.” Compl. ¶ 149; see also id. ¶ 37 (describing one question common to the putative class
`as “[w]hether Defendants violated the Licenses governing use of the Licensed Materials by using
`them to train Copilot”). However, Plaintiffs do not actually allege that Defendants’ use of
`licensed code to train Codex and Copilot constituted a breach of the open-source licenses at issue.
`See id. ¶¶ 172-87 (alleging that Defendants breached the licenses by failing to provide attribution
`in output, failing to provide copyright notices in output, and failing to identify applicable licenses
`and the text of those licenses in output). Because Plaintiffs do not allege they were injured by
`Defendants’ use of licensed code as training data, the Court does not address whether such an
`injury is sufficient to confer standing.
`
` 8
`
` This problem is related to redressability: Plaintiffs allege no injury redressable by the monetary
`damages they seek.
`
`8
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 9 of 25
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`that, “in a suit for damages, the mere risk of future harm . . . cannot qualify as a concrete harm . . .
`
`unless the exposure to the risk of future harm itself causes a separate concrete harm”) (emphasis
`
`in original); id. at 2213 (holding that “the risk of future harm on its own does not support Article
`
`III standing for the plaintiffs’ damages claims”). Plaintiffs do not allege any additional, concrete
`
`harm associated with this increased risk of misappropriation. Thus, an increased risk of
`
`misappropriation cannot provide standing for Plaintiffs’ damages claims.
`
`Finally, Plaintiffs argue that, “[g]iven the number of times users may use Copilot, it is a
`
`virtual certainty [that] any particular plaintiff’s code will be displayed either with copyright
`
`notices removed or in violation of Plaintiffs’ open-source licenses for profit.” ECF No. 67 at 15.
`
`Though this does not support standing for retrospective damages – for which, as discussed above,
`
`Plaintiffs must allege that they have suffered a qualifying injury-in-fact that has actually occurred
`
`to them – it may support standing for injunctive relief.
`
`“[A] person exposed to a risk of future harm may pursue forward-looking, injunctive relief
`
`to prevent the harm from occurring, at least so long as the risk of harm is sufficiently imminent
`
`and substantial.” TransUnion, 141 S. Ct. at 2210. “An allegation of future injury may suffice if
`
`the threatened injury is ‘certainly impending,’ or there is a ‘substantial risk that the harm will
`
`occur.’” Susan B. Anthony List v. Driehaus, 573 U.S. 149, 158 (2014) (quoting Clapper v.
`
`Amnesty Int’l USA, 568 U.S. 414 n.5 (2013)). A substantial risk means there is a “realistic danger
`
`of sustaining a direct injury.” Pennell v. City of San Jose, 485 U.S. 1, 8 (1988) (quoting Babbitt v.
`
`United Farm Workers Nat’l Union, 442 U.S. 289, 298 (1979)).
`
`Plaintiffs plausibly allege that there is at least a substantial risk that Defendants’ programs
`
`will reproduce Plaintiffs’ licensed code as output. Plaintiffs allege that both Codex and Copilot
`
`were trained on data that included all public GitHub repositories. Plaintiffs further allege that the
`
`programs reproduce well-known code in response to related prompts and offer several concrete
`
`examples of such reproduction. Finally, Plaintiffs allege that GitHub’s own internal research
`
`shows that Copilot reproduces code from training data “about 1% of the time.” Compl. ¶¶ 56, 71,
`
`74, 87-90. Plaintiffs thus plausibly allege that, absent injunctive relief, there is a realistic danger
`
`that Codex or Copilot will reproduce Plaintiffs’ licensed code as output. Plaintiffs further allege
`
`9
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 10 of 25
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`that Defendants have modified Copilot to ensure that it no longer reproduces license text,
`
`attribution, and copyright notices. Taking the facts of the complaint as true and construing all
`
`inferences in the Plaintiffs’ favor, the Court can reasonably infer that, should Plaintiffs’ code be
`
`reproduced as output, it will be reproduced in a manner that violates the open-source licenses
`
`under which Plaintiffs published their code.9
`
`While Plaintiffs have failed to establish an injury-in-fact sufficient to confer standing for
`
`their claims for damages based on injury to property rights, they have standing to pursue
`
`injunctive relief on such claims.
`
`B.
`
`Proceeding Pseudonymously
`
`Defendants argue that the complaint should be dismissed because Plaintiffs are not entitled
`
`to proceed under fictitious names.10
`
`“The normal presumption in litigation is that parties must use their real names.” Doe v.
`
`Kamehameha Schs./Bernice Pauahi Bishop Est., 596 F.3d 1036, 1042 (9th Cir. 2010). Parties
`
`may proceed pseudonymously only “in the ‘unusual case’ when nondisclosure of the party’s
`
`identity ‘is necessary . . . to protect a person from harassment, injury, ridicule or personal
`
`embarrassment.” Does I Thru XXIII v. Advanced Textile Corp., 214 F.3d 1058, 1067-68 (9th Cir.
`
`2000) (alteration in original) (quoting United States v. Doe, 655 F.2d 920, 922 n.1 (9th Cir.
`
`1981)).
`
`Threats of physical harm “present[] the paradigmatic case for allowing anonymity.”
`
`Kamehameha, 596 F.3d at 1043. Where a party seeks to proceed pseudonymously on the basis of
`
`
`9 Defendants argue that Plaintiffs must also allege facts regarding the type of code they published,
`what problem that code solves, how frequently that code might appear on GitHub, and how likely
`it is that a user would enter a prompt that would generate a match to that code. ECF No. 50 at 16.
`Defendants suggest that, absent such facts, Plaintiffs’ alleged injury is too “conjectural or
`hypothetical” to confer standing. Id. (quoting Lujan, 504 U.S. at 550). Defendants ask too much
`of Plaintiffs at the pleading stage. Such facts might help Defendants—the only parties with
`knowledge of how Copilot and Codex were designed and operate—understand whether Plaintiffs’
`alleged injury has already occurred or how soon it is likely to occur. However, taking the facts in
`the complaint as true, and construing all inferences in Plaintiffs’ favor, their alleged future injury
`is neither conjectural nor hypothetical: their licensed code was used to train these programs and
`there is presently a realistic danger that the code will be reproduced as output.
`
`10 Because the caselaw uses “pseudonymous” and “anonymous” interchangeably, this Court does
`so as well.
`
`10
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 11 of 25
`
`
`
`retaliatory harm, “a district court must balance five factors: ‘(1) the severity of the threatened
`
`harm, (2) the reasonableness of the anonymous party’s fears, . . . (3) the anonymous party’s
`
`vulnerability to such retaliation,’ (4) the prejudice to the opposing party, and (5) the public
`
`interest.” Id. at 1042 (quoting Advanced Textile, 214 F.3d at 1068).
`
`Plaintiffs seek to proceed pseudonymously because, “through their counsel, [they] have
`
`received legitimate and credible threats of physical violence.” ECF No. 67 at 17. Plaintiffs
`
`highlight three threatening emails sent to counsel prior to the filing of the operative complaint. All
`
`three emails contain veiled threats wishing death upon Plaintiffs’ counsel on the basis of his
`
`involvement in this lawsuit. ECF No. 68-1 at 2 (“[I]magine shooting against [AI] . . . . [G]o kys. .
`
`. . [K]ill urself.”); ECF No. 68-2 at 2 (“I hope you f*cking die you piece of sh*t. It’s people like
`
`you why this world sucks so f*cking bad and we can’t have nice things. I literally hope someone
`
`murder [sic] you. Go f*cking die.”); ECF No. 68-3 at 2 (“[G]o f*cking cry about [G]it[H]ub you
`
`f*cking piece of sh*t n****r, [I] hope your throat gets cut open and every single family member
`
`of you [sic] is burnt to death.”). Plaintiffs’ counsel have received “many” such messages. ECF
`
`No. 67 at 17. Plaintiffs argue they should be permitted to proceed pseudonymously because they
`
`reasonably fear they will be subject to threats of retaliation and violence if their identities are
`
`disclosed.
`
`The first two factors of the balancing test for retaliatory harm – severity of the threatened
`
`harm and the reasonableness of the fear of such harm – “are intricately related and should be
`
`addressed together.” Kamehameha, 596 F.3d at 1040. Plaintiffs need not prove the speakers
`
`“intend to carry out the threatened retaliation,” but rather “that a reasonable person would believe
`
`that the threat might actually be carried out.” Advanced Textile, 214 F.3d at 1071. “It is in the
`
`particular purview of the district court to view alleged threats in context and determine what the
`
`‘reasonable’ person in the plaintiffs’ situation would fear.” Kamehameha, 596 F.3d at 1044. “In
`
`11
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`21
`
`22
`
`23
`
`24
`
`25
`
`26
`
`27
`
`28
`
`Northern District of California
`
`United States District Court
`
`
`
`Case 4:22-cv-06823-JST Document 95 Filed 05/11/23 Page 12 of 25
`
`
`
`context, a plaintiff might reasonably fear a veiled threat of violence.” Id.11
`
`The threatened harm in this case – death – is plainly severe. Id. at 1043 (describing threats
`
`of physical retaliation, including death, as “undoubtedly severe”); Doe v. Steagall, 653 F.2d 180,
`
`186 (5th Cir. 1981) (permitting plaintiffs to proceed pseudonymously where they faced “threats of
`
`violence,” including veiled death threats, “generated by this case”); Doe v. Univ. Acct. Serv., LLC,
`
`No. 09-CV-01563-BAS-JLB, 2022 WL 623913, at *4 (S.D. Cal. Mar. 3, 2022) (permitting
`
`plaintiff to proceed pseudonymously where he received death threats). While the threatening
`
`emails were not sent to Plaintiffs directly, the emails wish death upon Plaintiffs’ counsel on the
`
`basis of their involvement in this lawsuit. It is reasonable for Plaintiffs to fear that such threats
`
`might be carried out against them if their identities were to become public.
`
`GitHub and Microsoft suggest that Plaintiffs’ fear is unreasonable because “the types of
`
`nasty messages at issue here are a fact of modern life in the era of internet ‘trolls.’” ECF No. 72 at
`
`19. GitHub and Microsoft do not explain why the rise of internet trolls renders Plaintiffs’ fears of
`
`harm unreasonable. Cf. United States v. Bagdasarian, 652 F.3d 1113, 1126-27 (9th Cir. 2011)
`
`(Wardlaw, J., concurring in part) (describing the connection between anonymous internet posts
`
`and subsequent real-world violence). Sending direct messages containing veiled death threats
`
`would seem to constitute behavior beyond trolling.