The Network Law Review is pleased to present a symposium entitled “Dynamics of Generative AI,” where lawyers, economists, computer scientists, and social scientists gather their knowledge around a central question: what will define the future of AI ecosystems? To bring all this expertise together, a conference co-hosted by the Weizenbaum Institute and the Amsterdam Law & Technology Institute will be held on March 22, 2024. Be sure to register in order to receive the recording.
This contribution is signed by Christopher S. Yoo, John H. Chestnut Professor of Law, Communication, and Computer & Information Science; Founding Director, Center for Technology, Innovation & Competition. The entire symposium is edited by Thibault Schrepel (Vrije Universiteit Amsterdam) and Volker Stocker (Weizenbaum Institute).
Generative artificial intelligence (GenAI) has the potential to become one of the most transformative developments in the history of technology. In particular, the recent release of OpenAI’s GPT4 has catalyzed myriad discussions about its potential impact on a wide range of productive and scholarly pursuits.
In addition, the advent of GenAI could also disrupt the basic structure of the tech industry. For example, as media reports have recognized, “machine learning and AI have remained at the heart” of the U.S. antitrust case against Google’s search practices, raising the question whether the OpenAI’s innovations now offer a new dimension along which search providers can compete.1Kendra Barnett, What we’ve learned so far in Google’s landmark antitrust trial, Drum (Oct. 3, 2023), … Continue reading Regardless of which side of the argument ultimately prevails when the trial court issues its decision, most likely in the summer of 2024, this dispute provides an apt illustration of the impact that GenAI can have on the ways that tech industry companies compete with one another.
This article briefly examines the arguments raised in the Google case to explore how GenAI could reconfigure the fundamental nature of competition among technology firms. It also then discusses three areas of law—copyright, privacy, and security—that could place limits on the ability of GenAI to perform this transformative role.
2. The Changing Role of Historical Search Data
One of the leading theories of competitive harm in the U.S. antitrust case against Google’s search practices asserts that search markets are characterized by significant economies of scale.2Amended Complaint at 14 ¶¶ 35-36, 31 ¶ 95, United States v. Google LLC, Case No. 1:20-cv-03010-APM (D.D.C. Jan. 15, 2021), available at https://www.justice.gov/media/1163846/dl. Not only does the possession of large amounts of search data allow for generating better results.3Id To the extent that analyzing consumers’ responses to search queries allows search engines to improve their results even more, scale can give rise to a feedback effect that can reinforce the advantages enjoyed by search engines with the strongest market positions.4Id. at 31 ¶ 95; FTC Bur. of Competition Staff, Memorandum to the Comm’n on Google Inc. 14 (Aug. 8, 2012), available at http://graphics.wsj.com/google-ftc-report; Maurice Stucke & Allen P. … Continue reading
As an initial matter, the relative importance of scale economies in generating high-quality search results remains a matter of some dispute. Prior to the beginning of the Google litigation, Google Chief Economist Hal Varian reported that the company typically conducts its analyses on random samples of only 0.1% of data available to it.5Hal R. Varian, Big Data: New Tricks for Econometrics, J. Econ. Persp., Spr. 2014, at 3, 4, available at https://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.28.2.3. The significance of scale economies in data has represented one of the central themes of the Google search trial.6Chris May, US DOJ attacks Google expert’s data scale experiment in search monopolization case, MLex (Oct. 31, 2023, 19:52), … Continue reading For example, Google presented an expert witness who conducted an experiment that found significant diminishing returns of user data on search result quality, attributing only 3% of the difference in search quality between Google and Bing to Google’s scale,7Tad Dickens, Modern Google antitrust case relies on century-old law, Virginia Tech professor says, Cardinal News (Sept. 25, 2023), … Continue reading findings that the government contested on cross examination.8Koenig, supra note 7; May, supra note 6. Former Google employee and co-founder of Google rival Neeva Sridhar Ramaswamy conceded on cross examination that Neeva could profitably deliver high-quality search results with only a 2.5% market share in search.9Jan Wolfe, Miles Kruppa & Erin Mulvaney, U.S. Wraps Up Its Google Antitrust Case, Wall St. J., Oct. 18, 2023, at 14, available at … Continue reading
Regardless of how the court ultimately resolves that issue, GenAI has the potential to fundamentally alter the role that proprietary, historical search data plays in the nature of rivalry in the search industry. One prominent development is Microsoft’s announcements that it was reconfiguring Bing to run on OpenAI’s GPT-4.10Yusuf Mehdi, Reinventing search with a new AI-powered Microsoft Bing and Edge, your copilot for the web, Off. Microsoft Blog (Feb. 7, 2023), … Continue reading The most significant implication of this shift is the extent to which GPT is trained primarily on data available in principle to anyone.11The first three generations of GPT were trained on BookCorpus (GPT-1), Reddit and WebText (GPT-2), and on the prior datasets as well as Common Crawl and Wikipedia (GPT-3), most of which are widely … Continue reading
Whether GenAI could displace proprietary data as way to improve search quality became a matter of dispute during the Google search trial. On the one hand, Google attorneys relied on Microsoft’s announcements lauding GenAI’s ability to improve search quality without relying on user data to argue to downplay the competitive significance of its current market position.12Paul Wiseman, Apple leverages idea of switching to Bing to pry more money out of Google, Microsoft exec says, AP (Sept. 27, 2023, 6:07 PM EST), … Continue reading Microsoft’s witnesses countered that GenAI technology had not yet developed enough to provide the sole basis for a search engine and that access to user data continued to play an important role.13Id.
Such arguments requires both companies to walk a tightrope. On the one hand, Microsoft’s statements to Wall Street about GenAI’s ability to make a more effective rival in search rest in uneasy tension with its testimony during the Google search trial claiming that GenAI is not sufficient to create sufficient search quality to permit it to compete.14Davey Alba & Leah Nylen, Google Walks a Tightrope on AI in Search Antitrust Trial, Yahoo! Fin. (Nov. 2, 2023), https://finance.yahoo.com/news/google-walks-tightrope-ai-search-100000986.html. For its part, Google faces claims by the U.S. Department of Justice it deliberately chose not to release a GenAI-driven version of its search engine in order to protect its existing monopoly.15Id. Google counters that GenAI-based search still posed dangers,16Id. a concern corroborated by OpenAI’s reported attempt to urge Microsoft to move more slowly in integrating GPT-4 into Bing.17Tom Warren, OpenAI reportedly warned Microsoft about Bing’s bizarre AI responses, Verge (June 13, 2023, 11:42 EDT), … Continue reading Google has now launched its own GenAI platform called Gemini that it eventually plans to integrate into its search engine.18David Pierce, Google launches Gemini, the AI model it hopes will take down GPT-4, Verge (Dec. 6, 2023, 10:00 AM EST), https://www.theverge.com/2023/12/6/23990466/google-gemini-llm-ai-model. Some reports speculate that the ability to combine publicly trained GenAI models with proprietary data could continue to give Google a competitive edge.19Proprietary Data gives Google’s Gemini an edge over ChatGPT, Praxis Tech Sch., https://praxistech.school/proprietary-data-gives-googles-gemini-an-edge-over-chatgpt/ (last visited Dec. 25, 2023); … Continue reading
Whether GenAI has changed the competitive landscape for search remains unclear. Many industry experts predicted that the incorporation of GenAI has enabled Bing to compete more effectively with Google,20See Davey Alba, ChatGPT Reignites the Search Wars Between Google and Microsoft, Bloomberg (Feb. 8, 2023, 7:00 AM EST, updated Feb. 8, 2023, 2:58 PM EST), … Continue reading and initial studies suggested that incorporating GPT-4 may have initially allowed Bing’s traffic growth to surpass Google’s.21Akash Sriram & Chavi Mehta, OpenAI tech gives Microsoft’s Bing a boost in search battle with Google, Reuters (Mar. 22, 2023, 2:34 PM EDT), … Continue reading Subsequent reports suggest that search market shares remained relatively stable through the second half of 2023.22Darran Allan, Bing AI may be getting crushed in the battle against Google search – but Microsoft might not care, TechRadar (Nov. 13, 2023), … Continue reading
This controversy provides important insights regardless of which side ultimately prevails. The object lesson is that antitrust courts must allow for the possibility that GenAI could create new business models that diverge from those pursued in the past. Courts interested in promoting consumer welfare must take into account the dynamic nature of the industry. Rather than look exclusively to the past, courts will have to make predictive judgments about how competitive dynamics are likely to play out in the future. When properly conducted, such forecasts look past the quest for “hot documents” showing that a company wanted to triumph over its rivals and instead focus on the structural features necessary to make particular theories of anticompetitive harm feasible. Any other approach risks falling into the classic pitfall of protecting competitors instead of consumers.
3. Potential Legal Obstacles to Access to Training Data
GenAI’s ability to render the market for search more competitive depends on competitors’ ability to obtain access to the large quantities of publicly available data needed to train the models. In addition to competitive considerations, GenAI systems must also take care to comply with at least three legal regimes: copyright, privacy, and security.
Perhaps the area of GenAI that has generated the most interest among legal scholars is its relationship to copyright. Although much of the literature has focused on whether AI-generated works are copyrightable, two other issues are more important from the standpoint of competition law: (1) the use of copyrighted works as inputs to train AI and (2) the possibility that AI could generate copyright infringing outputs.23Artificial Intelligence and Intellectual Property: Part I — Interoperability of AI and Copyright Law: Hearing Before the Subcomm. on Cts., Intell. Prop., and the Internet of the H. Judiciary Comm., … Continue reading
With respect to the use of copyrighted material as inputs to train GenAI models, scholars generally argue that such uses should constitute fair use and thus should be legal, while recognizing that the issue remains unresolved and acknowledging the existence of substantial arguments to the contrary.24Scholars generally favor treating the use of copyrighted works to train GenAI as fair use but recognize that the issue remains unresolved and acknowledge the existence of substantial arguments to the … Continue reading A determination that the use of copyrighted works in training data does not constitute fair use would increase the cost of obtaining access to the quantity of data generally needed to train GenAI systems, with both sides of this argument being presented in the recent lawsuit brought by the New York Times against Microsoft and Open AI.25Alexandra Brunell, New York Times Sues Microsoft and OpenAI, Alleging Copyright Infringement, N.Y. Times (Dec. 27, 2023, 8:24 AM ET), … Continue reading The ultimate outcome will necessarily remain uncertain until the courts resolve these cases. It bears noting that cases have found fair use when copying seeks to create new products26Google LLC v. Oracle Am., Inc., 141 S. Ct. 1183, 1203 (2021); or serves a different function.27Authors Guild v. Google, Inc. 804 F.3d 202, 217-18 (2d Cir. 2015); Authors Guild, Inc. v. Hathitrust, 755 F.3d 87, 96-97 (2d Cir. 2014); A.V. ex rel. Vanderhye; K.W. v. iParadigms, LLC, 562 F.3d 630, … Continue reading
Regarding the possibility that GenAI may generate outputs that infringe copyright, early judicial decisions have dismissed a number of such claims.28Kadrey v. Meta Platforms Inc., No. 3:23-cv-03417-CV, 2023 WL 8039640 (N.D. Cal. Nov. 20, 2023); Andersen v. Stability AI Ltd., No. 3:23-cv-00201-WSHO, 2023 WL 7132064 (N.D. Cal. Oct. 30, 2023). Data scientists are exploring the ability of GenAI to memorize copyrighted works contained in its training dataset and to reproduce them in respond to prompts and is developing measures to counteract these effects.29See Stella Biderman et al., Emergent and Predictable Memorization in Large Language Models, 36 Advances in Neural Info. Processing Sys. (forthcoming 2023) (preprint available at … Continue reading The possibility that the outputs of GenAI instances could generate substantial copyright liability would again serve as a drag on disruptive business models based on the new technology.
Another area of law that places constraints on GenAI is privacy law. Most notably, the European Union’s General Data Protection Regulation (GDPR) imposes myriad obligations on anyone using personal data of people located in the European Union.30Josephine Wolff, William Lehr & Christopher S. Yoo, Lessons from GDPR for AI Policymaking, 27 Va. J.L. & Tech. no. 4 (2024), … Continue reading For example, unless one of the other legal bases applies, GDPR requires that anyone processing personal data about a data subject obtain from that data subject affirmative, individualized consent for each form of processing conducted as well as provide them with the ability to withdraw their consent.31Regulation 2016/679, arts. 6(1)(a), 7(3), 2016 O.J. (L 119) 1, 36-37 (EU). GDPR also applies heightened protections for special categories of data, including “data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation.”32Id. art. 9. The controller shall also disclose a wide range of information to data subjects and provide them with rights of access, rectification, erasure, objection, and data portability, as well as the right to restrict uses under certain circumstances.33Id. arts. 13-20. Of particular note for AI is “the right not to be subject to a decision based solely on automated processing . . . which produces legal effects concerning him or her or similarly significantly affects him or her.”34Id. art. 22. Concerns about privacy led the Italian data protection authority to ban ChatGPT for several weeks in April 2023.35Kelvin Chan, OpenAI: ChatGPT back in Italy after meeting watchdog demands, AP (Apr. 28, 2023, 2:46 PM EST), … Continue reading Italian authorities identified additional issues that OpenAI still needs to address, and data protection authorities in other countries have initiated inquiries of their own.36Id.
In addition, GenAI deployments in the European Union may be subject to the impending EU Artificial Intelligence Act. Under the current compromise reached by the trilogue between the EU Commission, Council, and Parliament in December 2023, the AI Act would subject all General Purpose AI (GPAI) systems to transparency obligations and would subject all GPAI with “systemic risk” (defined as models trained with computing power above 1025 floating point operations (FLOPS)) to additional obligations, including model evaluations, assessment and mitigation of systemic risks, adversarial testing, reports to the Commission of serious incidents, cybersecurity protection, and reports on energy efficiency.37European Parliament Press Release PR 15699, Artificial Intelligence Act: deal on comprehensive rules for trustworthy AI (Dec. 9, 2023), … Continue reading
In addition, GenAI systems may contain vulnerabilities that may lead them to reveal personal information. For example, scholars are exploring whether certain attacks can cause GenAI systems to leak details of their training datasets in ways that can violate privacy law.38Carlini et al, Extracting Training Data, supra note 26, at 2635. Failure to address these privacy concerns could constitute a significant obstacle to the effective deployment of GenAI.
In addition to copyright and privacy law, GenAI systems must also comply with the laws governing online security. For example, the U.S. Computer Fraud and Abuse Act (CFAA) subjects anyone who exceeds their authorized access to a computer to criminal and civil liability.3918 U.S.C. § 1030(a)(1). One concern is that some websites make their content available to the public subject to conditions in their terms of service prohibiting wholesale scraping of their data. A recent U.S. Supreme Court decision failed to resolve the issue, offering language suggesting that such paper barriers to access would not support CFAA liability while dropping a footnote explicitly reserved the issue.40Van Buren v. United States, 141 S. Ct. 1648, 1659 n.8, 1660-62 (2021). The result is overhanging ambiguity regarding the legality of the collection of public data that violates these terms of service,41Nat’l Acads. of Sci., Eng’g, & Med., Social Media and Adolescent Health 205-06 (Sandro Galeo, Gillian J. Buckley & Alexis Wojtowicz eds., 2023), … Continue reading which potentially places a cloud over any GenAI system trained on data collected in this manner.
From the standpoint of enhancing competition, GenAI offers both potential upsides and downsides. Opening the door to new business models can offer rivals new dimensions along which they can compete, although it remains theoretically possible that these changes could also reinforce advantages enjoyed by incumbents. Proper resolution of this ambiguity requires the type of careful, evidence-based assessments of impact on consumers that has long characterized traditional antitrust law. In addition, GenAI systems must comply with the relevant requirements of copyright, privacy, and security law if they are to diversify competition in the manner that many hope, which in turn may reduce the level of innovation and competition in AI.
Citation: Christopher S. Yoo, Generative AI’s Potential Impact on Online Competition, Dynamics of Generative AI (ed. Thibault Schrepel & Volker Stocker), Network Law Review, Winter 2023.