The Rise of Large-Scale AI Systems and the Arrival of the GenAI Revolution
On March 29, 2023, the Future of Life Institute (FLI) published an open letter calling on “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4” (FLI, 2023). Signed by eminent academics, CEOs, and other tech luminaries, the letter lamented the lack of “planning and management” that characterized the hapless behavior of “AI labs locked in an out-of-control race to develop and deploy ever more powerful digital minds that no one – not even their creators – can understand, predict, or reliably control.”
The commercial explosion of generative AI (GenAI) technologies in the months following the launch of ChatGPT at the end of 2022 sent shockwaves across the digital world. Within weeks, hundreds of GenAI applications had stormed onto the scene, seemingly penetrating all areas of everyday life. Large tech companies like Microsoft and Google simultaneously integrated these technologies into their flagship digital services, impacting billions of users worldwide.
This rapid industrialization of AI was not necessarily unexpected. For several years, stakeholders from industry, academia, government, and civil society had made concerted efforts to develop standards, policies, and governance mechanisms to ensure the ethical, responsible, and equitable production and use of AI systems. Regional treaties, national laws, and voluntary initiatives had already established a robust conceptual basis for confronting the risks of expanding digitalization and datafication.
However, despite this ostensible readiness, the unfolding of the GenAI revolution triggered a crisis in the international AI policy and governance ecosystem. A disconnect emerged between the strengthening thrust of public concerns about the hazards posed by the hasty industrial scaling of GenAI and the absence of effectual regulatory mechanisms and needed policy interventions to address such hazards.
This international AI governance crisis was marked by the lack of vital aspects of AI policy and governance capability and execution, as well as the presence of new factors that significantly contributed to future shock. The former included enforcement gaps in existing digital and data-related laws, a lack of regulatory AI capacity, democratic deficits in the production of standards for trustworthy AI, and widespread evasionary tactics of ethics washing and state-enabled deregulation. The latter centered on the dynamics of unprecedented scaling and centralization that emerged as both drivers and by-products of the Large-Scale Era of AI and ML, which ushered in the GenAI revolution.
Scaling Dynamics and the Rise of New Risks
The drastic scaling of compute capacity, data ingestion, and model complexity led to qualitative leaps in the capabilities of industry-produced foundation models (FMs) and GenAI systems. These large-scale AI systems gained new capacities for zero-shot, transfer, and in-context learning, allowing them to generalize behavior across divergent application contexts. Scaling also enabled significant aggregate performance gains according to “scaling laws.”
This new era of large-scale AI systems saw a small class of FMs/large language models (LLMs) being converted into diverse task-specific applications and adapted to carry out a wide range of downstream functions. The effective conversion of these models into useable applications also required the introduction of novel computational techniques to transform them into “agentic systems” with functionalities like memory, planning, and reflection.
The multipurpose and increasingly multimodal character of this new class of large-scale AI systems, their increasingly agentic character, and their emergent capability for transfer learning and linguistic functionality across application domains heralded a step change in their utility, commercializability, and uptake. The launch of ChatGPT in late 2022 catalyzed the transformation of the era of large-scale AI systems into the era of AI’s industrial revolution, with large firms hastening to integrate GenAI applications into their core products, environments, and services.
This rapid industrialization of FMs and GenAI systems introduced two primary drivers of a new order and scale of AI-related risks: model scaling and industrial scaling.
The scaling of training data sets has been a precondition of the accelerating evolution of multipurpose FMs. The embrace of “the more data the better” and “scale is all you need” led to the collection of massive and uncurated web-scraped data sets. This has given rise to risks of data poisoning, memorization, and leakage, as well as potential violations of data protection rights and intellectual property infringements.
Furthermore, the unfathomability of FM/GenAI training data sets has led to risks of serious psychological, allocational, and identity-based harms that derive from discriminatory and toxic content embedded in web-scraped data and from demographically skewed data sets that lead to disparate model performance.
Alongside the risks stemming from data scaling, the scaling of model size and complexity has occasioned a range of unprecedented risks and governance challenges related to model opacity and complexity. The seeming impenetrability of these ultra-high-dimensional AI systems has rendered conventional AI explainability techniques largely unsuitable and ineffectual, yielding an urgent interpretability predicament.
While emerging techniques like representation engineering, mechanistic interpretability, and prompt-based self-explanation hold some promise, they are still in their early stages and face significant challenges. The unaddressed interpretability predicament has been a significant enabling factor of future shock, as the hasty commercial roll-out of black-box GenAI applications has taken place without sufficient safeguards.
In addition to the risks associated with model scaling, the rapid industrialization of FMs and GenAI systems has led to a new scale of systemic-, societal-, and biospheric-level risks and harms. The brute-force commercialization of GenAI ushered in a new age of widespread exposure, making increasing numbers of impacted people and communities at large susceptible to the risks and harms issuing from model scaling and to new possibilities for misuse, abuse, and cascading system-level effects.
These system-level risks span economy-level impacts (e.g., labor displacement, rising inequality, scaled fraud-based harms), information-ecosystem-level impacts (e.g., downstream data pollution, model collapse, and large-scale mis- and disinformation), population-level impacts on individual safety, security, and well-being (e.g., scaled cyberattacks and malware production, threats of bio-, chemical, and nuclear terrorism, and poorly designed, out-of-control systems), society-level impacts on individual agency, interpersonal relations, and political life (e.g., mass deskilling, cognitive atrophy, anthropomorphic and sociomorphic deception, overdependence, social polarization, and deterioration of social cohesion and public trust in democratic processes), geopolitical-level impacts (e.g., dual use, weaponization, and militarization of AI), and biospheric-level impacts (e.g., environment degradation, resource and biodiversity drain, and climate-related involuntary displacement).
The International AI Governance Crisis
The combination of the absence of vital aspects of AI policy and governance capability and the presence of new scaling-induced risks contributed to the emergence of an international AI governance crisis. Large tech firms capitalized on regulatory inaction and ineptitude, exploiting knowledge, information, and resource asymmetries to shape the pace and scope of potential statutory and governance interventions. This was exacerbated by the dynamics of unprecedented political-economic and geopolitical power centralization that shaped GenAI’s eruptive rise.
The disconnect between the strengthening thrust of public criticism and the ecosystem-level chasm engendered by the absence of needed regulatory mechanisms and policy interventions was at the heart of the crisis. While the first wave of international policy and governance initiatives in mid-2023 were seen by some as signaling progress, critics have emphasized that much of this activity has been ineffective and diversionary, subserving the deregulatory interests of big tech firms, failing to deliver binding governance mechanisms, and further entrenching legacies of Global North political, economic, and sociocultural hegemony.
The narrow-minded focus of these initiatives on technical “AI safety” concerns, rather than directly confronting the immediate threats to civil, social, political, and legal rights and environmental sustainability, allowed private sector “experts” to shape the governance discussion and maintain the legitimacy of corporate self-regulation. This perpetuated the absence of ex ante governance measures to secure the rights and interests of impacted people in advance of potentially harmful consequences.
Crucially, the international AI policy and governance conversation that steered these outcomes centered the views, positions, and interests of a handful of prominent geopolitical and private sector actors from the high-income countries of the West and the Global North, while broadly neglecting the contexts, voices, and concerns of those impacted communities whose members were from the Global Majority, especially those from lower-income countries.
This uneven pitching of the international discussion had significant agenda-determining consequences, whereby major issues affected by GenAI policy were largely absent from or deprioritized in global debates. These included the exploitation of labor, widening digital divides, growing global inequality, infringements on data sovereignty, inequities in international research environments, the worsening of institutional instability, epistemic injustices, data extractivism, and the disproportionate impact of AI-prompted environmental harm on those in lower-income and small island countries.
Toward Transversal and Equitable AI Policy and Governance
To address the international AI governance crisis, a rebalancing of AI policy and governance discussions is needed to include the voices of those from low- and middle-income and small island countries who have thus far been sidelined. This requires concerted efforts to understand and address the contexts of coloniality, global inequality, and systemic discrimination that have created the situation in the first place.
A critical first step is to create robust transversal interactions between a multiplicity of voices, backgrounds, and experiences. The idea of transversality involves disrupting the assumption of a fixed core-periphery relationship anchored in the Global North, instead creating a multitude of peripheries without a core. This enables inclusive and meaningful policy dialogues equipped to interrogate, tackle, and repair the full range of risks and harms emerging from GenAI, while also confronting the longer-term socio-historical patterns of inequity.
Recent efforts by UNESCO to co-organize global summits and launch participatory initiatives like the Global AI Ethics and Governance Observatory signal a gathering momentum of transversality. More of this kind of global co-convening and network-building work is needed to amplify the voices of those who have been peripheralized in international AI ethics and governance discussions and to decenter these discussions as such.
The 13 position papers collected in this Policy Forum offer diverse perspectives on how to address the international AI governance crisis, exploring concerted international efforts, strategies for effective governance, and ways to confront the societal and biospheric risks enabled by industrial scaling. By prioritizing multi-sector, cross-disciplinary, and geographically diverse expertise, this collection aims to initiate and advance a meaningful, informed, and far-ranging conversation on the future of AI policy and governance.
References
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big?. In FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 610-623). Association for Computer Machinery. https://doi.org/10.1145/3442188.3445922
Birhane, A., Kasirzadeh, A., Leslie, D., & Wachter, S. (2023). Science in the age of large language models. Nature Reviews Physics, 5, 277-280. https://doi.org/10.1038/s42254-023-00581-4
Bommasani, R., Creel, K. A., Kumar, A., Jurafsky, D., & Liang, P. S. (2022). Picking on the same person: Does algorithmic monoculture lead to outcome homogenization?. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.), Advances in Neural Information Processing Systems (Vol. 35, pp. 3663-3678). Curran Associates. https://proceedings.neurips.cc/paper_files/paper/2022/hash/17a234c91f746d9625a75cf8a8731ee2-Abstract-Conference.html
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demis, C., … Liang, P. (2021). On the opportunities and risks of foundation models. ArXiv. https://doi.org/10.48550/arXiv.2108.07258
Gebru, T., Bender, E. M., McMillan-Major, A., & Shmitchell, S. (2023). Interrogating the rhetoric around AI safety: A critical view from the ‘margins.’ ArXiv. https://doi.org/10.48550/arXiv.2301.02778
Hanna, A., & Bender, E. M. (2023). Incremental progress is not enough: We need to transform the field of AI. Nature Machine Intelligence, 5, 9-10. https://doi.org/10.1038/s42256-022-00544-z
Levinstein, B. A., & Herrmann, J. W. (2024). Probing lie-detection in large language models. Harvard Data Science Review, (Special Issue 5). https://doi.org/10.1162/99608f92.5a8f63ac
Leslie, D., Ashurst, E., Perini, A. M., Katell, M., Lin, Y., & Burr, C. (2024). ‘Frontier AI,’ power, and the public interest: Who benefits, who decides?. Harvard Data Science Review, (Special Issue 5). https://doi.org/10.1162/99608f92.eb98b5b9
Metcalf, J., & Singh, R. (2024). Scaling up mischief: Red-teaming AI and distributing governance. Harvard Data Science Review, (Special Issue 5). https://doi.org/10.1162/99608f92.f8c3ebd0
Moltzau, A., & Prabhu, R. (2024). Castles in the sand?: How the public sector and academia can partner in regulatory sandboxes to help leverage generative AI for public good. Harvard Data Science Review, (Special Issue 5). https://doi.org/10.1162/99608f92.1b1b8ab1
Weidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P. S., Cheng, M., Glaese, M., Balle, B., Burcˇák, A., Askell, A., Amodei, A., Leike, J., & Irving, G. (2021). Ethical and social risks of harm from Language Models. ArXiv. https://doi.org/10.48550/arXiv.2112.04359