Please consider supporting us by disabling your content blocker.
loader

By: Ayah Bdeir, Imo Udom and Nik Marda

TL;DR: Mozilla is excited about today’s new definition of open source AI, and we endorse it as an important step forward.

Open Source AI

This past year has seen a growing recognition of the societal benefits of open source AI. In October, a large coalition signed onto our statement emphasizing the importance of openness and transparency for safety and security in AI. In February, Mozilla and the Columbia Institute of Global Politics convened AI experts who highlighted how openness could help achieve key societal goals. Policymakers are also embracing open source AI, as demonstrated by the U.S. National Telecommunications and Information Administration (NTIA) issuing a seminal report advocating for openness in AI. Major companies like Google, Microsoft, Apple, and Meta are beginning to open certain aspects of their AI systems.

With the increasing focus on open source AI, it is crucial to establish a shared understanding of what it entails. A clear definition should specify what must be shared and under what terms. Without this clarity, we risk a fragmented approach where companies mislabel their products as “open source,” civil society lacks access to necessary AI components for testing and accountability, and policymakers create ineffective regulations.

The Open Source Initiative (OSI) has released a new draft definition of open source AI, marking a pivotal moment in the evolution of the internet. This follows two years of discussions and debates within the technical and open source communities. It’s not just about redefining “open source” in the context of AI; it’s about shaping the future of technology and its societal impact.

The original Open Source Definition from 1998 was a manifesto for a new way of building software, laying the groundwork for open systems that are now essential to the modern internet. Open source projects have driven innovation and collaboration, making technology more accessible and fostering a culture of transparency while enhancing software security.

This new definition brings clarity to the open source AI discussion, introducing a binary definition similar to existing definitions. It outlines that open source AI revolves around the ability to freely use, study, modify, and share an AI system, while also emphasizing access to key components needed to recreate equivalent AI systems, such as training data, source code, and the AI model itself.

Moreover, this definition attempts to address the complex issue of sharing training data for AI models. It recognizes the challenges of sharing full datasets and avoids disqualifying significant open source AI developments from being labeled as “open source.” Mozilla and Eleuther AI have recently gathered experts to outline best practices for open datasets to support AI training, with plans to publish a paper promoting norms for wider availability of training data.

While some may disagree with aspects of OSI’s definition, particularly regarding training data, we believe that the community-driven process has established a crucial reference point for discussions on open source AI. This definition will help combat the growing trend of “openwashing,” where non-open models are falsely promoted as leading “open source” options. Researchers have shown that the consequences of open-washing are significant, affecting innovation and public understanding of AI.

At its core, this effort exemplifies the open source community’s best practices—engaging in open discussions, addressing differences, and refining this definition collaboratively to build something better. It incorporates key aspects of openness that the community has been grappling with, such as considering broader model components and licensing approaches. In contrast, the closed source ecosystem operates in secrecy, limiting access and fostering behind-the-scenes deals among tech companies. We prefer our transparent approach over imperfect but open practices.

We, along with many others, are eager to continue collaborating with OSI and the broader open source community to clarify the open source AI discussion and unlock its potential for societal benefit.