The Rise of Decentralized Data Ownership
In February 2024, Reddit struck a $60 million deal with Google to allow the tech giant to utilize data from its platform to train AI models. This arrangement, however, did not include the voices of Reddit users whose data was effectively sold without consent. The deal emphasizes a harsh reality—the vast majority of our online personal data is controlled and monetized by big tech companies.
In response to this trend, Vana, a decentralized platform originating from an MIT class project, aims to shift power back to the users. Vana allows individuals to upload their data onto a fully user-owned network, enabling them to dictate how it’s utilized. Once they agree to contribute their data for AI training, users gain proportional ownership of the resulting AI models.
Empowering Users and Advancing AI
The core mission of Vana is to provide everyone with a stake in the AI systems shaping our society today. The platform also serves the purpose of unlocking new data sources critical for advancing AI technologies. Anna Kazlauskas, Vana’s co-founder, remarks, ‘This data is needed to create better AI systems. We’ve created a decentralized system to obtain better data—currently held by big tech—while ensuring users retain ultimate ownership.’
Kazlauskas’s journey into the tech world began during her time at MIT. Originally pursuing an economics degree, her interests shifted toward blockchain and cryptocurrency after joining the MIT Bitcoin club. She began mining Ethereum from her dorm room and became inspired to explore how technology can redistribute power to individuals.
Innovative Approaches to AI Training
In its current form, Vana leverages a little-known law allowing users on major tech platforms to export their data directly. This data can be uploaded into encrypted digital wallets within Vana, where it can be shared for AI training as users see fit. The platform enables AI engineers to suggest ideas for new models and allows users to collaborate, pooling their data for broader applications.
With Vana, user data is kept private—no identifiable information is exposed. Users maintain ownership of the models after creation and are rewarded based on their respective contributions to the training dataset. Kazlauskas adds, ‘This means we can build hyper-personalized health applications that can analyze personal data like sleep patterns and dietary habits.’
Breaking New Ground in Data Collaboration
Last year, Vana successfully engaged over 140,000 users who contributed Reddit data to train a model that can generate posts. These users maintained ownership of the model and set the terms for its utilization. Similar initiatives arise with user data from platforms such as X and Oura rings.
Vana boasts over one million users and more than 20 live data DAOs (decentralized autonomous organizations). The platform has seen over 300 data pools proposed, signaling a promising shift towards collaborative, user-owned AI development. Kazlauskas notes, ‘Vana allows for the creation of powerful models by bringing together disparate data sources which big tech firms often prevent due to regulations.’
Looking Ahead: The Future of AI and User Empowerment
Despite the current dominance of big tech companies, Vana embodies the potential for user ownership in the evolving AI landscape. Kazlauskas concludes, ‘Vana allows collective data action, facilitating better technology that benefits everyone, avoiding the monopoly of all-powerful AI models.’ The platform encapsulates hope that users can reclaim their online data while contributing to the future advancements in AI.
- 0 Comments
- Decentralized Data
- User Rights