
Social networks are bolstering their terms of service against scrapers and bots that crawl the website to train AI models. Days after Elon Musk-owned X updated its terms to explicitly prohibit AI model training, decentralized social network Mastodon today updated its own rules to bar any kind of model training, as well.
“We explicitly prohibit the scraping of user data for unauthorized purposes, e.g. archival or large language model (LLM) training. We want to make it clear that training LLMs on the data of Mastodon users on our instances is not permitted,” Mastodon said in an email sent to users.
The new terms, which will be applicable to the social network starting July 1, have legal language that prohibits any data extraction and development of an automated system.
“Use, launch, develop, or distribute any automated system, including without limitation, any spider, robot, cheat utility, scraper, offline reader, or any data mining or similar data gathering extraction tools to access the Instance, except in each case as may be the result of standard search engine or Internet browser and local caching or for human review and interaction with Content on the Instance,” the terms note.
It’s important to note that these terms apply only to Mastodon.social server, which is only one of the instances on the fediverse, a distributed network. This means scrapers could still extract data from other servers and use that to train AI models if they don’t explicitly bar that in their terms of service.
Other platforms, including OpenAI, Reddit, and The Browser Company, have added similar clauses to their rules to prevent other companies from training models.
Apart from this change, Mastodon is also enforcing a new age limit of 16 for users. The social network had an age limit of 13 for users in the U.S., but it is changing the age limit globally.