Twitter opens up data for researchers to study COVID-19 tweets

May 1, 2020 - 1:57 PM
3198
A 3D-printed logo for Twitter is seen in this picture illustration made in Zenica, Bosnia and Herzegovina on January 26, 2016. (Reuters/Dado Ruvic/File Photo)

Twitter Inc will grant researchers and software developers access to a real-time data stream of tens of millions of daily public tweets about COVID-19, which they can use to study the spread of the disease or track misinformation, the company said in a blog post on Wednesday.

Twitter said that this access could also be used by approved applicants working on crisis management, emergency response or communication within communities, as well as those developing machine learning and data tools to help the scientific community understand COVID-19.

Social media platforms have both introduced new policies to curb COVID-19 misinformation and warned that there may be errors due to their reliance on more automated moderation systems during the pandemic. Researchers studying the platforms have argued that the companies must collect data about this period.

European Commissioner Vera Jourova called Twitter‘s move “a good step in the right direction.”

“Our cooperation and regular contacts with online platforms to fight disinformation are bearing fruit. I have constantly underlined the importance for researchers to have better access to useful non-personal data and tools,” she said.

Last week, 75 groups and individuals, including digital rights and free speech organizations, wrote an open letter to social media platforms asking them to preserve and publish their content moderation data.

However, Twitter said that once tweets were taken down, they would have to be removed from this COVID-19 data set.

Currently, Twitter‘s free, public API access offers a small percentage of its full stream of tweets, although the company does provide greater levels of access to paying clients.

Twitter said it had never offered a full stream on a particular topic before, and that it represented tens of thousands of dollars per month in data.

Researchers will be able to access tweets created from the moment they connect to the data set, but it will not provide historical data.

In the blog post, it said that any developer or researcher with an approved Twitter developer account can apply for access to the COVID-19 stream endpoint, but they must meet a set of requirements, including that the use case supports “the public good.”

Applicants will also have to explain how they will protect the privacy and safety of people represented in the data.

Twitter said that researchers would be bound by its usual rules projects analyzing health-related topics, for example, not deriving or inferring information about a Twitter user’s health and not storing any personal data about such sensitive information. —Reporting by Elizabeth Culliford; Editing by David Gregorio