Summary information

Study title

TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets (Part 11, Jan 2022 - Aug 2022)

Creator

Baran, Erdal ( GESIS - Leibniz-Institut für Sozialwissenschaften)
Bensmann, Felix ( GESIS - Leibniz-Institut für Sozialwissenschaften)
Dietze, Stefan ( GESIS - Leibniz-Institut für Sozialwissenschaften & Heinrich-Heine-University Düsseldorf, Germany & L3S Research Center, Hannover, Germany)

Study number / PID

10.7802/2473 (GESIS)

10.7802/2473 (DOI)

Data access

Information not available

Series

Not available

Abstract

TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for nearly 3.0 billion tweets, spanning more than 9 years (February 2013 - August 2022). Metadata information about the tweets as well as extracted entities, sentiments, hashtags, user mentions and URLs are exposed in RDF using established RDF/S vocabularies. For the sake of privacy, we anonymize user IDs and we do not provide the text of the tweets. For a list of the previous dataset parts, example queries and more information see the TweetsKB's home page: https://data.gesis.org/tweetskb/.

Topics

Not available

Methodology

Data collection period

01/01/2022 - 01/08/2022

Country

Time dimension

Not available

Analysis unit

Not available

Universe

a 1% sample of all tweets from Jan 2022 until Aug 2022

Sampling procedure

Not available

Kind of data

Not available

Data collection mode

Web Scraping

Access

Publisher

GESIS Data Archive for the Social Sciences

Publication year

2022

Terms of data access

Free access (without registration) - The research data can be downloaded directly by anyone without further limitations. Data can only be used for non-commercial research

Related publications

Not available