Search topics
satisfaction with knowledge of the Internet
Collection years
Search countries
Search publishers
2016 EU Referendum campaign online news and information URLs
The data set represents processed data from individual web browsing histories collected during the EU Referendum campaign as part of ICM Unlimited Reflected Life's panel. Each line of data represents the number of times an individual user visited a news & information domain during the data collection period.The advent of Web 2.0 - the second generation of the World Wide Web,...
Predictors of likelihood of sharing disinformation on social media 2019-2020
This dataset was collected as part of a project evaluating the effect of a number of predictors on the likelihood of individuals onward-sharing of disinformation on social media platforms. Four online experimental studies were performed, with characteristics of the messages being manipulated and characteristics of the individuals being measured. The psychometric measures...
Accent bias and fair access in Britain 2017-2020
The accent bias project examines attitudes to major accents in England, changing attitudes across age groups, attitudes to new urban dialects, and explores how accent interferes with assessments of professional ability. Using a combination of insights from sociolinguistics, social psychology, and labour market economics, we explore accent bias in relation to a number of...
The national corpus of contemporary Welsh, 2016-2020
The CorCenCC corpus contains over 11 million words (circa 14.4m tokens). CorCenCC is the first corpus of the Welsh language that covers all three aspects of contemporary Welsh: spoken, written and electronically mediated (e-language). It offers a snapshot of the Welsh language across a range of contexts of use, e.g. private conversations, group socialising, business and...
Multilingualism and Multiliteracy: Raising Learning Outcomes in Challenging Contexts in Primary Schools Across India, 2016-2020
The Multilingualism and Multiliteracy (MultiLila) project was a four-year research study (2016 –2020).It aimed to examine whether a match or mismatch between the child’s home language(s) and the school language affect learning outcomes while at the same time taking into other factors that can affect a child’s performance on basic school skills and more advanced,...
Consumer perceptions of energy and clothes shopping: a high street survey
SPSS data file and questionnaires from a High Street survey conducted as part of the TRANSFER project. A total of 138 members of the general public (mix of genders and ages) were sampled in Sheffield (n = 111), Stockport (n = 19) and Manchester (n = 8), United Kingdom, in June/July 2014.
Respondents completed either a questionnaire about purchasing 'green' energy tariffs...
Assuming identities online: Experimental chatlogs
Research taking a computational approach to the analysis of online communications has thus far focused overwhelmingly on the structural elements of Computer Mediated Discourse (CMD), such as typography, orthography and other low level features, with little to no attention being paid to the socially situated discourses in which these features are embedded. The Centre for...
DCAL Research Data Archive 2006-2016
DCAL is the largest deafness and sign language research centre in Europe, bringing together leading Deaf and hearing researchers in the fields of sign linguistics, psychology, and neuroscience. Based at University College London (UCL) and funded by the Economic and Social Research Council (ESRC), DCAL places sign languages and Deaf people in the centre of the general...
Interviews with disaster affected people, humanitarian officers, local government representatives and other stakeholders involved in the Typhoon Haiyan Recovery
The dataset contains semi-structured interviews with 80 people affected by Typhoon Haiyan which hit the Central Philippines in November 2014 and remains the strongest Typhoon ever to make landfall. The interviews explored how participants' experienced the disaster recovery and in particular whether - and if so how - they used social and mobile media in that process.
Understanding speech in the presence of other speech: Perceptual mechanisms for auditory scene analysis in human listeners
The datasets comprise transcriptions of stimuli (simplified analogues of spoken utterances) and associated keyword scores (proportion correct). The transcriptions are those entered by the participant using a keyboard.
It is unusual to hear the speech of a particular talker in isolation; speech is typically heard in the presence of interfering sounds, such as the voices of...
Television framing of the 2014 Scottish independence referendum - Part 2: Coding of frames in television programmes
This dataset contains the coding of the frames that appeared in all news and current affairs items about the 2014 Scottish independence referendum which were produced for a Scottish audience (i.e. excluding the UK-wide coverage of the referendum) and were broadcast on BBC Scotland and STV between 18 August and 18 September 2014. The file records the date, duration, channel,...
Television framing of the 2014 Scottish independence referendum - Part 3: Coding of news sources
This dataset contains the coding of the sources that appeared or were openly referenced in all news items about the 2014 Scottish independence referendum which were broadcast on BBC Reporting Scotland between 18 August and 18 September 2014. The file records the name of each source, the duration of their appearance or quotation, their gender, the side they supported in the...
Study of dynamic communities on networks, diabetes tweets 2013-2014
This data collection consists of tweets in English that contain the term 'diabetes' posted between March 2013 and January 2014.
Abstract from the paper:
Social media are being increasingly used for health promotion. Yet the landscape of users and messages in such public fora is not well understood. So far, studies have typically focused either on people suffering from a...
How people perceive likelihood and risk of inferring sensitive information from social media data: survey data, 2016
This data collection consists anonymised survey data collected as part of a study into how people perceive likelihood and risk of inferring sensitive information from social media data when injecting conflicts and uncertainty. Electronic files include XLS spreadsheet of collected survey responses, and pdf versions of the online survey instrument.There is now a broad...
A sociology of values and value
The collection consists of transcripts from interviews with project participants, data from their Facebook usage, related advertising and browser tracking as well as survey answers.
There has been a great deal of interest in how capital has intervened in almost every area of life, leading some to propose new forms of capital eg ‘emotional capitalism’, and others to suggest...
Irish marriage referendum tweets 2015
This dataset contains the IDs of 499,642 tweets containing the hashtags #marref and #marriageref posted on Twitter between May 8 and May 23 2015.
We examine the relationship between social structure and sentiment through the analysis of a large collection of tweets about the Irish Marriage Referendum of 2015. We obtain the sentiment of every tweet with the hashtags #marref...
Interactions in duo improvisations
This data release relates to the topic of interactions in music performance. The data consists of annotations, movement and audio descriptors, and computational predictions of interactions in jazz duos. The jazz duos are represented by 15 pulsed (standard jazz) improvisations and 15 non-pulsed improvisations (free jazz) examples, which have been documented previously (Moran...
Enemy addiction: Archival documents from 13 United States presidential libraries, 1919-2008
The project 'Enemy Addiction' has created approximately 30.000 photos and photocopies of archival document pages.
These consist primarily of security speech drafts, such as the security sections of State of the Union and Inaugural addresses as well as other key security speeches. In addition, the collection contains related communication such as memoranda from the President...
elderLUCID: London UCL Older adults' clear speech in interaction database
This collection contains the quantitative data resulting from the analysis of the elderLUCID audio corpus – a set of speech recordings collected for 83 adults aged 19 to 84 years inclusive. Recordings were made while participants carried out two types of collaborative tasks with a conversational partner who was a young adult of the same sex: (1) a ‘spot the difference’...
Human trafficking media texts
The project partners collected and analysed media material produced after 2000. Gregoriou and Ras examined a 61.5 million word corpus of UK news texts published during 2000-2016. Muždeka compared a sub-corpus from Gregoriou and Ras’ UK study to a similarly compiled corpus of Serbian human trafficking newstexts. Beyer analysed British and Scandinavian crime fiction novels,...
Historic droughts inventory of references from British nineteenth-century newspapers 1800-1900
Occurrences of the search term 'drought' in articles published by nine British regional and national newspapers between 1800 and 1900, with surrounding context of 10 words on each side of the search term. The following newspapers were considered: The Era; Glasgow Herald; Hampshire and Portsmouth Telegraph; Ipswich Journal; Northern Echo; Pall Mall Gazette; Reynold’s Journal;...
Historic droughts inventory of references from British twentieth-century newspapers 1900-1999
Occurrences of the search term 'drought' in articles published in editions of The Times between 1900 and 1999, with surrounding context of 10 words on each side of the search term. The inventory provides information regarding publication date and instances of place-names within the UK that co-occur with the search term. Historic Droughts was a four year (2014 – 2018), £1.5m...
Word order judgement database
The associated Ph.D thesis investigates the relevance of grammatical structure when using dependency parsing to evaluate multiple aspects of quality in machine-translated sentences. To this end, two tools were produced.
In order to evaluate the performance of these and other tools, a body of native English speakers were presented with a series of sentences and asked to rate...
Linguistic development in L2 Spanish: Creation and analysis of a learner corpus
This project had two aims: to establish a small scale, high quality database of spoken learner Spanish, and to undertake a short programme of substantive research into L2 (second language)Spanish. The data was collected from classroom learners of Spanish (with English as their first language), from beginners to advanced level, using specially designed elicitation tasks. For...
Rhythmic timing and dyslexia: A causal connection?
How well children will learn to read is determined in part by their phonological awareness. Phonological awareness refers to the child’s awareness of the sound structure of their spoken language. It develops at different linguistic levels. The first level is that of the syllable (win-dow, pop-si-cle). The second level is that of the onset/rime (w-in, p-op, sw-eet, spr-ing)....
Cross-language differences in pitch range
British speakers are thought to vary their pitch range more (their voice goes more up and down) than German speakers do, but this has never been systematically compared.
The main aim of this project is to develop the methodology that would allow us to investigate the nature of variability in pitch range across speakers of different languages.
In this project the use of pitch...
New word learning in Down syndrome
Individuals with Down syndrome have particular difficulties in language acquisition, and also show relatively poor verbal short-term memory skills. These deficits may be related, as the ability to hold new word sounds in short-term memory is thought to be a crucial aspect of vocabulary learning. To test this possibility, this research will investigate the new word learning...
Effects of processing load on speech segmentation
The goal of this research project is to improve our understanding of the perceptual and cognitive factors contributing to the segmentation of fluent speech. Speech-segmentation research investigates how listeners identify word boundaries in the ongoing stream of sounds. There is ample evidence that the mechanisms supporting segmentation can be categorised...
Talking cleanliness in health and agriculture
The threat to human and animal health posed by a rise in infectious diseases, a decrease in antimicrobial resistance and the risk of zoonoses (diseases transmitted continuously from one species to another), such as avian flu, has rarely been higher on the government agenda. It is vital to know how to respond efficiently and effectively to such threats, be it on the farmyard...
Modelling eye-movements made in the course of reading syntactically ambiguous sentences
A central goal for work on human language processing is to spell out the characteristics of the transitory mental operations that enable people to make sense of sentences and passages of prose. In this field, much of the core empirical evidence has come from detailed measurements of the eye-movements that individual readers make as they work through carefully crafted samples...