Summary information

Study title

Drupal Planet links archive, 29-05-2013 - 23-11-2016

Creator

Rozas, D, University of Surrey

Study number / PID

852904 (UKDA)

10.5255/UKDA-SN-852904 (DOI)

Data access

Open

Series

Not available

Abstract

Database of links to posts published under Drupal Planet, a popular RSS feed within the Drupal community, whose contents are curated by Drupalistas according to certain guidelines. The database excludes press releases, job announcements and technical posts with little content relevant to Drupal. This archive has been designed for researching purposes for the PhD thesis: "Drupal as a Commons-Based Peer Production community: an ethnographic perspective". Since posts at Drupal Planet are only retained for 16 weeks, a set of software scripts was developed to collect and archive links to posts automatically from 29 May 2013 to 23 November 2016. This yielded an archive of 8,613 documents for documentary analysis as part of the relevant to the PhD study. Commons-Based Peer Production (CBPP) is a new model of socio-economic production in which groups of individuals cooperate with each other without a traditional hierarchical organisation to produce common and public goods, such as Wikipedia or GNU/Linux. There is a need to understand how these communities govern and organise themselves as they grow in size and complexity. Following an ethnographic approach, this thesis explores the emergence of and changes in the organisational structures and processes of Drupal: a large and global CBBP community which, over the past fifteen years, has coordinated the work of hundreds of thousands of participants to develop a technology which currently powers more than 2% of websites worldwide. Firstly, this thesis questions and studies the notion of contribution in CBPP communities, arguing that contribution should be understood as a set of meanings which are under constant negotiation between the participants according to their own internal logics of value. Following a constructivist approach, it shows the relevance played by less visible contribution activities such as the organisation of events. Secondly, this thesis explores the emergence and inner workings of the socio-technical...
Read more

Methodology

Data collection period

Not available

Country

World Wide

Time dimension

Not available

Analysis unit

Other

Universe

Not available

Sampling procedure

Not available

Kind of data

Text

Data collection mode

Two data collection strategies were employed: (1) A PHP script which periodically includes the new posts from Drupal Planet. This script was first run on 30/12/2014. From that date (including the previous 30 posts), the list should be exhaustive as far as no errors might have provoked the server to go down. (2) A Python script to recover the blog posts fetched via the RSS reader of Thunderbird. The source were a set of .eml files parsed and included into the database. These came from several machines and were merged. However, some of the blog posts might not have been gathered (e.g.: if the e-mail client was not run for a while). Therefore, the list regarding the previous period might not be so exhaustive and some of the posts might have gotten lost.The online version of the archive (see Related resources) continues to add further links; currently numbering 11,724 post links.The source code of the scripts is available on GitHub under a GPLv3 license. CSS adapted from captain Anonymous.

Funding information

Grant number

FP7-ICT-2013-10 610961

Access

Publisher

UK Data Service

Publication year

2018

Terms of data access

The Data Collection is available to any user without the requirement for registration for download/access.

Related publications

Not available