The catalogue contains study descriptions in various languages. The system searches with your search terms from study descriptions available in the language you have selected. The catalogue does not have ‘All languages’ option as due to linguistic differences this would give incomplete results. See the User Guide for more detailed information.
Digging into Early Colonial Mexico: DECM Machine Ready Corpus, 1577-1585
Creator
Murrieta-Flores, P, Lancaster University
Jimenez-Badillo, D, Instituto Nacional de Antropolgia e Historia, Mexico
Favila-Vazquez, M, Centro de Investigaciones y Estudios Superiores en Antropología Social, Mexico
Liceras-Garrido, R, Universidad Autonoma de Madrid, Spain
Bellamy, K, Lancaster University
Study number / PID
855935 (UKDA)
10.5255/UKDA-SN-855935 (DOI)
Data access
Open
Series
Not available
Abstract
This digital version of the RGs corpus contains only the historical information produced in the 16th century. All the comments and footnotes by René Acuña and Mercedes de la Garza have been removed to provide a clean version of the transcribed documents. This version of the corpus is now ready to be used for Text Mining, Machine Learning, Natural Language Processing, Corpus Lingüistics, and any other computational methodologies available for the study and exploration of historical textual sources.
The Data Collection is available from an external repository. Access is available via Related Resources.The 'Colonisation of America' is a fundamental process in the history of the modern world. Along with archaeological remains, the historical writings related to the establishment of the so-called Virreinatos constitute primary sources of information for the understanding of this period. An extended compilation of information ordered by the Spanish crown in the 16th century, called Relaciones Geográficas, served to gather vast amounts of information about the New World through multiple records and descriptions, both in Spanish and indigenous style. Traditional research of these documents has relied on the close reading of a handful of these texts, which can take the scholar a life-time to examine. Using a Big-Data approach, this project will apply for the first time ground-breaking computational methodologies to study one of the most important sources for the colonial history of America, and it will identify, extract, cross-link, and analyse information of vital importance to historical enquiry. Our highly interdisciplinary team will combine techniques from different disciplines, including Corpus Linguistics, Text Mining, Natural Language Processing, Machine Learning, and Geographic Information Systems, to address questions related to the recording of information about indigenous cultures, the Spanish exploration of indigenous social and religious concepts, the...
Terminology used is generally based on DDI controlled vocabularies: Time Method, Analysis Unit, Sampling Procedure and Mode of Collection, available at CESSDA Vocabulary Service.
Methodology
Data collection period
31/12/2017 - 30/12/2020
Country
Mexico, United Kingdom
Time dimension
Not available
Analysis unit
Other
Universe
Not available
Sampling procedure
Not available
Kind of data
Text
Data collection mode
Historical research; Optical Character Recognition; Transcription
Funding information
Grant number
ES/R003890/1
Access
Publisher
UK Data Service
Publication year
2022
Terms of data access
The Data Collection is available from an external repository. Access is available via Related Resources.