<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='/oai/static/oai2.xsl' ?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2026-05-31T12:17:24Z</responseDate>
  <request identifier="825ee2a87d445b566dc35227b00648ed0edb6f97696c625e6d686d690eb5cf78" metadataPrefix="oai_ddi25" verb="GetRecord">https://datacatalogue.cessda.eu/oai-pmh/v0/oai</request>
  <GetRecord>
    <record>
    <header>
      <identifier>825ee2a87d445b566dc35227b00648ed0edb6f97696c625e6d686d690eb5cf78</identifier>
      <datestamp>2025-06-17T03:17:53Z</datestamp>
      <setSpec>language:en</setSpec><setSpec>openaire_data</setSpec>
    </header>
      <metadata>
        <codeBook xmlns="ddi:codebook:2_5" version="2.5" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd">
    <docDscr>
      <citation>
        <titlStmt>
          <titl xml:lang="en">The Superdiversity Index</titl>
        </titlStmt>
        <prodStmt>
        </prodStmt>
        <holdings xml:lang="en" URI="https://doi.org/10.17903/FK2/AVI6AH"/>
      </citation>
    </docDscr>
  <stdyDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="en">The Superdiversity Index</titl>
        <IDNo xml:lang="en" agency="DOI">doi:10.17903/FK2/AVI6AH</IDNo>
      </titlStmt>
      <rspStmt>
        <AuthEnty affiliation="University of Pisa" xml:lang="en">Pollacci, Laura
        </AuthEnty><AuthEnty affiliation="University of Pisa" xml:lang="en">Sirbu, Alina
        </AuthEnty>
      </rspStmt>
      <prodStmt>
        <prodDate xml:lang="en"/>
        <grantNo agency="Horizon 2020" xml:lang="en">GA 870661</grantNo><grantNo agency="Horizon 2020" xml:lang="en">GA 654024</grantNo><grantNo agency="Horizon 2020" xml:lang="en">GA 871042</grantNo>
      </prodStmt>
      <distStmt>
        <distrbtr xml:lang="en">Κατάλογος Δεδομένων SoDaNet</distrbtr>
        <distDate xml:lang="en" date="2024-04-30">2024-04-30</distDate>
      </distStmt>
      <verStmt>
      </verStmt>
      <holdings xml:lang="en" URI="https://doi.org/10.17903/FK2/AVI6AH"/>
    </citation>
    <stdyInfo>
      <subject>
        <keyword xml:lang="en" vocab="ELSST">MIGRANTS</keyword><keyword xml:lang="en">CULTURAL INDICATORS</keyword><keyword xml:lang="en">SUPERDIVERSITY</keyword>
        <topcClas xml:lang="en" vocab="CESSDA Topic Classification">Language and linguistics</topcClas><topcClas xml:lang="en" vocab="CESSDA Topic Classification">Cultural and national identity</topcClas>
      </subject>
      <abstract xml:lang="en">The Superdiversity dataset includes the &lt;em&gt;Superdiversity Index&lt;/em&gt; (SI) calculated on the diversity of the emotional content expressed in texts of different communities. The emotional valences of words used by a community are extracted from Twitter data produced by that specific community. The Superdiversity dataset includes the SI built on &lt;b&gt;Twitter data&lt;/b&gt; and &lt;b&gt;lexicon-based Sentiment Analysis&lt;/b&gt;. In addition, the dataset comprises other possible diversity measures calculated from the same data from which the SI is calculated, such as the number of tweets in the community language and the Type-Token Ratio, the number of languages in a community.  The SI ranges in [0, 1]: &lt;ul&gt; &lt;li&gt;a value of 0 means an emotional content very close between the computed valences and a standard emotional lexicon. &lt;/li&gt; &lt;li&gt;a value of 0.5 indicates no correlation between the emotional content of words used by the community on Twitter and the standard emotional content.&lt;/li&gt; &lt;li&gt;a value of 1 would correspond to the use of terms with the opposite emotional content compared to the standard.&lt;/li&gt; &lt;/ul&gt; Data is computed at three different geographical scales based on the &lt;a href="https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Glossary:Nomenclature_of_territorial_units_for_statistics_(NUTS)"&gt;Classification of Territorial Units for Statistics (NUTS),&lt;/a&gt; i.e., NUTS1, NUTS2, and NUTS3, for two different nations Italy and the United Kingdom. The untagged Twitter dataset is composed of just under 73,175,500 geolocalised tweets gathered for 3 months, from the 1st August to the 31st October of 2015.</abstract>
      <sumDscr>
        <nation xml:lang="en">United Kingdom</nation>
        <anlyUnit xml:lang="en">Media unit: Text<concept/></anlyUnit>
        <dataKind xml:lang="en">Textual data</dataKind>
      </sumDscr>
    </stdyInfo>
    <method>
      <dataColl>
        <timeMeth xml:lang="en">Cross-section<concept/></timeMeth>
        <collMode xml:lang="en">Content coding<concept/></collMode>
        <resInstru xml:lang="en">Programming script<concept/></resInstru>
      </dataColl>
    </method>
    <dataAccs>
      <useStmt>
      </useStmt>
    </dataAccs>
    <othrStdyMat>
    </othrStdyMat>
  </stdyDscr>
  <fileDscr>
  </fileDscr>
</codeBook>
      </metadata>
      <about>
        <provenance xmlns="http://www.openarchives.org/OAI/2.0/provenance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/provenance http://www.openarchives.org/OAI/2.0/provenance.xsd">
    <originDescription harvestDate="2025-06-17T03:17:52Z" altered="true">
      <baseURL>https://datacatalogue.sodanet.gr/oai</baseURL>
      <identifier>doi:10.17903/FK2/AVI6AH</identifier>
      <datestamp>2024-04-30T15:50:19Z</datestamp>
      <metadataNamespace>ddi:codebook:2_5</metadataNamespace>
    </originDescription>
</provenance>
      </about>
    </record>
  </GetRecord>
</OAI-PMH>