Skip to main navigation Skip to search Skip to main content

Values ML: A new multilingual dataset for values detection in news and political manifestos

Research output: Working paper/preprintPreprintAcademic

Abstract

Values are important building blocks of ideologies and regularly referenced in political debates all over the world. They are also abstract, requiring some process of translation to connect them with behaviours or preferences over policies. However, it is still unclear how this translation takes form or how it differs across contexts,like different countries and cultures. With this paper, we introduce a large-scale and high-quality dataset of news and political manifestos in multiple languages, that was annotated and curated by values scholars according to the Schwartz (1992), Schwartz et al. (2012)theory of human values. Moreover, each expression of values annotated in the texts included a second layer of identification expressing the degree of fulfilment of the value in the text, specifically whether the value was (partially) attained or (partially) constrained. The final dataset comprises 2,648 annotatedtexts in ninelanguages,totalling74,231 sentences. The dataset can be used to investigate the expression of values in the news articles and political manifestos that were annotated and astraining materials for the development of automated values detection methods, using natural language processing algorithms.
Original languageEnglish
Pages1
Number of pages61
DOIs
Publication statusSubmitted - 28 May 2025

Fingerprint

Dive into the research topics of 'Values ML: A new multilingual dataset for values detection in news and political manifestos'. Together they form a unique fingerprint.

Cite this