Values That Are Explicitly Present in Fairy Tales: Comparing Samples from German, Italian and Portuguese Traditions

Looking at how social values are represented in fairy tales can give insights about the variations in communication of values across cultures. We study how values are communicated in fairy tales from Portugal, Italy and Germany using a technique called word embedding with a compass to quantify vocabulary differences and commonalities. We study how these three national traditions differ in their explicit references to values. To do this, we specify a list of value-charged tokens, consider their word stems and analyse the distance between these in a bespoke pre-trained Word2Vec model. We triangulate and critically discuss the validity of the resulting hypotheses emerging from this quantitative model. Our claim is that this is a reusable and reproducible method for the study of the values explicitly referenced in historical corpora. Finally, our preliminary findings hint at a shared cultural understanding and the expression of values such as Benevolence, Conformity, and Universalism across the studied cultures, suggesting the potential existence of a pan-European cultural memory.


I INTRODUCTION
Culture is defined "as a common heritage of a set of beliefs, norms, and values" [US DHHS 2001], that influences an individual's cognition and behaviour [Wong, 2013].Social values are understood as standards or criteria of the desirable, thus they guide the selection or evaluation of behaviours, policies, people, and events [Schwartz et al., 2020].Building on this understanding of values as a cornerstone of culture, we turn to literature as a mirror reflecting these values across different cultural contexts in the past.Developments in natural language processing (NLP), in particular word embeddings, have allowed for the quantitative analysis of historical corpora [Miaschi andDell'Orletta, 2020, Rodriguez andSpirling, 2022].
With this work we want to test the limits of an approach for studying the social values present in fairy tales, one of the most widely spread forms of popular narratives.Fairy tales are a privileged genre for the identification of patterns of cultural exchange, as they have historically migrated across different cultures and periods, creating a rich tapestry of storytelling traditions.
In particular, we study the aggregated explicit tokens mapped on the values proposed by the Theory of Basic Human Values [Schwartz, 1992[Schwartz, , 2012] ] across fairy tale corpora from the traditions of three European countries -namely Portugal, Italy and Germany -in order to compare their quantitative representations and analyse the emerging patterns.We do this by first finding the stemmed matches of these tokens and enriching the text with the corresponding annotation.After that we employ a technique called word embedding with a compass [Di Carlo et al., 2019] and clique percolations [Palla et al., 2005] to highlight the semantic variation between the three national corpora.
A critical investigation of the results of our method finds that these correspond to findings of previous research.We also find indications that, despite the differences in the expression of values in the three compared countries, it seems that the values of Benevolence (quality of interpersonal relationships), Conformity (respect for social norms and expectations) and Universalism (protection of the welfare of people and nature) have remained consistent in fairy tales across the three national traditions, which we also view as confirmation of the validity of our approach for the study of values embedded in historical, literary corpora.This paper continues with a background section, introducing key concepts from psychology, literary studies and NLP, and discussing relevant literature from these fields.In Section III, we present our approach and the software tools we used.We provide an overview of our results in Section IV.In the final section we conduct a discussion of the approach and results, and we reflect on possible directions for future work.

II BACKGROUND
The study of explicit references of values in fairy tales is related to the accumulated social attitudes up to the historical period of codification of the tales.To our knowledge, no systematic research of this wide topic exists.As such, we view it as being at the crossroads between the socio-historical, literary study of fairy tales, and the psychological study of social values which is shaped by contemporary research.On the other hand, such a study at scale and in a reproducible way would not be possible without the instruments and methods of computational humanities and word embeddings in particular.

Unpacking Fairy-Tale Studies from the Brothers Grimm to Digital Humanities
The late 18th century witnessed the rise of folklore studies as part of a quest for national and cultural identity, particularly in Europe [Schacker, 2003].Jakob and Wilhelm Grimm, riding the tide of renewed interest in popular culture among the upper-class intelligentsia, became pivotal figures in this domain.They first published their fairy-tale collection Children's and Household Tales in 1812, striving to present a pure German narrative tradition, untouched by foreign influence, particularly the French [Teverson, 2013].This publication sparked what would become the 19th century's golden age of fairy tales across Europe.This was a time of growing urbanisation, industrialization, and literacy.Scholars and nationalists, fearful of losing invaluable oral traditions due to these rapid societal changes, began the collection and preservation of folklore [Ostry, 2013].Among these custodians were collectors and writers such as Italy's Giuseppe Pitré and Portugal's Consiglieri Pedroso, whose texts feature prominently here alongside the Grimms'.Their work, heavily inspired by the Grimms, was driven by a desire to distil and dialectically construct their nations' cultural legacy.
Despite the nationalistic intentions of Brothers Grimm and others who embarked on preserving what they thought to be distinct national narratives, the study of fairy tales reveals as much about the interconnectedness of cultures as it does about their uniqueness.Fairy tales, at their core, are a blend of narratives that "migrate on soft feet" [Warner and Warner, 2016], indicating that they traverse and interweave across generic, geographical and temporal boundaries, sometimes in untraceable ways.Thus, while the Grimms and others sought to capture and enshrine a uniquely national heritage, their work also serves to underscore the similarities between narrative traditions.
Unpicking these similarities and differences, however, can prove to be quite a complex task.As scholars are frequently dependent on translations, the risk for misinterpretation or loss of nuanced meanings during this process is high.Translations, like the ones by Margaret Hunt, Thomas Crane and Henriqueta Monteiro used here, are enormously valuable artefacts, but must be recognised as acts of literary adaptation that might differ from the originals [Haase, 2016].These translations may introduce variations in the representation and interpretation of values, underscoring the need for careful consideration of linguistic nuances in cross-cultural analysis.Further complicating matters, the comparative analysis of several national traditions involves processing vast quantities of text to identify patterns.This challenge extends beyond the study of fairy tales and into the comparative study of literature as a whole.
In response to these challenges, digital humanities and computer-assisted literary studies offer innovative methodologies.Computational methods, in particular, aid in identifying and assessing literary patterns across scales, from individual texts to entire fields and systems of cultural production [Wilkens, 2015].These new approaches, to which our work is a contribution, help produce new types of evidence that enrich and expand humanities research.Indeed, computational approaches to fairy tales have already successfully been deployed in studies such as "Computational analysis of the body in European fairy tales" [Weingart and Jorgensen, 2013].In that study, the authors used digital humanities research methods to analyse the representations of gendered bodies in European fairy tales.They created a manually curated database listing every reference to a body or body part in a selection of 233 fairy tales, and its analysis revealed that the gender and age of fairy-tale protagonists correlate in ways that indicate societal biases, particularly against the ageing female body.A further exploration of gender bias in fairy tales is presented in "Are Fairy Tales Fair?" [Isaza et al., 2023].This study employs computational analysis to dissect the sequence of events in fairy tales, revealing that one in four event types exhibit gender bias when not considering temporal order, and that female characters are more likely to experience gender-biased events at the start of their narrative arcs.These studies underscore the potential of distant reading, data analysis and visualisation as powerful tools in the comparative study of fairy tales, particularly when used alongside subject expert close reading [Moretti, 2022].Nevertheless, perceptions and attitudes towards gender represent just a fraction of the broader societal values spectrum.

The expression of values across cultures in European Fairy Tales
Values are regarded as a shared societal understanding of what constitutes good, wrong, fair, unfair, just, right or ethical behaviour [Haidt, 2013, Kesebir and Haidt, 2010, Turiel, 2005].Values are cognitive representations of an individual's biological needs, an individual's requirements in interpersonal coordination, and the institutional demands focused on group welfare and survival [Schwartz and Bilsky, 1987].Nonetheless, it is crucial to acknowledge the significance of cultural and individual influence in the development and expression of values.Cultural Psychology postulates that human behaviours result from the reciprocal interaction between cultural and individual psyche [Shweder, 1991, Cohen, 2011, Schwartz et al., 2020].However, the manifestation of behaviours and values is contingent upon context and situation, implying that similar cultural processes might serve or facilitate different purposes based on cultural context [Rogoff, 2003, Schwartz et al., 2020].Therefore, one could examine variations in the expression of values across different regions and periods, and this could be done through the analysis of his- torical corpora.This stems from the expectation that literature can be used as a vehicle for the expression of cultural norms and values, thereby reflecting the distinct ideological attributes of the writers and the regions from which it emerges [Albrecht, 1956].Several Theories have been proposed to summarise values across different cultures (for a review of theories see Ellemers et al [2019]).In this paper we focus on the Theory of Basic Human Values [Schwartz, 2012], since it found validity expression across several cultures [Spini, 2003, Schwartz et al., 2001, 2014], and it has been applied in the study of European values (e.g., European Social Survey [Davidov et al., 2008]).A version of the Theory of Basic Human Values [Schwartz, 2012], simplier than its sequel, comprises of 10 human values that are fuelled by four different and opposite motivations: Openness to Change vs. Conservation, Self-Transcendence vs. Self-Enhancement as observed in Figure 1.
Openness to Change relates to an individual's need for independence of thought, action, and feelings, and readiness for change, therefore comprises the values of Self-Direction, Stimulation, and partly Hedonism.On the other hand, Conservation relates to the values of Security, Conformity and Tradition, as it emphasises the individual's needs for order, preservation of the past, and resistance to change.Self-Enhancement considers the individual's needs to pursue their own interests, success, and dominance over others, therefore comprises the values of Power, Achievement, and partly Hedonism.On the other hand, Self-Transcendence considers the values of Universalism and Benevolence, to focus on the welfare and better interests of others.For a definition of specific values, see Table 1.
Europeans can be regarded as having a common identity [Castano, 2004] that is expressed through their way of life, values and culture, and that has been building since ancient times [Pagden, 2002, Pinheiro et al., 2012]  Understanding, appreciation, tolerance, and protection of the welfare of all people and of nature.
Table 1: The definition of each of the ten motivational types of values [Schwartz, 2012].
studies and policy making guidelines, these values correspond to Schwartz's values of Universalism, Self-Direction, and Benevolence (for more information see Scharfbillig et al. [2021], Murteira [2024]).If these values are presumed to have been shared to some degree across the European territory since antiquity, it stands to reason that they could have been variously conveyed through fairy tales across the three regions under analysis.
Socio-psychological constructs such as values can either be assessed by explicit or implicit measures.A construct is implicitly assessed when the individual "is unaware that a psychological measurement is taking place, this type of measure is often used to assess values, attitudes, stereotypes, and emotions in social cognition research" [APA, 2023].On the other hand, a psychological construct is explicitly assessed when the "individual is aware that a psychological measurement is taking place" [APA, 2023].Putting it simply, values can be measured explicitly when individuals are directly asked about values, and implicitly when the individuals are not aware of the measurement, because values are assessed using indirect questioning methods.
Bearing in mind that art is a behavioural expression of culture that serves several purposes, including the form of order, which is the need for psychological and mental organisation of experiences [Dissanayake, 1980], we can hold the reasonable expectation that the historical corpora under analysis will reflect, to a degree, the explicit and implicit cultural ways and behaviours of societies in which these fairy tales were written.The presence of these values in our corpora was assessed by quantifying the textual representation employing a word embedding that communicate values in fairy tales.
One particular type of explicit reference to values, are negative ones, most trivially exemplfied in our corpus of study by "not loving" or "step mother".However, this notion of opposites to values expands into value dichotomies.These are pairs of values that are mutually opposed, such as "deceptiveness vs. honesty" or "trust vs. distrust".Generally, the alternatives in a duality do not necessarily imply a positive vs. negative interpretation.To illustrate, none of the options is unequivocally preferable in the dichotomies "tradition vs. innovation", "individualism vs. collectivism", "lawfulness vs. autonomy" [Hardy, 2022, Giouvanopoulou et al., 2023].Yet, in the cases when it is a matter of an unambiguously positive value and it's negation, such as "love vs. hate" or "honesty vs. dishonesty", we argue that the negation of a value is a form of indirect, albeit still explicit, reference to the value itself.Even more, in some cases the two sides of the dichotomy share the same morphological origin.Thus, we argue, that an attempt to capture explicit references to values, also needs to capture negative ones, as is the case when working with vocabulary occurrences.

Using Word Embeddings to Quantify Vocabulary Differences
Word embeddings have emerged as an important instrument for the quantitative analysis of textual corpora.These are mappings of vocabulary onto a multidimensional numerical space, based on their occurrences [Mikolov et al., 2013, Rodriguez andSpirling, 2022].Different techniques for creating word embeddings exist, but their common general principle is "a word is characterised by the company it keeps".It is useful to distinguish between two categories of word embeddings: i) static (also called type-based) -those that feature a single numerical representation vector per word token, and ii) contextual (also called token-based) -those that allow for multiple representations for a word token in order to capture potential nuances in meanings, according to the surrounding context [Miaschi andDell'Orletta, 2020, Lenci et al., 2022].Whereas contextual word embeddings better capture the richness of vocabulary, static word embeddings perform better on smaller corpora which do not have the volume that would allow for the semantic richness necessary to represent potential multiple meanings [Ehrmanntraut et al., 2021].Arguably, this is due to the fact that in a small thematic corpus, typically meanings are restricted by the context of its compilation.
A widespread approach that allows to overcome the challenge of small corpora and their lack of richness, is the combination of pre-training with a huge generic corpus and the subsequent finetuning with the corpus of interest.For example, the most popular contextual language model BERT is trained on a corpus that includes the entire contents of Wikipedia which comprises of 2.5 billion word tokens [Devlin et al., 2019], others use training sets that are many orders of magnitude larger [Dodge et al., 2021] However, corpora of these huge dimensions are inevitably contemporarily written, and due to cultural and linguistic change over time inevitably introduce unwanted biases [de Vassimon Manela et al., 2021, Ahn and Oh, 2021, Mozafari et al., 2020, Cuscito et al., 2024].In confirmation of this consideration, particularly forp the context of Historical English, Manjavacas and Fonteyn [2022] observed that training from the ground up is more effective than fine-tuning of preexisting models, and this has been independently confirmed by Cuscito et al [2024].
When it comes to comparing the word embeddings representing different corpora, a widespread approach is the so-called semantic change detection [Tahmasebi et al., 2021].Since for intercultural comparison, "change" might wrongly suggest a (diachronic) transition from one culture to the other, when comparing contexts that are not sequential, a more appropriate wording in this context is (synchronic) "semantic variation" [Tahmasebi et al., 2021, Schlechtweg et al., 2019].Still, whenever techniques for semantic change detection do not rely on any particular diachronic properties of the underlying corpora, we claim they could be reused also for synchronic linguistic analysis.More specifically we claim that an approach called temporal word embedding with a compass [Di Carlo et al., 2019] is applicable, for culture-specific rather than time-specific distinctive corpora.This approach consists of first creating an embedding on a cumulative corpus containing all texts from the different cultures to be considered.Then, from this baseline (compass) word embedding, further fine-tuning is performed on each of the corpora, to be compared so as to create culture-specific word embeddings.The result for each corpus is a different vector representations of each particular word token, which allows for quantitative comparisons between them, as done previously [Ferrara et al., 2022, Di Carlo et al., 2019].Figure 2: The outline of the process we followed.

III METHOD
To describe our method, we focus first on the followed process, and then on the bespoke tool that was developed to facilitate this process.

Process
Our study of the explicit references to values in fairy tales follows the process illustrated in Figure 2. To provide an outline, it starts with the identification of tokens that represent values of interest.We group these tokens in groups that we consider to be synonyms in the studied context.Then, we automatically annotate all occurrences in the text of the stems representing the considered tokens.Once this is done, we manually analyse the produced annotations to identify ambiguities and mistakes in this token identification process.The purpose of this analysis is to better understand the semantics behind their occurrences, in order to refine the selection of tokens and identify potential ambiguities arising from a single syntactical token potentially representing multiple values.Finally, we apply a static word embedding with a compass and perform critical analysis on the differences and similarities from the resulting vector spaces.

Fairy Tales Corpora.
The corpus selection had several stages.First, we focused on the Grimms' Children's and Household Tales, using Margaret Hunt's 1884 English translation.We manually selected 30 tales that span well-known and beloved stories and lesser-known ones, so as to provide a comprehensive representation of the entire collection.Then we selected 30 Portuguese and 30 Italian tales taken from two important contemporary collections to the Grimms': Portuguese Folktales by Consiglieri Pedroso, translated to English in 1882 by Henriqueta Monteiro; and Italian Popular Tales, collected and translated to English in 1885 by Thomas Frederick Crane.These collections were chosen due to their cultural significance and their temporal proximity to the Grimms' collection, aiming to offer a comparative perspective on 19th century fairy tales across different European cultures.Selection of Tokens.Assuming that the historical corpora are themselves mirrors of social behaviours and ways of living in societies in which the fairy tales were written, we are interested in the explicit expressions of values in the texts.Starting from Schwartz's model and the European core values, we initially compose a list of tokens that represent these values, based on three empirical studies regarding value-specific tokens.This list of tokens contains words that were selected from two dictionary studies about values, where each word is associated with a specific value from the 10 identified by Schwartz.[Schwartz, 1992, Lindeman and Verkasalo, 2005, Murteira, 2024].For instance, the token "Peace" is associated with the value of Universalism, and the token "Cooperation" is associated with the value of Benevolence (see Table 4 in Appendix).Then we perform automatic identification of explicit references of these tokens and relate them to the corresponding values.We do this using stemming [Jabbar et al., 2020] on both the token lists and the fairy tale texts.This is because, in contrast e.g. to lemmatisation, stemming reduces different word forms to the same originating token.We use the Snowball stemmer algorithm [Porter, 2001] to identify all occurrences of the stemmed tokens in the corpora and tag (i.e.annotate) them with a label corresponding to the group of synonym tokens.
Critical Review.We then critically analyse and refine by adapting tokens according to the desired annotation.This was done using a graphical interface that was specifically developed for the purpose and allows for a review of the texts in the corpora with the results of the automatic annotation highlighted in different colours.The tool is discussed in more details in Section 3.2.The outcome of this was a series of decisions to adjust the token selection as a way to refine it and guide subsequent iterations of this annotation process.Correspondingly, following this approach inspired by grounded theory [Rieger, 2019], the ultimately proposed list of tokens in this study emerges from exploration of the corpus and is not a result of deductive hypothesis research.We provide a statistical overview of the results of this annotation process in Table 2 and a Venn diagram of the occurrences of groups of synonym token across the three corpora in Figure 3. Furthermore, in Appendix we provide the complete final version of our tokens.compass,confid,curios,curious,equal,fair,father, god,good,help,husband,innoc,jewel,just,kind,king,know, liberti,love,marri,marriag,mother,pay,peac,piti,pray,punish,queen, reason,reward,right,sister,togeth,treasur,truth,wed,wife Figure 3: A Venn diagram showing the occurrences of stemmed tokens across the national corpora.
Word Embedding with a Compass.Due to the historical nature of the studied corpora and in order to avoid contaminating them with external biases from pre-training, we organise our analysis following the word embedding with a compass approach [Di Carlo et al., 2019].To do this, we create one generic culture-agnostic shared embedding from scratch containing all three corpora.Then, starting from this compass, we independently create three parallel fine-tunings for each of the cultures.For the creation of the compass, to avoid the possible introduction of biases, we chose not to include any further possible texts, neither from any of our three contexts, nor from others.Our approach to syntactic identification of references of values, is not contextual, i.e. we treat a reference to a value-related stemmed token as the same for all its identified uses.This is why, in our critical review step, we examined the validity of this generalisation.To represent the annotations in the word embedding algorithm, before and after each identified occurrence of a token we insert an indication of the corresponding group of synonym tokens (i.e. the first token in that group).
Comparison of Semantic Variation.
The word embedding allows measuring contextual similarity between words, thus speaking of "change" and "variation".Once we have the three word embeddings for the cultural corpora, for each of them we consider only the distances between groups of tokens (represented by the annotation label, i.e. the first token in each synonym group) and experimentally define a similarity threshold above which we consider a pair of tokens to have a relating edge between them in a graph representation of tokens) in order to use clique percolations clustering with k=2 [Palla et al., 2005].In other words, for all similarities above that threshold we consider the corresponding tokens to be related in that embedding, and distances above the threshold mean the corresponding tokens are not.This results in a clustering that might assign one token to multiple clusters.It might also bring two tokens into the same cluster even if the distance between them is greater than the threshold, as long as there is a "bridge" of other tokens in between to connect them.
Historical and Social Critical Analysis.At the end of our method, we analyse the quantitative results using critical analysis from the perspectives of both literary studies and psychological research.This allows us to cross-validate (e.g. through triangulation [Noble and Heale, 2019]) our results with the established body of research and thus get an indication of their theoretical validity.

Automated Annotation Tool
To facilitate the critical analysis of the annotations, we developed a bespoke tool -named MOREEVER1 -that automatically identifies the explicit references to values, highlights them for critical human review of the tales and annotations, and provides some simple visualisation techniques to ease the comparative analysis.The main view of the annotation tool is provided in Figure 4.
Both texts titles on the left and tokens on the right are clickable, which allows easy browsing per corpus to explore individual fairy tales, as well as per value token.Through a dropdown box visible in its upper left corner of Figure 4, the tool features a list of vocabulary generalisation techniques, intended as techniques to identify a broader range of tokens as matching.The choice of the technique in use can be changed in real time to allow users to examine in context throughout the corpora texts which vocabulary generalisation best approximates the expression of values they are aiming for.Among these generalisation techniques are lemmatisation, as well as Porter [Porter, 2006], Snowball [Porter, 2001] and Lancaster [Paice, 1990] stemmers2 .The tool further supports no reduction (i.e.identification only of the exact matching words) and repeated application of Snowball stemmer for experimental purposes.This feature allowed our research team to make an informed choice for the use of Porter's Snowball stemmer.
On top of these features, MOREEVER provides basic functionalities for interactive exploratory visualisations in the form of heatmap and Venn diagrams.Heatmaps, as shown in Figure 5 provide a bird's-eye view of the occurrences of tokens in tales.The featured 3-set Venn diagrams provide a cross-section of the occurrences of tokens across the three national corpora, as seen in Figure 3.Both visualisations are dependent on the choice of vocabulary generalisation

IV RESULTS
An important part of the results of our approach is the reflective inspection of the produced automated annotation and possible corrections for these.An overall conclusion of this process is that, expectedly, the most impactful tokens capture the values they were intended to match well.The most important token that did not correspond to our initial interpretation was "faith".We originally ascribed the label "faith" to the value of "piety", indicating religious devotion.However, a careful examination of our corpus revealed an intriguing trend.The term "faith", contrary to our initial classification, exclusively expressed affiliation with "loyalty", mainly as per the usage patterns in various Grimm tales, particularly in "Faithful Johannes" (a German tale).As a consequence, we ascribe the token "faith" to the value associated to "loyalty".
Another token that provides an interesting example is "father", due to its potential multiple associations.On the one hand, it could represent "caring", similar to "mother", but on the other, it could be a symbol of authority [Hopp et al., 2021].When exploring the corpora, we found that "father" was predominantly associated with "caring", with a remarkable exception in "The Maiden and the Fish" (Portugal), where one out of four instances appeared associated with authoritative power.
A third, less impactful token we considered was "patient", which was initially intended as associated with "patience" and "kindness".However, an analysis of the corpus found that its   usage related exclusively to an individual receiving medical treatment, and we consequently excluded it from our analysis.
Figure 6 shows the references to values by countries, according to the ascribed tokens.A more detailed mapping of occurrences of tokens in particular texts is provided in Figure 5 in the Appendix.From the resulting comparison of clusters across corpora, noteworthy is the one defined around tokens related to "mother".As the Venn diagram on Figure 7 shows, while in our German and Portuguese corpora this token of reference appears together with "brother", in the Italian and Portuguese corpora, it also appears in relation to "know".Only in Germany does it relate to "generous".Noteworthy, despite our previous comment regarding "father", this token does not appear in the cluster.

Historical Analysis
Dolores Buttry elucidates on the usage of "faith" in Grimm tales to exclusively mean "loyalty", and not "piety".She writes that the related values of faithfulness and loyalty (which are "Treu" and "Treue" in German) have been foundational virtues in Germany since ancient times [Buttry, 2011].Stories such as "Faithful Johannes", but also "The Frog King", exemplify extreme loyalty towards superiors, illustrating the importance of fidelity and respect for authority in their various manifestations.Buttry characterises the tale of the loyal servant as an enduring archetype, highlighting the recurring appearance of the words "Treu" (faithful) and "Treue" (loyalty, fidelity) in German tales [Buttry, 2011].She further suggests that, while respect for authority and the sanctity of oaths were nearly universal concepts before these stories were collected, they seem to have retained their vitality and cultural significance particularly in German-speaking traditions.This idea finds further support in one of the only non-German occurrences of "faith" in our corpus, as the label appears in "The Story Of Catherine and Her Fate," a Sicilian tale first collected by Swiss-German folklorist Laura Gozenbach.
It is also interesting to examine how values manifest in tales from different cultural contexts.In our results, we found that values of "piety" and "empathy" appeared clustered together in Italian and Portuguese tales, but not in German ones.This may be explained by the different religious traditions in all three countries, since both Italy and Portugal were predominantly Catholic regions at the time the tales were collected, while there was a strong Protestant presence in the German territory.Indeed, Jack Zipes [2002] writes that the Grimms' tales portrayed the main values of Protestant ethics and the bourgeois enlightenment.The heroes in their tales are predominantly concerned with self-preservation and the acquisition of wealth, and they assist others, including animals, only when they perceive a potential gain for themselves, demonstrat-ing a calculated approach to empathy and compassion.This model of behaviour, Zipes argues, exemplifies the general Protestant ethic of the time, and so empathy, although occasionally appearing in the Grimms' tales, is not a dominant theme [Zipes, 2002].We may advance the possibility that the differing religious ethos of Italy and Portugal would place more emphasis on empathy as it relates to Catholic piety.

Social Analysis
Frequency analysis shows that tokens such as "mother", "law", brother", and "love" have a strong presence (more than 100 appearances, see Figure 6) across the three countries under analysis.Based on the elaborated correspondence between tokens the Theory of Basic Values (see Appendix), the words "mother", "brother" and "love" are connected to Benevolence, and "law" is connected to Conformity.In Germany, the token "justice" has also a strong presence and is connected with the value of Universalism which stands for the protection and welfare of all people and nature.Considering that the value Benevolence stands for the good quality of social connections between people, and Conformity stands for the preservation of socio-cultural expectations and norms, then we could infer that these tales describe several social dynamics.The tales' plots are representative of dynamics among fictional characters that may resemble society, in order to describe the quality of human relationships and socio-cultural norms in place.
Interestingly, some differences across countries are expressed by the token frequencies related to Benevolence, Conformity and Universalism.For instance, in Germany, "mother" seems to be a stronger reference for communication of Benevolence than "brother" when compared to Portugal and Italy.Also, "love" seems to be a stronger reference for communication of Benevolence in Italy than in Germany and Portugal.However, in Germany, we may note that tokens such as "generous" and "cooperation" reinforce the communication and expression of Benevolence in those tales.Turning to the need for rules and social welfare, it seems that in Germany and Italy the token "law" is frequently used when compared to Portugal to express the value of Conformity.Finally, the German corpus shows a strong presence of the token "justice" in their tales, which highlights the importance of Universalism in this context and the need to convey the respect for human rights and dignity.In sum, while Portugal, Italy and Germany communicate strongly the values of Benevolence and Conformity, it seems that Germany also communicates the value of Universalism.Despite these nuances, it seems that European Values of Benevolence and Universalism are being communicated by the tales across all three countries.

V DISCUSSION AND CONCLUSION
While the provided analysis is open-ended, and the emerging results would require more thorough examination, our early analysis provides some concrete evidence that European Values have been a long-standing element in European cultural communication through fairy tales.The corpus analysis across different cultures revealed a significant variety in the representation of values.For example, the affiliation of the token "faith" with "loyalty" rather than "piety," particularly in German culture, illustrates the role of cultural and historical contexts in shaping value representations.Similarly, the differential clustering of "piety" and "empathy" in Italian and Portuguese tales compared to German tales further underscores the influence of religious and socio-cultural contexts in value representation.Interestingly, despite these differences, the analysis revealed a strong commonality across all three cultures, pointing at the communication of European Values through tales.Tokens associated with Benevolence, Conformity, and Universalism manifested frequently across fairy tales of all three countries.This finding is particularly noteworthy because it suggests a strong shared cultural understanding and expression of these values across European literary production, and, possibly and by extension, across European societies, thus hinting at the existence of a pan-European cultural memory.
We have identified clear limitations in our approach.Working at the syntactic level, both in terms of stemming and static word embeddings, limits the possibility to capture nuances, and with this some noise is introduced in the analysis.However, contrary to our expectations, our detailed analysis by means of in-depth close reading revealed that ambiguities are rather a noteworthy exception and not the norm.This is valid to the extent that in none of these cases a token bore semantic ambiguity that was a dichotomy rather than an outlier so that it could undermine the general results.
The focus on explicit references, unsurprisingly, resulted in an inability to annotate tokens such as "democracy" in the tales, as they were only implicitly referenced.Contextual language models are also able to capture indirect relatedness of from the context [Montanelli and Periti, 2023].This has also been attempted in the context of values, notably in the ValueEval competition [Kiesel et al., 2023, Ferrara et al., 2023, Papadopoulos et al., 2023].However, such approaches are undermined by the variance of value perceptions among humans.Efforts to annotate values for a ground truth chronically suffer from appalling agreement rates.In particular, when employing an even number of annotators, Hoover and colleagues report ties (disagreement) above 60% across more than 6000 tweets [Hoover et al., 2020].When annotating arguments for values, Kiesel and colleagues report agreement with Krippendorff's α of 0.49 [Kiesel et al., 2022], which is well below the 0.667 that Krippendorff calls "the lowest conceivable limit" [Krippendorff, 2004].Furthermore, when using contextual word embeddings, due to the needed corpus sizes, an approach that combines of pre-training and fine-tuning becomes necessary.Considering this, we believe special attention should be paid to the possibility that the pre-trained embeddings may introduce biases unrelated to the corpus under study.
This work provides a foundational understanding of how European Values are represented in literary texts and highlights the potential of computational linguistics in cultural studies.This study encourages further interdisciplinary research in the field of literary studies, cultural analytics, and computational linguistics to expand our understanding of cultural values and their historical evolution.

Figure 1 :
Figure 1: Theoretical model of relations among ten motivational types of values Schwartz [2012].

Figure 4 :
Figure 4: Screenshot of a browsing page from MOREEVER, the bespoke web instrument that reviews the produced annotations.Another view shows a clickable heatmap as Figure 5 in Appendix, which allows for a distant reading view.

Figure 5 :
Figure 5: Counts of identified occurrences of stemmed tokens across the texts of the three corpora.An interactive version of this heatmap is available in MOREEVER.In it clicking on a number takes you to the corresponding text for easier review.

Figure 6 :
Figure 6: Frequencies of identified occurrences of stemmed tokens across the three corpora.A more detailed heatmap between texts and labels is available on Figure 5 in Appendix.

Figure 7 :
Figure7: An illustration of the degree of overlap across the three national corpora for the token "mother".Note that the visible tokens are stemmed.
, and stability of society, of relationships, and of self.TraditionRespect, commitment, and acceptance of the customs and ideas that traditional culture or religion provide the self.Maintaining and preserving cultural, family or religious traditions.ConformityRestraint of actions, inclinations, and impulses likely to upset or harm others and violate social expectations or norms.Self-Direction Independent thought and action-choosing, creating, exploring.
leading to the establishment of a broad set of European Values.Values such as human dignity, freedom, democracy, equality, rule of law, and human rights have been declared as the values of the European Union, to form "a society in which inclusion, tolerance, justice, solidarity and non-discrimination prevail"[EU, 2020].Based on several empirical

Table 2 :
Descriptive statistics of the corpora.When we refer to tokens, we mean the ones that were identified by our automated annotation process.Complete list of included texts is available in Table3in Appendix.