Version Variation Visualization (VVV): Case Studies on the Hebrew Haggadah in English

The following case study illustrates how ‘Version Variation Visualisation’ (VVV) software tools were successfully used to highlight previously unnoticed linguistic features in a corpus of 50 differing English-language (re)translations of the Hebrew Passover Haggadah. VVV's visualizations facilitate overviews, close reading, and navigation among versions. Its ‘Eddy and Viv’ algorithms enabled the authors to identify outliers (unusual versions) within the corpus, and pointed them to specific segments in the source text that generated the most variation among the translations. Such patterns could then be explained using close reading techniques with the help of cultural-historical background information. VVV is shown to be a useful tool in analysing multi-translation corpora.


INTRODUCTION
The 'Version Variation Visualization' project has developed online tools to support comparative, algorithm-assisted investigations of a corpus of multiple versions of a text, e.g.variants, translations, adaptations (Cheesman, 2015(Cheesman, , 2016;;Cheesman et al., 2012Cheesman et al., , 2012Cheesman et al., -13, 2016;;Thiel, 2014;links: www.tinyurl.com/vvvex).A segmenting and aligning tool allows users to 1) define arbitrary segment types, 2) define arbitrary text chunks as segments, and 3) align segments between a 'base text' (a stabilised iteration of the translated source text), and translated versions of the text.(The 'source text' of multiply translated texts is typically unstable, with historical variations in differing manuscripts and printed editions; our system requires the initial establishment of a 'base text' to serve as a stable basis for comparisons among translated versions.)The alignment tool can automatically align recurrent defined segment types in sequence.Several visual interfaces in the prototype installation enable exploratory access to parallel versions, to comparative visual representations of versions' alignment with the base text, and to the base text visually annotated by an algorithmic analysis of variation among versions of segments.Data can be filtered, viewed and exported in diverse ways.Many more modes of access and analysis can be envisaged.The tool is language neutral.Experiments so far mostly use modern texts: German Shakespeare translations.Roos is working on a collection of approx.100 distinct English-language translations of a Hebrew text with ancient Hebrew and Aramaic passages: the Haggadah (Roos, 2015) I THE HAGGADAH On the evening before Passover (Pesach), Jews gather at home to celebrate a festive ceremony and meal with family and friends, to commemorate the biblical Exodus of the Jewish people out of Egypt.They eat the traditional matza, drink the prescribed four glasses of wine, and read from the Haggadah.This is a Hebrew text with instructions for a 15-phase ceremony: what to say or sing, which acts to perform, in what order, when to eat or drink what, etc.For example, in phase 1 the blessing over the wine is made, phase 11 is the meal and in phase 12 a specially broken piece of a Matzah, called Afikoman, is consumed.All participants hold a printed copy of the Haggadah.Typically, many different versions (plural: Haggadot) are present in the room.
The Hebrew Haggadah text is a compilation of Bible quotes, excerpts from traditional rabbinical teachings (Mishnah, Midrash), Exodus narrative, explanations of the festival's history, and Passover 'laws'.The text probably dates back to 200-300 CE.The oldest complete manuscript dates to the 10th century CE.Thousands of variant Hebrew-language versions are extant, in manuscript and print.There are translations in over 40 different languages [Yudlov, 1997].The first English-language version appeared in London in 1770.Countless more have since appeared.[Yudlov, 1997] catalogues 823 English-language editions to 1960.The rate of production of new ones has since been accelerating exponentially.Most of these are retranslations, variously dependent on precursors.
Roos is compiling a digital corpus of English-language Haggadah translations, and using digital tools to compare them and visualize the differences.He aims to explain the differences in terms of their cultural historical contexts, and so shed light on translators' minds and motives.

II VERSION VARIATION VISUALIZATION (VVV) -Eddy and Viv Algorithms
VVV compares multiple retranslation documents at segment level, and visualizes the similarities and differences, in order to facilitate overviews, close reading, and navigation among versions.An algorithm called 'Eddy' ('∑D') quantifies variation among versions of a base text segment, in order to distinguish more and less predictable or distinctive versions.An algorithm called Viv ('variation in variation') aggregates Eddy metrics, and projects the result onto the base text segment, in order to distinguish more and less variously translated segments.The algorithms can be applied to the aligned corpus or any selected sub-corpus.

Eddy
The Eddy value assigned to a particular version or section indicates its "strangeness" as compared to other versions.Essentially, Eddy assigns lower metrics to wordings which are closer to the notional average, and higher metrics to more distant ones.So, Eddy ranks versions on a cline from low to high distinctiveness, or originality, or unpredictability.It sorts common-or-garden translations from interestingly different ones.
Eddy can be implemented in various ways.Our standard approach is: Each word in the corpus word list [where corpus means the corpus of aligned segment versions] is considered as representing an axis in N-dimensional space, where N is the length of the corpus word list.For each version, a point is plotted within this space whose co-ordinates are given by the word frequencies in the version word list for that version.(Words not used in that version have a frequency of zero.)The position of a notional 'average' translation is established by finding the centroid of that set of points.An initial 'Eddy' variation value for each version is calculated by measuring the Euclidean distance between the point for that version and the centroid.[Flanagan in : Cheesman, Flanagan, and Thiel, 2012-13] No stop words are excluded; no stemming, lemmatisation or parsing is performed.Users can also select a more primitive arithmetical formula, and one using Dice's co-efficient.
In the VVV 'Eddy and Viv' view, when a base text segment is selected, the segment-versions are displayed in a scrollable list in Eddy order, with associated metrics, and with a visual representation of relative Eddy value.The list can be re-ordered to display by date, translator name, or segment length in characters.Eddy values can also be displayed, explored, and exported in the form of charts and tables.
Examples of Eddy use will be provided in section 4.1.

Viv
Viv aggregates the Eddy values for a segment.In our standard approach, Viv is the average of Eddy values of version-segments.Users can also select Viv as the standard deviation of Eddy values.Viv indicates where translators differed most or least, in relation to the base text.(This function is comparable with the amount of layering associated with a word or string of words in the TRAViz visualization: [Jänicke et al., 2015.]) In the VVV 'Eddy and Viv' view, Viv is represented on the base text by a tonal underlay, varying with the relative value of each segment.Metrics can be viewed by brushing a segment.Floor and ceiling values can be altered to facilitate surveying the base text.
Segments can be filtered in various ways (text search, Eddy/Viv ranges, segment lengths, etc), in the base text and in the version corpus or subcorpora, and texts exported in CSV tables with associated Eddy and Viv metrics.
Examples of Viv use will be provided in section 4.3.
As one reviewer commented, Eddy/Viv is not the only possible approach to comparing differing translations/versions.Measuring the overlap of words (or lemmas) among segments would achieve the same effect.Such a method would also need to calculate a centroid and distances from it.
VVV is specifically created to compare numerous retranslations of the same source text, making it ideal for research into Haggadah version variation.It can help a researcher identify variations, and present them to an audience.

III RELATED WORK
There has been some digital work on larger retranslations corpora, involving works of wide intrinsic interest, but none designed to facilitate access to multiple translations, and the translated work, together with algorithmic analyses.[Janicke et al., 2015] take an in some ways similar approach, but their 'TRAViz' interface offers a very different mode of text visualization, is monolingual (shows no translated text), and works best with more limited variation and shorter texts.Similarly, Juxta-Commons, CollateX and Stemmaweb are monolingual, do not rate the "strangeness" of variants in comparison to all others (Eddy value) and do not create a heat map in the source text (Viv Value) revealing which ST instances generated most target text variants, two of the most powerful VVV features.It is especially these comparisons between the ST and the TT that this research focuses on.Whereas the above software highlights differences in the versions, VVV highlights how these differences are connected to the ST, thus helping us the explain the reasons for the variants.
[Lapshinova-Koltunski, 2013] describes a parallel multi-translation corpus designed to support computational linguistic analyses of differences between professional translations, student translations, MT outputs and edited MT outputs.[Shei and Pain, 2002] proposed a similar parallel corpus, with an interface designed for translator training.These projects only offer access to filtered segments of the text corpus, and do not envisage exploring variation among retranslations.[Altintas, Can, and Patton, 2007] used two time-separated (c.1950, c.2000) collections of published translations of the same seven English, French or Russian literary classics into Turkish, in order to quantify aspects of language change.This raises the question whether such translations 'represent' their language.Corpus-based Translation Studies [Baker, 1993;Kruger et al., 2011] has established that translated language differs from untranslated language.We also know from decades of work in Descriptive Translation Studies [Morini, 2014;Toury, 2012] that retranslations vary for complex genre-, market-, subculture-specific and institutional factors, and individual psychosocial factors, involving the translators and others with a hand in the work (commissioners, editors), and their uses of resources including source versions and prior (re)translations.

IV USING VVV WITH HAGGADAH SAMPLES
One section of the Haggadah concerns four sons who represent four different attitudes to Judaism.They each ask a question which characterizes their attitude towards the Passover festivities, and the text then suggests how to respond to these questions.This section of the Hebrew source text has 126 words and is divided into six parts: (1) introduction; (2) characterization of each of the four sons; (3)-( 6) one paragraph for each son, with his question and the response.The source text contains twelve manually defined segments: units of meaning.
60 different translations of this section were uploaded to VVV, segmented and manually aligned with the Hebrew base text.Each translation contains between five and twelve of the source text segments because translations sometimes disregard segments of the ST or merge two ST segments into one TT translation.

Exploring with Eddy
In part (2), characterizing the four sons, most translators use straightforward terms: 'wise', 'wicked', 'simple', and 'one who does not know how to ask'.Some are more creative.Eddy highlights certain translations as 'strange'.VVV automatically rates each version, thus ranking all 60 versions from most common to most exotic.In The corpus includes C18 and C19 versions, but none appear in table 1. Almost all high Eddy versions date from the 1940s and after.The general retranslation trend is towards greater variation, probably at least in part because of copyright issues and a need to differentiate in order to stand out in the ever growing crowd.
As a distant reading tool, VVV's Eddy values reveals to us that in comparison to other versions of the same period, the 1906 translation (WILROS) is an early outlier, therefore worth further investigation.Close reading reveals that the language use in this particular translation is indeed quite extraordinary, with phrases such as "Israel's exode from Egypt", "and took cognizance of them", "of which Jerusalem is emblematic", "cut the sea in twain", etc. Historically and culturally, the translator William Rosenau was a radical leader of Reform Judaism, and he would eventually edit the revised edition of the Reform Union Haggadah, with a thoroughly rewritten source text.The version in our corpus still adheres to the traditional ST, but Rosenau's radicalism clearly already shines through in his translation and is picked up by VVV.
It is also intriguing that no version is consistently in the highest 5 for all four sons (see Table 2).A translation's relative Eddy varies, as we will see in the next section.

Eddy Variation Chart
The poet Abraham Regelson published several Haggadot.Roos's collection includes one from 1944 (REGFORST1) and another from 1952 (REGFORST2).VVV's Variation Chart ('Eddy Overview') helps us compare these two translations (see Figure 1).This chart plots each version's Eddy values on the y-axis, for segments in sequence on the x-axis.The user can select which versions' graphs to display or hide, select an area to zoom in on, and hover over a node to display base text and version (as is shown in Figure 1).In Figure 1 we see Regelson using higher Eddy-value language (more distinctive language in relation to the corpus) in 1952 than in 1944.One exception is highlighted, in part (2) of the passage (discussed in Section 4.3).In Table 3  What says the Simple Son? -"What is all this about?"Therefore, say to him: "With might of hand, Praised be God, praised be He.Blessed be He who gave the Torah to His people, Israel.Blessed be He.On the subject of the Passover service the Torah speaks of -FOUR SONS One is intelligent, one is ill-mannered, one is indifferent, and one is not even able to ask a question.
1.The INTELLIGENT son asks: "What is the meaning of all the Passover customs and ceremonies, the rules and rites which God has commanded?"You will explain to him all the traditions of Pesach down to the last detail of the Afikoman.
2. The ILL-MANNERED one asks: "What's the sense of all this business of yours?"Yours, he says; and none of his.By refusing to identify himself with his people he denies a basic principle of our religion.You may fling this in his teeth: "I do this because of what the Lord did for me when He rescued me from Egypt."Me not him.Let him know that had he been there, by denying his brothers he could not have been saved.

Viv: Variation in Variation
In VVV's 'Eddy and Viv' interface, source text segments are highlighted according to their Viv value: the higher the value, the darker the underlay tone.We can visually identify which source text segments produced the most variant translations, or 'read the original by the light of the translations' [Cheesman, 2015].The segment with highest Viv value is the one beginning: 'We may not eat an afikoman…' (underlined above).This is a quote from the Mishna (oral laws compiled about 200 CE by Rabbi Judah HaNasi, the basis for the later Talmud).Already in the Talmud (c.500 CE) the correct meaning of the term afikoman had become obscure and was disputed.In Talmudic traditions, afikoman (Hebrew ‫)אפיקומן‬ is said to derive from Greek epikomen or epikomion (επί Κομός), 'that which comes after', variously interpreted as (A) 'dessert', or (B) 'afterdinner entertainment/revelry', and additionally as (C) a metaphor.The translation from the Sefaria site cited above clearly favors the option that it is a form of dessert.
But our corpus contains five different interpretations in the context of the answer to the wise son: (A1) a proscribed dessert; (A2) the prescribed dessert eaten at the 13 th phase of the Passover ceremony (the piece of a matza called Afikoman); (B1) proscribed excessive subsequent entertainment (distinguishing Passover from pagan celebrations); (C1) prescribed teaching of all of the (Passover) law: because the afikoman is the last law in the section on Pesach, so "We don't leave anything until after the afikoman" could mean, "we don't stop studying until we have learned everything"; (C2) prescribed sacrifice of a Passover lamb.
There is also a sixth option for translation: leaving afikoman to stand in the target text, uninterpreted.
This range of options explains the segment's high Viv value.Some of the variant English versions, low in the Eddy value list, are shown in the VVV 'Eddy and Viv' view in Figure 3. Viv and Eddy values are calculated according to manually set segments (meaningful units).These can be single words, phrases, sentences, paragraphs or even the whole text.By creating a whole text segment, one could easily aggregate data for a quantitative comparison of whole texts.This way, although no one version consistently deviates from the norm for all 4 sons (as shown above), it is possible to ascertain which translation as a whole is furthest from the base texts.It should be noted, however, that by doing this, the Viv value becomes irrelevant.The version that scored the highest overall Eddy value was the 1974 Polychrome Haggadah by Jacob Freedman, whose translation is extremely verbose and elaborate.1) expands to an extraordinary degree.This is the verbose and elaborate translation by Jacob Freedman mentioned at the end of section 4.3.The 1955 and 1967 versions are also expansive.They did not appear in Table 1. Figure 4 now explains why: both omit the segments which ST's part 2 with the 4 segments of the names of the four sons, shown in Table 1.

Conclusion
Using VVV can yield valuable insights when comparing multiple variants, and is also useful for presenting findings visually.Manually comparing different versions becomes difficult with larger corpora.When Viv is highlighted in base text segments, even researchers with no knowledge of a language (in this case Hebrew) can identify the parts that warrant closer inspection.
VVV offers a useful range of visualization modes, but many more can be developed.Future research planned on the Haggadah includes comparing the language use of translators when translating Hebrew and Aramaic text passages, comparing the translations of biblical Hebrew

Figure 1 .
Figure 1.Comparing REGFORST1 and REGFORST2 in the Eddy Variant Chart

Figure 2 .
Figure 2. Partial Screenshot of Eddy and Viv view of the Four Sons Passage

Figure 3 .
Figure 3. Eddy and Viv view of the Four Sons Passage

4. 4
Parallel View Visualization: Alignment Maps Parallel view visualizations include a distant overview of segment alignments between base text and versions: an 'alignment map'.Successive segments of the base text are represented as a vertical 'barcode': the thickness of a bar represents segment length in words.Segments of a version are represented in the same format.Alignments are represented by lines connecting base text and version.This enables rapid identification of translators' editing decisions: omission, addition, reduction, expansion, and transposition.

Figure 4 (Figure 4 .
Figure 4 (created from screenshots) shows ten examples of the 'Four Sons' passage.The unchanging base text is on the left, the version on the right of each 'map'.The afikoman segment is highlighted.

Table 1 .
Table 1, the left column gives the original Hebrew and the commonest translation (lowest Eddy value).The second column gives the five translations with highest Eddy values (rounded Eddy figures given in column 3): outliers in comparison to all other versions in the corpus, worth further exploration.Names of the Four Sons: Translations with lowest and Highest Eddy.

Table 3 .
INDIFFERENT one merely asks: "What is this?" Tell him: "With a strong hand God took us out of Egypt the Lord hath taken us out of Egypt, from the house of slavery."But the One Who Wits Not To Ask-sit is for thee to open talk with him, as it is said: "And thou shalt tell thy son on that day, saying: 'This is on account of what the Lord did for me when I went forth from Egypt.'''where we were slaves."4.The INCOMPETENT one-get him started by quoting the words from the Bible: "In that day you shall tell your son saying: (Point to the ceremonial dishes.)"All of this is because of what the Lord did for me when I came out of Egypt."ComparingRegelson 1944 and Regelson 1952.It makes sense to assume that, having in 1944 already translated the Haggadah in a quite straightforward manner, Regelson decided for the 1952 translation to try out different translation techniques, more off the beaten path.That would explain why his 1952 translation scores higher Eddy values.Examples of this can be seen by the alliterated names of the four sons (Intelligent, ill-mannered, indifferent, and incompetent), something not found in any of the other versions.On the other hand, having used a etymology-based final comment for the wise son in 1944, Regelson opts for a very specific Jewish-sources based closing for his 1952 intelligent son.We will comment on this further in the next section.