Intertextual Pointers in the Text Alignment Network

The Text Alignment Network (TAN) is a suite of XML encoding formats intended to serve anyone who wishes to encode, exchange, and study multiple versions of texts (e.g., translations, paraphrases), and annotations on those texts (e.g., quotations, word-for-word correspondences). This article focuses on TAN’s innovative intertextual pointers, which, I argue, provide an unprecedented level of readability, interoperability, and semantic context. Because TAN is a new, experimental format, this article provides a brief introduction to the format and concludes with comments on progress and future prospects.

Interoperability of multiple versions within a single project is relatively easy.Many TEI projects have done just that.But TEI offers no mechanism to try to check or enforce cross-project alignment.Four different projects that each encode a version of Aristotle's Categories are likely to come up with as many ways of structuring the transcriptions in TEI.Anyone who wishes to synthesize those four versions, say into an HTML page that aligns the versions in parallel, must spend considerable effort reconciling differences.And every time a new TEI version of the Categories is found, the exercise must be repeated.
Or consider the humble cross-reference.TEI allows many ways to represent a claim that work A passage x is a quotation of work B passage y.But any representation in TEI is almost guaranteed to be meaningless outside a given TEI file or project (Kalvesmaki 2015).TEI code requires both syntactic and semantic interpretation, and is not predictably applicable to other versions of the same work.
I've singled out TEI, but the problem is true for other markup schemes.HTML, TMX, and XLIFF allow users to align multiple versions of the same work, but no effort has been made to try to make data interoperable across projects using the same markup method.1 II A New Approach to Intertexts: The Text Alignment Network I have often wondered what would have happened if the designers of TEI had taken a different approach.Could TEI be constrained to maximize the likelihood that the texts I produce, and the things I want to say about them, will be interoperable with other versions and annotations you produce?Can I write intertextual pointers that have value not just for my TEI transcription, but to the ones other people have made for the same work?
Over the last several years I have worked to answer "yes" to both questions by means of the Text Alignment Network (TAN), a suite of XML encoding formats intended to serve anyone who wishes to encode, exchange, and study translations, paraphrases, adaptations, quotations, and other varieties of text reuse.The TAN syntax is meant to be maximally readable and editable by both humans and machines.The RELAX-NG and Schematron schemas that drive the validation routines not only check the syntactic and semantic interoperability of a text, but provide feedback and help (via Schematron Quick Fixes) in the context of an XML editor such as oXygen. 2AN is a format, not a tool.It does not make intertextual pointers for you.It does not try to identify quotations or to reconcile differences in alignment.But it lays a foundation for such efforts.The extensive library of XSLT functions upon which TAN validation is built definitively interpret the format and provide a starting point for further use (some examples are discussed at the end of this article).
TAN is not intended to replace other formats such as TEI.In fact, TAN is built partly on a customization of TEI All.Rather, TAN is meant to complement other formats, because it has been designed around a set of distinctive principles:  Annotations should always be separated from what is annotated (stand-off markup): Anything that is not a transcription but comments on one-e.g., remarks on quotations, the analysis of the morphological parts of a given word-should stand apart from the transcription itself.Such stand-off annotation brings many benefits not available with inline annotation.Multiple scholars can edit and remark on the same transcription independently and collaboratively.Any two annotations may point to overlapping ranges-one of the Great Impermissibles in a single XML file-without compromising XML rules. 3Further, stand-off files can be combined and transformed into any number of derivative inline files.The reverse is not always true.A heavily marked up TEI file is like a baked cake.Each type of tag set is like a set of ingredients or form of preparation.You might get a TEI file shaped baked into the shape of a star, sprinkled with walnuts (I'm allergic to them).But to make it the cake you want, you have to unbake it, remove the ingredients you don't like, add the ones you do, then bake it again.In adopting stand-off annotation I argue that we should not be baking cakes.Rather we should be providing simple ingredients, so others can bake cakes without having to first unbake them.
 One file per task : As a corollary to the previous principle, a given file should be focused upon a single task, and kept simple.Each version of a work should be in its own transcription file, and restricted more or less to the plain text.An index of quotations should be in a separate file, and simply point to the transcription files rather than trying to mark them up.Most important is the requirement that every TAN/TEI transcription file must be restricted to a single version of a single conceptual work found on a single text-bearing object, segmented and labeled according to a single reference system (canonical or not  Content and assertions should be deeply validated : It is fine, of course, that our files' structure is checked for validity, but the content should be checked as well, when possible.If a word is marked as having particular grammatical features, the code we use should comport with the rules we have adopted for the language (e.g., that there should be no such thing as a first-person nominative conjunction).Text should be checked to make sure it is normalized (according to Unicode NFC).In a given element with both @from and @to, the date in the latter should not precede the one in the former.
In TAN, more than one hundred such rules are checked, at different depths.Users are permitted to specify how deeply these checks should be made, to save time during editing.The underlying XSLT functions have been designed to provide contextual help to editors through Schematron Quick Fixes (SQFs).If an SQF-aware XML application (e.g., oXygen) flags text as not normalized, the software will give the user the option to automatically replace the text with a normalized version, saving perhaps hours of hunting and retyping. We should be able not just to collaborate but to communicate across projects : As textual scholars, we should be able to work not just independently but collaboratively.Good collaboration should include some form of communication.Suppose you have published on your website a TAN transcription that I find important and I want to refer to it in my own TAN file.To do so, I must specify when I last consulted your file (via @accessed-when).Now, suppose you find errors in your file.You make the correction, and explain and date this change in your file.The next time I validate my dependent TAN file, I will be informed or warned that your source file has since changed.Suppose your alteration is serious.You can specify how serious it is, and describe the correction.My validation process will not only mark my file at the level of error you've specified but convey your message.I can ignore that message if I want, or I can take into account the changes you have made.If I update @accessed-when, I can squelch those older messages.
Much more could be said about the design principles, but such an effort would wind up repeating the comprehensive documentation already available at the project's website, http://textalign.net.All material is released under a Creative Commons Attribution 4.0 license, in the hopes that the project will outlive and outgrow me.

III Pointers 1 Current Pointing Systems
I wish here to focus on TAN pointers, since they are at the heart of intertextuality, the theme of this issue of the journal.TAN pointers are most similar to those defined by Canonical Text Services.But, or so I argue, they provide a new level of readability and refinement.TAN pointers are provide an unprecedented level of integration with the semantics of the source and the target.And they show that the syntax we already use in the notes of our articles and books to quote primary sources is adequate for digital texts.We don't need long, cumbersome URNs ; we can use shorthand forms that are already familiar.
First, each of these pointers have limited readability to any human who needs either to write or to read them.Different people respond differently to these kinds of pointers, based on familiarity.The URLs (HTML and XPointer) and IDRefs (HTML, XPointer, TEI) might be recognizable, even if long.The pointers dependent upon XPath (TEI) will probably not be clear to those who do not work in XSLT or XQuery.All of them, though, are pretty long and cumbersome.Few of us type them out ; we normally rely upon cutting and pasting.
Second, each pointer has coinage in only one file.The HTML, XPointer, and TEI examples point to a single node within a single document.Even if they could be tried on other files (like the XSLT example, which is meant to be applied to an unspecified number of files), the intention built into the pointer will be respected only if those other files have been encoded using the same syntactical markup.But the syntax is necessarily arbitrary, and people working independently will come up with their own naming systems.So any success will be a happy but rare coincidence.
Third, the underlying semantics are ignored.Even if we humans intuit their meaning, the terms « chapter1 » and « chapter » are not computationally defined.The syntax may be computable, but the computer has no means to fetch the meaning (semantics) behind that syntax.That is, a computer does not « understand » (we seem forced to anthropomorphize the algorithms we write) that « chapter » refers to a such-and-such a type of textual division.
The examples above illustrate simple, one way pointers.Now, consider the humble cross-reference, which requires at minimum a double pointer, set in the context of a sentence.Suppose we want to encode the idea « Matthew 4:15-16 is a quotation of Isaiah 8:23-9:1.» (Example A.) And suppose we want to use one of the traditional pointers above.The first issue, readability, still holds, although its greatest sin is its nuisance.The second issue, interoperability, presents more serious obstacles, since the sentence we want to requires us to use a syntax that transcends any individual version of the work.But all the pointing techniques above are restricted to individual files.And the third issue still holds : none of the pointing mechanisms are able to preserve the semantics behind the references, e.g., that the grammatical subject refers to a book of the New Testament.
Suppose we wanted to be more specific in our assertion.After all, there are parts of Isaiah 8 :23 that are not part of the quotation, which doesn't begin until around the 15th word, according to the Rahlfs edition of the Septuagint.Furthermore, a number of words or phrases differ between the quoting and quoted source.Revising Example A to be more specific could get pretty complicated.For the sake of illustration, we'll stay simple : « All of Matthew 4 :15-16 is a quotation of Isaiah 8 :23 word 15 through the end of Isaiah 9 :1.» (Example B.) Strangely, the traditional pointing methods above are better suited to Example B than they are to Example A. That is because the latter was meant to be true independent of any version of the texts.We were flying about in a sort of zero gravity.But now that we are seeking to point to individual words, we must enter the gravitational field of a specific version, at least for the second text (and perhaps the first).We cannot expect a statement true of the Rahlfs edition of the Septuagint to be true for the Brenton one (where the quotation doesn't begin until the sixth word of Isaiah 9 :1 !).Which means that our second concern above, about interoperability, is not as relevant, since we must restrict our assertion to specific versions, which is what the pointing languages above were designed for.(But all the other concerns still hold, and even the issue of interoperability of multiple files handling the same version.) I have framed the problem in this way because it seems to me that all textual pointers fall into one of two categories.A pointer either transcends any particular version or it does not.
Pointers from the first category rely upon a reference system that must necessarily be version-agnostic, even if that reference system originates from features endemic to a specific version.For example, the page numbers, column letters, and line numbers of Bekker's edition have been used as a standard reference scheme for Aristotle's works, applied to later print editions.But the typesetting in those later editions results in lineation that differs considerably from Bekker's.Typesetters have not tried to match exactly Bekker's line beginnings and ends, so a Bekker number is oftentimes only a good approximation.The Bekker reference system has transcended Bekker's edition.
In the second category are pointers that delve into a particlar version.The pointer may begin with a version-agnostic component (e.g., Isaiah 8 :23…) but they must terminate in something that does not (…word fifteen).Or it the pointer may dispense with the versionagnostic reference altogether, and rely completely upon a reference system that is native to only a single version (e.g., quire 43, fol.4r, col.4, line 43, character 11, which is where Isaiah 8 :23 begins in Codex Sinaiticus).

Version-agnostic pointers
Let us start with version-agnostic pointers such as those in Example A. Elsewhere in this special issue we have read about Canonical Text Services URNs,5 which has been to my knowledge the first attempt to provide a digital language for pointers that transcend any specific version, and so strive to admit syntactic interoperability.So for Example A, we would begin with CTS URNs such as this (adopting the somewhat problematic catalog numbering of the Thesaurus Linguae Graecae [TLG]): 6  urn:cts:greekLit:tlg0031.tlg001:4.15-4.16 urn:cts:greekLit:tlg0527.tlg048:8.23-9.1These URNs reliably point to Matthew 4:15-16 and Isaiah 8:23-9:1 independent of any mechanism, tool, or server. 7They point to whatever version you can find of the cited works.But they are still just as cumbersome, opaque, and unreadable as the traditional pointers discussed above.And although they are syntactically interoperable, they are not semantically so (Kalvesmaki 2015).Nevertheless, it is admirable that we can pack so much into a single URN.A CTS URN is a good start to expressing Example A. To finish the job, we might adopt the data model of the Resource Description Framework (RDF), which defines three-part statements, made of subject, predicate, and object.Here is one way of using CTS URNs to create an RDF statement for Example A (using the Turtle syntax, and adopting a URL that has been minted to define the concept « quotation ») : urn:cts:greekLit:tlg0031.tlg001:4.15-4.16http://purl.org/spar/fabio/Quotationurn:cts:greekLit:tlg0527.tlg048:8.23-9.1 .
The TAN approach also uses URNs (IRIs), but in the interest of readability, those IRIs have been moved into the background.They are retrieved either by looking up the value of an id value, or by calculating the value of a node from its context.Splitting the work this way allows editors to spend most of their time working with a syntax that close to ordinary conventions.Here is how Example A could be rendered in a TAN annotation file: <claim verb="quotes"> <subject work="nt-grc" ref="Mt 4:15-16"/> <object work="lxx" ref="Isa 8:23-9:1"/> </claim> <claim> indicates that a claim or assertion is being made. 8<subject>, @verb, and <object> correspond to RDF's subject, predicate, and object.The arbitrary value of @verb is defined in the <head> by one or more IRIs.Each @work's value is defined by the corresponding source file, and can also be converted to IRI values.The value of @ref points to a specific part of one or more source files that are part of the same work (as defined by the work IRI).The meaning and implications of @ref are calculated via the context of the relevant portions of the source TAN files.
Any TAN/TEI file or fragment may be algorithmically converted to a CTS URN.Unfortunately, the converse is not true.9Because the underlying TAN/TEI transcriptions are devoted to a single version of a single work in a single, standard reference scheme, the values in @ref are unique and are therefore just as reliable as any other method of pointing.Further, because every division in a TAN/TEI transcription must be defined as a particular kind of division, a computer can draw semantic inferences, e.g., that Mt and Isa both identify books, and that the numbers correspond to chapters and verses.
To refer to spans of text, the TAN syntax is more than are CTS URNs.For example, @ref may point to a discontinuous ranges of text, and it may also adopt numeration systems other than Arabic numerals :10 <claim verb="quotes"> <subject work="src1" ref="4 b 5, 5c"/> <object work="src2" ref="12 b-d, f, 15 c"/> </claim> Ranges may be abbreviated economically, which results in a clean syntax that avoids repetition.That is, Mt 4:15-16 is sufficient.You don't need to write Mt 4:15-Mt 4:16.It is assumed that the first part of a range bestows upon any abbreviated second part the contextual hierarchy by which it can be resolved. 11here are many other features that could be added to this example (e.g., certainty, or adverbial modifiers), but explaining them would transfer us from the topic of pointers to that of RDF.We save that much-needed discussion for another day.
The TAN approach to Example B builds upon that of Example A. We assume that we want to point to a portion of a text node that is within a leaf element (any element that contains text and no other elements).I have adopted the principle that TAN syntax should whenever possible mimic real-life practice.I have observed that in ordinary conversation when we want to point to a specific part of text, we prefer to talk about the nth (e.g., first, seventh, last) paragraph, sentence, line, word, or character.When we get down to the word level, we might identify the word by position (e.g., "the fifth to last word") or by name (e.g., "the last 'from'").
TAN adopts space normalization as a default, which means that text can be indented as freely as one wishes.So a text node's new lines are no guide to identifying what we normally mean by "line" or "paragraph."Therefore, those terms cannot be reliably used to identify a portion of a TAN text node.The meaning of "sentence" is contested, and difficult to define computationally: the most common sentence separator (at least in modern Western languages), the period, is used for too many other purposes.
So we are left with words and characters.TAN integrates both into its pointer syntax.But each of those concepts are ambiguous, and need some discussion before we look at real examples.
The term "word" is notoriously difficult to define.In different contexts, "New York" and "didn't," for example, can each be justifiably defined as one or two words.Furthermore, some scholars consider punctuation to be words in their own right (e.g., linguists' study of modern English), whereas others ignore them as being anachronistic or arbitrary (e.g., classicists' study of ancient Greek and Latin).TAN adopts the adjunct term "token": a string of characters that are defined by a regular expression meant to approximate what we usually mean by "word."And when I ask you to look at the fifth word, you normally ignore any intervening punctuation.Therefore, TAN's default definition of a token is [\w&#xad;&#x200b;&#x200d;]+one or more consecutive word characters, the soft hyphen, the zero-width space, or the zero-width joiner.Although that is the default, it is not The Law.In any TAN file, an editor is welcome to write their own regular expression that defines what a token is.
"Character" also has some issues, even if we agree, as I think we should, that it should refer to a notional Unicode codepoint, whether expressed in UTF-8, UTF-16, UTF-32, or an HTML or XML entity (e.g., &#xA;).The main problem is that Unicode is complicated, and has had to differentiate three levels of support for regular expressions. 13Only level-one conformance is guaranteed in any XML system.And combining characters fall in level two.Suppose we have a string of three characters: b (i.e., a, a combining underdot, and a b).If I ask you via a regular expression to fetch me the second character, will you return the dot or the b?That decision, at present, depends upon the implementation you are using, and the results are not predictable.Therefore, TAN defines a character as any non-modifying character along with any ensuing modifying characters (regular expression: \P{M}\p{M}*).
With that background, we can now look at how Example B could be rendered in TAN: <claim verb="quotes"> <subject work="nt-grc" ref="Mt 4:15-16"/> <object src="lxx"> <tok ref="Isa 8:23" pos="15-last"/> <div-ref ref="Isa 9:1"/> </object> </claim> The principal change is that we are pointing to a range in Isaiah 8:23 that extends from the fifteenth word to the last.(During validation TAN functions silently convert the term "last" to a digit.)Note that @work has become @src, since we are now making a version-specific reference.
Suppose we have another situation where we wish to refer to a specific word or specific set of characters.Some arbitrary examples: <tok ref="Isa 8:23" val="χώρα"/> (the first instance of the word) <tok ref="Isa 8:23" val="καὶ" pos="2"/> (the second instance of the word) <tok ref="Isa 8:23" pos="last" chars="1-2, last-2"/> (returns three characters from the last word, i.e., Ιοί from Ιουδαίας) Note that "last-2" means the second-to-last character-it is oftentimes so much easier to count from the end than from the beginning.
Because the values are resolved by means of regular expressions, @val may take a regular expression: <tok ref="Gen 4:15" val="θε[όὸ]ς" pos="2"/> This example points to the second instance of θεός regardless of whether the accent is grave or acute. 14 course, if the value of the attributes point to something that does not exist, the TAN validation routines will mark the element as invalid, and via Schematron Quick Fixes offer contextual suggestions on how to fix the error.In fact, it is possible to ask for help by placing three question marks in an attribute then running a validation check.Either the error report itself or an SQF will provide editing help.So a version-specific pointer in TAN is not only readable, it is a kind of heuristic, at least when used during validation.

Pointing Problems
Such ambitious pointing has problems.But the problems I have found most difficult pertain to the concept of textual pointing in general.We might fuss about them in our digital projects, but they were there ever before there was a computer.
One of the the general problems is that a reference system might not match across different versions of the same work.The Psalms, for example, have a significantly different numeration system in the Greek and Hebrew versions (Kalvesmaki 2014), which means that any reference you find to a Psalm, whether you read that reference on the screen or in print, could confuse you.A computer fares no better.What this means is that even within the same notional, version-agnostic reference system you are not guaranteed the interoperability we strove for when we invented those systems.In practice many of them fall short, to one degree or another.That is why I have used terms above like « try », « maximize », and « maximally likely », because I recognize that nothing can guarantee that texts produced by different people (or even the same person) will be in sync.TAN, however, creates a framework that makes such interoperability more likely, and provides diagnostics to help make our texts more interoperable.
TAN validation can be set to any of three phases-terse, normal, and verbose-to allow for variations in speed and depth. 15In the verbose phase of a TAN alignment file, you will be told where there are references that are defective (references found in some but not all versions of the same work).Sometimes you can and should do something about defective references ; other times you shouldn't because a source is fragmentary or an editor has sequestered some passages.Verbose validation will give you the opportunity to review the differences.In that same TAN alignment file you can write statements that synchronize versions.For example, one source may have used « Matt » and another « Mt » for the same book.Or, in the case of Isaiah 8 :29, the versification of the Brenton and Rahlfs versions are out of sync.You can reconcile such differences through specially designated elements.
Another pointing problem is the case where a work has multiple, overlapping reference systems.For example, citations from the works of Aristotle rely upon both bookchapter references and Bekker page-column-line numbers.Citations from the Church Fathers oftentimes need both book-chapter references and volume-column-subcolumn references from Migne's Patrology.
TEI handles this problem by allowing users to supplement a transcription's primary hierarchy with <milestone>s, which usually mark page or column breaks.The <milestone> is very useful, but it lacks the requisite semantics for representing a textual hierarchy, as required under TAN rules. 16 have approached the problem differently.First, only one reference system may be applied to a TAN file.For each reference system desired, users should create a separate file with an identical transcription that is structured according to that reference system.These files should point to each other, which then makes reconciliation relatively straightforward.The verbose level of TAN validation allows one to check to make sure that the transcriptions are identical.If they are not, the specific differences are marked, and SQFs can be used to semiautomate the correction of the text.
Under this approach, a text can take as many flavors of hierarchies as you wish.And a cluster of differently segmented transcriptions can help TAN-based applications migrate texts from one reference system to another.In fact, I have been able to write an application that takes as input a transcription pair (a single version of a work in two files, each with its own reference system) and a third file that contains another version of the same work, perhaps in a completely different language, but styled according to one of the reference systems.The output is the third file restructured in the alternative reference system, with a best guess as to where the divisions should fall. 17here are of course other problems about pointers that could be discussed.Some of them are treated in the official TAN guidelines and examples.Others not covered by the guidelines will be incorporated as they are encountered.

IV. Progress and Prospects
At present, TAN is an experimental format, under development.It has been used successfully in service to three different projects : the Guide to Evagrius Ponticus (GEP), the Chrysostomus Latinus in Iohannem Online project (CLIO), and a private collaborative effort to translate a fourth-century Greek work and its ancient Syriac translations (PRV). 18he GEP relies upon TAN files to generate and publish HTML transcriptions of select primary sources.It also takes a TAN file that contains RDF-like claims, along with a bibliography database (in Zotero RDF), and automatically generates an extensively documented master checklist of writings from the fourth century monk, creating links between the TAN files and the relevant bibliographic entries.
CLIO, which is focused on the three Latin translations of eighty-eight homilies by John Chrysostom on the Gospel of John, has developed a small library of master TAN-TEI files that are used to populate a website correlates the translations (and textual variants) with manuscript images.The TAN files include both straightforward transcriptions and indications of textual variants.It is hoped that in the future other projects treating the same work but in other languages (e.g., Greek, Armenian, Syriac, Arabic) will be able to create comparable TAN files that synchronize with the CLIO files, to demonstrate genuine cross-project interoperability.
For its work, PRV has depended upon a library of TAN files to prepare a book that is under contract with Oxford University Press.To reach that goal, PRV has developed a number of TAN-based XSLT tools that create various forms of output that have been useful not only for the published project but for conducting initial work.That output includes : (1) Master HTML pages that collate every version (Greek, Syriac, the working English translations, etc.) not just with each other but with juxtaposed commentary and quotations from Aristotle and the Bible (in different versions).The number and sequence of the versions are easily configured, without touching the underlying TAN files.
(2) Master indexes of textual quotations from the Bible and Aristotle, sortable in the order of the quoted or quoting work, with contextual snippets.(3) Master subject indexes.(4) Comparative statistical profiles of each version of each work (e.g., word counts, hapax legomena, most frequent words).(5) Reports that suggest how words in one particular language-version correspond to words in the others.(6) Word-by-word interlinear editions.
(7) Quotation detection. 19ery one of these applications would merit its own article-length discussion.As time and opportunity permits I will do so, but in the meantime, do explore the examples that have already been posted. 20uch progress is promising, but tentative.At the present, TAN needs people from other established projects in the digital humanities to use it and to actively contribute to the development of the schemas, examples, and documentation.By expanding the network, to see what rules work, what rules don't, and what can't be reduced to a rule, the TAN format will become more servicable, and provide insight into what kinds of rules can and cannot be reduced to an algorithm.
Trying it out incurs little risk, since TAN is not an exclusive format.A TAN file may sit happily side by side with TEI, HTML, plain text, or other formats.But it is primed to do things those other formats cannot do, namely, participate in a cross-project network, an ecosystem of peer-to-peer sharing of our sources and our annotations on those sources.Developing TAN has made me dream about someone to come along and create a Napster for digital texts. 21TAN would be perfectly suited to a Text Napster.But whether or not such dream software materializes, I aim to continue developing TAN over the coming years, since I believe that even on its own it heralds a new Internet of primary sources.

The bare scholarly minimum should be expected and enforced in metadata : The
).If we are working with a book that features multiple versions of a work, each version of interest needs to be broken out into its own file.The purpose of a TAN/TEI file is to focus on the work, not on the physical exemplar, which means that a TAN transcription file is likely to cover only a portion of what appears in a book.We don't need a full catalogue entry.Our data is not about the book, it's about the work.Just point to one or more records where we can learn more, if we so choose.Metadata should be restricted to the data in the <body>.
 Metadata should be focused on the task at hand : But because a given file is devoted to one task, the metadata should as well.Yes, we want to know who was responsible for editing or proofreading a text.But we don't need to know the proofreaders' age, nationality, or background.A unique identifier for those people can point us to other places more authoritative where we can learn about them.So too with a book that is the basis for a transcription.