Today will feature an introduction to the history of scholarly editing, an overview of digital workflow strategies, and an introduction to Markdown and XML.


  • General grasp of the history of scholarly editing.

  • Facility with transcribing documents in Markdown, HTML, and XML.

Schedule: Day 1 (Monday, 2 July)

Time Topic Type
12.30 Registration  
13.00 Senate House Library Talk Presentation
14.00 Seminar 1: Brief history of Scholarly Editing Presentation, Discussion
16.00 Seminar 2: Digital Editing Workflow, Transcription with Markdown, Brief Introduction to XML Digital lab

Seminar 1: Brief History of Scholarly Editing


  1. Greetham, “A History of Textual Scholarship” (from the Cambridge Companion to Textual Scholarship).

  2. A. E. Housman, “The Application of Thought to Textual Criticism” (Art and Error).

  3. G. Thomas Tanselle, “The Varieties of Scholarly Editing” (Greetham, 1995).

Lecture notes

A brief outline of textual scholarship
  • Peisistratus (560–527 BCE) orders the 'official' text of Homer. The primary challenge was to build a coherent text from myriad versions spoken by the rhapsodes. This could be a viable beginning of textual criticism, i.e., being aware of variance and attending to authenticity and authority (whatever those terms mean). (Discuss!)

  • Lycurgus (c. 390–324 BCE) arranges for single texts of Aeschylus, Sophocles, and Euripedes to be deposited into Athenian archives.

  • The history of textual editing is a history of arguments about the meaning of terms such as authenticity and authority. It is also a record of humans grappling with the contingencies of cultural imagination, tradition, and artifacts.

  • What is the textus receptus? When mistakes in a received (published) edition prevail: E.g., Falstaff "babbl'd o' green fields" (Shakespeare, Henry V); "soiled fish of the sea" (Melville, White-Jacket).

  • Library of Alexandria: manuscript copying was a common practice, since all incoming ships had to declare any manuscripts in their possession. Any manuscripts declared would then be copied and deposited in libraries. Their copies were only labeled differently if they had differences. Sometimes the copies were returned and the originals kept in Alexandria. What's wrong with this story?

  • The birth of collation as an editorial practice; and dealing with analogy versus anomaly: the Alexandrians sought to emend texts that had, in their judgment, corruptions. Their practice is idealistic: the best text is not based on any actual document but rather a new document that seeks to bring out the best readings from all the extant texts.

  • Pergamum, the other civic rival to Alexandria, switched to using parchment (animal skin) after Alexandria banished papyrus exports during a trade conflict. Generally, the Pergamanian scholars accepted the necessity of corruption and sought to identify the "best text" based on a careful examination of all surviving witnesses. The "best text" would be based on an actual historical document, rather than the Alexandrian text, which was a reconstructed text. Texts from neither of these epochs survive, but citations of them exist in medieval scholias.

  • Descriptive Bibliography. Callimachus (c. 305–240 BCE) created the first record of Greek manuscripts, Pinakes (Tablets).

  • Late classical era: the birth of textual commentaries (Servius Honoratus on Virgil, for example). Why is this important? The textual commentaries include quotes of important works and other cultural and historical information that have been otherwise lost. Hugh Cayless offers a good primer on Servius, as well as some thoughts on digital editing, on his blog.

  • Biblical scholarship: problems of vocalisation, accentuation, and word-division in consonantal Hebrew. Masoretic text (Hebrew and Aramaic copies, c. 7th–9th centuries CE) versus Greek Septuagint translation versus the Dead Sea Scrolls. The Old Testament is far less complicated (textually speaking) than the transmission of the Greek New Testament. Jerome's Vulgate, commissioned by Pope Damascus I in the late 4th century CE, was the first Latin Bible that was based on surviving witnesses (~8000 manuscripts!).

  • Medieval period saw a period of conservation, copying mostly religious works and trying to reconcile them, as much as possible, with classical (pagan) works. The Caroline Reformation led to a standardised script that made various European national scripts consistent––a significant portion of surviving manuscripts of classical literature is the result of copies made in monasteries with Carolingian script. Meanwhile, Constantinople's holdings of Greek manuscripts were crucial to Italian humanists' serious return to Greek study in the late fourteenth–early fifteenth century.

  • Copying work transferred from the hands of monks to those of professional scribes, often in universities. The great poet Petrarch's partial reconstruction of Livy's histories was a rigorous editorial project based on manuscript fragments in many medieval repositories. Poggio Bracciolini (1380–1459), acting as papal secretary, found manuscripts all over Europe of prominent classical thinkers. Bracciolini even invented a new humanist script that was far more clear and readable than the prevailing textura (i.e., gothic) script of the day. This is a good moment to reflect on the desire for humanists over time to invent inscription technologies that are consistent, readable, and shareable––a set of values very important to so-called "digital humanities" today.

  • Another figure worth noting: Lorenzo Valla (1407–57), the great debunker of forgeries: the Donation of Constantine and the letters of Seneca and St. Paul, e.g. He also sought to emend Jerome's Vulgate. His edition, based on Greek and patristic texts, was published by Erasmus in 1505. Similarly, Politian (1454–94) searched for earliest recoverable version of a manuscript––this foreshadowed the genealogical method of plotting a linear path of textual transmission. Politian derived the method of eliminatio codicum descriptorum, the removal of "descriptive" or derived copies as witnesses to an authentic version. This led to the method (very much in use to this day) of stemma codicum, the "family tree" of textual versions.

  • Stemmatics: building a family tree by examining scribal errors in multiple manuscript copies. Aldine editions. Example of the Erasmus New Testament. As an example: (Source:
  • Philology (OED):

    1. Love of learning and literature; the branch of knowledge that deals with the historical, linguistic, interpretative, and critical aspects of literature; literary or classical scholarship. Now chiefly U.S.

    3. The branch of knowledge that deals with the structure, historical development, and relationships of languages or language families; the historical study of the phonology and morphology of languages; historical linguistics. See also comparative philology at comparative adj. 1b.

  • Lachmannian method: identification and evaluation of bibliographic sources with a critical awareness. This comes out of the work of Karl Lachmann (1793–1851), whose 1850 edition of Lucretius claimed that the three extant manuscripts descended from a single archetype. Later witnesses have more errors. Interestingly, Lachmann's Nibelungenlied edition involved more speculation.

  • Johann Gottfried Eichhorn (1753–1824) and his monumental claim that there was no possibility to find or reconstruct the original or best text in biblical texts, because of all of the layers of copying and linguistic shifts (Einleitung in das Alte Testament, 1780–83).

  • Friedrich August Wolf (1759–1824) similarly argued in his Prolegomena ad Homerum (1795) that it would be impossible to recover Homeric texts.

Housman's thought
  • Where do science and art meet? "Textual criticism is a science, and, since it comprises recension and emendation, it is also an art."

  • A matter of reason and common sense, but also not "an exact science at all ... fluid and variable ... neither mystery nor mathematics"... It deals with human frailties---errors.

  • Editorial problems should be treated as individuals: "must be regarded as possibly unique."

  • Learning principles from instances: "P]ublic opinion is now aware that textual criticism, however repulsive, is nevertheless indispensable, and editors find that some presence of dealing with the subject is obligatory; and in these circumstances they apply, not thought, but words, to textual criticism. They get rules by rote without grasping the realities of which those rules are merely emblems, and recite them on inappropriate occasions instead of seriously thinking out each problem it arises."

  • This is to suggest that editors should "look all facts in the face" and avoid sectarianism of thought: "This I cite as a specimen of the things which people may say if they do not think about the meaning of what they are saying, and especially as an example of the danger of dealing in generalisations. The best way to treat such pretentious inanities is to transfer them from the sphere of textual criticism, where the difference between truth and falsehood or between sense and nonsense is little regarded and seldom even perceived, into some sphere where men are obliged to use concrete and sensuous terms, which force them, however reluctantly, to think."

  • What does he mean by sincerity of a manuscript? "When you call a MS. sincere you instantly engage on its behalf the moral sympathy of the thoughtless ... Our concern is not with the eternal destiny of the scribe, but with the temporal utility of the MS.; and a MS. is useful or the reverse in proportion to the amount of truth which it discloses or conceals, no matter what may be the causes of the disclosure or concealment."

  • Sincerity and recension; the importance of building. "[E]ven the traditional rules must of course be tested by comparison with the witness of the MSS... if we build structures on our trust we are no critics."

  • A paradox: "The MSS. are the material upon which we base our rule, and then, when we have got our rule, we turn round upon the MSS. and say that the rule, based upon them, convicts them of error. We are thus working in a circle, that is a fact which there is no denying; but, as Lachmann says, the task of the critic is just this, to tread that circle deftly and warily"

  • "To be a textual critic requires aptitude for thinking and willingness to think; and though it also requires other things, those things are supplements and cannot be substitutes. Knowledge is good, method is good, but one thing beyond all others is necessary; and that is to have a head, not a pumpkin, on your shoulders and brains, not pudding, in your head."

Editing and history
  • An act of historical scholarship which requires an answer to this question: "What role do judgment and evaluation play in reconstructing the past?" (Tanselle, 10).

  • Texts of documents v. text of works.

Seminar 2: Digital Editing Workflow


  1. David Birnbaum, “An even gentler introduction to XML”.

Lecture notes

Basic components of a digital edition
  • Source file(s) of transcribed text and metadata encoded in XML. The best encoding practice is to use the Text Encoding Initiative (TEI) standards, but it's not necessary.

  • Files that parse (i.e., read) and transform the encoded documents for viewing. Typically these will be XSLT or XQuery or (less common) Python files.

  • The edition, as transformed by the former, in html.

    Files for styling the edition's html interface (CSS, JavaScript)

Digital Editing Workflow

If I am interested in creating a digital edition, there are two questions that you must ponder at length before proceeding:

1. What is my text model, why am I making it, and what will it be used for?

2. What is my workflow?

The answer to (1) will vary quite a bit, depending on your documents, and what kind of edition you would like to produce. We will continue to investigate options to (1) as we move through the course this week.

The answer to (2) is a little more straightforward. Since we are concerned with "digital" editing, we need to think in terms of an appropriate computational pipeline.

Transcription Options

The beginning of the pipeline is the flexible text editor. By flexible I mean an editor that is amenable to Web publishing, and uses non-proprietary open source formatting. Many editors have used proprietary word processors to transcribe their editorial material. While that has many virtues (control of type-setting features, to name one), it presents a lot of problems if you are trying to optimize your workflow. E.g., if you transcribe an edition in Microsoft Word, you would have to transform that document (and all of its attendant proprietary code) into XML or HTML in order to make it work as a digital edition on the Web. Also data scientists or digital text analysts warn against using Microsoft Excel files for analysis because that program introduces unnecessary code that can hinder output.

Which text editor? We will be using the Atom text editor. It features a very attractive Markdown previewer (with additional feature packages), and it is well-integrated with GitHub (upon which this course web site is built). Other good options are the Sublime text editor, BB Edit, and Notepad ++ (for Microsoft).

For us, the common understanding is that XML files should be our edition files of record. Ideally, all documents would be transcribed in XML from the beginning, but for a variety of reasons that is not always practicable.

First we will look at the most basic of transcription: Markdown. This is lightweight web authoring at its best.

Markdown exercise

First we will go through a slideshow: Access the Markdown slides here.

  • Download “The Child on the Cliffs,” by Edward Thomas.

  • Open the file in Atom text editor. Save it as

  • Make sure your Atom text editor is updated. Click on Help and check to see you if need to update.

  • Go to File > Settings > Install and type in the search box “markdown-preview-enhanced”. This enhanced preview package gives you additional features such as footnotes and table of contents.

  • Press control + shift + m to show an html preview in Atom.

  • Using Markdown syntax, mark up the following (click here for the Markdown cheatsheet):

    • a main header (for the title), and italicise it;
    • a secondary header (for the author);
    • a hyperlink from Edward Thomas’s name to a web page (say, Poetry Foundation) with his biography;
    • create a contextual footnote for one of the lines (possibly the “Source”?).
  • Once your markdown document is complete, right-click on the preview window, select HTML > HTML (offline). In a green box you will see the html url for your file. You can also right click on the markdown preview and select “Open in Broswer”. Your document is now available as a web-ready html file. You can navigate to the file yourself and open it in your browser.

How do you get from markdown to xml? Two good options are Pandoc and OxGarage. I prefer using Pandoc for my transformations (my favourite probably being the markdown > PDF transformation). OxGarage is also good, and a little bit simpler to use: it can convert several types of documents into TEI-XML.

The other option is to open a new TEI-XML document in oXygen or your preferred text editor and simply copy-and-paste the body of the html file into the <body> element of the xml file.

Brief Introduction to XML

For the introduction: Access the XML slides here.

Proceed to Day 2