Footnote Markup Language version 1.0
Author: Raffaele Arecchi
Creation date: 23/04/2017
Last update: 10/09/2017
This document describes the Footnote Markup Language version 1.0.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The Footnote Markup Language, throughout this document abbreviated FootnoteML, is a markup language designed after the following principles:
- The text should be readable and the markup less verbose and less invasive as possible
- The text should be unstructured and the tags free to place
- The possibility to add multiple semantics to text, allowing the tags to overlap
- A clear distinction between semantic metadata (tags) and semantic visualization (browser)
The ability to add tags and leave a document easily readable and editable is an important feature of a markup language and FootnoteML readability competes with those of other markup languages.
FootnoteML does not have any requirement about structuring of the text and leaves the author the full freedom to express any thinking without the contraint of a data structure in mind. Many other markup languages tipically require a tree structuring of the document.
This structuring freedom and the allowance of tag overlapping enable FootnoteML to add multiple semantics to a text. The value of this feature can be show with a little example, consider the following verses from John Keats' poem Endymiom:
A thing of beauty is a joy for ever: Its loveliness increases; it will never Pass into nothingness; but still will keep A bower quiet for us, and a sleep
An intellectual reviewer would be happy to add rich semantic to this: she would state that each line is a verse but also she would give an explanation for "it will never Pass into nothingness" that overlaps two verses. Many markup languages fail to represent and distinguish at the same time the metric structure from the paraphrasal structure. XML for example, could be theoretically extended to allow multiple DTD in the same text, but overlapping tags that belong to the same DTD will cause the text to be XML invalid, unless extending it further introducing IDs [SJDeRose].
The last aspect aims to enforce the separation of the document visualization from its semantics. This is essential on overlapping structures as they require different semantic visualization. In this context, many markup languages suffer from the intrusiveness of visualization tag elements.
The main idea in FootnoteML comes from the primordial of all kinds of metadata: the footnote. Here is an example:
Be still my heart; thou¹ hast known worse than this²³. ¹ you (archaic) ² you had worse than this to bear ³ see Homer, Odissey - Book XX
The nice thing about footnotes is that they have no rules, but everyone is (usually) able to understand their scope and meaning.
The first note refers to a single word, thou, and its linking to the text could be interpreted as "thou is synonym for you". The second note has a wider scope and represents a modern translation. The third note refers to the entire phrase and is a reference.
The metadata scope and meaning can be made clearer by superscript delimitation and verbs in the notes:
³Be still my heart; ²¹thou¹ hast known worse than this²³. ¹ is synonym for → you (archaic) ² means → you had worse than this to bear ³ is a reference from → Homer, Odissey - Book XX
FootnoteML takes the same idea by:
- Delimiting scope by numbered tags,
- Express the note in the form subject-predicate-object, the subject being the source text.
A good feature of such markup language is that we do not need to structure the text at all, and we can place the metadata on a separate line to improve readability and editability.
Happily, this markup admits tag overlapping: in the last example the tags 1 and 2 are unrelated
and the same interpretation is given if we swap
While the complete disconnection of numbered tags is what makes overlapping possible, the drawback is that tag classification becomes cumbersome. The HTML
<p>Today is a nice day.</p> <p>Hope tomorrow will be the same.</p>
would be equivalent to
¹Today is a nice day.¹ ²Hope tomorrow will be the same.² ¹ is a → paragraph ² is a → paragraph
and many paragraphs will generate lots of duplicated information. It is easy to solve this by reuse of the same footnote:
¹Today is a nice day.¹ ¹Hope tomorrow will be the same.¹ ¹ is a → paragraph
However, in the case of same-number overlapping, two different numbers will be anyway required otherwise tag boundaries will not be interpretable. But again this can be improved if the footnotes are allowed to inherit content from other footnotes: just rewrite
Peter Piper ¹picked ²a peck¹ of pickled² peppers. ¹ is an → alliteration ² is an → alliteration
Peter Piper ¹picked ²a peck¹ of pickled² peppers. ¹ inherits from ³ ² inherits from ³ ³ is an → alliteration
The syntax gets more compact if we introduce an inheritance notation (:) and if the footnote can be written once and then referenced beyond in the text:
³ is an → alliteration Peter Piper ¹picked ²a peck¹ of pickled² peppers. ¹:³ ²:³
More generally, footnote inheritance can be useful to maintain a low number of organized notes.
As last observation, we can allow the content inside the metadata to be, in turn, tagged to a footnote within the same text file, a feature that transcends the common markup practise:
³Be still my heart; ²¹thou¹ hast known worse than this²³. ¹ is ⁴synonym⁴ for → you (archaic) ² means → ⁵you had worse than this to bear⁵ ³ is a reference from → ⁶Homer⁶, Odissey - Book XX ⁴ more on → http://en.wikipedia.org/wiki/Synonym ⁵ paraphrase by → Adso from Melk, 1382 ⁶ is → an ancient greek poet whose existence is disputed
To maintain this meta-meta tagging as general as possible, we will consider infinitely recursive references meaningless but valid:
Hope one day the GNU ¹Hurd¹ microkernel will become stable. ¹ is acronym for → ²Hird² of Unix-Replacing Daemons ² is acronym for → ¹Hurd¹ of Interfaces Representing Depth
A FootnoteML file is any kind of text file with Unicode characters. No specific encoding and no file extensions are strictly required.
A tag is represented by an integer number sorrounded by tildes ~. An opening tag with a missing closing one, or viceversa, should be considered invalid. Escaping of tilde is performed by doubling it.
Here is an example of valid tags:
~3~Be still ~4~my heart; ~2~~1~thou~1~ hast known~4~ worse than this~2~~3~.
A line whose first character is the sharp #, excluding blanks and tabs, it represents a note. The sharp must be followed immediately by the number to which it is referring, then followed by eventual inheritance content, then followed by the metadata content of the note.
Notes can be placed anywhere in the text.
The inheritance content of the note is optional, if present it must start with a colon : followed by a sequence of numbers separated by commas ,. A note showing an inheritance content is equivalent to the same note but including as metadata content also all the metadatas of the numbered notes to which the inheritance content is referring to, recursively.
The metadata content of the note is composed by an unlimited sequence of predicate-object couples marked by square brakets. Inside the brackets, the predicate and the object are separated by pipe | while interspace outside the brackets is not relevant. Predicate and object content are allowed to be tagged just any other text, and to be displayed on multiple lines, but are not allowed to contain subnotes.
Escaping inside a note is performed by backslash \, and \\ represents the backslash itself.
Here is an example of valid notes:
#7 [is ~44~synonym~44~ for|you] [is historically|archaic] #9:34,1,2 [means|you had worse than this to bear] #5 [is a reference from|~13~Homer~13, ~72~Odissey~72~ - Book XX] #13 [is|an ancient greek poet whose existence is disputed] #44 [more on|http://en.wikipedia.org/wiki/Synonym] #72 [more on| http://en.wikipedia.org/wiki/Odyssey]
A note number can be redefined, but this must be interpreted as if the number is a new generated number that does not appear anywhere in the text. All tags and notes that are located after the note redefinition and refere to this number (by reference or inheritance) must be interpreted as if they refere to the new generated number.
The predicate is can be omitted.
FootnoteML is similar to HTML, but more expressive. A nice feature of FootnoteML is that the text layout remains stable and familiar to paper books.
To improve readability further, all the notes can be placed after a tab character.
To improve editing, an editor program could simply blur the tags by a ligher color, or minimizing them, and highlights matching opening/closing tags by darker colors. Advanced editing could include note number refactoring and metadata tooltip for inherited notes.
The Endymiom could by shown as:
#1 [is metrically a|verse] A thing of beauty is a joy for ever: ~2~Its loveliness increases; ~4~it will never~2~ ~3~Pass into nothingness~4~; but still will keep~3~ A bower quiet for us, and a sleep #2:1 [rhymes with|ver] #3:1 [rhymes with|eep] #4 [means|~5~it will live forever~5~] #5 [paraphrase by|Adso from Melk, 1382]
If the notes are encoded to a standard, a browser then could be enabled to let the intellectual user select the view modality: metric or logic.
Ona a different context, the following example shows how FootnoteML is suitable for tagging work notes in a file:
#1 [malaria] [cause] [italy] [02/04/2016] ~1~ Today report from Liam Brown showed that Malaria in Italy is mainly an import disease due to the increase in migratory flows and trips to endemic tourism or VFR (visiting friends and relatives). It is the main cause of fever when returning from tropical countries and should be considered the first suspect diagnosis in case of febrile subjects coming from endemic areas. Approximately 600 cases per year (30% Italian citizens, 70% foreign citizens) a 5-10% share develops serious malaria, a medical emergency characterized by multiorgan involvement with 40% mortality in the event of no diagnosis or absence of treatment. ~1~ #1 [ABC] [anomaly] [dam] #2 [ABC] [test] [log] #3 [command] [download] ~1~ The ABC application is in charge of detecting anomalies in seepage paths in the earth dam. ~2~ Log in test environment are located at http://acme-intranet.es/test/abc/log ~2~ ~3~ Example of download from the command line: wget -e use_proxy=no http://acme-intranet.es/test/abc/log/tracing.log ~3~ ~1~
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.