Easy, readable css interlinear glosses

css
linguistics
(surely there is a better way to do this now?)
Published

February 28, 2009

CSS should make the creation and sharing of standard interlinear glosses/translations easy; so far it does not. In general, either the text entry or the output (or both!) is absolutely unacceptable. For example:

<h1>John 3:16</h1>
<div class="unit"><p class="gk">οὕτως</p>
<p class="en">such</p></div>
<div class="unit"><p class="gk">γὰρ</p>
<p class="en">for</p></div>
<div class="unit"><p class="gk">ἠγάπησεν</p>
<p class="en">loved</p></div>

 

Notice that any sense of these translated words having some pairwise (let alone phrasal, clause, or sentence!) structure is completely shrouded in the cumbersome and obtrusive markup (this would also be a huge pain to type). The output is admittely really nice and allows one to do some cool things with javascript, but the text is essentially unusable in this form. If I wanted 4,000,000 tags per document I’d use XML/XSLT (p.s. yuck).

The goal, it seems,is something more like the entry one uses in gb4e.sty glosses in a LaTeX document:

\begin{exe} 
\ex 
\gll Wenn jemand in die W\"uste zieht ... \\ 
If someone in the desert draws and lives ... \\ 
\trans ‘if one retreats to the desert and ... ’ 
\end{exe} 

 

gb4e/LaTeX handle making words that are glosses of one another line up in the display. In short, the goal is separation of content from display and that’s the entire bloody point of CSS, right?

What I’d really like is text entry like this:

<div class="interlinear">
<p class="source">T&aacute; ceol agam. </p>
<p class="gloss">to be-PRS music-M.PL at-1SG</p>
<p class="target">I am musical</p>
</div>

 

In addition to the problem that CSS is basically incapable of doing any kind of interesting text alignment, this example introduces the complication of aligning multiple words to a single word. What I have so far is text entry using the CSS inline-table attribute like this:

<div class="interlinear">
<p class="gloss">
    <div class="gll">T&aacute;<br />to be-PRS</div>
    <div class="gll">ceol<br />music-M.PL</div>
    <div class="gll">agam<br />at-1SG</div>
</p>
<p class="translation">I am musical</p>
</div>

 

which, I think, still totally sucks. There’s less markup than the first version so it’s slightly better, but it’s still not very good. Two good things, though, are that the CSS is obvious, simple and standard (interlinear.css) and the output is nice:

  1. Irish


    to be-PRS
    ceol
    music-M.PL
    agam
    at-1SG

    I am musical

  2. Latin

    In
    in
    nomine
    name
    Patris
    Father
    et
    and
    Filii
    Son
    et
    and
    Spiritus
    Spirit
    Sancti
    Holy

    In the name of the Father, the Son and the Holy Spirit.

  3. Classical Japanese (Ariwara no Narihira courtesy E. Alpert)

    tsuki
    moon
    ya
    Q
    aranu
    is-NEG
    haru
    spring
    ya
    Q
    mukashi
    long.ago
    no
    GEN
    haru
    spring
    naranu
    COP-NEG

    “Isn’t this the moon? And isn’t spring the way it used to be?”

 

I have a little javascript hack that will assemble these inline-table divs at run time from the first type of HTML. This allows me to write markup in the style I want but it’s gross and still doesn’t facilitate searching. I’ll update this page when I come up with something better (or you could e-mail me if you have a better idea).