My documents and hacks about bidi

"Bidi" is a shorthand term for bidirectional (mixed left-to-right with right-to-left) text support in computers. I'm strongly interested in advancing bidi support in free software. My main interests are:

Hierachically Implicit Bidi

Currently each document format that supports bidirectional text has special constructs to explicitly assign RTL/LTR direction to elements of the document tree. For the programmer, this means that bidi support must be explicitly specified for each format and means all tools that process or convert these formats must have special support for bidi. This inevitably results in a user experience where bidi information is fragile, easily lost or mangled in conversion.

Here I present a scheme which relies on the carrying format to provide structure, while inferring the direction from the text. It can be applied to any format that structures a document as a tree of elements (the specific kinds of elements are not important for it). A direction is inferred for every element implicitly (requiring at most LRM/RLM to override it). With some luck, this scheme should allow unmodified conversion/generation tools to preserve bidi information.

Status:A bit rouph (unresolved issues with numbers, some text duplication). I hope to start an implementation soon.

Explicit BiDi Codes Considered Awkward

[This is a largely unfinished document that led to Hierachically Implicit Bidi. It sheds a bit more light on the problems I see with the explicit bidi codes but doesn't describe the solution.]

I argue that the explicit embedding codes specified in UAX 9 (LRE, RLE, LRO, RLO, PDF) are awkward, both for the programmer who needs to know too much information to emit them and for the user, who gets a stream contaminated with too much semi-visual information which reduces its usefulness as a logical-order text. I also argue that the flat document model targeted by UAX 9 is insufficient for defining the bidirectional behavior of real-life documents.

I propose an alternative scheme for implicit bidi in (possibly) hierachical documents and an alternative set of codes with better properties, which together with RLM/LRM (against which I have nothing) allow representing embedded text in purely logical order and more implicitly.

Status:Largely unfinished.

Mock-up of a bidi editing style

Here I show some ideas on how a good (IMHO) bidi editor would work. The central idea is providing visual feedback by putting some kind of border around an embedding and allowing control of embedding boundaries by positioning the cursor inside/outside the embedding.

Status:Conveys the original idea but I've got many sub-ideas about it since...