Here I show some ideas on how a good (IMHO) bidi editor would work.
Since bidi display is ambiguos and while editing it's easy to get inconsistent states, it's important to make the embedding structure evident to the user. The editor should highlight the embedding structure in some way. Not only the explicit embeddings but also implicit embeddings detected by the bidi algorithm. I.e. behave as if virtual embedding characters were present at boundaries of implicit embedings. The difference between explicit and implicit characters can be hidden completely, as will be discussed below.
One approach is putting borders around the embeddings, as in TeXmacs (which has no bidi support but is a good structured editor):
english start HEBREW START 123 HEBREW END english end
(all examples use the CapRTL convention: capital letters are Hebrew).
This is too heavy for normal use, of course. A lighter approaches would be overlines only:
english start HEBREW START 123 HEBREW END english end
Or some kind of brackets (CSS2 needed to see them):
english start HEBREW START 123 HEBREW END english end
Another direction for making it less intrusive is to show only some of the boundaries. First, the outermost boundary should never be shown:
english start HEBREW START 123 HEBREW END english end
Also, we don't have to show the boxes all over the document. Show only boxes inside which the cursor is, or boxes directly near it. This is the approach taken by TeXmacs. As a futher extension of it, we can show only the innermost of these boxes. Guaranteeing that no more than one box is highlighted at a time is nice because it avoids the need to waste vertical space (unless the brackets style is used).
I'll not deeply here address the question of visual vs. logical cursor movement; IMHO both should be provided, e.g. visual on the arrow keys and logical on C-f / C-b.
The more important thing about cursor movement is its behaviour at embedding boundaries. The wonderful thing about displaying boundaries, is that you can behave as if there was a virtual character at the boundary, so the cursor can be positioned either inside the embedding or outsied - and you need to press a character to cross it. This solves all issues about neutral characters on boundaries.
In the following examples, the cursor position is shown by the red paren ()). It moves one position forward each time. The examples show logical movement but the same could be done with visual movement, it would just enter the embeddings from the other side.
english start) HEBREW START 123 HEBREW END english end
english start )HEBREW START 123 HEBREW END english end
english start )HEBREW START 123 HEBREW END english end
english start H)EBREW START 123 HEBREW END english end
And some more:
english start HEBREW START) 123 HEBREW END english end
english start HEBREW START )123 HEBREW END english end
english start HEBREW START )123 HEBREW END english end
Here is how this would support simply typing text in logical order. Recall that the embeddings we are talking about can be implicit: the moment you type a character of another directionality, a new embedding is created. As you continue typing the embedded text, the cursor stays at the end of the embedding. Should it be just before the end or outside it? Consider typing a punctuation character after the embedding. According to the Unicode bidi algorithm, it should go outside the embedding (which is usually right). So the cursor when typing should stay outside:
english, )
english, H)
english, HEB)
english, HEBREW)
english, HEBREW,)
english, HEBREW, eng)
english, HEBREW, english again)
This is also simpler to understand: the cursor doesn't enter/exit an embedding unless you explicitly move it. As long as you typed Hebrew, the letters were actually added outside the embedding but were immediately joined with it.
Note that if you wanted to put the closing comma inside the embedding, all you had to do would be to press the arrow key once before it, moving the cursor inside the embedding.
Now let's see in more detail how embedding are managed.
The basic idea is that the user should not be bothered to understand RLM/LRM vs. RLE/LRE/PDF (no interface for LRO/RLO is presented here but it's not hard to imagine). Instead, all he sees are these embedding boxes. Upon loading bidi control codes are parsed; upon saving they are generated (minimal needed set, probably RLM/LRM in innermost embeddings, LRE/RLE/PDF as needed in greater embeddings).
When typing, embeddings can implicitly appear or disappear. Let's define the precise circumstances. First some invariant constraints:
Embeddings always alternate: when an embedding is a direct child of another embedding, they always have opposite directionality.
No unnecessary nesting: an embedding's content can't be just another embedding - if such a situation arises they are both elimintead and the inner embedding is inlined into the parent:
english )Hinner
if we now delete the H, the RTL embedding becomes pointless and is eliminated, giving:
english ) inner
No adjacent embeddings: when nothing separates two embeddings, they become one:
english start HEBREW)xANOTHER HEBREW english end
If we remove the x, the two embeddings are merged:
english start HEBREW )ANOTHER HEBREW english end
No empty embeddings.
These constraints, I believe, are equivallent to using an embedding level attached to every character to describe the structure (except for the "no unnecessary nesting" constraint - it's not inherent in using levels, it means that you use the minimal levels that would achieve the same order).
Typing/removing neutral characters can't change embedding boundaries. Neutrals go into the mebedding where the cursor currently is.
Typing strongly directional characters can re-shape embeddings, since they can't live in an embedding of the wrong direction.
So I think it should always create a new embedding; besides, it's more consistent, which means less effort from the user to understand.
@@@ I should describe here behavior when deleting things.
@@@ Behavior for numbers should be defined here. Basically numbers create implicit embeddings too.
When cutting-and-pasting, the embedding info is taken with the text. To handle cut-and-paste between applications, the embedding info should be converted to bidi codes when copying and parsed when pasting.
@@@ How should complex operations (like serach-and-replace, etc.) be handled?
What happens when a user creating the wrong embedding structure and wants to change it? He needs a way to modify the structure. The behaviors described above, together with cut-and-paste, allow to build any new structure, but that's too cumbersome.
The ideal device for this is probably the one provided in Lyx (with adaptations to inline editing, Lyx only has it at paragraph level): the user can select text and then he can (with toolbar buttons, hotkeys or whatever) increase or decrese it's nesting level.
It also makes sense to be able to increase or decrease the level of the cursor itself when there is no text selected. Then if you type new text it takes this level, if you move out it's lost. This means you can have a (transient) embedding containing only the cursor. This can be explained and implemented by treating the cursor itself as a character. This model would also allow the cursor to be on either side of an embedding boundary if the internal reprepresentation is based on levels, so boundaries take no space in the buffer (then the implementation of movement is trickier).