@neauoire that would be step 1. Then you would need to support both LTR and RTL at the same time. Then you need to figure out how to render the actual Arabic script :D
@neauoire nope. It doesn't even have individual characters when it comes to drawing them, as far as I know.
@cancel ah yeah that's true the diacritics and stuff are added above the characters. It's really not something I'd feel confortable tackling haha.
@cancel @neauoire i know @nasser has some relevant thoughts on this, e.g. https://merveilles.town/@nasser/105352373639070650
@chirrolafupa @cancel @neauoire lol I wrote and deleted a response to your original post about the text editor with this exact thought. getting non-Latin scripts right in the general case is in the same category of difficulty as getting cryptography right, effectively impossible for an individual. you need to use a library like harfbuzz, and then there is considerable work on top of that. the end result will not be small or simple, unfortunately, because human writing is not small or simple...
@chirrolafupa @cancel @neauoire that's all true within the current paradigm of digital text rendering, which is ultimately derived from movable type theory, and that alone disadvantages cursive languages (ask me some time about the history of the printing press in the Middle East). there's nothing preventing a radically different approach to text rendering and fonts, however! things might be different if taken from a different perspective.
@akkartik @chirrolafupa @cancel @neauoire so much to say! the first printed arabic text was a quran made in venice in the 15th century, the idea being that they would sell it to muslim people as a product. they pulled it off, but the result was so catastrophically ugly that no one wanted it lol. attached images are the venetian quran and an arabic quran from the same time period. I don't know how obvious the differences are to someone who doesn't read arabic, but trust me it sucks.
@akkartik @chirrolafupa @cancel @neauoire it wasn't until the 18th and 19th century that you started to get competently printed Arabic text, mostly in Cairo and Beirut. they did this by carving many many variations for each letter to accommodate the different ways it might bend and shape itself in a word, in addition to carving whole words into blocks. the result is that you have hundreds upon hundreds of blocks, as opposed to a single block for each letter.
@akkartik @chirrolafupa @cancel @neauoire this is a hack of course and like all hacks it has its limitations. once the typewriter comes around, you kind of do have to pick a single image for each letter. in movable type it's inconvenient to have hundreds and hundreds of blocks, but in a typewriter it's just impossible. as I understand this is when arabic as a written language gets most of its nuance and flourish flattened out of it to accommodate western tech
@akkartik @chirrolafupa @cancel @neauoire so the combination of movable type (letters have a finite number of visual presentations, words are constructed by arranging rectilinear images of letter in two dimensional space) and typewriters (every letter is connected to a button) are fundamentally incompatible with arabic as a writing system, and arabic has had to change to accommodate them, and is still not very well supported as a result. things only get worse with computers 😬 </history>
@neauoire @nasser @akkartik @chirrolafupa @cancel one question I meant to ask during FennelConf when I presented my structural editor was that while I know the love2d UI would be unable to support Arabic, it seems like terminals exist that do an OK job of it. is it easier for terminal/curses-based programs to support Arabic text as long as they're running inside a harfbuzz-enabled terminal, or are there still a lot of minefields there?
@nasser @neauoire @akkartik @chirrolafupa @cancel for instance, here's the line-based UI for my editor running inside mlterm showing some code from the قلب readme. if the program did not mix scripts, would it have a decent chance at working correctly without any application-level changes to support Arabic?
@akkartik @chirrolafupa @cancel @neauoire not for Arabic in general, because the letter forms vary pretty significantly from calligraphic style to calligraphic style. certain styles are better documented online than others, and for the more ornate ones you just need to study with an expert to be able to write them (and they are quite hard to read as well). I learned the handwriting style I know (ruqaa) from a book and square kufic from this calligrapher http://www.sakkal.com/instrctn/sq_kufi_alphabet.html
@nasser Does it make sense to try to separate calligraphic style from the picture? The analogy I'm thinking of is that I might hand-write the (arabic!) numeral 7 with a stroke in the middle, but I don't think about whether my computer does the same. Not sure if that's quite what you mean by 'calligraphic style'. Please tell me if I'm misunderstanding. And I don't mean to exclude all considerations of shape.
@akkartik @chirrolafupa @cancel @neauoire yeah calligraphic style roughly (very roughly) maps to "font" if thats helpful, so its kind of like presence/absence of the dash in the 7. digital text *has* basically disregarded all but the bare minimal letter forms to produce legible arabic, so what youre describing is the status quo.
@rra @akkartik @chirrolafupa @cancel @neauoire (if anyone wants to be left off of these replies please lmk) the approach of piles of ligatures is basically what modern systems do. opentype supports ligatures and arabic fonts go to town on them, even for full words in the cases of really nice fonts. re: native text rendering, i've never been able to find anything that was not a hack on top of existing systems, usually western or japanese.
@rra @akkartik @chirrolafupa @cancel @neauoire the earliest evidence of arabic text i can find are the Kuwaiti Sakhr computers from ~1985, but they are custom firmware for MSX chips, and use hacks on top of the existing MSX font tech as i understand it, so not quite native. (i managed to get one of these on ebay and i have a cartridge for an arabic logo programming language!)
I don't speak Arabic but I made some progress in learning how to read and write it so that I could navigate Arabic music recordings (for example on YouTube). I've studied languages before but I've never encountered a writing system like this. I couldn't have even imagined what I didn't know!
@nasser Right. I want to improve the status quo -- but also not boil the ocean. The problem coming into focus for me: render text more fluidly and with shape, but supporting just a single style/font, so that it looks non-sucky to all readers of Arabic.
Still a hard problem. Does this seem worth attempting?
Perhaps print-text() should always accept and return a bounding box. And sometimes error: "you can't print x followed by y, you must print xy together."
@waterbear @akkartik @chirrolafupa @cancel @neauoire not quite. arabic is an alphabet (technically an "abjad") with a relatively average number of letters that have a large number of visual variations because the script is cursive. chinese as I understand has a large number of characters to begin with, and they don't have required visual variations and the script is not cursive. they're both poorly served by computers though, just for different reasons 🙃
Standing offer: I would love to collaborate on a computing stack for a non-English language by forking https://github.com/akkartik/mu. It's very barebones and not afraid of radical experiments. It already has an API for rendering arbitrary utf-8 strings and returning arbitrarily-sized bounding boxes for what was rendered. So the work is mostly creating glyphs for combinations of codepoints. And rules for segmenting. And lots of testing.
Merveilles is a community project aimed at the establishment of new ways of speaking, seeing and organizing information — A culture that seeks augmentation through the arts of engineering and design. A warm welcome to any like-minded people who feel these ideals resonate with them.