EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Decimal Point vs. Decimal Comma

Started by rickman April 2, 2016
Ron Aaron <rambamist@gmail.com> writes:



>On 03/04/2016 20:24, upsidedown@downunder.com wrote:
>> In fact it would be a great thing, if you could use UTF-8 in >> programming, e.g. Greek letters Alfa and Beta as program variables ? >> No need to do any translitteration from any textbok variables.
>You can. In 8th you certainly can, and I don't see why a standard Forth >would disallow a UTF-8 name.
You can have a certain leeway as implementor, because non-ascii characters are "implementation defined". This makes the use of the IBM box-printing characters a portable program with dependancies. I've a different strategy in ciforth. A name is a string of bytes, not characters. That is why I detest case insensitivity. The interpretation mechanism just looks up a byte string. So a name can contain sequences that turn the current foreground color green then red, so as to have names that look like colorforth if printed on an appropriate device (a VT100 would not do). Instead of manipulating >IN ciforth has lifted to an ever so slightly higher abstraction, PP@@. PP@@ returns an incremented parse pointer and the next character (that must fit in 64 bits). By revectoring PP@@ one can redefine how word names are interpreted into characters, or escape sequences. A practical application is com-4e5 . Any function keys are just stored in the dictionary. Press a function key while in communication mode and it will just be executed. Press an as yet undefined key and you get the chance to define it. If you use this, your program is no longer ascii and looses the epitethon "Forth with environmental dependancies". Groetjes Albert -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

On 04/04/2016 12:17, Albert van der Horst wrote:

> You can have a certain leeway as implementor, because non-ascii characters > are "implementation defined". This makes the use of the IBM box-printing > characters a portable program with dependancies.
Right; I assumed that since Forths parse whitespace-delimited words, any bytes which are not whitespace would be ok for names.
> I've a different strategy in ciforth. A name is a string of bytes, > not characters. That is why I detest case insensitivity.
Me too :o ...
> A practical application is com-4e5 . Any function keys are just > stored in the dictionary. Press a function key while in communication > mode and it will just be executed. Press an as yet undefined key > and you get the chance to define it.
That is very cool, hadn't thought of that idea.
Op Mon, 04 Apr 2016 07:13:44 +0200 schreef Robert Wessel  
<robertwessel2@yahoo.com>:
> On Sun, 03 Apr 2016 20:24:10 +0300, upsidedown@downunder.com wrote: >> On Sun, 3 Apr 2016 14:18:56 +0200, Hans-Bernhard Br&#4294967295;ker >> <HBBroeker@t-online.de> wrote: >>> Am 02.04.2016 um 23:49 schrieb rickman: >> [...] >> >> In fact it would be a great thing, if you could use UTF-8 in >> programming, e.g. Greek letters Alfa and Beta as program variables ? >> No need to do any translitteration from any textbok variables. > > You've always been able to do that in Java, although it's UCS-2 (old > versions) or UTF-16 (current), not UTF-8.
javac has the -encoding option, so you can use anything you want. -- (Remove the obvious prefix to reply privately.) Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/
Am Mon, 04 Apr 2016 07:11:32 +0000 schrieb Anton Ertl:

>>Forth-2012 with the Xchar wordset allows all XChar printable characters >>as part of a Forth word (usually UTF-8). > > Actually Forth-2012 falls slightly short of actually guaranteeing that > this works, because it does not require that the Forth system supports > UTF-8.
In the rationale, we recommend implementing support for UTF-8. It's a weak recommendation, not a strong requirement, so it's up to observation what kind of common practice will emerge. Most systems that have been tried with UTF-8 sources have no special support for it, e.g. the amForth programs in VF 2015/03-04 is compiled on an amForth without any xchar extension. VFX and SwiftForth also don't come with xchars compiled into the default image, and yet, UTF-8-based source code works on both. There are some minor issues in the command line editor when you try editing that stuff, which doesn't exist in an xchar-aware command line editor like Gforth's, though. But we have to implement a good locate before complaining about VFX again ;-). -- Bernd Paysan "If you want it done right, you have to do it yourself" net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ* http://bernd-paysan.de/
On Sun, 03 Apr 2016 20:24:10 +0300, upsidedown wrote:

> What is the problem of using Latin-1 (ISO 8859-1) in a programming > language ? > > In fact it would be a great thing, if you could use UTF-8 in > programming, e.g. Greek letters Alfa and Beta as program variables ? > No need to do any translitteration from any textbok variables.
The problem is that it raises the bar for any system needing to touch the code. It creates problems for printing the code using non-graphical printers, editing the code on a character-based terminal, transmitting it through a 7-bit channel, or embedding it in any data format with a limited character set. Most computer systems only allow a small number of characters to be entered without having to resort to numeric codes or searching through tables for something to be clicked with the mouse. Homoglyphs create confusion. Even regular punctuation is bad enough, as most of it requires using the little fingers or reaching for the top row, possibly along with the use of the shift (or even AltGr) key.
On Tue, 05 Apr 2016 03:32:54 +0100, Nobody <nobody@nowhere.invalid>
wrote:

>On Sun, 03 Apr 2016 20:24:10 +0300, upsidedown wrote: > >> What is the problem of using Latin-1 (ISO 8859-1) in a programming >> language ? >> >> In fact it would be a great thing, if you could use UTF-8 in >> programming, e.g. Greek letters Alfa and Beta as program variables ? >> No need to do any translitteration from any textbok variables. > >The problem is that it raises the bar for any system needing to touch the >code.
Windows NT has supported Unicode from the beginning in the 1990's (UTF-16) and Linux for a long time (UTF-8). Smart phones also have some kind of support.
> It creates problems for printing the code using non-graphical >printers,
Can you still buy printers with built in character ROMs ?
>editing the code on a character-based terminal,
Anybody using VT-100 terminals for program development ?
>transmitting it >through a 7-bit channel,
Latin-x (8859-x) character sets support most European languages and it requires 8 bit characters. If insisting of using 7 (or 6) bit channels, some encapsulation methods are needed to carry the 8 bit Latin character sets. The same encapsulation method should be able to carry UTF-8.
>or embedding it in any data format with a limited >character set. > >Most computer systems only allow a small number of characters to be >entered without having to resort to numeric codes or searching through >tables for something to be clicked with the mouse. Homoglyphs create >confusion.
I have done (with code pages) bilingual (Latin/Cyrillc and Latin/Arabic) systems. Usually you had a special key for toggling between languages, no big deal. These days any touch screen virtual keyboard it is even more trivial.
>Even regular punctuation is bad enough, as most of it requires using the >little fingers or reaching for the top row, possibly along with the use of >the shift (or even AltGr) key.
Your arguments might have been valid in the beginning of this century, but these days they are non-issues.
On Tue, 05 Apr 2016 03:32:54 +0100, Nobody <nobody@nowhere.invalid>
wrote:

>On Sun, 03 Apr 2016 20:24:10 +0300, upsidedown wrote: > >> What is the problem of using Latin-1 (ISO 8859-1) in a programming >> language ? >> >> In fact it would be a great thing, if you could use UTF-8 in >> programming, e.g. Greek letters Alfa and Beta as program variables ? >> No need to do any translitteration from any textbok variables. > >The problem is that it raises the bar for any system needing to touch the >code.
Windows NT has supported Unicode from the beginning in the 1990's (UTF-16) and Linux for a long time (UTF-8). Smart phones also have some kind of support.
> It creates problems for printing the code using non-graphical >printers,
Can you still buy printers with built in character ROMs ?
>editing the code on a character-based terminal,
Anybody using VT-100 terminals for program development ?
>transmitting it >through a 7-bit channel,
Latin-x (8859-x) character sets support most European languages and it requires 8 bit characters. If insisting of using 7 (or 6) bit channels, some encapsulation methods are needed to carry the 8 bit Latin character sets. The same encapsulation method should be able to carry UTF-8.
>or embedding it in any data format with a limited >character set. > >Most computer systems only allow a small number of characters to be >entered without having to resort to numeric codes or searching through >tables for something to be clicked with the mouse. Homoglyphs create >confusion.
I have done (with code pages) bilingual (Latin/Cyrillc and Latin/Arabic) systems. Usually you had a special key for toggling between languages, no big deal. These days any touch screen virtual keyboard it is even more trivial.
>Even regular punctuation is bad enough, as most of it requires using the >little fingers or reaching for the top row, possibly along with the use of >the shift (or even AltGr) key.
Your arguments might have been valid in the beginning of this century, but these days they are non-issues.
Nobody <nobody@nowhere.invalid> writes:
>The problem is that it raises the bar for any system needing to touch the >code. It creates problems for printing the code using non-graphical >printers
Welcome to the 1990s. Daisy-wheel printers, chain printers, drum printers and the like have long died out.
> editing the code on a character-based terminal,
Also have died out quite some time ago.
> transmitting it >through a 7-bit channel
The same, but if you really have one, there are various ways of recoding 8-bit stuff into 7-bit bytes.
>or embedding it in any data format with a limited >character set.
The nice thing about data is that it's just bits, and it does not know how it is interpreted. That's why UTF-8 works so well. The lack of knowledge about the interpretation is actually the real practical problem with UTF-8: If I set up an xterm to interpret characters as Latin-1-encoded, UTF-8 won't come out right, and vice versa. There are ways to deal with that, but they are cumbersome. But that's the transition pain.
>Most computer systems only allow a small number of characters to be >entered without having to resort to numeric codes or searching through >tables for something to be clicked with the mouse.
So what? Most users have ways to input the kind of data they need to input. If a programmer wants to write programs with greek letters, he probably has a way to input that relatively easily, and he will ensure that his collaborators have that, too, or he will see their numbers dwindle, or his beautiful greek letters searched and replaced with something that his collaborators can type in easily. I type on a US-layout keyboard. My favourite editor has support for inputting Umlauts etc. easily (latin-1-prefix). I found that Windows 8.1 uses the same convention (maybe because it knows where I live, but it's still cool). - anton -- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.forth200x.org/forth200x.html EuroForth 2015: http://www.rigwit.co.uk/EuroForth2015/
upsidedown@downunder.com writes:

>On Tue, 05 Apr 2016 03:32:54 +0100, Nobody <nobody@nowhere.invalid> >wrote:
>>On Sun, 03 Apr 2016 20:24:10 +0300, upsidedown wrote: >> >>> What is the problem of using Latin-1 (ISO 8859-1) in a programming >>> language ? >>> >>> In fact it would be a great thing, if you could use UTF-8 in >>> programming, e.g. Greek letters Alfa and Beta as program variables ? >>> No need to do any translitteration from any textbok variables. >> >>The problem is that it raises the bar for any system needing to touch the >>code.
>Windows NT has supported Unicode from the beginning in the 1990's >(UTF-16) and Linux for a long time (UTF-8). Smart phones also have >some kind of support. > >> It creates problems for printing the code using non-graphical >>printers,
>Can you still buy printers with built in character ROMs ?
>>editing the code on a character-based terminal,
>Anybody using VT-100 terminals for program development ?
>>transmitting it >>through a 7-bit channel,
>Latin-x (8859-x) character sets support most European languages and it >requires 8 bit characters. If insisting of using 7 (or 6) bit >channels, some encapsulation methods are needed to carry the 8 bit >Latin character sets. The same encapsulation method should be able to >carry UTF-8.
>>or embedding it in any data format with a limited >>character set. >> >>Most computer systems only allow a small number of characters to be >>entered without having to resort to numeric codes or searching through >>tables for something to be clicked with the mouse. Homoglyphs create >>confusion.
>I have done (with code pages) bilingual (Latin/Cyrillc and >Latin/Arabic) systems. Usually you had a special key for toggling >between languages, no big deal. These days any touch screen virtual >keyboard it is even more trivial.
>>Even regular punctuation is bad enough, as most of it requires using the >>little fingers or reaching for the top row, possibly along with the use of >>the shift (or even AltGr) key.
>Your arguments might have been valid in the beginning of this century, >but these days they are non-issues.
All the above arguments are eminently valid. I'm a productive programmer which implies I'm a ten finger blind typist. That means that I restrict the language I type in to the keyboard layout I'm used to. Using a touch keyboard on a notepad is already a great slow down. Remember, when I'm typing, I'm not typing, I'm thinking. I can't have the distraction of " how was the theta symbol again" if I'm trying to define a class for computation of continued fractions. Groetjes Albert -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
On Tue, 05 Apr 2016 09:56:20 +0300, upsidedown@downunder.com wrote:

>On Tue, 05 Apr 2016 03:32:54 +0100, Nobody <nobody@nowhere.invalid> >wrote: > >> [Unicode] creates problems for printing the code using non-graphical >>printers, > >Can you still buy printers with built in character ROMs ?
Yes. There are dot matrix printers still available from a number of vendors. Lasers cannot handle multi-part forms.
>>editing the code on a character-based terminal, > >Anybody using VT-100 terminals for program development ?
A lot of people develop scripts using terminal editors. Does that count? George [I still have two working Okidata dot matrix, one original and one with the IBM character ROM.]
The 2026 Embedded Online Conference