EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Decimal Point vs. Decimal Comma

Started by rickman April 2, 2016
On 16-04-03 20:24 , upsidedown@downunder.com wrote:
> On Sun, 3 Apr 2016 14:18:56 +0200, Hans-Bernhard Br&#4294967295;ker > <HBBroeker@t-online.de> wrote: > > What is the problem of using Latin-1 (ISO 8859-1) in a programming > language ?
Allowed in Ada.
> In fact it would be a great thing, if you could use UTF-8 in > programming, e.g. Greek letters Alfa and Beta as program variables ? > No need to do any translitteration from any textbok variables.
You can do that in Ada. Whether it makes sense is another question. I usually keep to alphanumerics, but sometimes I use accented letters when I write Ada in a Finnish context, with Finnish names for variables. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

On 03/04/2016 20:24, upsidedown@downunder.com wrote:

> In fact it would be a great thing, if you could use UTF-8 in > programming, e.g. Greek letters Alfa and Beta as program variables ? > No need to do any translitteration from any textbok variables.
You can. In 8th you certainly can, and I don't see why a standard Forth would disallow a UTF-8 name.
On Sun, 03 Apr 2016 05:25:10 -0400, George Neuner
<gneuner2@comcast.net> wrote:

>On Sat, 2 Apr 2016 17:49:07 -0400, rickman <gnuarm@gmail.com> wrote: > >>I know that roughly half the world uses a period to separate the >>fractional part of a number from the integer part. Roughly half the >>world uses a comma for the same purpose. But what about in computer >>languages? A little research showed that Algol was specified to work >>with either. Other computer languages seem to work primarily or >>exclusively with a period (point). >> >>Are there any languages that support both formats without special >>programming? > >If I understand correctly that you're asking about numeric literals in >program source, then I think the answer is COBOL and very possibly >nothing else. > >It's trivial to make a lexer accept either number format, but many >languages use commas as separators for, e.g., function arguments, >array and list elements, etc. ... so also using commas as decimal >marks would cause problems. > >Consider: x = f( 1,9 ); > >Is that one argument or two? Depends on number lexing. Given a >particular lexing, one of the two possible parses is an error - but >which depends on the declaration of f(). This could be extremely >confusing to a programmer. The only way around it is to also change >the argument separator, or to enforce that truly separate values be >separated by whitespace in addition to the separator. > >Pretty much only the whitespace separated sexpr syntax of Lisp (and >Scheme) is immune to the parsing issue ... languages like ML and >Haskel, etc. don't necessarily need commas for function calls, but >they still do use them for other things. > >I'm not aware of any language that specifically addresses this issue >regarding its source code - all that I am familar with simply use the >dot syntax for decimals. Number formats often are addressed as locale >issues for I/O libraries, but not for program source. > >Many compilers now allow Unicode in their source, but programmers >still are actively discouraged from using non-ASCII characters. And >non-English speaking programmers almost are forced to program in >English for portability. English is the lingua franca of programming. >There have been compilers designed specifically for certain locales, >but sans government mandate of their use, none has ever been very >successful. > > >Incidentally, I'm not sure what you saw re: Algol, but neither Algol >60 nor 68 permitted commas as decimal marks - both used the dot >syntax. I don't have a reference for Algol 58, but all the Algols >used commas as argument and array separators, so they all would have >been susceptible to the parsing issue described above.
Cobol has the same issue, so if you use DECIMAL POINT IS COMMA, you need to be more careful with whitespace in some places. But subscripting is not one of them, since only integers are allowed there (I think this might have changed in modern Cobol so that you can have full expressions in subscripts, but I'm not sure). Cobol is full of weird parsing rules.
Am Sun, 03 Apr 2016 06:23:56 +0000 schrieb Anton Ertl:

> A very early version of Gforth also worked that way, but I removed this > feature, because it's more important to be able to get a useful error > message if you use "2,", but the Forth system does not define it. I > have not missed this feature, and I come from the part of the world that > uses decimal comma.
VFX has dp-char and fp-char to store the currently used decimal separator for doubles and floats (actually, a zero-terminated string of a few characters), by default, dp-char is both '.' and ',', so you can enter both forms: VFX Forth for Linux IA32 123,456 d. 123456 ok 123.456 d. 123456 ok fp-char is by default just '.'. Gforth's current development versions also have dp-char and fp-char, but only one possible character; default is ".". This allows to change the character in case you want to read localized numbers. -- Bernd Paysan "If you want it done right, you have to do it yourself" net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ* http://bernd-paysan.de/
Am Sun, 03 Apr 2016 20:24:10 +0300 schrieb upsidedown:

>>Programs need to be able to handle localized formats on input and >>output, >>but not in the source itself. > > What is the problem of using Latin-1 (ISO 8859-1) in a programming > language ? > > In fact it would be a great thing, if you could use UTF-8 in > programming, > e.g. Greek letters Alfa and Beta as program variables ? No need to do > any translitteration from any textbok variables.
Forth-2012 with the Xchar wordset allows all XChar printable characters as part of a Forth word (usually UTF-8). People use that; we recently had an article about a Chinese student team using Forth to win a competition, and one of the two programmers used Chinese for his own definitions. As far as I can test here, all Forth systems I tested have no problems with this approach; just case insensitivity is limited to ASCII (which the standard recommends, we have case insensitivity to write words in lower case which have been standardized in upper case). -- Bernd Paysan "If you want it done right, you have to do it yourself" net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ* http://bernd-paysan.de/

On 04/04/2016 02:14, Bernd Paysan wrote:

> VFX has dp-char and fp-char to store the currently used decimal separator > for doubles and floats (actually, a zero-terminated string of a few > characters), by default, dp-char is both '.' and ',', so you can enter > both forms
8th doesn't distinguish doubles and floats (it just has 'numbers'). But it does let you set the (output) values of the decimal and thousands separators using .# and ,# respectively. You cannot query the current values; they default to . and , One weakness is that for example, outputting numbers for Indian users is more difficult, since 8th has no internal concept of 'locale' and the built-in ,# will only cause 'thousands separators' every three characters, unlike the normal Indian formatting. That would have to be done manually using e.g. "s:strfmt" or other words.

On 04/04/2016 07:18, Ron Aaron wrote:
>
> One weakness is that for example, outputting numbers for Indian users is > more difficult...
Here's one way to do it: http://8th-dev.com/forum/index.php/topic,893.0.html
On Sun, 03 Apr 2016 20:24:10 +0300, upsidedown@downunder.com wrote:

>On Sun, 3 Apr 2016 14:18:56 +0200, Hans-Bernhard Br&#4294967295;ker ><HBBroeker@t-online.de> wrote: > >>Am 02.04.2016 um 23:49 schrieb rickman: >>> I know that roughly half the world uses a period to separate the >>> fractional part of a number from the integer part. Roughly half the >>> world uses a comma for the same purpose. But what about in computer >>> languages? >> >>In computer languages we have barely enough punctuation letters >>available as it is, to express all the necessary things without being >>overly verbose. Wasting one on a luxury item like that would IMHO be >>unjustifiable. > >Some computer languages like COBOL and FORTRAN could be written with 6 >bit codes (like FIELDDATA). > >Unfortunately languages like C and Pascal used characters from ISO-686 >character sets, including characters reserved for national variants. > > >>Programs need to be able to handle localized formats on input and >>output, but not in the source itself. > >What is the problem of using Latin-1 (ISO 8859-1) in a programming >language ? > >In fact it would be a great thing, if you could use UTF-8 in >programming, e.g. Greek letters Alfa and Beta as program variables ? >No need to do any translitteration from any textbok variables.
You've always been able to do that in Java, although it's UCS-2 (old versions) or UTF-16 (current), not UTF-8.
Bernd Paysan <bernd.paysan@gmx.de> writes:
>Am Sun, 03 Apr 2016 20:24:10 +0300 schrieb upsidedown: >> In fact it would be a great thing, if you could use UTF-8 in >> programming, >> e.g. Greek letters Alfa and Beta as program variables ? No need to do >> any translitteration from any textbok variables. > >Forth-2012 with the Xchar wordset allows all XChar printable characters >as part of a Forth word (usually UTF-8).
Actually Forth-2012 falls slightly short of actually guaranteeing that this works, because it does not require that the Forth system supports UTF-8. Fortunately, UTF-8 is designed to be compatible with existing code, so if the system is 8-bit clean and does not do funny things for recognizing white space, it will pretty much work with UTF-8. An example is old versions of Gforth (long before anybody thought about the Xchar wordset), where everything but command-line editing works with UTF-8. An example of UTF-8 variables in Forth (and other languages on the same page) is shown on <http://rosettacode.org/wiki/Unicode_variable_names#Forth>. - anton -- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.forth200x.org/forth200x.html EuroForth 2015: http://www.rigwit.co.uk/EuroForth2015/
Niklas Holsti <niklas.holsti@tidorum.invalid> writes:

>On 16-04-03 20:24 , upsidedown@downunder.com wrote: >> On Sun, 3 Apr 2016 14:18:56 +0200, Hans-Bernhard Br&#4294967295;ker >> <HBBroeker@t-online.de> wrote: >> >> What is the problem of using Latin-1 (ISO 8859-1) in a programming >> language ?
>Allowed in Ada.
>> In fact it would be a great thing, if you could use UTF-8 in >> programming, e.g. Greek letters Alfa and Beta as program variables ? >> No need to do any translitteration from any textbok variables.
>You can do that in Ada. Whether it makes sense is another question.
>I usually keep to alphanumerics, but sometimes I use accented letters >when I write Ada in a Finnish context, with Finnish names for variables.
What Algol68 does makes much more sense. It defines symbols that makes sense for the language. Like "less or equal" (a kind of underscored <) and bold stropping. That makes Algol68 look very good in publications. Relevant to this discussion: the exponent E that we have is a subscripted e in Algol 68. Now how to use those in a particular implementation is implementation defined. There is a burden when copying program sources. We have that problem in Forth too, in particular with meta sources. Maybe it is better to permanently realize that a program always uses a character representation that is somewhat arbitrary and must be part of defining a standard.
>-- >Niklas Holsti
Groetjes Albert -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
The 2026 Embedded Online Conference