Forum: d0p3 BBS

C90+ toequiv()

From Paul Edwards@3:633/10 to All on Sat Nov 15 17:33:04 2025

I am not 100% sure, but I believe some people (Greeks?)
have keyboards such that their native character set can be
freely entered, when they're working in their native language.

And if they are required to work in English, or rather, 7-bit
ASCII, they will "switch keyboards", ie using the mouse or
whatever to select a different keyboard, and type the English,
and then return to the Greek etc keyboard.

I'm interested in a slight change to C90. I'm not interested in
UTF-8 either.

I'd like to write a program using pure ASCII, and indeed, pure
English prompts, but not force a Greek user to switch keyboards.
I'm not interested in a complicated translation layer either.

Originally I was thinking I just need to modify my programs and
the Greek locale so that I could do:

if (toupper(c) == 'X') printf("whatever\n");

And make some random Greek character the equivalent of 'X', ie
the Greek user knows that when prompted to type 'x' (or 'X'), he
just needs to press (lambda or whatever Greeks use). The Greek
locale will convert lambda into X when passed to toupper.

However, it was pointed out to me that this would interfere with
storing filenames on traditional FAT, for example. Not everything
should be subject to uppercasing. The Greek, or Katakana, should
be preserved, not converted into ASCII gibberish.

So I was thinking I need some halfway point of equivalency.

I'm happy to change all my programs so that they don't rely on
the user typing in an exact character. ie I am happy to drop case
sensitivity from everything, "now that I know there's an issue".
Actually there are other environments where case sensitivity is
difficult. e.g. some CMS (mainframe) environments.

And making sure I do toupper() is a way to solve the issue for
the environments where case-sensitivity is difficult/impossible.
(assuming they exist).

But I'd like to go one step further and cater for Greeks etc.

And it seems to me that I want to not change toupper() - which
would be expected to uppercase Greek characters (or some
other language), independent of the uppercasing of any English
characters they happened to enter (potentially at "great effort"
of changing keyboards).

And what I'm really after is being able to designate some Greek
characters as the equivalent of English counterparts in circumstances
where that is appropriate, and there is a desire to avoid a keyboard
change. So a new isequiv() function as an extension to C90. (I'm
basically forking C90 to create a C90+ or C90.0.1 - same as we
do with software - bells and whistles go into C99 etc, not C90.0.1)

Any thoughts?

Thanks. Paul..
a

--- PyGate Linux v1.5
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Mon Nov 17 09:02:05 2025

On 15/11/2025 10:33, Paul Edwards wrote:

I am not 100% sure, but I believe some people (Greeks?)
have keyboards such that their native character set can be
freely entered, when they're working in their native language.

And if they are required to work in English, or rather, 7-bit
ASCII, they will "switch keyboards", ie using the mouse or
whatever to select a different keyboard, and type the English,
and then return to the Greek etc keyboard.

I'm interested in a slight change to C90. I'm not interested in
UTF-8 either.

I'd like to write a program using pure ASCII, and indeed, pure
English prompts, but not force a Greek user to switch keyboards.
I'm not interested in a complicated translation layer either.

Originally I was thinking I just need to modify my programs and
the Greek locale so that I could do:

if (toupper(c) == 'X') printf("whatever\n");

And make some random Greek character the equivalent of 'X', ie
the Greek user knows that when prompted to type 'x' (or 'X'), he
just needs to press (lambda or whatever Greeks use). The Greek
locale will convert lambda into X when passed to toupper.

However, it was pointed out to me that this would interfere with
storing filenames on traditional FAT, for example. Not everything
should be subject to uppercasing. The Greek, or Katakana, should
be preserved, not converted into ASCII gibberish.

So I was thinking I need some halfway point of equivalency.

I'm happy to change all my programs so that they don't rely on
the user typing in an exact character. ie I am happy to drop case
sensitivity from everything, "now that I know there's an issue".
Actually there are other environments where case sensitivity is
difficult. e.g. some CMS (mainframe) environments.

And making sure I do toupper() is a way to solve the issue for
the environments where case-sensitivity is difficult/impossible.
(assuming they exist).

But I'd like to go one step further and cater for Greeks etc.

And it seems to me that I want to not change toupper() - which
would be expected to uppercase Greek characters (or some
other language), independent of the uppercasing of any English
characters they happened to enter (potentially at "great effort"
of changing keyboards).

And what I'm really after is being able to designate some Greek
characters as the equivalent of English counterparts in circumstances
where that is appropriate, and there is a desire to avoid a keyboard
change. So a new isequiv() function as an extension to C90. (I'm
basically forking C90 to create a C90+ or C90.0.1 - same as we
do with software - bells and whistles go into C99 etc, not C90.0.1)

Any thoughts?

Thanks. Paul..
a

It is not at all clear to me what you are asking about here. Indeed, I
do not think it is at all clear to /you/. Have you tried talking
directly to someone who regularly uses a different alphabet - Greek,
Cyrillic, Hebrew, etc.? Try to explain your idea to them and see what
they think.

At the moment it appears that you want to make "toupper" (or some new function) somehow take Greek letters (without using UTF-8) and turn some
Greek lower-case letters into some Latin upper-case letters. But it
should only do that sometimes - not when typing filenames for FAT.

So which Greek letter to you think should be "equivalent" to Latin X ?
Perhaps xi, since it has similar pronunciation to Latin X (in English)? Perhaps chi, since its capital looks very much like a Latin X, though it
has a different pronunciation and the lower-case is noticeably
different. And which Greek letter should be "equivalent" to Q, J, or V?
What should the Greek letters omega and psi convert to?

And what are you going to do with Cyrillic alphabets (in all their
varieties), and Hebrew, Arabic, or the many dozens of alphabets used in
India and south-east Asia?

--- PyGate Linux v1.5
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Mon Nov 17 10:20:40 2025

On 11/15/25 10:33, Paul Edwards wrote:

I am not 100% sure, but I believe some people (Greeks?)
have keyboards such that their native character set can be
freely entered, when they're working in their native language.

In Greece you will typically get keyboards with the Greek letters.

And if they are required to work in English, or rather, 7-bit
ASCII, they will "switch keyboards", ie using the mouse or
whatever to select a different keyboard, and type the English,
and then return to the Greek etc keyboard.

I had once configured a system to use some control-key combination
(like Ctrl-Alt-Shift) to switch between three different languages
(EN, GR, DE).

I'm interested in a slight change to C90. I'm not interested in
UTF-8 either.

You have to map the keys to characters of some specific "codepage".

It sounds to me that you want with an interactive keyboard-layout
change also to switch the underlying character encoding. - To me
that just sounds wrong! - How would a string like "P�" be then
encoded?

The environment that I set up just used UTF-8, a single encoding
for all (in that case just three) languages. That way you could
type (Greek) '�' or (German) '�' or any other character (as far
as it's supported by the system with fonts, etc.).

I'd like to write a program using pure ASCII, and indeed, pure
English prompts, but not force a Greek user to switch keyboards.

I understand it that the "C"-code is as usual ASCII but embedded
strings may be any other character.

Again: How would a string like "P�" be then encoded?

The '�' (like an '�') could stem from ISO 8859-15 (but then it would
be a special case), or from ISO 8859-7 (the native Greek variant of
Latin), or from UTF-8. - You cannot represent these characters by a
single ASCII-character.

I'm not interested in a complicated translation layer either.

What comes below sounds very fuzzy; I certainly don't understand what
you have in mind there, so I cannot really comment on that.

For me, the solution for multi-language programming environment would
not switch character encodings but use a single standard (UTF-8) for
that.

Originally I was thinking I just need to modify my programs and
the Greek locale so that I could do:

if (toupper(c) == 'X') printf("whatever\n");

And make some random Greek character the equivalent of 'X', ie
the Greek user knows that when prompted to type 'x' (or 'X'), he
just needs to press (lambda or whatever Greeks use). The Greek
locale will convert lambda into X when passed to toupper.

Are you looking for an ASCII representation of that (template?) 'X'?
Something like "μ" (Like "µ" for '�' in HTML)?

However, it was pointed out to me that this would interfere with
storing filenames on traditional FAT, for example. Not everything
should be subject to uppercasing. The Greek, or Katakana, should
be preserved, not converted into ASCII gibberish.

You should be aware that on filename level you typically have (on
Unixes) just anonymous octets that need an interpretation to be
displayed. (It may be UCS2 with Windows filesystems; don't know.)

In my Linux/UTF-8 environment my filenames may contain umlauts or
Greek letters

$ touch "P�"
$ ls "P�"
P�

The filename will be stored in octets (values 0..255), where each
non-ASCII character will occupy more than one octet.

Such filenames will only be displayed as "ASCII gibberish" if you
somehow "force" it to be interpreted as pure ASCII.

So I was thinking I need some halfway point of equivalency.

I'm happy to change all my programs so that they don't rely on
the user typing in an exact character. ie I am happy to drop case
sensitivity from everything, "now that I know there's an issue".
Actually there are other environments where case sensitivity is
difficult. e.g. some CMS (mainframe) environments.

And making sure I do toupper() is a way to solve the issue for
the environments where case-sensitivity is difficult/impossible.
(assuming they exist).

Usually this should be handled by the locale setting.

$ awk 'BEGIN {print tolower("�")}'
?
$ awk 'BEGIN {print toupper("?")}'
�

The test above is from my environment (using UTF-8); it works even
*without* setting any Greek locale (I'm using "en_US.UTF-8".)

But I'd like to go one step further and cater for Greeks etc.

And it seems to me that I want to not change toupper() - which
would be expected to uppercase Greek characters (or some
other language), independent of the uppercasing of any English
characters they happened to enter (potentially at "great effort"
of changing keyboards).

And what I'm really after is being able to designate some Greek
characters as the equivalent of English counterparts in circumstances
where that is appropriate,

To me this sounds as fuzzy as wrong.

and there is a desire to avoid a keyboard
change. So a new isequiv() function as an extension to C90. (I'm
basically forking C90 to create a C90+ or C90.0.1 - same as we
do with software - bells and whistles go into C99 etc, not C90.0.1)

Any thoughts?

I cannot comment on your reluctance to use (or requirement to not use)
an underlying UTF-8 encoding, and the rest should be handled by the
locale - if UTF-8 characters are supported and allowed in string and
character literals of the respective programming language; I haven't
tried for "C" (but I seem to have no problems with using any UTF-8
characters in string literals).

I hope you found some useful information and got some more insights.
If, based on that, you can clarify your intentions and thoughts I'd
be interested to hear what you're actually trying to achieve.

Janis

Thanks. Paul..
a

--- PyGate Linux v1.5
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Paul Edwards@3:633/10 to All on Sun Nov 23 08:26:05 2025

"David Brown" <david.brown@hesbynett.no> wrote in message news:10fekpt$mkk8$1@dont-email.me...

It is not at all clear to me what you are asking about here. Indeed, I
do not think it is at all clear to /you/. Have you tried talking
directly to someone who regularly uses a different alphabet - Greek, Cyrillic, Hebrew, etc.? Try to explain your idea to them and see what
they think.

No. This IS my (apparently - next message) attempt to directly
talk to someone.

At the moment it appears that you want to make "toupper" (or some new function) somehow take Greek letters (without using UTF-8) and turn some Greek lower-case letters into some Latin upper-case letters. But it
should only do that sometimes - not when typing filenames for FAT.

Yes. The new function would do that.

toupper() would then be free to continue as-is.

So which Greek letter to you think should be "equivalent" to Latin X ?

Any character will do. locales are flexible like that.

Perhaps xi, since it has similar pronunciation to Latin X (in English)?

Nope - not required. Completely arbitrary. Just the same as it is
completely arbitrary on the keyboard itself. ie they likely have the
QWERTY keyboard (in small letters) under the Greek. Even if
they don't print the English/Latin alphabet, it doesn't matter. An
English touch-typist (like me) knows where the QWERTY keys
are.

Note that the layout of QWERTY itself is completely arbitrary too.

Perhaps chi, since its capital looks very much like a Latin X, though it
has a different pronunciation and the lower-case is noticeably
different. And which Greek letter should be "equivalent" to Q, J, or V?
What should the Greek letters omega and psi convert to?

Out of scope - arbitrary.

And what are you going to do with Cyrillic alphabets (in all their varieties), and Hebrew, Arabic, or the many dozens of alphabets used in
India and south-east Asia?

Same deal - arbitrary where they go on the keyboard. Arbitrary
what their code point is in any character set. Arbitrary which
English/Latin characters they map to.

I'm not trying to break the tradition of different people inventing
different code pages. That's a different problem to solve (and
being bypassed with UTF-8 currently, at the cost of extra
processing time and code - I don't want either of those things).

BFN. Paul.

a

--- PyGate Linux v1.5.1
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Paul Edwards@3:633/10 to All on Sun Nov 23 08:54:14 2025

"Janis Papanagnou" <janis_papanagnou+ng@hotmail.com> wrote in message news:10fepd8$3p5tk$4@dont-email.me...

On 11/15/25 10:33, Paul Edwards wrote:

I am not 100% sure, but I believe some people (Greeks?)
have keyboards such that their native character set can be
freely entered, when they're working in their native language.

In Greece you will typically get keyboards with the Greek letters.

Exactly what I expect. And I'm not trying to change that.

And if they are required to work in English, or rather, 7-bit
ASCII, they will "switch keyboards", ie using the mouse or
whatever to select a different keyboard, and type the English,
and then return to the Greek etc keyboard.

I had once configured a system to use some control-key combination
(like Ctrl-Alt-Shift) to switch between three different languages
(EN, GR, DE).

And assuming someone only knows Greek, I don't want them
to ever switch keyboards.

(I wouldn't want it if it was the other way around - and ASCII
was in fact all Greek).

I'm interested in a slight change to C90. I'm not interested in
UTF-8 either.

You have to map the keys to characters of some specific "codepage".

Yes, any traditional Greek codepage is fine. Or even a
non-traditional Greek codepage. I expect all the Greek
characters to exist between x'80' and x'FF'.

It sounds to me that you want with an interactive keyboard-layout
change also to switch the underlying character encoding.

No - not correct. There won't even BE a keyboard change.
Unless you have someone who speaks more than just Greek.

- To me
that just sounds wrong! - How would a string like "P??" be then
encoded?

However you did it in 1990 when ISO/IEC 9899:1990 was
published.

The environment that I set up just used UTF-8, a single encoding
for all (in that case just three) languages. That way you could
type (Greek) '?' or (German) '?' or any other character (as far
as it's supported by the system with fonts, etc.).

Yes, I'm not interested in the computing power needed to do
that. Nor the burden placed on the display to display Kanji.
I'm only interested in (half-width I think) Katakana. ie the
first Japanese displays for the PC from the 1980s.

I'd like to write a program using pure ASCII, and indeed, pure
English prompts, but not force a Greek user to switch keyboards.

I understand it that the "C"-code is as usual ASCII but embedded
strings may be any other character.

As an "English" C programmer, I do not wish to put in embedded
Greek strings. Nor provide a translation layer for Greek. Nor
have the speed of my program impacted to support Greek
characters.

I don't expect the Greeks to learn English, but I do expect them
to get used to using the software such that they recognize that
when a particular bit of English gibberish like "Enter your name"
appears on the screen, that they know it is time to enter their
(Greek) name. I knew some Chinese people who operated the
ATM in Australia like that - they didn't bother to learn what
the prompts were - they just memorized the sequence. The
user interface changed one day and they couldn't withdraw
money anymore. The executable I provide won't change
unless you change it, and then you'll need to memorize that
new sequence as part of the upgrade - if you have zero
English.

Again: How would a string like "P??" be then encoded?

As per 1980s. I'm not trying to change what you did in the
1980s. And I'm not trying to change the existing "accents".
ie you type ^ and then a "u" and then you get a u with a
hat. (in some countries).

The '?' (like an '?') could stem from ISO 8859-15 (but then it would
be a special case), or from ISO 8859-7 (the native Greek variant of
Latin), or from UTF-8. - You cannot represent these characters by a
single ASCII-character.

I don't want to use ASCII. Nothing a Greek-only person will
ever type will be in ASCII, it will all be a SINGLE value
between x'80' and x'FF'.

I'm not interested in a complicated translation layer either.

What comes below sounds very fuzzy; I certainly don't understand what
you have in mind there, so I cannot really comment on that.

I'm happy to explain.

For me, the solution for multi-language programming environment would
not switch character encodings but use a single standard (UTF-8) for
that.

It's not multi-language - well - not by preference. The end
user would much rather have the prompts for "what is your
name?" in Greek, but that's not on the table.

Originally I was thinking I just need to modify my programs and
the Greek locale so that I could do:

if (toupper(c) == 'X') printf("whatever\n");

And make some random Greek character the equivalent of 'X', ie
the Greek user knows that when prompted to type 'x' (or 'X'), he
just needs to press (lambda or whatever Greeks use). The Greek
locale will convert lambda into X when passed to toupper.

Are you looking for an ASCII representation of that (template?) 'X'? Something like "μ" (Like "µ" for '?' in HTML)?

Yes, an ASCII representation of similar-to-uppercase "micro".

However, it was pointed out to me that this would interfere with
storing filenames on traditional FAT, for example. Not everything
should be subject to uppercasing. The Greek, or Katakana, should
be preserved, not converted into ASCII gibberish.

You should be aware that on filename level you typically have (on
Unixes) just anonymous octets that need an interpretation to be
displayed. (It may be UCS2 with Windows filesystems; don't know.)

In my Linux/UTF-8 environment my filenames may contain umlauts or
Greek letters

$ touch "P??"
$ ls "P??"
P??

The filename will be stored in octets (values 0..255), where each
non-ASCII character will occupy more than one octet.

Such filenames will only be displayed as "ASCII gibberish" if you
somehow "force" it to be interpreted as pure ASCII.

It won't be ASCII or UTF-8. It will be what MSDOS in
Greece did in the 1980s. I won't say "when men were men",
but it's a variation of that.

So I was thinking I need some halfway point of equivalency.

I'm happy to change all my programs so that they don't rely on
the user typing in an exact character. ie I am happy to drop case sensitivity from everything, "now that I know there's an issue".
Actually there are other environments where case sensitivity is
difficult. e.g. some CMS (mainframe) environments.

And making sure I do toupper() is a way to solve the issue for
the environments where case-sensitivity is difficult/impossible.
(assuming they exist).

Usually this should be handled by the locale setting.

Sure. My plan is to implement locales.

$ awk 'BEGIN {print tolower("?")}'
?
$ awk 'BEGIN {print toupper("?")}'
?

The test above is from my environment (using UTF-8); it works even
*without* setting any Greek locale (I'm using "en_US.UTF-8".)

Yes, I don't want that solution.

But I'd like to go one step further and cater for Greeks etc.

And it seems to me that I want to not change toupper() - which
would be expected to uppercase Greek characters (or some
other language), independent of the uppercasing of any English
characters they happened to enter (potentially at "great effort"
of changing keyboards).

And what I'm really after is being able to designate some Greek
characters as the equivalent of English counterparts in circumstances
where that is appropriate,

To me this sounds as fuzzy as wrong.

Can you touch-type? I assume all SBCS users can touch-type
in their language (if they learn). I expect someone who only
knows Greek, to touch-type at full speed - in Greek - even
when using my English-only software.

and there is a desire to avoid a keyboard
change. So a new isequiv() function as an extension to C90. (I'm
basically forking C90 to create a C90+ or C90.0.1 - same as we
do with software - bells and whistles go into C99 etc, not C90.0.1)

Any thoughts?

I cannot comment on your reluctance to use (or requirement to not use)
an underlying UTF-8 encoding, and the rest should be handled by the
locale - if UTF-8 characters are supported and allowed in string and character literals of the respective programming language; I haven't
tried for "C" (but I seem to have no problems with using any UTF-8
characters in string literals).

None of that - including literals in my own software - is what
I am after.

I hope you found some useful information and got some more insights.
If, based on that, you can clarify your intentions and thoughts I'd
be interested to hear what you're actually trying to achieve.

I'm interested in Greek, Greek, Greek.

At no extra processing or space cost to English.

Exactly as it was when the C90 standard was created.

Which came VERY close to supporting exactly that. The
C89 standard was delayed for a year because of Europeans
intending to change C89 to support locales. Instead, the
Americans changed ANSI. And got it NEARLY right.
You can't expect people to get things 100% right. So
with the benefit of 35 years of hindsight, and a C library
and operating system and supporting tools under my
belt, I'm trying to add some MINOR touches to C90.

BFN. Paul.

--- PyGate Linux v1.5.1
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From bart@3:633/10 to All on Sun Nov 23 01:28:06 2025

On 15/11/2025 09:33, Paul Edwards wrote:

I am not 100% sure, but I believe some people (Greeks?)
have keyboards such that their native character set can be
freely entered, when they're working in their native language.

And if they are required to work in English, or rather, 7-bit
ASCII, they will "switch keyboards", ie using the mouse or
whatever to select a different keyboard, and type the English,
and then return to the Greek etc keyboard.

I'm interested in a slight change to C90. I'm not interested in
UTF-8 either.

I'd like to write a program using pure ASCII, and indeed, pure
English prompts, but not force a Greek user to switch keyboards.
I'm not interested in a complicated translation layer either.

Originally I was thinking I just need to modify my programs and
the Greek locale so that I could do:

if (toupper(c) == 'X') printf("whatever\n");

And make some random Greek character the equivalent of 'X', ie
the Greek user knows that when prompted to type 'x' (or 'X'), he
just needs to press (lambda or whatever Greeks use). The Greek
locale will convert lambda into X when passed to toupper.

This is quite a big subject, and it spans many areas such as hardware, drivers, OS, displays, character sets, fonts, as well as programming
languages and applications.

Whatever it is you want to do, I doubt it will be solved with one C
function.

However, it was pointed out to me that this would interfere with
storing filenames on traditional FAT, for example. Not everything
should be subject to uppercasing. The Greek, or Katakana, should
be preserved, not converted into ASCII gibberish.

Are you planning to use 8-bit code-pages?

Those have all sorts of problems which are unlikely to be tolerated
these days.

I suggest you just try and make UTF8 work, even if it's a cut-down
version that only supports the first layer of Unicode, which is
character codes up 64K.

Anything else, even going with 16-bit characters, is going to be
impractical.

So I was thinking I need some halfway point of equivalency.

That doesn't make sense, sorry. Imagine you were working in Greek and
were talking about mapping A-Z to some random Greek latter, but it
didn't matter which!

--- PyGate Linux v1.5.1
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Sun Nov 23 09:37:00 2025

On 11/23/25 01:54, Paul Edwards wrote:

"Janis Papanagnou" <janis_papanagnou+ng@hotmail.com> wrote in message news:10fepd8$3p5tk$4@dont-email.me...

On 11/15/25 10:33, Paul Edwards wrote:

...

...

And assuming someone only knows Greek, I don't want them
to ever switch keyboards.

(I had been speaking just about switching the layout, not the
keyboard. The installation I spoke of used one Greek keyboard
used with three switchable layouts, for EN, GR, DE.)

(In another environment I'm using two keyboards, EN and DE, in
parallel on one system, each one with its own specific layout.)

(I wouldn't want it if it was the other way around - and ASCII
was in fact all Greek).

Not sure what you're trying to say here. ASCII was never Greek,
and ISO 8859-7 contains ASCII as subset and Greek in the upper
range.

...

...

Yes, any traditional Greek codepage is fine. Or even a
non-traditional Greek codepage. I expect all the Greek
characters to exist between x'80' and x'FF'.

I don't know what you mean by "non-/traditional Greek codepage".

(Back these days there was no 7-bit Greek codepage that I knew of
- but I'm not sure - but there was 8-bit [ASCII based/"extended"]
ISO 8859-7 in the late 1980's.)

...
I don't expect the Greeks to learn English, but I do expect them
to get used to using the software such that they recognize that
when a particular bit of English gibberish like "Enter your name"
appears on the screen, that they know it is time to enter their
(Greek) name.

I see. - So something like: Enter your name: �??????

And the binary/encoded representation shall be, say, ISO 8859-7 ?

I'd have expected that old "C" versions supported 8-bit character
sets without any change; I cannot tell for GR, but I've certainly
used DE with ISO 8859-1 and -15 encoded data in "C" (without any
change of the language) that contained umlauts (for example).

(With gcc -std=c90 and LC_ALL=C I can at least create "�lfr��" in
a contemporary system. But that might not be an exact test case,
as you may have in mind.)

...

...

As per 1980s. I'm not trying to change what you did in the
1980s. And I'm not trying to change the existing "accents".
ie you type ^ and then a "u" and then you get a u with a
hat. (in some countries).

(But that is just a configured way of how diacritical characters
can be entered through keyboard. In my systems I can set it with
the locale; using or not using "dead-characters", or some such.)

...

I don't want to use ASCII. Nothing a Greek-only person will
ever type will be in ASCII, it will all be a SINGLE value
between x'80' and x'FF'.

Yes, with ISO Latin 7 you'll have 8 bit available (as opposed to
7-bit ASCII). So if your system is supporting an 8-bit ISO 8859-7
character set that should probably work already; doesn't it?

...

It's not multi-language - well - not by preference. The end
user would much rather have the prompts for "what is your
name?" in Greek, but that's not on the table.

(Using a character encoding that supports two complete languages
is "multi-language", actually.)

Originally I was thinking I just need to modify my programs and
the Greek locale so that I could do:

if (toupper(c) == 'X') printf("whatever\n");

And make some random Greek character the equivalent of 'X', ie
the Greek user knows that when prompted to type 'x' (or 'X'), he
just needs to press (lambda or whatever Greeks use). The Greek
locale will convert lambda into X when passed to toupper.

Are you looking for an ASCII representation of that (template?) 'X'?
Something like "μ" (Like "µ" for '�' in HTML)?

Yes, an ASCII representation of similar-to-uppercase "micro".

It's still unclear to me. - Above I got the impression you'd want
Enter your name: �??????
now it reads as if you want just a (7-bit) ASCII encoding, as in
Enter your name: Giannis
In both cases upper-casing should be possible, though.

(I still don't see where you actually see or have the problem.)

...

...

It won't be ASCII or UTF-8. It will be what MSDOS in
Greece did in the 1980s.

I cannot tell anything about DOS in Greek in the 1980's. As far as
memory serves, I know that (DOS based) Windows 95 could be used in
Greece with Greek keyboards and letters. (It should be possible to
Google what codepage Windows used back these days. I'm positive
that MS did not do any rocket-science back then in that respect.)

...
Can you touch-type? I assume all SBCS users can touch-type
in their language (if they learn). I expect someone who only
knows Greek, to touch-type at full speed - in Greek - even
when using my English-only software.

Yes. - And...? (Not sure what that has to do with the requirements.)

...

I'm interested in Greek, Greek, Greek.

At no extra processing or space cost to English.

Exactly as it was when the C90 standard was created.

Which came VERY close to supporting exactly that. The
C89 standard was delayed for a year because of Europeans
intending to change C89 to support locales. Instead, the
Americans changed ANSI. And got it NEARLY right.
You can't expect people to get things 100% right. So
with the benefit of 35 years of hindsight, and a C library
and operating system and supporting tools under my
belt, I'm trying to add some MINOR touches to C90.

I cannot test; my expectation is that you can do what's described
above already with a Greek Latin 7 encoding natively in C90 as was
(most probably) done in Greece back in the late 1980's.

One thing *might* be crucial where you said:

Yes, an ASCII representation of similar-to-uppercase "micro".

But I didn't understand what you wanted here, so I'll bite.

Janis

--- PyGate Linux v1.5.1
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Sun Nov 23 13:23:30 2025

On 23/11/2025 01:26, Paul Edwards wrote:

"David Brown" <david.brown@hesbynett.no> wrote in message news:10fekpt$mkk8$1@dont-email.me...

It is not at all clear to me what you are asking about here. Indeed, I
do not think it is at all clear to /you/. Have you tried talking
directly to someone who regularly uses a different alphabet - Greek,
Cyrillic, Hebrew, etc.? Try to explain your idea to them and see what
they think.

No. This IS my (apparently - next message) attempt to directly
talk to someone.

Perhaps you should start by /talking/ to someone who works with multiple languages and alphabets. By "talk" I mean "talk" - in person - not
Usenet posts. You need to be discussing this with people in the same
room, with hand-waving, white board scribbles, trying out sample
programs (from the user's viewpoint - the code is irrelevant). You need
to talk to people who are fluent in English (so that you understand
them) and Greek or other languages, and who also know, help and interact
with people who /only/ speak Greek (or whatever) and not English. You
can't communicate with the people you are trying to help here, so at
least talk to people who already help those users.

And you have to start by asking them "Is there a problem here? How
could we make things easier for you?".

You seem to have started by imagining some kinds of users, imagining the
kinds of challenges they have, and imagining a solution for those
challenges with the priority firmly on what fits your pet "sort-of-C90" concept rather than thinking about what these imaginary users might want.

Talking here to Janis is better than sitting by yourself and imagining everything, but you have to start by thinking about the right questions.
Start at the right end - find out if there is a problem, what that
problem is, and how the user's experience can be improved. Don't bother asking Janis about his experiences - he doesn't have a problem because
he can switch keyboard layouts and has better English language skills
than most English language natives. (I'm assuming here that Janis is
Greek by upbringing - perhaps he is only Greek by his family roots.)
You have to ask Janis if his great aunty has trouble interacting with
software written only for English speakers, and asking how you could
make things easier for her.

--- PyGate Linux v1.5.1
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Paul Edwards@3:633/10 to All on Mon Nov 24 01:45:37 2025

"Janis Papanagnou" <janis_papanagnou+ng@hotmail.com> wrote in message news:10fuh3c$100up$1@dont-email.me...

On 11/23/25 01:54, Paul Edwards wrote:

"Janis Papanagnou" <janis_papanagnou+ng@hotmail.com> wrote in message news:10fepd8$3p5tk$4@dont-email.me...

On 11/15/25 10:33, Paul Edwards wrote:

...

...

And assuming someone only knows Greek, I don't want them
to ever switch keyboards.

(I had been speaking just about switching the layout, not the
keyboard. The installation I spoke of used one Greek keyboard
used with three switchable layouts, for EN, GR, DE.)

Sorry - I used the wrong word (I'm not fluent in terminology).

I don't want a Greek user to ever switch LAYOUTS, even
when using MY English-only software.

Also note - what I am trying to do is more "icing on the cake"
than "base functionality". As you noted - people can already
enter their name in Greek.

What they can't do is "press X to continue".

(In another environment I'm using two keyboards, EN and DE, in
parallel on one system, each one with its own specific layout.)

(I wouldn't want it if it was the other way around - and ASCII
was in fact all Greek).

Not sure what you're trying to say here. ASCII was never Greek,
and ISO 8859-7 contains ASCII as subset and Greek in the upper
range.

If the Greeks had invented computers (and had a population
of 90% of the globe), there is little doubt that the bottom
128 code points would be Greek characters, and English
characters would be in the top 128.

The prompt "enter your name" would be in Greek.

I don't wish to learn Greek, as I'm in that 10% of the globe,
and I'm not a programmer.

Yes, any traditional Greek codepage is fine. Or even a
non-traditional Greek codepage. I expect all the Greek
characters to exist between x'80' and x'FF'.

I don't know what you mean by "non-/traditional Greek codepage".

A traditional Greek codepage is one from the late 1980s.
You named an ISO one - probably that.

A non-traditional Greek codepage is one that I just made
up right now. I'm going to put the Greek "micro" at code
point x'F3'. For no particular reason.

Or maybe the reason is that I want to store the box-drawing
characters, plus some other things, and ISO doesn't have
something that does exactly what I want.

(Back these days there was no 7-bit Greek codepage that I knew of
- but I'm not sure - but there was 8-bit [ASCII based/"extended"]
ISO 8859-7 in the late 1980's.)

Yes - what (some) people call "extended ASCII".

I don't expect the Greeks to learn English, but I do expect them
to get used to using the software such that they recognize that
when a particular bit of English gibberish like "Enter your name"
appears on the screen, that they know it is time to enter their
(Greek) name.

I see. - So something like: Enter your name: ???????

And the binary/encoded representation shall be, say, ISO 8859-7 ?

Yes - correct. That all works already.

The C89/C90 people got MOST stuff working already. All
the blatantly obvious stuff works already.

They didn't have the luxury of 35 years of thinking about
the same topic.

I'd have expected that old "C" versions supported 8-bit character
sets without any change; I cannot tell for GR, but I've certainly
used DE with ISO 8859-1 and -15 encoded data in "C" (without any
change of the language) that contained umlauts (for example).

Sure.

(With gcc -std=c90 and LC_ALL=C I can at least create "?lfr??" in
a contemporary system. But that might not be an exact test case,
as you may have in mind.)

Sure - that all works.

As per 1980s. I'm not trying to change what you did in the
1980s. And I'm not trying to change the existing "accents".
ie you type ^ and then a "u" and then you get a u with a
hat. (in some countries).

(But that is just a configured way of how diacritical characters
can be entered through keyboard. In my systems I can set it with
the locale; using or not using "dead-characters", or some such.)

Sure - I'm not interested in how that is done.

I don't want to use ASCII. Nothing a Greek-only person will
ever type will be in ASCII, it will all be a SINGLE value
between x'80' and x'FF'.

Yes, with ISO Latin 7 you'll have 8 bit available (as opposed to
7-bit ASCII). So if your system is supporting an 8-bit ISO 8859-7
character set that should probably work already; doesn't it?

Yes - almost.

The problem comes about in my English-only programs - that
support Greek - mostly - by C90 accident.

Assuming setlocale(LC_ALL, ""); is in effect, anyway.

It's not multi-language - well - not by preference. The end
user would much rather have the prompts for "what is your
name?" in Greek, but that's not on the table.

(Using a character encoding that supports two complete languages
is "multi-language", actually.)

Ok.

Originally I was thinking I just need to modify my programs and
the Greek locale so that I could do:

if (toupper(c) == 'X') printf("whatever\n");

And make some random Greek character the equivalent of 'X', ie
the Greek user knows that when prompted to type 'x' (or 'X'), he
just needs to press (lambda or whatever Greeks use). The Greek
locale will convert lambda into X when passed to toupper.

Are you looking for an ASCII representation of that (template?) 'X'?
Something like "μ" (Like "µ" for '?' in HTML)?

Yes, an ASCII representation of similar-to-uppercase "micro".

It's still unclear to me. - Above I got the impression you'd want
Enter your name: ???????
now it reads as if you want just a (7-bit) ASCII encoding, as in
Enter your name: Giannis
In both cases upper-casing should be possible, though.

No - none of that is correct. I want the 8-bit Greek characters.

(I still don't see where you actually see or have the problem.)

Ok, so here's another one.

As a Greek user, I want to be able to type:

dir /od

in Greek gibberish.

I literally bring an OS to the table - PDOS (see pdos.org).

I have no intention of creating a Greek translation of "dir".

But I would also like the Greek user to have the OPTION
of NOT changing keyboard layouts, and STILL be able to
use PDOS.

What I can do is uppercase the string "dir /od", and then
in my code I can do strcmp() with "DIR".

That covers ME PERSONALLY.

It doesn't cover a Greek person - they would have to
change layouts.

They can't just press the Greek character at location "d"
on the US ASCII keyboard, which will generate a "micro"
or whatever.

This is what I want to avoid.

I'm happy to change all MY software (decades worth), to
ensure that there is no case sensitivity, by making sure that
all strings are uppercased.

But that won't cover the Greeks, who will be forced to
type either lowercase English or uppercase English.

I don't want them to have to do either of those.

And nor do I want to do anything in my own software that
is much more onerous than "toupper".

And now I want to bridge that gap, by replacing toupper
with toequiv.

On MY system - because I use the "C" locale - toupper and
toequiv will be identical.

But for the Greek locale invoked with setlocale "" - the toequiv
will behave differently from toupper. The 7-bit ASCII will be
the same - mapping 'd' to 'D'. But the high characters will have
some Greek characters, e.g. "micro", ALSO mapping to 'D'.

So as an OPTION you can type "dir" really fast (touch-typing)
WITHOUT changing keyboards.

And without ME doing ANY work except the minor "cultural
change" of using toequiv instead of toupper.

I'm happy to change my own "culture" *to this extent*.

It won't be ASCII or UTF-8. It will be what MSDOS in
Greece did in the 1980s.

I cannot tell anything about DOS in Greek in the 1980's. As far as
memory serves, I know that (DOS based) Windows 95 could be used in
Greece with Greek keyboards and letters. (It should be possible to
Google what codepage Windows used back these days. I'm positive
that MS did not do any rocket-science back then in that respect.)

Sure. Everything mostly works already, except for the fact
that you are forced to change layout - to enter a language
"you" don't even know (English), to run software you were
"forced" to run, written by me..

Can you touch-type? I assume all SBCS users can touch-type
in their language (if they learn). I expect someone who only
knows Greek, to touch-type at full speed - in Greek - even
when using my English-only software.

Yes. - And...? (Not sure what that has to do with the requirements.)

And so I don't want layout-switches to interfere with the speed.

I don't have to do layout changes myself. I expect a Greek version of
me to be equally as fast as me.

I'm interested in Greek, Greek, Greek.

At no extra processing or space cost to English.

Exactly as it was when the C90 standard was created.

Which came VERY close to supporting exactly that. The
C89 standard was delayed for a year because of Europeans
intending to change C89 to support locales. Instead, the
Americans changed ANSI. And got it NEARLY right.
You can't expect people to get things 100% right. So
with the benefit of 35 years of hindsight, and a C library
and operating system and supporting tools under my
belt, I'm trying to add some MINOR touches to C90.

I cannot test; my expectation is that you can do what's described
above already with a Greek Latin 7 encoding natively in C90 as was
(most probably) done in Greece back in the late 1980's.

All things are indeed possible if you are happy to change
layout constantly between English and Greek.

Even if you don't even know English.

One thing *might* be crucial where you said:

Yes, an ASCII representation of similar-to-uppercase "micro".

But I didn't understand what you wanted here, so I'll bite.

Some arbitrary Greek character, is the equivalent of uppercase
'D', and another one is the equivalent of uppercase 'X', etc for
all the characters I could possibly "force" a Greek user to type
in my pure C90+ program.

I hope that makes sense. If it doesn't, let me know and I'll
have another attempt.

BFN. Paul.

--- PyGate Linux v1.5.1
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

Who's Online
Recent Visitors
- John F Kennedy
  Thu Nov 20 14:53:19 2025
  from crazyworldbbs.com:2323 via Telnet
- Guest
  Sat Nov 22 17:37:30 2025
  from Meh. Nah via Telnet
- Guest
  Wed Nov 26 06:46:07 2025
  from Gremlintown, Az via Telnet
- Guest
  Thu Nov 27 12:02:51 2025
  from Gremlintown, Az via Raw

System Info

Sysop:	Tetrazocine
Location:	Melbourne, VIC, Australia
Users:	14
Nodes:	8 (0 / 8)
Uptime:	238:03:24
Calls:	184
Files:	21,502
Messages:	82,427

C90+ toequiv()

Who's Online

Recent Visitors

System Info