• C90+ toequiv()

    From Paul Edwards@3:633/10 to All on Sat Nov 15 17:33:04 2025
    I am not 100% sure, but I believe some people (Greeks?)
    have keyboards such that their native character set can be
    freely entered, when they're working in their native language.

    And if they are required to work in English, or rather, 7-bit
    ASCII, they will "switch keyboards", ie using the mouse or
    whatever to select a different keyboard, and type the English,
    and then return to the Greek etc keyboard.

    I'm interested in a slight change to C90. I'm not interested in
    UTF-8 either.

    I'd like to write a program using pure ASCII, and indeed, pure
    English prompts, but not force a Greek user to switch keyboards.
    I'm not interested in a complicated translation layer either.

    Originally I was thinking I just need to modify my programs and
    the Greek locale so that I could do:

    if (toupper(c) == 'X') printf("whatever\n");

    And make some random Greek character the equivalent of 'X', ie
    the Greek user knows that when prompted to type 'x' (or 'X'), he
    just needs to press (lambda or whatever Greeks use). The Greek
    locale will convert lambda into X when passed to toupper.

    However, it was pointed out to me that this would interfere with
    storing filenames on traditional FAT, for example. Not everything
    should be subject to uppercasing. The Greek, or Katakana, should
    be preserved, not converted into ASCII gibberish.

    So I was thinking I need some halfway point of equivalency.

    I'm happy to change all my programs so that they don't rely on
    the user typing in an exact character. ie I am happy to drop case
    sensitivity from everything, "now that I know there's an issue".
    Actually there are other environments where case sensitivity is
    difficult. e.g. some CMS (mainframe) environments.

    And making sure I do toupper() is a way to solve the issue for
    the environments where case-sensitivity is difficult/impossible.
    (assuming they exist).

    But I'd like to go one step further and cater for Greeks etc.

    And it seems to me that I want to not change toupper() - which
    would be expected to uppercase Greek characters (or some
    other language), independent of the uppercasing of any English
    characters they happened to enter (potentially at "great effort"
    of changing keyboards).

    And what I'm really after is being able to designate some Greek
    characters as the equivalent of English counterparts in circumstances
    where that is appropriate, and there is a desire to avoid a keyboard
    change. So a new isequiv() function as an extension to C90. (I'm
    basically forking C90 to create a C90+ or C90.0.1 - same as we
    do with software - bells and whistles go into C99 etc, not C90.0.1)

    Any thoughts?

    Thanks. Paul..
    a



    --- PyGate Linux v1.5
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Mon Nov 17 09:02:05 2025
    On 15/11/2025 10:33, Paul Edwards wrote:
    I am not 100% sure, but I believe some people (Greeks?)
    have keyboards such that their native character set can be
    freely entered, when they're working in their native language.

    And if they are required to work in English, or rather, 7-bit
    ASCII, they will "switch keyboards", ie using the mouse or
    whatever to select a different keyboard, and type the English,
    and then return to the Greek etc keyboard.

    I'm interested in a slight change to C90. I'm not interested in
    UTF-8 either.

    I'd like to write a program using pure ASCII, and indeed, pure
    English prompts, but not force a Greek user to switch keyboards.
    I'm not interested in a complicated translation layer either.

    Originally I was thinking I just need to modify my programs and
    the Greek locale so that I could do:

    if (toupper(c) == 'X') printf("whatever\n");

    And make some random Greek character the equivalent of 'X', ie
    the Greek user knows that when prompted to type 'x' (or 'X'), he
    just needs to press (lambda or whatever Greeks use). The Greek
    locale will convert lambda into X when passed to toupper.

    However, it was pointed out to me that this would interfere with
    storing filenames on traditional FAT, for example. Not everything
    should be subject to uppercasing. The Greek, or Katakana, should
    be preserved, not converted into ASCII gibberish.

    So I was thinking I need some halfway point of equivalency.

    I'm happy to change all my programs so that they don't rely on
    the user typing in an exact character. ie I am happy to drop case
    sensitivity from everything, "now that I know there's an issue".
    Actually there are other environments where case sensitivity is
    difficult. e.g. some CMS (mainframe) environments.

    And making sure I do toupper() is a way to solve the issue for
    the environments where case-sensitivity is difficult/impossible.
    (assuming they exist).

    But I'd like to go one step further and cater for Greeks etc.

    And it seems to me that I want to not change toupper() - which
    would be expected to uppercase Greek characters (or some
    other language), independent of the uppercasing of any English
    characters they happened to enter (potentially at "great effort"
    of changing keyboards).

    And what I'm really after is being able to designate some Greek
    characters as the equivalent of English counterparts in circumstances
    where that is appropriate, and there is a desire to avoid a keyboard
    change. So a new isequiv() function as an extension to C90. (I'm
    basically forking C90 to create a C90+ or C90.0.1 - same as we
    do with software - bells and whistles go into C99 etc, not C90.0.1)

    Any thoughts?

    Thanks. Paul..
    a



    It is not at all clear to me what you are asking about here. Indeed, I
    do not think it is at all clear to /you/. Have you tried talking
    directly to someone who regularly uses a different alphabet - Greek,
    Cyrillic, Hebrew, etc.? Try to explain your idea to them and see what
    they think.

    At the moment it appears that you want to make "toupper" (or some new function) somehow take Greek letters (without using UTF-8) and turn some
    Greek lower-case letters into some Latin upper-case letters. But it
    should only do that sometimes - not when typing filenames for FAT.

    So which Greek letter to you think should be "equivalent" to Latin X ?
    Perhaps xi, since it has similar pronunciation to Latin X (in English)? Perhaps chi, since its capital looks very much like a Latin X, though it
    has a different pronunciation and the lower-case is noticeably
    different. And which Greek letter should be "equivalent" to Q, J, or V?
    What should the Greek letters omega and psi convert to?

    And what are you going to do with Cyrillic alphabets (in all their
    varieties), and Hebrew, Arabic, or the many dozens of alphabets used in
    India and south-east Asia?



    --- PyGate Linux v1.5
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Mon Nov 17 10:20:40 2025
    On 11/15/25 10:33, Paul Edwards wrote:
    I am not 100% sure, but I believe some people (Greeks?)
    have keyboards such that their native character set can be
    freely entered, when they're working in their native language.

    In Greece you will typically get keyboards with the Greek letters.

    And if they are required to work in English, or rather, 7-bit
    ASCII, they will "switch keyboards", ie using the mouse or
    whatever to select a different keyboard, and type the English,
    and then return to the Greek etc keyboard.

    I had once configured a system to use some control-key combination
    (like Ctrl-Alt-Shift) to switch between three different languages
    (EN, GR, DE).

    I'm interested in a slight change to C90. I'm not interested in
    UTF-8 either.

    You have to map the keys to characters of some specific "codepage".

    It sounds to me that you want with an interactive keyboard-layout
    change also to switch the underlying character encoding. - To me
    that just sounds wrong! - How would a string like "Pæ„" be then
    encoded?

    The environment that I set up just used UTF-8, a single encoding
    for all (in that case just three) languages. That way you could
    type (Greek) 'æ' or (German) '„' or any other character (as far
    as it's supported by the system with fonts, etc.).

    I'd like to write a program using pure ASCII, and indeed, pure
    English prompts, but not force a Greek user to switch keyboards.

    I understand it that the "C"-code is as usual ASCII but embedded
    strings may be any other character.

    Again: How would a string like "Pæ„" be then encoded?

    The 'æ' (like an '„') could stem from ISO 8859-15 (but then it would
    be a special case), or from ISO 8859-7 (the native Greek variant of
    Latin), or from UTF-8. - You cannot represent these characters by a
    single ASCII-character.

    I'm not interested in a complicated translation layer either.

    What comes below sounds very fuzzy; I certainly don't understand what
    you have in mind there, so I cannot really comment on that.

    For me, the solution for multi-language programming environment would
    not switch character encodings but use a single standard (UTF-8) for
    that.


    Originally I was thinking I just need to modify my programs and
    the Greek locale so that I could do:

    if (toupper(c) == 'X') printf("whatever\n");

    And make some random Greek character the equivalent of 'X', ie
    the Greek user knows that when prompted to type 'x' (or 'X'), he
    just needs to press (lambda or whatever Greeks use). The Greek
    locale will convert lambda into X when passed to toupper.

    Are you looking for an ASCII representation of that (template?) 'X'?
    Something like "&mu;" (Like "&micro;" for 'æ' in HTML)?


    However, it was pointed out to me that this would interfere with
    storing filenames on traditional FAT, for example. Not everything
    should be subject to uppercasing. The Greek, or Katakana, should
    be preserved, not converted into ASCII gibberish.

    You should be aware that on filename level you typically have (on
    Unixes) just anonymous octets that need an interpretation to be
    displayed. (It may be UCS2 with Windows filesystems; don't know.)

    In my Linux/UTF-8 environment my filenames may contain umlauts or
    Greek letters

    $ touch "Pæ„"
    $ ls "Pæ„"
    Pæ„

    The filename will be stored in octets (values 0..255), where each
    non-ASCII character will occupy more than one octet.

    Such filenames will only be displayed as "ASCII gibberish" if you
    somehow "force" it to be interpreted as pure ASCII.


    So I was thinking I need some halfway point of equivalency.

    I'm happy to change all my programs so that they don't rely on
    the user typing in an exact character. ie I am happy to drop case
    sensitivity from everything, "now that I know there's an issue".
    Actually there are other environments where case sensitivity is
    difficult. e.g. some CMS (mainframe) environments.

    And making sure I do toupper() is a way to solve the issue for
    the environments where case-sensitivity is difficult/impossible.
    (assuming they exist).

    Usually this should be handled by the locale setting.

    $ awk 'BEGIN {print tolower("â")}'
    ?
    $ awk 'BEGIN {print toupper("?")}'
    â

    The test above is from my environment (using UTF-8); it works even
    *without* setting any Greek locale (I'm using "en_US.UTF-8".)


    But I'd like to go one step further and cater for Greeks etc.

    And it seems to me that I want to not change toupper() - which
    would be expected to uppercase Greek characters (or some
    other language), independent of the uppercasing of any English
    characters they happened to enter (potentially at "great effort"
    of changing keyboards).

    And what I'm really after is being able to designate some Greek
    characters as the equivalent of English counterparts in circumstances
    where that is appropriate,

    To me this sounds as fuzzy as wrong.

    and there is a desire to avoid a keyboard
    change. So a new isequiv() function as an extension to C90. (I'm
    basically forking C90 to create a C90+ or C90.0.1 - same as we
    do with software - bells and whistles go into C99 etc, not C90.0.1)

    Any thoughts?

    I cannot comment on your reluctance to use (or requirement to not use)
    an underlying UTF-8 encoding, and the rest should be handled by the
    locale - if UTF-8 characters are supported and allowed in string and
    character literals of the respective programming language; I haven't
    tried for "C" (but I seem to have no problems with using any UTF-8
    characters in string literals).

    I hope you found some useful information and got some more insights.
    If, based on that, you can clarify your intentions and thoughts I'd
    be interested to hear what you're actually trying to achieve.

    Janis


    Thanks. Paul..
    a




    --- PyGate Linux v1.5
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Paul Edwards@3:633/10 to All on Sun Nov 23 08:26:05 2025
    "David Brown" <david.brown@hesbynett.no> wrote in message news:10fekpt$mkk8$1@dont-email.me...

    It is not at all clear to me what you are asking about here. Indeed, I
    do not think it is at all clear to /you/. Have you tried talking
    directly to someone who regularly uses a different alphabet - Greek, Cyrillic, Hebrew, etc.? Try to explain your idea to them and see what
    they think.

    No. This IS my (apparently - next message) attempt to directly
    talk to someone.

    At the moment it appears that you want to make "toupper" (or some new function) somehow take Greek letters (without using UTF-8) and turn some Greek lower-case letters into some Latin upper-case letters. But it
    should only do that sometimes - not when typing filenames for FAT.

    Yes. The new function would do that.

    toupper() would then be free to continue as-is.

    So which Greek letter to you think should be "equivalent" to Latin X ?

    Any character will do. locales are flexible like that.

    Perhaps xi, since it has similar pronunciation to Latin X (in English)?

    Nope - not required. Completely arbitrary. Just the same as it is
    completely arbitrary on the keyboard itself. ie they likely have the
    QWERTY keyboard (in small letters) under the Greek. Even if
    they don't print the English/Latin alphabet, it doesn't matter. An
    English touch-typist (like me) knows where the QWERTY keys
    are.

    Note that the layout of QWERTY itself is completely arbitrary too.

    Perhaps chi, since its capital looks very much like a Latin X, though it
    has a different pronunciation and the lower-case is noticeably
    different. And which Greek letter should be "equivalent" to Q, J, or V?
    What should the Greek letters omega and psi convert to?

    Out of scope - arbitrary.

    And what are you going to do with Cyrillic alphabets (in all their varieties), and Hebrew, Arabic, or the many dozens of alphabets used in
    India and south-east Asia?

    Same deal - arbitrary where they go on the keyboard. Arbitrary
    what their code point is in any character set. Arbitrary which
    English/Latin characters they map to.

    I'm not trying to break the tradition of different people inventing
    different code pages. That's a different problem to solve (and
    being bypassed with UTF-8 currently, at the cost of extra
    processing time and code - I don't want either of those things).

    BFN. Paul.

    a



    --- PyGate Linux v1.5.1
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Paul Edwards@3:633/10 to All on Sun Nov 23 08:54:14 2025
    "Janis Papanagnou" <janis_papanagnou+ng@hotmail.com> wrote in message news:10fepd8$3p5tk$4@dont-email.me...
    On 11/15/25 10:33, Paul Edwards wrote:
    I am not 100% sure, but I believe some people (Greeks?)
    have keyboards such that their native character set can be
    freely entered, when they're working in their native language.

    In Greece you will typically get keyboards with the Greek letters.

    Exactly what I expect. And I'm not trying to change that.

    And if they are required to work in English, or rather, 7-bit
    ASCII, they will "switch keyboards", ie using the mouse or
    whatever to select a different keyboard, and type the English,
    and then return to the Greek etc keyboard.

    I had once configured a system to use some control-key combination
    (like Ctrl-Alt-Shift) to switch between three different languages
    (EN, GR, DE).

    And assuming someone only knows Greek, I don't want them
    to ever switch keyboards.

    (I wouldn't want it if it was the other way around - and ASCII
    was in fact all Greek).

    I'm interested in a slight change to C90. I'm not interested in
    UTF-8 either.

    You have to map the keys to characters of some specific "codepage".

    Yes, any traditional Greek codepage is fine. Or even a
    non-traditional Greek codepage. I expect all the Greek
    characters to exist between x'80' and x'FF'.

    It sounds to me that you want with an interactive keyboard-layout
    change also to switch the underlying character encoding.

    No - not correct. There won't even BE a keyboard change.
    Unless you have someone who speaks more than just Greek.

    - To me
    that just sounds wrong! - How would a string like "P??" be then
    encoded?

    However you did it in 1990 when ISO/IEC 9899:1990 was
    published.

    The environment that I set up just used UTF-8, a single encoding
    for all (in that case just three) languages. That way you could
    type (Greek) '?' or (German) '?' or any other character (as far
    as it's supported by the system with fonts, etc.).

    Yes, I'm not interested in the computing power needed to do
    that. Nor the burden placed on the display to display Kanji.
    I'm only interested in (half-width I think) Katakana. ie the
    first Japanese displays for the PC from the 1980s.

    I'd like to write a program using pure ASCII, and indeed, pure
    English prompts, but not force a Greek user to switch keyboards.

    I understand it that the "C"-code is as usual ASCII but embedded
    strings may be any other character.

    As an "English" C programmer, I do not wish to put in embedded
    Greek strings. Nor provide a translation layer for Greek. Nor
    have the speed of my program impacted to support Greek
    characters.

    I don't expect the Greeks to learn English, but I do expect them
    to get used to using the software such that they recognize that
    when a particular bit of English gibberish like "Enter your name"
    appears on the screen, that they know it is time to enter their
    (Greek) name. I knew some Chinese people who operated the
    ATM in Australia like that - they didn't bother to learn what
    the prompts were - they just memorized the sequence. The
    user interface changed one day and they couldn't withdraw
    money anymore. The executable I provide won't change
    unless you change it, and then you'll need to memorize that
    new sequence as part of the upgrade - if you have zero
    English.

    Again: How would a string like "P??" be then encoded?

    As per 1980s. I'm not trying to change what you did in the
    1980s. And I'm not trying to change the existing "accents".
    ie you type ^ and then a "u" and then you get a u with a
    hat. (in some countries).

    The '?' (like an '?') could stem from ISO 8859-15 (but then it would
    be a special case), or from ISO 8859-7 (the native Greek variant of
    Latin), or from UTF-8. - You cannot represent these characters by a
    single ASCII-character.

    I don't want to use ASCII. Nothing a Greek-only person will
    ever type will be in ASCII, it will all be a SINGLE value
    between x'80' and x'FF'.

    I'm not interested in a complicated translation layer either.

    What comes below sounds very fuzzy; I certainly don't understand what
    you have in mind there, so I cannot really comment on that.

    I'm happy to explain.

    For me, the solution for multi-language programming environment would
    not switch character encodings but use a single standard (UTF-8) for
    that.

    It's not multi-language - well - not by preference. The end
    user would much rather have the prompts for "what is your
    name?" in Greek, but that's not on the table.

    Originally I was thinking I just need to modify my programs and
    the Greek locale so that I could do:

    if (toupper(c) == 'X') printf("whatever\n");

    And make some random Greek character the equivalent of 'X', ie
    the Greek user knows that when prompted to type 'x' (or 'X'), he
    just needs to press (lambda or whatever Greeks use). The Greek
    locale will convert lambda into X when passed to toupper.

    Are you looking for an ASCII representation of that (template?) 'X'? Something like "&mu;" (Like "&micro;" for '?' in HTML)?

    Yes, an ASCII representation of similar-to-uppercase "micro".

    However, it was pointed out to me that this would interfere with
    storing filenames on traditional FAT, for example. Not everything
    should be subject to uppercasing. The Greek, or Katakana, should
    be preserved, not converted into ASCII gibberish.

    You should be aware that on filename level you typically have (on
    Unixes) just anonymous octets that need an interpretation to be
    displayed. (It may be UCS2 with Windows filesystems; don't know.)

    In my Linux/UTF-8 environment my filenames may contain umlauts or
    Greek letters

    $ touch "P??"
    $ ls "P??"
    P??

    The filename will be stored in octets (values 0..255), where each
    non-ASCII character will occupy more than one octet.

    Such filenames will only be displayed as "ASCII gibberish" if you
    somehow "force" it to be interpreted as pure ASCII.

    It won't be ASCII or UTF-8. It will be what MSDOS in
    Greece did in the 1980s. I won't say "when men were men",
    but it's a variation of that.

    So I was thinking I need some halfway point of equivalency.

    I'm happy to change all my programs so that they don't rely on
    the user typing in an exact character. ie I am happy to drop case sensitivity from everything, "now that I know there's an issue".
    Actually there are other environments where case sensitivity is
    difficult. e.g. some CMS (mainframe) environments.

    And making sure I do toupper() is a way to solve the issue for
    the environments where case-sensitivity is difficult/impossible.
    (assuming they exist).

    Usually this should be handled by the locale setting.

    Sure. My plan is to implement locales.

    $ awk 'BEGIN {print tolower("?")}'
    ?
    $ awk 'BEGIN {print toupper("?")}'
    ?

    The test above is from my environment (using UTF-8); it works even
    *without* setting any Greek locale (I'm using "en_US.UTF-8".)

    Yes, I don't want that solution.

    But I'd like to go one step further and cater for Greeks etc.

    And it seems to me that I want to not change toupper() - which
    would be expected to uppercase Greek characters (or some
    other language), independent of the uppercasing of any English
    characters they happened to enter (potentially at "great effort"
    of changing keyboards).

    And what I'm really after is being able to designate some Greek
    characters as the equivalent of English counterparts in circumstances
    where that is appropriate,

    To me this sounds as fuzzy as wrong.

    Can you touch-type? I assume all SBCS users can touch-type
    in their language (if they learn). I expect someone who only
    knows Greek, to touch-type at full speed - in Greek - even
    when using my English-only software.

    and there is a desire to avoid a keyboard
    change. So a new isequiv() function as an extension to C90. (I'm
    basically forking C90 to create a C90+ or C90.0.1 - same as we
    do with software - bells and whistles go into C99 etc, not C90.0.1)

    Any thoughts?

    I cannot comment on your reluctance to use (or requirement to not use)
    an underlying UTF-8 encoding, and the rest should be handled by the
    locale - if UTF-8 characters are supported and allowed in string and character literals of the respective programming language; I haven't
    tried for "C" (but I seem to have no problems with using any UTF-8
    characters in string literals).

    None of that - including literals in my own software - is what
    I am after.

    I hope you found some useful information and got some more insights.
    If, based on that, you can clarify your intentions and thoughts I'd
    be interested to hear what you're actually trying to achieve.

    I'm interested in Greek, Greek, Greek.

    At no extra processing or space cost to English.

    Exactly as it was when the C90 standard was created.

    Which came VERY close to supporting exactly that. The
    C89 standard was delayed for a year because of Europeans
    intending to change C89 to support locales. Instead, the
    Americans changed ANSI. And got it NEARLY right.
    You can't expect people to get things 100% right. So
    with the benefit of 35 years of hindsight, and a C library
    and operating system and supporting tools under my
    belt, I'm trying to add some MINOR touches to C90.

    BFN. Paul.



    --- PyGate Linux v1.5.1
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From bart@3:633/10 to All on Sun Nov 23 01:28:06 2025
    On 15/11/2025 09:33, Paul Edwards wrote:
    I am not 100% sure, but I believe some people (Greeks?)
    have keyboards such that their native character set can be
    freely entered, when they're working in their native language.

    And if they are required to work in English, or rather, 7-bit
    ASCII, they will "switch keyboards", ie using the mouse or
    whatever to select a different keyboard, and type the English,
    and then return to the Greek etc keyboard.

    I'm interested in a slight change to C90. I'm not interested in
    UTF-8 either.

    I'd like to write a program using pure ASCII, and indeed, pure
    English prompts, but not force a Greek user to switch keyboards.
    I'm not interested in a complicated translation layer either.

    Originally I was thinking I just need to modify my programs and
    the Greek locale so that I could do:

    if (toupper(c) == 'X') printf("whatever\n");

    And make some random Greek character the equivalent of 'X', ie
    the Greek user knows that when prompted to type 'x' (or 'X'), he
    just needs to press (lambda or whatever Greeks use). The Greek
    locale will convert lambda into X when passed to toupper.

    This is quite a big subject, and it spans many areas such as hardware, drivers, OS, displays, character sets, fonts, as well as programming
    languages and applications.

    Whatever it is you want to do, I doubt it will be solved with one C
    function.



    However, it was pointed out to me that this would interfere with
    storing filenames on traditional FAT, for example. Not everything
    should be subject to uppercasing. The Greek, or Katakana, should
    be preserved, not converted into ASCII gibberish.

    Are you planning to use 8-bit code-pages?

    Those have all sorts of problems which are unlikely to be tolerated
    these days.

    I suggest you just try and make UTF8 work, even if it's a cut-down
    version that only supports the first layer of Unicode, which is
    character codes up 64K.

    Anything else, even going with 16-bit characters, is going to be
    impractical.

    So I was thinking I need some halfway point of equivalency.

    That doesn't make sense, sorry. Imagine you were working in Greek and
    were talking about mapping A-Z to some random Greek latter, but it
    didn't matter which!



    --- PyGate Linux v1.5.1
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Sun Nov 23 09:37:00 2025
    On 11/23/25 01:54, Paul Edwards wrote:
    "Janis Papanagnou" <janis_papanagnou+ng@hotmail.com> wrote in message news:10fepd8$3p5tk$4@dont-email.me...
    On 11/15/25 10:33, Paul Edwards wrote:
    ...
    ...
    And assuming someone only knows Greek, I don't want them
    to ever switch keyboards.

    (I had been speaking just about switching the layout, not the
    keyboard. The installation I spoke of used one Greek keyboard
    used with three switchable layouts, for EN, GR, DE.)

    (In another environment I'm using two keyboards, EN and DE, in
    parallel on one system, each one with its own specific layout.)


    (I wouldn't want it if it was the other way around - and ASCII
    was in fact all Greek).

    Not sure what you're trying to say here. ASCII was never Greek,
    and ISO 8859-7 contains ASCII as subset and Greek in the upper
    range.

    ...
    ...
    Yes, any traditional Greek codepage is fine. Or even a
    non-traditional Greek codepage. I expect all the Greek
    characters to exist between x'80' and x'FF'.

    I don't know what you mean by "non-/traditional Greek codepage".

    (Back these days there was no 7-bit Greek codepage that I knew of
    - but I'm not sure - but there was 8-bit [ASCII based/"extended"]
    ISO 8859-7 in the late 1980's.)

    ...
    I don't expect the Greeks to learn English, but I do expect them
    to get used to using the software such that they recognize that
    when a particular bit of English gibberish like "Enter your name"
    appears on the screen, that they know it is time to enter their
    (Greek) name.

    I see. - So something like: Enter your name: â??????

    And the binary/encoded representation shall be, say, ISO 8859-7 ?

    I'd have expected that old "C" versions supported 8-bit character
    sets without any change; I cannot tell for GR, but I've certainly
    used DE with ISO 8859-1 and -15 encoded data in "C" (without any
    change of the language) that contained umlauts (for example).

    (With gcc -std=c90 and LC_ALL=C I can at least create "™lfr„á" in
    a contemporary system. But that might not be an exact test case,
    as you may have in mind.)

    ...
    ...
    As per 1980s. I'm not trying to change what you did in the
    1980s. And I'm not trying to change the existing "accents".
    ie you type ^ and then a "u" and then you get a u with a
    hat. (in some countries).

    (But that is just a configured way of how diacritical characters
    can be entered through keyboard. In my systems I can set it with
    the locale; using or not using "dead-characters", or some such.)

    ...
    I don't want to use ASCII. Nothing a Greek-only person will
    ever type will be in ASCII, it will all be a SINGLE value
    between x'80' and x'FF'.

    Yes, with ISO Latin 7 you'll have 8 bit available (as opposed to
    7-bit ASCII). So if your system is supporting an 8-bit ISO 8859-7
    character set that should probably work already; doesn't it?

    ...
    It's not multi-language - well - not by preference. The end
    user would much rather have the prompts for "what is your
    name?" in Greek, but that's not on the table.

    (Using a character encoding that supports two complete languages
    is "multi-language", actually.)


    Originally I was thinking I just need to modify my programs and
    the Greek locale so that I could do:

    if (toupper(c) == 'X') printf("whatever\n");

    And make some random Greek character the equivalent of 'X', ie
    the Greek user knows that when prompted to type 'x' (or 'X'), he
    just needs to press (lambda or whatever Greeks use). The Greek
    locale will convert lambda into X when passed to toupper.

    Are you looking for an ASCII representation of that (template?) 'X'?
    Something like "&mu;" (Like "&micro;" for 'æ' in HTML)?

    Yes, an ASCII representation of similar-to-uppercase "micro".

    It's still unclear to me. - Above I got the impression you'd want
    Enter your name: â??????
    now it reads as if you want just a (7-bit) ASCII encoding, as in
    Enter your name: Giannis
    In both cases upper-casing should be possible, though.

    (I still don't see where you actually see or have the problem.)

    ...
    ...
    It won't be ASCII or UTF-8. It will be what MSDOS in
    Greece did in the 1980s.

    I cannot tell anything about DOS in Greek in the 1980's. As far as
    memory serves, I know that (DOS based) Windows 95 could be used in
    Greece with Greek keyboards and letters. (It should be possible to
    Google what codepage Windows used back these days. I'm positive
    that MS did not do any rocket-science back then in that respect.)

    ...
    Can you touch-type? I assume all SBCS users can touch-type
    in their language (if they learn). I expect someone who only
    knows Greek, to touch-type at full speed - in Greek - even
    when using my English-only software.

    Yes. - And...? (Not sure what that has to do with the requirements.)

    ...

    I'm interested in Greek, Greek, Greek.

    At no extra processing or space cost to English.

    Exactly as it was when the C90 standard was created.

    Which came VERY close to supporting exactly that. The
    C89 standard was delayed for a year because of Europeans
    intending to change C89 to support locales. Instead, the
    Americans changed ANSI. And got it NEARLY right.
    You can't expect people to get things 100% right. So
    with the benefit of 35 years of hindsight, and a C library
    and operating system and supporting tools under my
    belt, I'm trying to add some MINOR touches to C90.

    I cannot test; my expectation is that you can do what's described
    above already with a Greek Latin 7 encoding natively in C90 as was
    (most probably) done in Greece back in the late 1980's.

    One thing *might* be crucial where you said:
    Yes, an ASCII representation of similar-to-uppercase "micro".
    But I didn't understand what you wanted here, so I'll bite.

    Janis


    --- PyGate Linux v1.5.1
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun Nov 23 13:23:30 2025
    On 23/11/2025 01:26, Paul Edwards wrote:
    "David Brown" <david.brown@hesbynett.no> wrote in message news:10fekpt$mkk8$1@dont-email.me...

    It is not at all clear to me what you are asking about here. Indeed, I
    do not think it is at all clear to /you/. Have you tried talking
    directly to someone who regularly uses a different alphabet - Greek,
    Cyrillic, Hebrew, etc.? Try to explain your idea to them and see what
    they think.

    No. This IS my (apparently - next message) attempt to directly
    talk to someone.

    Perhaps you should start by /talking/ to someone who works with multiple languages and alphabets. By "talk" I mean "talk" - in person - not
    Usenet posts. You need to be discussing this with people in the same
    room, with hand-waving, white board scribbles, trying out sample
    programs (from the user's viewpoint - the code is irrelevant). You need
    to talk to people who are fluent in English (so that you understand
    them) and Greek or other languages, and who also know, help and interact
    with people who /only/ speak Greek (or whatever) and not English. You
    can't communicate with the people you are trying to help here, so at
    least talk to people who already help those users.

    And you have to start by asking them "Is there a problem here? How
    could we make things easier for you?".

    You seem to have started by imagining some kinds of users, imagining the
    kinds of challenges they have, and imagining a solution for those
    challenges with the priority firmly on what fits your pet "sort-of-C90" concept rather than thinking about what these imaginary users might want.



    Talking here to Janis is better than sitting by yourself and imagining everything, but you have to start by thinking about the right questions.
    Start at the right end - find out if there is a problem, what that
    problem is, and how the user's experience can be improved. Don't bother asking Janis about his experiences - he doesn't have a problem because
    he can switch keyboard layouts and has better English language skills
    than most English language natives. (I'm assuming here that Janis is
    Greek by upbringing - perhaps he is only Greek by his family roots.)
    You have to ask Janis if his great aunty has trouble interacting with
    software written only for English speakers, and asking how you could
    make things easier for her.


    --- PyGate Linux v1.5.1
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Paul Edwards@3:633/10 to All on Mon Nov 24 01:45:37 2025
    "Janis Papanagnou" <janis_papanagnou+ng@hotmail.com> wrote in message news:10fuh3c$100up$1@dont-email.me...
    On 11/23/25 01:54, Paul Edwards wrote:
    "Janis Papanagnou" <janis_papanagnou+ng@hotmail.com> wrote in message news:10fepd8$3p5tk$4@dont-email.me...
    On 11/15/25 10:33, Paul Edwards wrote:
    ...
    ...
    And assuming someone only knows Greek, I don't want them
    to ever switch keyboards.

    (I had been speaking just about switching the layout, not the
    keyboard. The installation I spoke of used one Greek keyboard
    used with three switchable layouts, for EN, GR, DE.)

    Sorry - I used the wrong word (I'm not fluent in terminology).

    I don't want a Greek user to ever switch LAYOUTS, even
    when using MY English-only software.

    Also note - what I am trying to do is more "icing on the cake"
    than "base functionality". As you noted - people can already
    enter their name in Greek.

    What they can't do is "press X to continue".

    (In another environment I'm using two keyboards, EN and DE, in
    parallel on one system, each one with its own specific layout.)


    (I wouldn't want it if it was the other way around - and ASCII
    was in fact all Greek).

    Not sure what you're trying to say here. ASCII was never Greek,
    and ISO 8859-7 contains ASCII as subset and Greek in the upper
    range.

    If the Greeks had invented computers (and had a population
    of 90% of the globe), there is little doubt that the bottom
    128 code points would be Greek characters, and English
    characters would be in the top 128.

    The prompt "enter your name" would be in Greek.

    I don't wish to learn Greek, as I'm in that 10% of the globe,
    and I'm not a programmer.

    Yes, any traditional Greek codepage is fine. Or even a
    non-traditional Greek codepage. I expect all the Greek
    characters to exist between x'80' and x'FF'.

    I don't know what you mean by "non-/traditional Greek codepage".

    A traditional Greek codepage is one from the late 1980s.
    You named an ISO one - probably that.

    A non-traditional Greek codepage is one that I just made
    up right now. I'm going to put the Greek "micro" at code
    point x'F3'. For no particular reason.

    Or maybe the reason is that I want to store the box-drawing
    characters, plus some other things, and ISO doesn't have
    something that does exactly what I want.

    (Back these days there was no 7-bit Greek codepage that I knew of
    - but I'm not sure - but there was 8-bit [ASCII based/"extended"]
    ISO 8859-7 in the late 1980's.)

    Yes - what (some) people call "extended ASCII".

    I don't expect the Greeks to learn English, but I do expect them
    to get used to using the software such that they recognize that
    when a particular bit of English gibberish like "Enter your name"
    appears on the screen, that they know it is time to enter their
    (Greek) name.

    I see. - So something like: Enter your name: ???????

    And the binary/encoded representation shall be, say, ISO 8859-7 ?

    Yes - correct. That all works already.

    The C89/C90 people got MOST stuff working already. All
    the blatantly obvious stuff works already.

    They didn't have the luxury of 35 years of thinking about
    the same topic.

    I'd have expected that old "C" versions supported 8-bit character
    sets without any change; I cannot tell for GR, but I've certainly
    used DE with ISO 8859-1 and -15 encoded data in "C" (without any
    change of the language) that contained umlauts (for example).

    Sure.

    (With gcc -std=c90 and LC_ALL=C I can at least create "?lfr??" in
    a contemporary system. But that might not be an exact test case,
    as you may have in mind.)

    Sure - that all works.

    As per 1980s. I'm not trying to change what you did in the
    1980s. And I'm not trying to change the existing "accents".
    ie you type ^ and then a "u" and then you get a u with a
    hat. (in some countries).

    (But that is just a configured way of how diacritical characters
    can be entered through keyboard. In my systems I can set it with
    the locale; using or not using "dead-characters", or some such.)

    Sure - I'm not interested in how that is done.

    I don't want to use ASCII. Nothing a Greek-only person will
    ever type will be in ASCII, it will all be a SINGLE value
    between x'80' and x'FF'.

    Yes, with ISO Latin 7 you'll have 8 bit available (as opposed to
    7-bit ASCII). So if your system is supporting an 8-bit ISO 8859-7
    character set that should probably work already; doesn't it?

    Yes - almost.

    The problem comes about in my English-only programs - that
    support Greek - mostly - by C90 accident.

    Assuming setlocale(LC_ALL, ""); is in effect, anyway.

    It's not multi-language - well - not by preference. The end
    user would much rather have the prompts for "what is your
    name?" in Greek, but that's not on the table.

    (Using a character encoding that supports two complete languages
    is "multi-language", actually.)

    Ok.

    Originally I was thinking I just need to modify my programs and
    the Greek locale so that I could do:

    if (toupper(c) == 'X') printf("whatever\n");

    And make some random Greek character the equivalent of 'X', ie
    the Greek user knows that when prompted to type 'x' (or 'X'), he
    just needs to press (lambda or whatever Greeks use). The Greek
    locale will convert lambda into X when passed to toupper.

    Are you looking for an ASCII representation of that (template?) 'X'?
    Something like "&mu;" (Like "&micro;" for '?' in HTML)?

    Yes, an ASCII representation of similar-to-uppercase "micro".

    It's still unclear to me. - Above I got the impression you'd want
    Enter your name: ???????
    now it reads as if you want just a (7-bit) ASCII encoding, as in
    Enter your name: Giannis
    In both cases upper-casing should be possible, though.

    No - none of that is correct. I want the 8-bit Greek characters.

    (I still don't see where you actually see or have the problem.)

    Ok, so here's another one.

    As a Greek user, I want to be able to type:

    dir /od

    in Greek gibberish.

    I literally bring an OS to the table - PDOS (see pdos.org).

    I have no intention of creating a Greek translation of "dir".

    But I would also like the Greek user to have the OPTION
    of NOT changing keyboard layouts, and STILL be able to
    use PDOS.

    What I can do is uppercase the string "dir /od", and then
    in my code I can do strcmp() with "DIR".

    That covers ME PERSONALLY.

    It doesn't cover a Greek person - they would have to
    change layouts.

    They can't just press the Greek character at location "d"
    on the US ASCII keyboard, which will generate a "micro"
    or whatever.

    This is what I want to avoid.

    I'm happy to change all MY software (decades worth), to
    ensure that there is no case sensitivity, by making sure that
    all strings are uppercased.

    But that won't cover the Greeks, who will be forced to
    type either lowercase English or uppercase English.

    I don't want them to have to do either of those.

    And nor do I want to do anything in my own software that
    is much more onerous than "toupper".

    And now I want to bridge that gap, by replacing toupper
    with toequiv.

    On MY system - because I use the "C" locale - toupper and
    toequiv will be identical.

    But for the Greek locale invoked with setlocale "" - the toequiv
    will behave differently from toupper. The 7-bit ASCII will be
    the same - mapping 'd' to 'D'. But the high characters will have
    some Greek characters, e.g. "micro", ALSO mapping to 'D'.

    So as an OPTION you can type "dir" really fast (touch-typing)
    WITHOUT changing keyboards.

    And without ME doing ANY work except the minor "cultural
    change" of using toequiv instead of toupper.

    I'm happy to change my own "culture" *to this extent*.

    It won't be ASCII or UTF-8. It will be what MSDOS in
    Greece did in the 1980s.

    I cannot tell anything about DOS in Greek in the 1980's. As far as
    memory serves, I know that (DOS based) Windows 95 could be used in
    Greece with Greek keyboards and letters. (It should be possible to
    Google what codepage Windows used back these days. I'm positive
    that MS did not do any rocket-science back then in that respect.)

    Sure. Everything mostly works already, except for the fact
    that you are forced to change layout - to enter a language
    "you" don't even know (English), to run software you were
    "forced" to run, written by me..

    Can you touch-type? I assume all SBCS users can touch-type
    in their language (if they learn). I expect someone who only
    knows Greek, to touch-type at full speed - in Greek - even
    when using my English-only software.

    Yes. - And...? (Not sure what that has to do with the requirements.)

    And so I don't want layout-switches to interfere with the speed.

    I don't have to do layout changes myself. I expect a Greek version of
    me to be equally as fast as me.

    I'm interested in Greek, Greek, Greek.

    At no extra processing or space cost to English.

    Exactly as it was when the C90 standard was created.

    Which came VERY close to supporting exactly that. The
    C89 standard was delayed for a year because of Europeans
    intending to change C89 to support locales. Instead, the
    Americans changed ANSI. And got it NEARLY right.
    You can't expect people to get things 100% right. So
    with the benefit of 35 years of hindsight, and a C library
    and operating system and supporting tools under my
    belt, I'm trying to add some MINOR touches to C90.

    I cannot test; my expectation is that you can do what's described
    above already with a Greek Latin 7 encoding natively in C90 as was
    (most probably) done in Greece back in the late 1980's.

    All things are indeed possible if you are happy to change
    layout constantly between English and Greek.

    Even if you don't even know English.

    One thing *might* be crucial where you said:

    Yes, an ASCII representation of similar-to-uppercase "micro".

    But I didn't understand what you wanted here, so I'll bite.

    Some arbitrary Greek character, is the equivalent of uppercase
    'D', and another one is the equivalent of uppercase 'X', etc for
    all the characters I could possibly "force" a Greek user to type
    in my pure C90+ program.

    I hope that makes sense. If it doesn't, let me know and I'll
    have another attempt.

    BFN. Paul.



    --- PyGate Linux v1.5.1
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)