Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
struct S {
// ...
T A[];
};
Don't do this:
malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
struct S {
// ...
T A[];
};
Don't do this:
malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
Am 08.10.2025 um 08:35 schrieb Kaz Kylheku:
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
’’ struct S {
’’’’ // ...
’’’’ T A[];
’’ };
Don't do this:
’’ malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
’’ malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
In a real language:
#include <iostream>
#include <optional>
#include <array>
using namespace std;
template<typename T, typename Derived>
struct flex_base
{
’’’’T &operator []( size_t i )
’’’’{
’’’’’’’ return static_cast<Derived &>( *this ).m_arr[i];
’’’’}
’’’’virtual ~flex_base() {};
};
template<typename T, size_t N>
struct flex_array : flex_base<T, flex_array<T, N>>
{
’’’’virtual ~flex_array() {};
private:
’’’’template<typename T, typename Derived>
’’’’friend struct flex_base;
’’’’std::array<T, N> m_arr;
};
int main()
{
’’’’auto &fb = *new flex_array<string, 100>();
’’’’for( size_t i = 0; i != 100; ++i )
’’’’’’’ fb[i] = "hello world";
}
Somewhat more complicated to declare, but much shorter and
more readable usage.
C really sucks.
Am 08.10.2025 um 11:09 schrieb Bonita Montero:
Am 08.10.2025 um 08:35 schrieb Kaz Kylheku:
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
?? struct S {
???? // ...
???? T A[];
?? };
Don't do this:
?? malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
?? malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
In a real language:
#include <iostream>
#include <optional>
#include <array>
using namespace std;
template<typename T, typename Derived>
struct flex_base
{
????T &operator []( size_t i )
????{
??????? return static_cast<Derived &>( *this ).m_arr[i];
????}
????virtual ~flex_base() {};
};
template<typename T, size_t N>
struct flex_array : flex_base<T, flex_array<T, N>>
{
????virtual ~flex_array() {};
private:
????template<typename T, typename Derived>
????friend struct flex_base;
????std::array<T, N> m_arr;
};
int main()
{
????auto &fb = *new flex_array<string, 100>();
????for( size_t i = 0; i != 100; ++i )
??????? fb[i] = "hello world";
}
Somewhat more complicated to declare, but much shorter and
more readable usage.
C really sucks.
OMG, I was blind:
T * new T[N];
You don't understand the meaning of the word 'flexible'.
The whole point of it is that N is unknown at compile time.
Formally speaking, flexible array members are not supported in
inferior tongue ...
However, if I am not mistaken, it works just because implementors are
sane people, rather than because the language itself provides sane guarantees.
struct S {
unsigned int size;
unsigned char mode;
unsigned char array[];
}
And another question. Suppose I need an array of struct S. All elements
have 7-bytes array[] member. How to allocate this array and access each >element?
[...]
C is really dangerous in that sense because you've to flip every bit yourself. Better use abstactions you re-use a lot of times. In C there
almost no complex data strructures at all; like a vector in C++ or a unordered map because it would be a large effort to specialize your-
self that for every data type. Most C projects stick with simple data structures which are less efficient. The "generic" types in C which
work work callbacks like with qsort() really suck since their perfor-
mance is better but still not optimal.
I think all developers who use C today are either forced to stick
with C though their job or are persons which think mostly on the
detail level and can't think in abstractions.
This is programming like in the beginning of the 90s.
But today's
machines are capable to handle more complex requirements and these requirements need a more flexible language so that you can handle
that with less bugs than in a lanugage where you've to do every
detail by yourself.
Il 08/10/2025 08:35, Kaz Kylheku ha scritto:
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
struct S {
// ...
T A[];
};
Don't do this:
malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
struct S {
unsigned int size;
unsigned char mode;
unsigned char array[];
}
In a 32-bits integer machine, sizeof(struct S) is 8, because there are 3 bytes of padding after mode and array is considered empty.
Now I want to store 9 bytes in array[]. I could use:
malloc(sizeof(struct S) + 9 * sizeof(unsigned char))=malloc(17)
Am 08.10.2025 um 08:35 schrieb Kaz Kylheku:
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
struct S {
// ...
T A[];
};
Don't do this:
malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
In a real language:
And another question. Suppose I need an array of struct S. All
elements have 7-bytes array[] member. How to allocate this array and
access each element?
I think I can't use the first malloc (17), neither the second (14).
Both aren't a multiple of alignment length.
Ah well, you can lead an ass to water ...
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
struct S {
// ...
T A[];
};
Don't do this:
malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
On 08.10.2025 12:09, Bonita Montero wrote:
[...]
C is really dangerous in that sense because you've to flip every bit
yourself. Better use abstactions you re-use a lot of times. In C there
almost no complex data strructures at all; like a vector in C++ or a
unordered map because it would be a large effort to specialize your-
self that for every data type. Most C projects stick with simple data
structures which are less efficient. The "generic" types in C which
work work callbacks like with qsort() really suck since their perfor-
mance is better but still not optimal.
I think all developers who use C today are either forced to stick
with C though their job or are persons which think mostly on the
detail level and can't think in abstractions.
This is programming like in the beginning of the 90s.
I disagree in the historic valuation; abstractions were known and
used (and asked for) already [long] before. (Even your beloved C++
came already a decade earlier, and its designer was influenced by
even older abstraction concepts from the 1960's [Simula].)
But there certainly always have been developers who stuck to older
languages with less expressiveness in abstraction; obviously still
today. About the (strange or also valid) reasons we can speculate.
I would also speculate that many/most developers can not only think
in abstractions but know (and can program in) other languages that
provide abstraction concepts. (Or so I hope.)
Janis
But today's
machines are capable to handle more complex requirements and these
requirements need a more flexible language so that you can handle
that with less bugs than in a lanugage where you've to do every
detail by yourself.
I disagree in the historic valuation; abstractions were known and
used (and asked for) already [long] before. ...
Am 08.10.2025 um 15:59 schrieb Janis Papanagnou:
I disagree in the historic valuation; abstractions were known and
used (and asked for) already [long] before. ...
Compared to C++ they're almost not possible in C compared to mordern languages.
On 10/8/2025 8:59 AM, Janis Papanagnou wrote:
While a higher level language might be nice sometimes...
C++ is kind of a trash fire.
[...]
Only real reason to deal with it is that some people around seem to
think C++ is a good idea, and have used it to write software.
Granted, a few of my own language design attempts ended up with a
different mess: [...]
[ attempt for a discussion on features of "own language"
snipped; not my business ]
Full GC is undesirable because basically no one has managed to avoid the issue of timing and performance instabilities. Almost invariably,
performance stutters at regular intervals, or the program periodically
stalls (and at least over the past 30 years, it seems no one has
entirely avoided this issue).
Refcounting is also bad for performance, but typically the timing issues
from refcounting is orders of magnitude smaller (and mostly cases where
some large linked list or similar has its refcount drop to 0 and is
freed, which in turn results in walking and freeing every member of the list).
[...]
Still, not really anything that fully replaces C though.
One merit of C being, that it is C, and so all existing C code works in it.
Both C++ and other extended C dialects can have the advantage of being backwards compatible with existing C code (though for C++, "only
sorta"); but the drawback of also being "crap piled onto C", inherently
more messy and crufty than a "clean" language could have been.
[...]
Like, one can throw out the whole mess that is dealing with Multiple-Inheritance
and all of the tweaky and weird ways that class
instances can be structured (direct inheritance vs virtual inheritance,
friend classes, ... and all of the wacky effects these can have on
object layout).
Comparably, at least with Single-Inheritance and interfaces, [...]
[...]
Also, it simplifies things if class instances are always by reference
and never by value. So, structs retain the by value use-case, with
structs being disallowed from having interfaces or virtual methods or supporting inheritance (which can be the exclusive domain of class
objects).
On Wed, 8 Oct 2025 15:23:09 -0000 (UTC)
Kaz Kylheku <643-408-1753@kylheku.com> wrote:
Ah well, you can lead an ass to water ...
IMHO, your sarcasm is unwarranted. Read the whole post of pozz.
I seems to me that [in the first half of his post] pozz cares about
things that are not worth carrying (few more or few less byte requested
from malloc, where in practice malloc rounds requested size up at least
to a multiple of 8, but more likely of 16), but it is obvious that he
fully understood your earlier suggestion.
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
struct S {
// ...
T A[];
};
Don't do this:
malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
On 10/7/2025 11:35 PM, Kaz Kylheku wrote:
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
’’ struct S {
’’’’ // ...
’’’’ T A[];
’’ };
Don't do this:
’’ malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
’’ malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
For some god damn reason its raising memories of an older region
allocator I mocked up in C:
Still on pastebin. funny:
https://groups.google.com/g/comp.lang.c/c/H_p2Ki5JhYU/m/rlSzqJsxCQAJ
https://pastebin.com/raw/f37a23918
(no ads, raw text)
On 10/8/2025 1:35 AM, Kaz Kylheku wrote:
Jonas Lund of https://whizzter.woorlic.org/ mentioned this
trick in a HackerNews comment:
Given:
struct S {
// ...
T A[];
};
Don't do this:
malloc(offsetof(S, A) + n * sizeof (T));
But rather this:
malloc(offsetof(S, A[n]));
It's easy to forget that the second argument of offsetof is a
designator, not simply a member name.
This is assuming offsetof and can deal with general expressions (vs
just field names). IIRC, it is only required to work with field names
(and with plain structs).
But in addition to that, in Kaz's example, n is not a constant
expression, so `&(t.member-designator)` is not an address constant
and therefore `offsetof(S, A[n])` has undefined behavior.
Every compiler I've tried handles this "correctly", and I tend to
On 2025-10-08, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
But in addition to that, in Kaz's example, n is not a constant
expression, so `&(t.member-designator)` is not an address constant
and therefore `offsetof(S, A[n])` has undefined behavior.
Great; I'd like to hear reasons to avoid it so I don't look foolish
for having overlooked it for manytyears. :)
Every compiler I've tried handles this "correctly", and I tend to
I'm sure I've seen foo.bar expressions on the right of an offsetof,
but those still yield constants.
On 08.10.2025 19:29, BGB wrote:
On 10/8/2025 8:59 AM, Janis Papanagnou wrote:
While a higher level language might be nice sometimes...
C++ is kind of a trash fire.
I'm not familiar with the meaning of the term "trash fire".
(If it's important to understand your post please explain.)
I can say a lot concerning C++; both, pros and cons. (But
not here and not now.)
[...]
Only real reason to deal with it is that some people around seem to
think C++ is a good idea, and have used it to write software.
Well, I certainly don't think it's a bad idea; far from bad.
And back then, when I was seeking for a HLL with OO support,
and C++ became available - and even widely accepted - I was
quite grateful to be able to use it professionally.
Granted, a few of my own language design attempts ended up with a
different mess: [...]
A sensibly defined language isn't something easily to create
or obtain! - Personally I'd have appreciated it more if more
designers of "own languages" have oriented their designs on
sensible existing and proven concepts. - There may be a
"market" for all these "own languages", I don't know, but I
also don't care much, given what I've seen or heard of yet.
(This isn't meant to be offensive, just to be clear, only
that I don't care much. As compiler writers don't care much
what I think.)
[ attempt for a discussion on features of "own language"
snipped; not my business ]
Full GC is undesirable because basically no one has managed to avoid the
issue of timing and performance instabilities. Almost invariably,
performance stutters at regular intervals, or the program periodically
stalls (and at least over the past 30 years, it seems no one has
entirely avoided this issue).
Well, some languages have no GC at all. Others even support
a couple of functions to control GC on various levels. It
may be triggered manually (on items, classes, or ranges),
or automatically (on demand, or depending on conditions; it
may depend on memory, time, heuristics, statistical behavior).
Pick your language depending on your projects demands.
Refcounting is also bad for performance, but typically the timing issues
from refcounting is orders of magnitude smaller (and mostly cases where
some large linked list or similar has its refcount drop to 0 and is
freed, which in turn results in walking and freeing every member of the
list).
Tailor your application and language choice on the projects'
requirements.
[...]
Still, not really anything that fully replaces C though.
One merit of C being, that it is C, and so all existing C code works in it.
(On a larger time scale that seems not to match my observation.
But okay, never mind.)
Both C++ and other extended C dialects can have the advantage of being
backwards compatible with existing C code (though for C++, "only
sorta"); but the drawback of also being "crap piled onto C", inherently
more messy and crufty than a "clean" language could have been.
Are you talking here about the relation of "C" with C++?
I certainly agree to what a "clean language" can be.
My opinion on that is, though, that the "C" base of C++ is part of
the problem. Which doesn't let it appear to me "C" to be "better"
than C++, but that the "C" base is part of C++'s problem. (Here
I'm not speaking about "C++"'s own problems that probably entered
about with C++0x/C++11, IMO. - Mileages certainly vary.)
[...]
Like, one can throw out the whole mess that is dealing with
Multiple-Inheritance
Well, when I started with C++ there wasn't multiple-inheritance
available. Personally thinking its omission would be a mistake;
I missed it back these day.
I'm not sure what "mess" you have in mind. - Explicit qualification
isn't a hindrance. Weakening the independence of classes in complex multi-level class-topologies is something under control of the
program designer. - So it's fine to have it with all design options
it opens.
and all of the tweaky and weird ways that class
instances can be structured (direct inheritance vs virtual inheritance,
I'm not sure what you are thinking here. - It's a notation to avoid
duplicate inclusions across "converging hierarchies".
friend classes, ... and all of the wacky effects these can have on
object layout).
Well, back then I wasn't a "friend" of the 'friend' feature. But it
also didn't stress me in any way. (The only aspect I was concerned
about a bit here was the uncontrolled access to class details; yet
it's under the programmer's control.)
Comparably, at least with Single-Inheritance and interfaces, [...]
This insight came later. (Was it Java that served as paragon? I only
seem to recall that the GNU compiler suite supported C++ 'interfaces'
at some time; was it the late 1990's ?)
[...]
Also, it simplifies things if class instances are always by reference
and never by value. So, structs retain the by value use-case, with
structs being disallowed from having interfaces or virtual methods or
supporting inheritance (which can be the exclusive domain of class
objects).
Well, I can only say that it was nice to use objects ("instances")
in an orthogonal way like other [primitive, built-in] object entities.
(I knew the concept of "ref-only" [for class objects] from Simula.
But this distinction was something I never considered a nice concept.)
Janis
[...]
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
Though, similar was often a problem in my other language design
attempts: The most efficient way to do things was often also the C way.
The only real exception I have found to this rule basically being in relation to some features I have borrowed from languages like GLSL and Verilog. But, some of this stuff isn't so much making the language
"higher level" as much as "being easier to map to ISA features and optimize".
Say:
’ vd[62:52]=vs[20:10];
Being easier to optimize than, say:
’ vd=(vd&(~(2047ULL<<52)))|(((vs>>10)&2047ULL)<<52);
Though, Verilog itself, not so much... Works well in an ASIC or FPGA,
not so much on a CPU.
Though, as can be noted:
’ Bit-ranges are required to be constant at compile time;
’ When used with normal integer types, both bounds are required.
OTOH, GLSL offers nice and efficient ways to deal with SIMD.
Well, and also having some types for bit-preserving casts.
Or ability to specify endianess and alignment for individual struct
members.
...
Granted, a few of my own language design attempts ended up with a
different mess:’ [...]
A sensibly defined language isn't something easily to create
or obtain! - Personally I'd have appreciated it more if more
designers of "own languages" have oriented their designs on
sensible existing and proven concepts. - There may be a
"market" for all these "own languages", I don't know, but I
also don't care much, given what I've seen or heard of yet.
(This isn't meant to be offensive, just to be clear, only
that I don't care much. As compiler writers don't care much
what I think.)
Yeah.
They have either tended to not amount to much, or converged towards more conventional languages.
[ attempt for a discussion on features of "own language"
’’ snipped; not my business ]
Some amount of my stuff recently has involved various niche stuff.
’ Interfacing with hardware;
’ Motor controls;
’ Implementing things like an OpenGL back-end or similar;
’ Being used for a Boot ROM and OS kernel;
’ Sometimes neural nets.
Some features are useful in some contexts but not others:
For example, "__int128" is very helpful when writing FPU-emulation code
for Binary128 handling, but has a lot fewer use-cases much beyond this.
Or, like:
’ exp=vala[126:112];’ //extract exponent
’ fra=(_BitInt(128)) { 0x0001i16, vala[111:0]};’ //extract fraction
Unless maybe something can come along that is a better C than C...
Would likely simplify or eliminate some infrequently used features in C.
Possibly:
’ Preprocessor, still exists, but its role is reduced.
’’’ Its role can be partly replaced by compiler metadata.
’ Trigraphs and digraphs: Gone;
’ K&R style declarations, also gone;
’ Parser should not depend on previous declarations;
’ Non trivial types and declarator syntax: Eliminate;
’ ...
Possibly:
Pointers and arrays can be specified on the type rather than declarator
(so, more like C# here)
...
But, as I see it, drastically changing the syntax (like in Go or Rust)
is undesirable. Contrast, say, C# style syntax was more conservative.
Though, the harder problem here isn't necessarily that of designing or implementing it, but more in how to make its use preferable to jus
staying with C.
One merit is if code can be copy-pasted, but if one has to change all instances of:
’ char *s0, *s1;
To:
’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows C style declaration syntax in this case.
On 09/10/2025 04:49, BGB wrote:
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
Though, similar was often a problem in my other language design
attempts: The most efficient way to do things was often also the C way.
The only real exception I have found to this rule basically being in
relation to some features I have borrowed from languages like GLSL and
Verilog. But, some of this stuff isn't so much making the language
"higher level" as much as "being easier to map to ISA features and
optimize".
Say:
’ vd[62:52]=vs[20:10];
Being easier to optimize than, say:
’ vd=(vd&(~(2047ULL<<52)))|(((vs>>10)&2047ULL)<<52);
Using special bit-features makes it easier to generate decent code for a simple compiler.
But gcc for example has no trouble optimising that masking/shifting version.
(It can do it in four x64 instructions, whereas I need nine working from vd.[62..52] := vs.[20..10]. It could be improved though; I don't need to extract the data to bits 10..0 first for example.)
The main advantage is that it is a LOT easier to write, read and
understand. The C would need macros to make it practical.
(ldb (byte 4 2) 100)9
(let ((x 100))(setf (ldb (byte 4 2) x) 15)
On 2025-10-10, bart <bc@freeuk.com> wrote:
On 09/10/2025 04:49, BGB wrote:
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
On 10/8/2025 8:39 PM, Kaz Kylheku wrote:
On 2025-10-08, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
But in addition to that, in Kaz's example, n is not a constantGreat; I'd like to hear reasons to avoid it so I don't look foolish
expression, so `&(t.member-designator)` is not an address constant
and therefore `offsetof(S, A[n])` has undefined behavior.
for having overlooked it for manytyears. :)
Every compiler I've tried handles this "correctly", and I tend toI'm sure I've seen foo.bar expressions on the right of an offsetof,
but those still yield constants.
I think it is a case of, it is not required to work...
But, if the typical implementation is something like, say:
#define offsetof(T, M) ((long)(&(((T *)0)->M)))
It is probably going to work without issue.
On 09/10/2025 04:49, BGB wrote:
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
Though, similar was often a problem in my other language design
attempts: The most efficient way to do things was often also the C way.
The only real exception I have found to this rule basically being in
relation to some features I have borrowed from languages like GLSL and
Verilog. But, some of this stuff isn't so much making the language
"higher level" as much as "being easier to map to ISA features and
optimize".
Say:
’’ vd[62:52]=vs[20:10];
Being easier to optimize than, say:
’’ vd=(vd&(~(2047ULL<<52)))|(((vs>>10)&2047ULL)<<52);
Using special bit-features makes it easier to generate decent code for a simple compiler.
But gcc for example has no trouble optimising that masking/shifting
version.
(It can do it in four x64 instructions, whereas I need nine working from vd.[62..52] := vs.[20..10]. It could be improved though; I don't need to extract the data to bits 10..0 first for example.)
The main advantage is that it is a LOT easier to write, read and
understand. The C would need macros to make it practical.
Though, Verilog itself, not so much... Works well in an ASIC or FPGA,
not so much on a CPU.
Though, as can be noted:
’’ Bit-ranges are required to be constant at compile time;
’’ When used with normal integer types, both bounds are required.
I can handle some variable elements, but it gets rapidly complicated. At some point it needs to use library functions to do the work.
OTOH, GLSL offers nice and efficient ways to deal with SIMD.
Well, and also having some types for bit-preserving casts.
Or ability to specify endianess and alignment for individual struct
members.
...
Granted, a few of my own language design attempts ended up with a
different mess:’ [...]
A sensibly defined language isn't something easily to create
or obtain! - Personally I'd have appreciated it more if more
designers of "own languages" have oriented their designs on
sensible existing and proven concepts. - There may be a
"market" for all these "own languages", I don't know, but I
also don't care much, given what I've seen or heard of yet.
(This isn't meant to be offensive, just to be clear, only
that I don't care much. As compiler writers don't care much
what I think.)
Yeah.
They have either tended to not amount to much, or converged towards
more conventional languages.
[ attempt for a discussion on features of "own language"
’’ snipped; not my business ]
(There are those who can devise and use their own languages, and those
who can't.)
Some amount of my stuff recently has involved various niche stuff.
’’ Interfacing with hardware;
’’ Motor controls;
’’ Implementing things like an OpenGL back-end or similar;
’’ Being used for a Boot ROM and OS kernel;
’’ Sometimes neural nets.
Some impressive stuff.
Some features are useful in some contexts but not others:
For example, "__int128" is very helpful when writing FPU-emulation
code for Binary128 handling, but has a lot fewer use-cases much beyond
this.
Or, like:
’’ exp=vala[126:112];’ //extract exponent
’’ fra=(_BitInt(128)) { 0x0001i16, vala[111:0]};’ //extract fraction
I had i128/u128 types at one point (quite a nice implementation too; it
was only missing full 128-bit divide, I had only 128/64.)
But the only place they got used was implementing 128-bit support in the self-hosted compiler and its library! So they were dropped.
Unless maybe something can come along that is a better C than C...
There are lots of new products, mostly too ambitious, too big and too complex. But C is already ensconced everywhere.
Would likely simplify or eliminate some infrequently used features in C.
Possibly:
’’ Preprocessor, still exists, but its role is reduced.
’’’’ Its role can be partly replaced by compiler metadata.
’’ Trigraphs and digraphs: Gone;
’’ K&R style declarations, also gone;
’’ Parser should not depend on previous declarations;
’’ Non trivial types and declarator syntax: Eliminate;
’’ ...
Possibly:
Pointers and arrays can be specified on the type rather than
declarator (so, more like C# here)
...
But, as I see it, drastically changing the syntax (like in Go or Rust)
is undesirable. Contrast, say, C# style syntax was more conservative.
Nobody cares about C syntax. Learning all its ins and outs seems be a
rite of passage.
The trouble is that C-style is so dominant, few people would know what a decent syntax looks like. Or, more, likely, they associate a clean, well-designed syntax with toy or scripting languages, and can't take it seriously.
But if it looks as hairy as C++ then it must be the business!
Though, the harder problem here isn't necessarily that of designing or
implementing it, but more in how to make its use preferable to jus
staying with C.
One merit is if code can be copy-pasted, but if one has to change all
instances of:
’’ char *s0, *s1;
To:
’’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows C
style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
’typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
On 09/10/2025 04:49, BGB wrote:[...]
Nobody cares about C syntax.
Learning all its ins and outs seems be a
rite of passage.
The trouble is that C-style is so dominant, few people would know what
a decent syntax looks like. Or, more, likely, they associate a clean, well-designed syntax with toy or scripting languages, and can't take
it seriously.
But if it looks as hairy as C++ then it must be the business!
One merit is if code can be copy-pasted, but if one has to change
all instances of:
’ char *s0, *s1;
To:
’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows C
style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
Several implementations I've tried (gcc, clang, tcc) implement the
offsetof macro via "__builtin_offsetof". Whatever compiler magic
is used to implement "__builtin_offsetof" typically works correctly
for Kaz's example (which is of course one of the possible results of undefined behavior).
bart <bc@freeuk.com> writes:
On 09/10/2025 04:49, BGB wrote:[...]
Nobody cares about C syntax.
That is so manifestly untrue that I can't imagine what you actually
meant.
Many of us, myself included, don't particularly like some aspects of C syntax, but that's not the same as not caring about it.
Learning all its ins and outs seems be a
rite of passage.
Perhaps. It's also necessary if you want to work with the language.
The trouble is that C-style is so dominant, few people would know what
a decent syntax looks like. Or, more, likely, they associate a clean,
well-designed syntax with toy or scripting languages, and can't take
it seriously.
But if it looks as hairy as C++ then it must be the business!
C syntax has survived and been propagated to other languages because
it's well known, not, I think, because anybody really likes it.
[...]
One merit is if code can be copy-pasted, but if one has to change
all instances of:
’ char *s0, *s1;
To:
’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows C
style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
Yes, that will work in C23, but it would never occur to me to
write that. I'd just write `char *s0, *s1;` or, far more likely,
define s0 and s1 on separate lines. Using typeof that way triggers
my WTF filter.
On 10/9/2025 10:59 PM, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
One merit is if code can be copy-pasted, but if one has to change
all instances of:
’ ’ char *s0, *s1;
To:
’ ’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows C
style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
’ typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
Yes, that will work in C23, but it would never occur to me to
write that.’ I'd just write `char *s0, *s1;` or, far more likely,
define s0 and s1 on separate lines.’ Using typeof that way triggers
my WTF filter.
Agreed.
I think it can be contrast with C# style syntax (with "unsafe") where
one would write:
’ char* s0, s1;
On 2025-10-10, bart <bc@freeuk.com> wrote:
On 09/10/2025 04:49, BGB wrote:
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
Though, similar was often a problem in my other language design
attempts: The most efficient way to do things was often also the C way.
The only real exception I have found to this rule basically being in
relation to some features I have borrowed from languages like GLSL and
Verilog. But, some of this stuff isn't so much making the language
"higher level" as much as "being easier to map to ISA features and
optimize".
Say:
’ vd[62:52]=vs[20:10];
Being easier to optimize than, say:
’ vd=(vd&(~(2047ULL<<52)))|(((vs>>10)&2047ULL)<<52);
Using special bit-features makes it easier to generate decent code for a
simple compiler.
But gcc for example has no trouble optimising that masking/shifting version. >>
(It can do it in four x64 instructions, whereas I need nine working from
vd.[62..52] := vs.[20..10]. It could be improved though; I don't need to
extract the data to bits 10..0 first for example.)
The main advantage is that it is a LOT easier to write, read and
understand. The C would need macros to make it practical.
I'm skeptical that the C macro system is powerful enough to actually
create an operand like
bits(vd, 52, 62)
such that this constitutes an lvalue that can be assigned,
such that those range of bits will receive the value.
The closest C mechanism to that is the bitfield, which has
compartments decided at compile-time.
What if 52 and 62 could be variables?
On 10/10/2025 08:27, BGB wrote:
On 10/9/2025 10:59 PM, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
One merit is if code can be copy-pasted, but if one has to change
all instances of:
? ? char *s0, *s1;
To:
? ? char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows
C style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
? typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
Yes, that will work in C23, but it would never occur to me to
write that.? I'd just write `char *s0, *s1;` or, far more likely,
define s0 and s1 on separate lines.? Using typeof that way triggers
my WTF filter.
Agreed.
I think it can be contrast with C# style syntax (with "unsafe")
where one would write:
? char* s0, s1;
Does C# treat s1 as "char*" in this case? That sounds like an extraordinarily bad design decision - having a syntax that is very
like the dominant C syntax yet subtly different.
On Fri, 10 Oct 2025 12:06:10 +0200
David Brown <david.brown@hesbynett.no> wrote:
On 10/10/2025 08:27, BGB wrote:
On 10/9/2025 10:59 PM, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
One merit is if code can be copy-pasted, but if one has to change
all instances of:
’ ’ char *s0, *s1;
To:
’ ’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows
C style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
’ typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
Yes, that will work in C23, but it would never occur to me to
write that.’ I'd just write `char *s0, *s1;` or, far more likely,
define s0 and s1 on separate lines.’ Using typeof that way triggers
my WTF filter.
Agreed.
I think it can be contrast with C# style syntax (with "unsafe")
where one would write:
’ char* s0, s1;
Does C# treat s1 as "char*" in this case? That sounds like an
extraordinarily bad design decision - having a syntax that is very
like the dominant C syntax yet subtly different.
Generally, I disagree with your rule. Not that it makes no sense at
all, but sometimes a violation has more sense. For example, I strongly
prefer for otherwise C-like languages to parse 011 literal as decimal
11 rather than 9.
In this particular case it's more subtle.
What makes it a non-issue in practice is the fact that pointers is C# is
very rarely used expert-level feature, especially so after 7 or 8
years ago the language got slices (Span<T>).
A person that decides to use C# pointers has to understand at least
half a dozen of more arcane things than this one.
Also it's very unlikely in case somebody made such mistake that his
code will pass compilation. After all, we're talking about C# here, not something like Python.
On 10/10/2025 08:27, BGB wrote:
On 10/9/2025 10:59 PM, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
One merit is if code can be copy-pasted, but if one has to change
all instances of:
’ ’ char *s0, *s1;
To:
’ ’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows C
style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
’ typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
Yes, that will work in C23, but it would never occur to me to
write that.’ I'd just write `char *s0, *s1;` or, far more likely,
define s0 and s1 on separate lines.’ Using typeof that way triggers
my WTF filter.
Agreed.
I think it can be contrast with C# style syntax (with "unsafe") where
one would write:
’’ char* s0, s1;
Does C# treat s1 as "char*" in this case?’ That sounds like an extraordinarily bad design decision - having a syntax that is very like
the dominant C syntax yet subtly different.
Issues like this have been "solved" for decades - in the sense that
people who care about their code don't make mistakes from mixups of
"char" and "char*" declarations.’ There are a dozen different ways to be sure it is not an issue.’ Simplest of all is a style rule - never
declare identifiers of different types in the same declaration.’ I'd
have preferred that to be a rule baked into the language from the start,
but we all have things we dislike about the C syntax.
On 10/10/2025 16:28, Michael S wrote:
On Fri, 10 Oct 2025 12:06:10 +0200
David Brown <david.brown@hesbynett.no> wrote:
On 10/10/2025 08:27, BGB wrote:
On 10/9/2025 10:59 PM, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
One merit is if code can be copy-pasted, but if one has to change >>>>>>> all instances of:
’’ ’ char *s0, *s1;
To:
’’ ’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows >>>>>>> C style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
’’ typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
Yes, that will work in C23, but it would never occur to me to
write that.’ I'd just write `char *s0, *s1;` or, far more likely,
define s0 and s1 on separate lines.’ Using typeof that way triggers
my WTF filter.
Agreed.
I think it can be contrast with C# style syntax (with "unsafe")
where one would write:
’ ’ char* s0, s1;
Does C# treat s1 as "char*" in this case?’ That sounds like an
extraordinarily bad design decision - having a syntax that is very
like the dominant C syntax yet subtly different.
Generally, I disagree with your rule. Not that it makes no sense at
all, but sometimes a violation has more sense. For example, I strongly
prefer for otherwise C-like languages to parse 011 literal as decimal
11 rather than 9.
I did not intend to describe a general rule (and I agree with you in
regard to octal).
In this particular case it's more subtle.
What makes it a non-issue in practice is the fact that pointers is C# is
very rarely used expert-level feature, especially so after 7 or 8
years ago the language got slices (Span<T>).
A person that decides to use C# pointers has to understand at least
half a dozen of more arcane things than this one.
Also it's very unlikely in case somebody made such mistake that his
code will pass compilation. After all, we're talking about C# here, not
something like Python.
Sure.
It would seem to me, however, that it would have been better for the C# designers to pick a different syntax here rather than something that
looks like C, but has subtle differences that are going to cause newbies confusion when they try to google for explanations for their problems.
For example, if raw pointers are rarely used, then they should perhaps
be accessible using a more verbose syntax than a punctuation mark - "ptr<char> s0, s1;" might work.
However, I have no experience with C#, and don't know the reasons for
its syntax choices.
On 10/10/2025 08:27, BGB wrote:
On 10/9/2025 10:59 PM, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
One merit is if code can be copy-pasted, but if one has to change
all instances of:
’ ’ char *s0, *s1;
To:
’ ’ char* s0, s1;
Well, this is likely to get old, unless it still uses, or allows C
style declaration syntax in this case.
That one's been fixed (50 years late): you instead write:
’ typeof(char*) s0, s1;
But you will need an extension if it's not part of C23.
Yes, that will work in C23, but it would never occur to me to
write that.’ I'd just write `char *s0, *s1;` or, far more likely,
define s0 and s1 on separate lines.’ Using typeof that way triggers
my WTF filter.
Agreed.
I think it can be contrast with C# style syntax (with "unsafe") where
one would write:
’ char* s0, s1;
Does C# treat s1 as "char*" in this case? That sounds like an extraordinarily bad design decision - having a syntax that is very like
the dominant C syntax yet subtly different.
Issues like this have been "solved" for decades - in the sense that
people who care about their code don't make mistakes from mixups of
"char" and "char*" declarations. There are a dozen different ways to be sure it is not an issue. Simplest of all is a style rule - never
declare identifiers of different types in the same declaration.
have preferred that to be a rule baked into the language from the start,
but we all have things we dislike about the C syntax.
Yeah, '0' by itself indicating octal is weird, so I might agree here.
123 //decimal
0123 //maybe reinterpret as decimal?
0o123 //octal
0x123 //hexadecimal
0b101 //binary
777777
00777777
#o777511
#x7771911
#b10019
On 2025-10-10, BGB <cr88192@gmail.com> wrote:
Yeah, '0' by itself indicating octal is weird, so I might agree here.
123 //decimal
0123 //maybe reinterpret as decimal?
0o123 //octal
0x123 //hexadecimal
0b101 //binary
Lisp people worked this out before the end of the 80s:
777777
00777777
#o777511
#x7771911
#b10019
Leading zeros changing base is really a sneaky stupidity, and causes
problems in shell scripts also, from time to time.
$ printf "%d\n" 0777
511
$
$ echo $(( 0777 + 0 ))
511
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
On 10/8/2025 8:59 AM, Janis Papanagnou wrote:
Throughout much of my life, C++ has been around, but using it has often turned into a footgun. Early on the code had a bad habit of breaking
from one compiler version to another, or the ability to compile C++ code
in general would be broken (primarily with Cygwin and MinGW; where
whether or not "g++" worked on a given install attempt, or with a given program, was very hit or miss).
[...]
In most cases, it left C as a more preferable option.
C can be made to do the same stuff at similar performance, with often
only minimal difference in expressive power.
And, the main "powerful" tool of C++, templates,
tending to do bad
things to build times and result in excessive code bloat.
And, if one tries to avoid C++'s drawbacks, the result was mostly code
that still looks mostly like C.
Though, similar was often a problem in my other language design
attempts: The most efficient way to do things was often also the C way.
[...]
Some amount of my stuff recently has involved various niche stuff.
Interfacing with hardware;
Motor controls;
Implementing things like an OpenGL back-end or similar;
Being used for a Boot ROM and OS kernel;
Sometimes neural nets.
Few traditional languages other than C work well at a lot of this.
A usual argued weakness of C is that it requires manual memory
management. But, OTOH, you *really* don't want a GC in motor controls or
an OS kernel or similar.
Like, if the GC triggers, and an interrupt handler happens at a bad
time, then you have a problem.
Or, if you have a 1us timing tolerance for motor controls and this gets
blown because the GC takes 75ms, etc...
[...]
Maybe C will be around indefinitely for all I know.
Like, the passage of time still hasn't totally eliminated FORTRAN and
COBOL.
And, C is far more commonly used than either.
Unless maybe something can come along that is a better C than C...
[...]
I certainly agree to what a "clean language" can be.
My opinion on that is, though, that the "C" base of C++ is part of
the problem. Which doesn't let it appear to me "C" to be "better"
than C++, but that the "C" base is part of C++'s problem. (Here
I'm not speaking about "C++"'s own problems that probably entered
about with C++0x/C++11, IMO. - Mileages certainly vary.)
Possibly.
A new C-like language need not necessarily be strictly C based.
My thinking would be likely keeping a similar basic syntax though,
though likely more syntactically similar to C#,
but retaining more in
terms of implementation with C and C++.
Would likely simplify or eliminate some infrequently used features in C.
Possibly:
Preprocessor, still exists, but its role is reduced.
Its role can be partly replaced by compiler metadata.
Trigraphs and digraphs: Gone;
K&R style declarations, also gone;
Parser should not depend on previous declarations;
Non trivial types and declarator syntax: Eliminate;
...
Possibly:
Pointers and arrays can be specified on the type rather than declarator
(so, more like C# here)
[...]
Though, the harder problem here isn't necessarily that of designing or implementing it, but more in how to make its use preferable to jus
staying with C.
One merit is if code can be copy-pasted, but if one has to change all instances of:
char *s0, *s1;
To:
char* s0, s1;
[...]
Java and C# had made 'char' 16-bit, but I now suspect this may have been
a mistake. It may be preferable instead keep 'char' as 8 bits and make
UTF-8 the default string format. In the vast majority of cases, strings
hold primarily or entirely ASCII characters.
Also, can probably have a string type:
string str="Some String";
But, then allow that string is freely cast to "char*", ...
Well, and that the underlying representation of a string is still as a pointer into a string-table or similar.
Also the design of the standard library should remain conservative and
not add piles of needless wrappers or cruft.
[...]
Like, one can throw out the whole mess that is dealing with
Multiple-Inheritance
Well, when I started with C++ there wasn't multiple-inheritance
available. Personally thinking its omission would be a mistake;
I missed it back these day.
I'm not sure what "mess" you have in mind. - Explicit qualification
isn't a hindrance. Weakening the independence of classes in complex
multi-level class-topologies is something under control of the
program designer. - So it's fine to have it with all design options
it opens.
There is both implementation complexity of MI, and also some added
complexity with using it. The complexity gets messy.
The SI + Interfaces model can reduce both.
Granted, these can grow their own warts (like default methods or
similar), but arguably still not as bad as MI.
I am more thinking from the perspective of implementing a compiler.
[ implementation issues snipped and gracefully skipped ]
[...]
Virtual inheritance still means one can't just call the copy logic for
each parent class when copying a derived class;
[...]
(Sorry for the delayed reply; your ~450 lines post was too long for
me to consider a timely reply.)
On 09.10.2025 05:49, BGB wrote:
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
On 10/8/2025 8:59 AM, Janis Papanagnou wrote:
Throughout much of my life, C++ has been around, but using it has often
turned into a footgun. Early on the code had a bad habit of breaking
from one compiler version to another, or the ability to compile C++ code
in general would be broken (primarily with Cygwin and MinGW; where
whether or not "g++" worked on a given install attempt, or with a given
program, was very hit or miss).
I used it early on on various Unix platforms; all had some details
different - like the way how templates worked in the development
environment - but nothing was really an issue; as with current
configuration settings this was covered and handled by the build
system.
It doesn't astonish me the least if you've faced specific problems
on the Windows platforms.
[...]
In most cases, it left C as a more preferable option.
C can be made to do the same stuff at similar performance, with often
only minimal difference in expressive power.
The problem is, IMO, rather that "C", in the first place, doesn't
compare to C++ in its level of "expressive power".
And, the main "powerful" tool of C++, templates,
(IMO, the main powerful tool was primarily classes, polymorphisms,
also [real] references.)
tending to do bad
things to build times and result in excessive code bloat.
I recall that initially we had issues with code bloat, but I don't
recall that it would have been a problem; we handled that (but,
after that long time, don't ask me how).
And, if one tries to avoid C++'s drawbacks, the result was mostly code
that still looks mostly like C.
(That sounds as if you haven't used OO designs, reference parameters, overloading, and so on, obviously.)
Though, similar was often a problem in my other language design
attempts: The most efficient way to do things was often also the C way.
IME, *writing* software in "C" requires much more time than in C++;
presuming you meant that with "most efficient way to do things".
(Saving a few seconds in "C" compared to C++ programs can hardly be
relevant, I'd say; unless you were not really familiar with C++ ?
Or have special application areas, as I read below in the post.)
[...]
Some amount of my stuff recently has involved various niche stuff.
Interfacing with hardware;
Motor controls;
Implementing things like an OpenGL back-end or similar;
Being used for a Boot ROM and OS kernel;
Sometimes neural nets.
"Nice. - I've done Neural Net simulations with C++ back these days.)
Few traditional languages other than C work well at a lot of this.
A usual argued weakness of C is that it requires manual memory
management. But, OTOH, you *really* don't want a GC in motor controls or
an OS kernel or similar.
Like, if the GC triggers, and an interrupt handler happens at a bad
time, then you have a problem.
Or, if you have a 1us timing tolerance for motor controls and this gets
blown because the GC takes 75ms, etc...
Sure, you should know where to use static memory, dynamic management organized yourself, or "I-don't-want-to-care" and use GC management,
or a sensible deliberate mixture of that (if the language allows).
(I've never used GC with C++; is that meanwhile possible?)
[...]
Maybe C will be around indefinitely for all I know.
Not unlikely.
Like, the passage of time still hasn't totally eliminated FORTRAN and
COBOL.
There's obviously some demand. *shrug* - I don't care much. - My last "contact" with FORTRAN was when one of my children was asked to handle
some legacy library code; my suggestion was to get rid of that task.
And, C is far more commonly used than either.
Unless maybe something can come along that is a better C than C...
There's so many languages meanwhile - frankly, there were already a
lot back then, four decades ago! - so I don't think the proliferation
will stop; I don't think that evolution is a good thing. It seems that
often the inventors have their own agenda and the success of languages depends mainly on the marketing efforts and the number of fan-people
that got triggered by newly invented buzzwords, and an own invented terminology [for already existing old concepts]!
[...]
I certainly agree to what a "clean language" can be.
My opinion on that is, though, that the "C" base of C++ is part of
the problem. Which doesn't let it appear to me "C" to be "better"
than C++, but that the "C" base is part of C++'s problem. (Here
I'm not speaking about "C++"'s own problems that probably entered
about with C++0x/C++11, IMO. - Mileages certainly vary.)
Possibly.
A new C-like language need not necessarily be strictly C based.
(There's a couple things I like in "C". But if I'd have to invent a
language it would certainly not be "C-like". I'd took a higher-level
[better designed] language as paragon and support the "C" features I
like, if not already present in that language.)
My thinking would be likely keeping a similar basic syntax though,
though likely more syntactically similar to C#,
(But the syntax is one of C's and descendants' problem, IMO. - Part
of what was described in existing "C-like" languages is either the less-desired elements or deviations, but the latter will probably
just add to confusion if details are subtle. It's already bad enough
with subtle differences between different "C" standards it seems.)
but retaining more in
terms of implementation with C and C++.
(But weren't exactly these languages already [partly] invented with
such an agenda?)
Would likely simplify or eliminate some infrequently used features in C.
Possibly:
Preprocessor, still exists, but its role is reduced.
Its role can be partly replaced by compiler metadata.
Trigraphs and digraphs: Gone;
K&R style declarations, also gone;
Parser should not depend on previous declarations;
Non trivial types and declarator syntax: Eliminate;
...
Sounds all reasonable to me.
Possibly:
Pointers and arrays can be specified on the type rather than declarator
(so, more like C# here)
(Yeah, but mind the comments on effects of "subtle differences".)
[...]
Though, the harder problem here isn't necessarily that of designing or
implementing it, but more in how to make its use preferable to jus
staying with C.
Well, as formulated, that's an individual thing. Meanwhile I have the
freedom to use what I like in my recreational activities, but if we
consider professional projects there's conditions and requirements to
take into account.
One merit is if code can be copy-pasted, but if one has to change all
instances of:
char *s0, *s1;
To:
char* s0, s1;
Such changes would be annoying. (And I say that with a strong aversion
of C's declaration syntax.) - For me, "C" is not a good base; neither
to keep its bad syntax nor to have to change it alike in subtle ways.
My style is anyway another; [mostly] separate declarations, and those initialized, as in
char * s0 = some_alloc (...);
char * s1 = 0;
More important is that such declarations may appear anywhere not just
at the begin of a block. (I'm still traumatized by K&R, I suppose.)
[...]
Java and C# had made 'char' 16-bit, but I now suspect this may have been
a mistake. It may be preferable instead keep 'char' as 8 bits and make
UTF-8 the default string format. In the vast majority of cases, strings
hold primarily or entirely ASCII characters.
I think we should be careful here! An Unicode "character" may require
even 32 bit, but UTF-8 is just an "encoding" (in units of an octet).
If we want a sensible type system defined we should be aware of that difference. The question is; what shall be expressed by a 'char' type;
the semantic entity or the transfer syntax. (This question is similar
to the Unix file system, also based on octets; that made it possible
to represent any international multi-octet characters. There's some
layer necessary to get from the "transfer-syntax" (the encoding) to
the representation.) - What will, say, a "C" user expect from 'char';
just move it around or represent it on some output (or input) medium.
Also, can probably have a string type:
string str="Some String";
But, then allow that string is freely cast to "char*", ...
(Wasn't that so in C++? - And in addition there's the corresponding
template classes, IIRC. - But I don't recall all the gory details.)
Well, and that the underlying representation of a string is still as a
pointer into a string-table or similar.
Also the design of the standard library should remain conservative and
not add piles of needless wrappers or cruft.
Not sure what you have in mind here.
Personally, despite some resentment on some of the complex syntax
and constructs necessary, I liked the C++ STL; its orthogonality
and concepts in principle. (And especially if compared to some
other languages' ad hoc "tool-chest" libraries I stumbled across.)
[...]
Like, one can throw out the whole mess that is dealing with
Multiple-Inheritance
Well, when I started with C++ there wasn't multiple-inheritance
available. Personally thinking its omission would be a mistake;
I missed it back these day.
I'm not sure what "mess" you have in mind. - Explicit qualification
isn't a hindrance. Weakening the independence of classes in complex
multi-level class-topologies is something under control of the
program designer. - So it's fine to have it with all design options
it opens.
There is both implementation complexity of MI, and also some added
complexity with using it. The complexity gets messy.
(Okay, if that's what you took from it, I of course accept it.
But I'd have more expected that you might have dislike of some
STL parts than [multiple] inheritance.)
The SI + Interfaces model can reduce both.
I've used classes with only "pure virtual" functions to achieve
the interface abstraction; since I could easily design what I
needed with standard features and practically no overhead I thus
wasn't missing the 'interface' feature.
(But of course I can see the implementation argument you make.)
Granted, these can grow their own warts (like default methods or
similar), but arguably still not as bad as MI.
(Well, I appreciated it to have that feature available in C++,
even though my first OO language, Simula, didn't support it, so
I was used to not having it when I got into C++ and liked it.)
I am more thinking from the perspective of implementing a compiler.
Hah! Yeah. - Recently in another NG someone disliked a feature
because he had suffered from troubles implementing it. (It was
not MI but formatted I/O in that case.) - I'm not implementing
complex languages, so I guess I can feel lucky if someone else
did the language implementation job and I can just use it.
[ implementation issues snipped and gracefully skipped ]
[...]
Virtual inheritance still means one can't just call the copy logic for
each parent class when copying a derived class;
(I don't think I agree here. - Or are you still talking of the
implementers' challenges? - But never mind. Programming in C++
I could model everything I liked. That was really nice.)
Janis
[...]
Apparently the languages people are trying to push as C replacements are mostly Rust, Zig, and Go.
None of these particularly compel me though.
’ They seem more like needless deviations from C than a true successor.
I guess the older generations mostly had Pascal and Ada.
There was ALGOL, but both C and Pascal descended from ALGOL.
As noted elsewhere, my thinking is partly that pipeline looks like:
’ Preprocessor (basic or optional, C like)
’ Parser (Context-independent, generates ASTs)
’ Front end compiler: Compiles ASTs to a stack IL.
Backend:
’ IL -> 3AC/SSA;
On 15/10/2025 02:13, BGB wrote:
Apparently the languages people are trying to push as C replacements
are mostly Rust, Zig, and Go.
None of these particularly compel me though.
’’ They seem more like needless deviations from C than a true successor.
So what would a true successor look like?
I guess the older generations mostly had Pascal and Ada.
There was ALGOL, but both C and Pascal descended from ALGOL.
I've heard that before that C was somehow derived from Algol and even
Algol 68.
But it is so utterly unlike either of those, that if it's from the same family, then it must have been adopted.
As noted elsewhere, my thinking is partly that pipeline looks like:
’’ Preprocessor (basic or optional, C like)
’’ Parser (Context-independent, generates ASTs)
’’ Front end compiler: Compiles ASTs to a stack IL.
Backend:
’’ IL -> 3AC/SSA;
That's odd: you're going from a stack IL to a 3AC non-stack IR/IL?
Why not go straight to 3AC?
(I've tried both stack and 3AC ILs, but not both in the same compiler! I finally decided to stay with stack; 3AC code *always* got too fiddly to
deal with.
So stack IL is directly translated to register-based, unoptimised native code, which reasonably efficient. Performance is usually somewhere in between Tiny C and gcc-O2.)
On 10/13/2025 11:29 PM, Janis Papanagnou wrote:
(Sorry for the delayed reply; your ~450 lines post was too long for
me to consider a timely reply.)
On 09.10.2025 05:49, BGB wrote:
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
On 10/8/2025 8:59 AM, Janis Papanagnou wrote:
[...]
Well, and for a given Cygwin install attempt, whether or not "g++" would work, etc, was a bit like playing roulette.
[...]
In most cases, it left C as a more preferable option.
C can be made to do the same stuff at similar performance, with often
only minimal difference in expressive power.
The problem is, IMO, rather that "C", in the first place, doesn't
compare to C++ in its level of "expressive power".
?...
I have yet to find much that can be expressed in C++ but is not also expressible in C.
The main things that are fundamentally different, are things like
Exceptions and RTTI, but even in C++, these don't come free.
Though, if exceptions are implemented using an approach similar to VEH
in the Windows X64 ABI, it is at least modest.
And, the main "powerful" tool of C++, templates,
(IMO, the main powerful tool was primarily classes, polymorphisms,
also [real] references.)
These can be done in C via manually written vtables, and passing the
address of a variable.
[...]
We can do OO, just using a different approach, say:
[...]
It all works, and doesn't require significantly more LOC than it would
have in C++.
Though, similar was often a problem in my other language design
attempts: The most efficient way to do things was often also the C way.
IME, *writing* software in "C" requires much more time than in C++;
presuming you meant that with "most efficient way to do things".
(Saving a few seconds in "C" compared to C++ programs can hardly be
relevant, I'd say; unless you were not really familiar with C++ ?
Or have special application areas, as I read below in the post.)
Main limiting factor at present is that it is a harder issue to write a non-trivial C++ compiler.
I could write C++ code, but then it isn't really portable outside
running on my PC or similar.
[ snip "own compiler", speed and other topics ]
Like, the passage of time still hasn't totally eliminated FORTRAN and
COBOL.
There's obviously some demand. *shrug* - I don't care much. - My last
"contact" with FORTRAN was when one of my children was asked to handle
some legacy library code; my suggestion was to get rid of that task.
In my case, I don't have any descendants.
Apparently they still exist in some places, mostly as languages that no
one uses.
[...]
[...]
Apparently the languages people are trying to push as C replacements are mostly Rust, Zig, and Go.
None of these particularly compel me though.
They seem more like needless deviations from C than a true successor.
I guess the older generations mostly had Pascal and Ada.
[...]
[...]
A new C-like language need not necessarily be strictly C based.
(There's a couple things I like in "C". But if I'd have to invent a
language it would certainly not be "C-like". I'd took a higher-level
[better designed] language as paragon and support the "C" features I
like, if not already present in that language.)
[ ruminations about such new language snipped ]
but retaining more in
terms of implementation with C and C++.
(But weren't exactly these languages already [partly] invented with
such an agenda?)
[...]
I am imagining something that basically does similar stuff to what C
already does, and can ideally be used in a similar context.
The main downsides is that C and C++ are more complicated than ideal in
many areas. This has a detrimental effect on compilers.
Not so much intending to make a language that tries to be more intuitive
or hand-holding though. However, if it is possible to make provisions
for things like static-analysis or bounds-checked arrays (in a way that ideally doesn't adversely effect performance), this can be nice.
[...]
[...]
Java and C# had made 'char' 16-bit, but I now suspect this may have been >>> a mistake. It may be preferable instead keep 'char' as 8 bits and make
UTF-8 the default string format. In the vast majority of cases, strings
hold primarily or entirely ASCII characters.
I think we should be careful here! An Unicode "character" may require
even 32 bit, but UTF-8 is just an "encoding" (in units of an octet).
If we want a sensible type system defined we should be aware of that
difference. The question is; what shall be expressed by a 'char' type;
the semantic entity or the transfer syntax. (This question is similar
to the Unix file system, also based on octets; that made it possible
to represent any international multi-octet characters. There's some
layer necessary to get from the "transfer-syntax" (the encoding) to
the representation.) - What will, say, a "C" user expect from 'char';
just move it around or represent it on some output (or input) medium.
It is a tradeoff.
But, if "char*" can point to a string, then "char" needs to be the same
size as an item in memory (thus, probably a byte).
Otherwise, it would make sense to have "char" as an alias to "int" and require "ubyte*" for use as strings. For consistency with C, makes more
sense to assume char to be a byte.
[...]
Well, and that the underlying representation of a string is still as a
pointer into a string-table or similar.
Also the design of the standard library should remain conservative and
not add piles of needless wrappers or cruft.
Not sure what you have in mind here.
Personally, despite some resentment on some of the complex syntax
and constructs necessary, I liked the C++ STL; its orthogonality
and concepts in principle. (And especially if compared to some
other languages' ad hoc "tool-chest" libraries I stumbled across.)
I was primarily thinking of Java and its excessive piles of wrapper
classes. Like, C gives you the stdio functions, which are basic but effective.
Java has:
[...]
We don't need this. Java just sort of ran with it, creating piles of
random wrapper classes whose existence serves almost no practical
purpose (and would have been much better served, say, by simply
providing a File class that holds a mock-up of C's stdio interface;
which is, ironically, closer to the approach C# had taken here).
The great sin here of C++ is mostly things like iostream.
[...]
I am more thinking from the perspective of implementing a compiler.
Hah! Yeah. - Recently in another NG someone disliked a feature
because he had suffered from troubles implementing it. (It was
not MI but formatted I/O in that case.) - I'm not implementing
complex languages, so I guess I can feel lucky if someone else
did the language implementation job and I can just use it.
I am writing from the POV of someone who did start making an attempt to implement C++ support, and mostly gave up at roughly an early 1990s
feature level.
If you dropped MI, templates, and pretty much everything following from these, stuff would be a lot easier.
[...]
On 15/10/2025 02:13, BGB wrote:
There was ALGOL, but both C and Pascal descended from ALGOL.
I've heard that before that C was somehow derived from Algol and even
Algol 68.
But it is so utterly unlike either of those, that if it's from the same family, then it must have been adopted.
On 15.10.2025 03:13, BGB wrote:
On 10/13/2025 11:29 PM, Janis Papanagnou wrote:
(Sorry for the delayed reply; your ~450 lines post was too long for
me to consider a timely reply.)
(Now ~800 lines; it escalates.)
On 09.10.2025 05:49, BGB wrote:
On 10/8/2025 2:04 PM, Janis Papanagnou wrote:
On 08.10.2025 19:29, BGB wrote:
On 10/8/2025 8:59 AM, Janis Papanagnou wrote:
[...]
Well, and for a given Cygwin install attempt, whether or not "g++" would
work, etc, was a bit like playing roulette.
I didn't "like" Cygwin, but also never had any "roulette" experience.
[...]
In most cases, it left C as a more preferable option.
C can be made to do the same stuff at similar performance, with often
only minimal difference in expressive power.
The problem is, IMO, rather that "C", in the first place, doesn't
compare to C++ in its level of "expressive power".
?...
I have yet to find much that can be expressed in C++ but is not also
expressible in C.
You may adhere to another sort of expressiveness than I. (For me
assembler, for example, is not more expressive than "C".) It's all
about expressing "complex" things in easy ways.
The main things that are fundamentally different, are things like
Exceptions and RTTI, but even in C++, these don't come free.
Back then they said that exceptions come for "almost free" (or so);
I've never counted the seconds difference, since our project goals
and priorities were lying on other factors.
RTTI, OTOH, I rarely used in the first place. Part of it was due to
my design principle to avoid casts; here (WRT RTTI), dynamic casts.
This features wasn't often used in our projects.
Though, if exceptions are implemented using an approach similar to VEH
in the Windows X64 ABI, it is at least modest.
And, the main "powerful" tool of C++, templates,
(IMO, the main powerful tool was primarily classes, polymorphisms,
also [real] references.)
These can be done in C via manually written vtables, and passing the
address of a variable.
(Yes, and you can also do it in assembler. - But that's not the point
of using higher level structuring features. - Frankly, I'm so stumped
that you wrote such a strange thing that I suppose it makes no sense
to discuss that point further with you; our views, it seems, are here
so fundamentally different, obviously.)
[...]
We can do OO, just using a different approach, say:
[...]
*shudder*
It all works, and doesn't require significantly more LOC than it would
have in C++.
IME, *writing* software in "C" requires much more time than in C++;
Though, similar was often a problem in my other language design
attempts: The most efficient way to do things was often also the C way. >>>
presuming you meant that with "most efficient way to do things".
(Saving a few seconds in "C" compared to C++ programs can hardly be
relevant, I'd say; unless you were not really familiar with C++ ?
Or have special application areas, as I read below in the post.)
Main limiting factor at present is that it is a harder issue to write a
non-trivial C++ compiler.
I could write C++ code, but then it isn't really portable outside
running on my PC or similar.
(We've used it in professional contexts on various platforms for
different customers without problem. - I cannot comment on your
opinion or experiences.)
[ snip "own compiler", speed and other topics ]
Like, the passage of time still hasn't totally eliminated FORTRAN and
COBOL.
There's obviously some demand. *shrug* - I don't care much. - My last
"contact" with FORTRAN was when one of my children was asked to handle
some legacy library code; my suggestion was to get rid of that task.
In my case, I don't have any descendants.
Apparently they still exist in some places, mostly as languages that no
one uses.
(In scientific areas FORTRAN is obviously still widely used. And
this is no "[geographically] local phenomenon" as I learned.)
[...]
[...]
Apparently the languages people are trying to push as C replacements are
mostly Rust, Zig, and Go.
I've heard so. (But don't care much.)
None of these particularly compel me though.
They seem more like needless deviations from C than a true successor.
I guess the older generations mostly had Pascal and Ada.
Not sure what you are thinking here.
While I knew of Pascal programs used even in professional projects
(like in a nuclear reprocessing plant) but it never appeared to me
that it is well usable for larger real world programs; at least in
its standardized form back then; Pascal successors addressed these shortcomings to some degree, though. And Ada is (I think still) used
in avionics, spacetravel, and some military areas. (Myself [an older generation] I had never programmed in Ada, or "professionally" in
Pascal.)
[...]
[...]
A new C-like language need not necessarily be strictly C based.
(There's a couple things I like in "C". But if I'd have to invent a
language it would certainly not be "C-like". I'd took a higher-level
[better designed] language as paragon and support the "C" features I
like, if not already present in that language.)
[ ruminations about such new language snipped ]
but retaining more in
terms of implementation with C and C++.
(But weren't exactly these languages already [partly] invented with
such an agenda?)
[...]
I am imagining something that basically does similar stuff to what C
already does, and can ideally be used in a similar context.
The main downsides is that C and C++ are more complicated than ideal in
many areas. This has a detrimental effect on compilers.
Not so much intending to make a language that tries to be more intuitive
or hand-holding though. However, if it is possible to make provisions
for things like static-analysis or bounds-checked arrays (in a way that
ideally doesn't adversely effect performance), this can be nice.
I see.
[...]
[...]
Java and C# had made 'char' 16-bit, but I now suspect this may have been >>>> a mistake. It may be preferable instead keep 'char' as 8 bits and make >>>> UTF-8 the default string format. In the vast majority of cases, strings >>>> hold primarily or entirely ASCII characters.
I think we should be careful here! An Unicode "character" may require
even 32 bit, but UTF-8 is just an "encoding" (in units of an octet).
If we want a sensible type system defined we should be aware of that
difference. The question is; what shall be expressed by a 'char' type;
the semantic entity or the transfer syntax. (This question is similar
to the Unix file system, also based on octets; that made it possible
to represent any international multi-octet characters. There's some
layer necessary to get from the "transfer-syntax" (the encoding) to
the representation.) - What will, say, a "C" user expect from 'char';
just move it around or represent it on some output (or input) medium.
It is a tradeoff.
But, if "char*" can point to a string, then "char" needs to be the same
size as an item in memory (thus, probably a byte).
Otherwise, it would make sense to have "char" as an alias to "int" and
require "ubyte*" for use as strings. For consistency with C, makes more
sense to assume char to be a byte.
(I don't think that addresses what I was pointing at. But never mind.)
[...]
Well, and that the underlying representation of a string is still as a >>>> pointer into a string-table or similar.
Also the design of the standard library should remain conservative and >>>> not add piles of needless wrappers or cruft.
Not sure what you have in mind here.
Personally, despite some resentment on some of the complex syntax
and constructs necessary, I liked the C++ STL; its orthogonality
and concepts in principle. (And especially if compared to some
other languages' ad hoc "tool-chest" libraries I stumbled across.)
I was primarily thinking of Java and its excessive piles of wrapper
classes. Like, C gives you the stdio functions, which are basic but
effective.
Yes, Java follows one (quite common) way, "C" another (primitive).
But "tertium datur"! One other example is the STL.
Java has:
[...]
We don't need this. Java just sort of ran with it, creating piles of
random wrapper classes whose existence serves almost no practical
purpose (and would have been much better served, say, by simply
providing a File class that holds a mock-up of C's stdio interface;
which is, ironically, closer to the approach C# had taken here).
To be fair, and despite my dislike, there's of course a rationale
for Java's approach. Its concepts can often very flexibly be used.
(I recall I once wanted to use Regexps, found a simple Apache (I
think) library that was clearly and sensibly defined and could be
used easily and in a readable way. There was another flexible and
bulky library with a three-level object hierarchy (or so). - Guess
what became Java standard some year later!
The great sin here of C++ is mostly things like iostream.
It may appear so at first glance. But beyond some unfortunate design
details it allows very flexible and powerful (yet readable) software
designs in complex software architectures. (The problem is that folks
seem to watch and stumble over the pebbles and thereby missing the
landscape; figuratively formulated.)
[...]
I am more thinking from the perspective of implementing a compiler.
Hah! Yeah. - Recently in another NG someone disliked a feature
because he had suffered from troubles implementing it. (It was
not MI but formatted I/O in that case.) - I'm not implementing
complex languages, so I guess I can feel lucky if someone else
did the language implementation job and I can just use it.
I am writing from the POV of someone who did start making an attempt to
implement C++ support, and mostly gave up at roughly an early 1990s
feature level.
If you dropped MI, templates, and pretty much everything following from
these, stuff would be a lot easier.
As student in a more radical mood I considered templates to be less
important compared to inheritance; you can emulate them. But it's a
lot simpler writing nice code if you have support for parameterized
classes (templates) I had to admit back then; I wouldn't have wanted
to miss them. (Stroustrup, BTW, considered not inventing templates
earlier a mistake.)
Janis
[...]
Sysop: | Tetrazocine |
---|---|
Location: | Melbourne, VIC, Australia |
Users: | 14 |
Nodes: | 8 (0 / 8) |
Uptime: | 207:33:18 |
Calls: | 178 |
Files: | 21,502 |
Messages: | 80,307 |