Forum: d0p3 BBS

Re: Really beautiful

From Chris M. Thomasson@3:633/10 to All on Fri Jan 2 19:18:40 2026

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?� and pad does this.

The callable is only a number of references [&] which will be optimized away.

callable can call into god knows what... You want your flag to be isolated.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Fri Jan 2 19:48:45 2026

On 1/2/2026 1:49 PM, Chris M. Thomasson wrote:

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
�� xonce_flag() noexcept = default;
private:
�� friend bool xcall_once( xonce_flag &, std::invocable auto );
�� using flag_t = std::atomic<signed char>;
�� flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
�� using namespace std;
�� xonce_flag::flag_t &flag = xflag.m_flag;
�� for( signed char ref = flag.load( memory_order_relaxed ); ; )
�� if( ref > 0 ) [[likely]]

Hummm... I would need to add the braces here. Totally forgot about that.

�� std::atomic_thread_fence(std::memory_order_acquire);

�� return true;

if( ref > 0 ) [[likely]]
{

std::atomic_thread_fence(std::memory_order_acquire);

return true;
}

Yikes!

�� else if( ref < 0 ) [[unlikely]]
�� {
�� flag.wait( ref, memory_order_relaxed );
�� ref = flag.load( memory_order_relaxed );
�� }
�� else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_relaxed ) ) [[likely]]
�� break;
�� bool succ = true;

�� std::atomic_thread_fence(std::memory_order_acquire);

�� try
�� {
�� if constexpr( requires { (bool)callable(); } )
�� succ = (bool)callable();
�� else
�� callable();
�� }
�� catch( ... )
�� {
�� std::atomic_thread_fence(std::memory_order_release);

�� flag.store( 0, memory_order_relaxed );
�� flag.notify_one();
�� throw;
�� }

�� std::atomic_thread_fence(std::memory_order_release);

�� flag.store( (char)succ, memory_order_relaxed );
�� if( succ )
�� flag.notify_all();
�� else
�� flag.notify_one();
�� return succ;
}

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sat Jan 3 04:57:00 2026

You make the code more complicated for nothing.

Am 02.01.2026 um 22:49 schrieb Chris M. Thomasson:

On 12/31/2025 9:00 PM, Bonita Montero wrote:
[...]

#pragma once
#include <concepts>
#include <atomic>

struct xonce_flag
{
�� xonce_flag() noexcept = default;
private:
�� friend bool xcall_once( xonce_flag &, std::invocable auto );
�� using flag_t = std::atomic<signed char>;
�� flag_t m_flag = 0;
};

bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
�� using namespace std;
�� xonce_flag::flag_t &flag = xflag.m_flag;
�� for( signed char ref = flag.load( memory_order_relaxed ); ; )
�� if( ref > 0 ) [[likely]]

�� std::atomic_thread_fence(std::memory_order_acquire);

�� return true;
�� else if( ref < 0 ) [[unlikely]]
�� {
�� flag.wait( ref, memory_order_relaxed );
�� ref = flag.load( memory_order_relaxed );
�� }
�� else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_relaxed ) ) [[likely]]
�� break;
�� bool succ = true;

�� std::atomic_thread_fence(std::memory_order_acquire);

�� try
�� {
�� if constexpr( requires { (bool)callable(); } )
�� succ = (bool)callable();
�� else
�� callable();
�� }
�� catch( ... )
�� {
�� std::atomic_thread_fence(std::memory_order_release);

�� flag.store( 0, memory_order_relaxed );
�� flag.notify_one();
�� throw;
�� }

�� std::atomic_thread_fence(std::memory_order_release);

�� flag.store( (char)succ, memory_order_relaxed );
�� if( succ )
�� flag.notify_all();
�� else
�� flag.notify_one();
�� return succ;
}

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sat Jan 3 04:58:14 2026

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sat Jan 3 04:59:13 2026

Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?� and pad does
this.

The callable is only a number of references [&] which will be
optimized away.

callable can call into god knows what... You want your flag to be
isolated.

My code behaves the same way as with a std::call_once in that sense.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sat Jan 3 05:00:04 2026

Am 03.01.2026 um 04:09 schrieb Chris M. Thomasson:

On 1/2/2026 4:32 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:52 schrieb Chris M. Thomasson:

I think you can make that logic all relaxed. +

Yes, with two additional barriers. But it's easier how I do it.

My way has the barriers exactly where they are actually needed, and
its way easier to read.

Absolutely not.

But my way uses a single acquire barrier after the logic has done
its thing. That is simpler and more efficient.

It's not simpler your way; you would need two additional barriers and
I have two implicit barriers at runtime.

Its better than using the membars in the damn cas wrt C++. One membar
for fail, one membar for success. Yeah. There can be rather major
issues with that...

Wrong.

You only need the actual membars once right before you call into
callable, and once right after it.

It's simpler how I do that.

Actually, not. Well, imvho.

Actually, std::atomic_thread_fence is more of a SPARC way to do
things wrt memory order.

SPARC is dead.

std::atomic_thread_fence can make things oh so much easier.

No, it makes my code more complicated.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Fri Jan 2 20:02:33 2026

On 1/2/2026 8:00 PM, Bonita Montero wrote:

Am 03.01.2026 um 04:09 schrieb Chris M. Thomasson:

On 1/2/2026 4:32 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:52 schrieb Chris M. Thomasson:

I think you can make that logic all relaxed. +

Yes, with two additional barriers. But it's easier how I do it.

My way has the barriers exactly where they are actually needed, and
its way easier to read.

Absolutely not.

But my way uses a single acquire barrier after the logic has done
its thing. That is simpler and more efficient.

It's not simpler your way; you would need two additional barriers and
I have two implicit barriers at runtime.

Its better than using the membars in the damn cas wrt C++. One membar
for fail, one membar for success. Yeah. There can be rather major
issues with that...

Wrong.

Oh really? How?

You only need the actual membars once right before you call into
callable, and once right after it.

It's simpler how I do that.

Actually, not. Well, imvho.

Actually, std::atomic_thread_fence is more of a SPARC way to do
things wrt memory order.

SPARC is dead.

std::atomic_thread_fence can make things oh so much easier.

No, it makes my code more complicated.

Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct. But,
the stand alone one works and it only executes a membar when its 100%
needed.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Fri Jan 2 20:03:45 2026

On 1/2/2026 7:59 PM, Bonita Montero wrote:

Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?� and pad does
this.

The callable is only a number of references [&] which will be
optimized away.

callable can call into god knows what... You want your flag to be
isolated.

My code behaves the same way as with a std::call_once in that sense.

You want your flag to be isolated from callable. Ideally aligned and
padded to a l2 cache line.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Fri Jan 2 20:05:14 2026

On 1/2/2026 7:57 PM, Bonita Montero wrote:

You make the code more complicated for nothing.

[...]

It makes the memory sync MUCH easier to read, imvho. Also, its not more complicated, its more concise.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sat Jan 3 05:20:23 2026

Am 03.01.2026 um 05:03 schrieb Chris M. Thomasson:

On 1/2/2026 7:59 PM, Bonita Montero wrote:

Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?� and pad
does this.

The callable is only a number of references [&] which will be
optimized away.

callable can call into god knows what... You want your flag to be
isolated.

My code behaves the same way as with a std::call_once in that sense.

You want your flag to be isolated from callable. Ideally aligned and
padded to a l2 cache line.

You're really really sick !
The flag is written only a few times until initialization succeds.
Then it remains in a shared cacheline; so there's no false sharing.
And no need for alignment.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sat Jan 3 05:21:11 2026

Am 03.01.2026 um 05:05 schrieb Chris M. Thomasson:

On 1/2/2026 7:57 PM, Bonita Montero wrote:

You make the code more complicated for nothing.

[...]

It makes the memory sync MUCH easier to read, imvho. Also, its not
more complicated, its more concise.

You're really sick. This are 24 lines of code.
If you think it's too hard to read don't program at all.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Fri Jan 2 22:24:11 2026

On 1/2/2026 8:21 PM, Bonita Montero wrote:

Am 03.01.2026 um 05:05 schrieb Chris M. Thomasson:

On 1/2/2026 7:57 PM, Bonita Montero wrote:

You make the code more complicated for nothing.

[...]

It makes the memory sync MUCH easier to read, imvho. Also, its not
more complicated, its more concise.

You're really sick. This are 24 lines of code.
If you think it's too hard to read don't program at all.

I?m saying your original version with the membars is, as far as I can
tell, correct. Yes, I can read it just fine. The whole (bool)callable()
thing aside for a moment... ;^o

I just wanted to show another way to place the membars. The SPARC style
is neat, and C++ lets us express it cleanly. It?s simply easier for me
to think about the protocol when the barriers are spelled out
explicitly. In this layout, the membars are exactly where they need to
be, and all the atomics are relaxed.

Your (bool)callable issue is interesting, by the way.

Anyway, here?s the SPARC?style sketch I typed into the newsreader
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:
_________________
ct_tls& tls = ct_get_tls();

for (;;)
{
void* ptr0 = atomic_load(&anchor);

if (! ptr0)
{
tls.hazard = nullptr;
return nullptr;
}

atomic_store(&tls.hazard, ptr0);

membar_storeload(); // MEMBAR #StoreLoad | #StoreStore

void* ptr1 = atomic_load(&anchor);

if (ptr0 == ptr1)
{
// no-op (DEC Alpha (mb) aside for a moment,
// compiler barrier aside...)
membar_consume();

return ptr0;
}
}
_________________

C++ better NOT use an acquire barrier for my membar_consume()! GRRRRRR!

:^D

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sat Jan 3 10:08:07 2026

Am 03.01.2026 um 07:24 schrieb Chris M. Thomasson:

I just wanted to show another way to place the membars. The SPARC
style is neat, and C++ lets us express it cleanly. It?s simply easier
for me to think about the protocol when the barriers are spelled out explicitly. In this layout, the membars are exactly where they need to
be, and all the atomics are relaxed.

I like my minimalism.
If there would be a more complex synchronization algorithm with screenpages
of lines you might be right, but not with this small amout of code.

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

Anyway, here?s the SPARC?style sketch I typed into the newsreader
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:

SPARC is dead.
Neither Oracle nor Fujitsu have officiall quitted this CPUs,
but the last SPARC-CPUs are nine years ago. Fujitsu has moved
its development team to design new ARM-CPUs.

C++ better NOT use an acquire barrier for my membar_consume()! GRRRRRR!

I don't know wheter a childish attitude is appropriate for sofware development.
But at least when it comes to such small details I might be right.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sat Jan 3 11:22:30 2026

On 1/2/2026 8:20 PM, Bonita Montero wrote:

Am 03.01.2026 um 05:03 schrieb Chris M. Thomasson:

On 1/2/2026 7:59 PM, Bonita Montero wrote:

Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:

On 1/2/2026 4:34 PM, Bonita Montero wrote:

Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:

The flag should be completely isolated from callable?� and pad
does this.

The callable is only a number of references [&] which will be
optimized away.

callable can call into god knows what... You want your flag to be
isolated.

My code behaves the same way as with a std::call_once in that sense.

You want your flag to be isolated from callable. Ideally aligned and
padded to a l2 cache line.

You're really really sick !
The flag is written only a few times until initialization succeds.
Then it remains in a shared cacheline; so there's no false sharing.
And no need for alignment.

I disagree. Try to get rid of any possibility of false sharing. Strive
for it. It's just good hygiene! :^)

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sat Jan 3 11:39:56 2026

On 1/3/2026 1:08 AM, Bonita Montero wrote:

Am 03.01.2026 um 07:24 schrieb Chris M. Thomasson:

I just wanted to show another way to place the membars. The SPARC
style is neat, and C++ lets us express it cleanly. It?s simply easier
for me to think about the protocol when the barriers are spelled out
explicitly. In this layout, the membars are exactly where they need to
be, and all the atomics are relaxed.

I like my minimalism.
If there would be a more complex synchronization algorithm with screenpages of lines you might be right, but not with this small amout of code.

To each their own.

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

Anyway, here?s the SPARC?style sketch I typed into the newsreader
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:

SPARC is dead.

If you say so. I happen to like the way it handled memory order with its MEMBAR instruction.

Neither Oracle nor Fujitsu have officiall quitted this CPUs,
but the last SPARC-CPUs are nine years ago. Fujitsu has moved
its development team to design new ARM-CPUs.

Okay.

C++ better NOT use an acquire barrier for my membar_consume()! GRRRRRR!

I don't know wheter a childish attitude is appropriate for sofware development.
But at least when it comes to such small details I might be right.

Oh my. If a damn compiler puts in a MEMBAR #LoadStore | #LoadLoad for a consume membar, I would be pissed off. You should be pissed off as well.

In a sense, if expecting a compiler to respect the memory model and
avoid unnecessary hardware fences is 'childish,' then I guess the entire
C++ Standards Committee is in preschool. Efficiency isn't a small
detail; it's the whole point

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sat Jan 3 11:57:11 2026

On 1/2/2026 7:58 PM, Bonita Montero wrote:

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and ?true? must mean
successful initialization? That?s fine if it?s documented, but it?s not
a general pattern in a sense. Humm...

Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag uninitialized. Well, shit happens...

So the question isn?t whether your approach works for your specific
use case... It?s whether the interface is robust for "arbitrary"
callables? Right now, it isn?t unless you require a bool returning
callable with a very specific meaning...

Fair enough?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Richard Damon@3:633/10 to All on Sat Jan 3 17:59:19 2026

On 1/3/26 2:57 PM, Chris M. Thomasson wrote:

On 1/2/2026 7:58 PM, Bonita Montero wrote:

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and ?true? must mean
successful initialization? That?s fine if it?s documented, but it?s not
a general pattern in a sense. Humm...

Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag uninitialized. Well, shit happens...

So the question isn?t whether your approach works for your specific
use case... It?s whether the interface is robust for "arbitrary"
callables? Right now, it isn?t unless you require a bool returning
callable with a very specific meaning...

Fair enough?

Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.

Not uncommon to need thin shims like this in "generic" interfaces.

If int foo() return 0 on success, you need a
bool shim_foo() { return 0 == foo(); }

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sun Jan 4 03:32:03 2026

Am 03.01.2026 um 20:57 schrieb Chris M. Thomasson:

On 1/2/2026 7:58 PM, Bonita Montero wrote:

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized.

But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and ?true? must mean
successful initialization? That?s fine if it?s documented, but it?s
not a general pattern in a sense. Humm...

It can return a bool but it must not (if constexpr( ... )).

Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag uninitialized. Well, shit happens...

So the question isn?t whether your approach works for your specific
use case... It?s whether the interface is robust for "arbitrary"
callables? Right now, it isn?t unless you require a bool returning
callable with a very specific meaning...

Fair enough?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sat Jan 3 18:47:01 2026

On 1/3/2026 2:59 PM, Richard Damon wrote:

On 1/3/26 2:57 PM, Chris M. Thomasson wrote:

On 1/2/2026 7:58 PM, Bonita Montero wrote:

Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:

On 1/2/2026 4:35 PM, Bonita Montero wrote:

Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:

What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX >>>>>> API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.

If the callable returns false the flag remains 0, i.e. uninitialized. >>>>>

But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?

It's the same behaviour as if the intitialization code of a once_flag
throws an exceptions; the once_flag remains uninitialized.

So, a user provided callable needs to return a bool?

Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and ?true? must mean
successful initialization? That?s fine if it?s documented, but it?s
not a general pattern in a sense. Humm...

Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag
uninitialized. Well, shit happens...

So the question isn?t whether your approach works for your specific
use case... It?s whether the interface is robust for "arbitrary"
callables? Right now, it isn?t unless you require a bool returning
callable with a very specific meaning...

Fair enough?

Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.

Not uncommon to need thin shims like this in "generic" interfaces.

If int foo() return 0 on success, you need a
bool shim_foo() { return 0 == foo(); }

Fine. It just needs to be documented.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sun Jan 4 08:04:34 2026

Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.

If it the return is bool use if( fn() ) or if( !fn() ).
If it returns an integral with 0 for success use if( fn() == 0 ).
The rest is easily readable from the context.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sun Jan 4 08:06:36 2026

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing. Strive
for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia-
lized; otherwise the flag / the cacheline remains in shared mode. The performance-impact of completely avoiding false sharing here ist nearly
zero.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sun Jan 4 08:11:44 2026

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

Anyway, here?s the SPARC?style sketch I typed into the newsreader
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:

SPARC is dead.

If you say so. I happen to like the way it handled memory order with
its MEMBAR instruction.

I'm using mebars in my code in the most efficient way since how I
did it is the simplest way to do that.

I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.

Oh my. If a damn compiler puts in a MEMBAR #LoadStore | #LoadLoad for
a consume membar, I would be pissed off. You should be pissed off as well.

I use as less membars as possible and where I use them they're
at their right place. I don't follow the minimalism padadigm
all the time but hiere it applies.

In a sense, if expecting a compiler to respect the memory model and
avoid unnecessary hardware fences is 'childish,' then I guess the
entire C++ Standards Committee is in preschool. Efficiency isn't a
small detail; it's the whole point

_I_ do that the simplest way, you added superflous code to make
the code more understandable. With a function that is only 34
lines of code ...

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Sun Jan 4 08:32:49 2026

Am 03.01.2026 um 05:02 schrieb Chris M. Thomasson:

Its better than using the membars in the damn cas wrt C++. One
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...

Wrong.

Oh really? How?

I use the membars correctly and as minimal as possible.

No, it makes my code more complicated.

Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when its
100% needed.

Forget SPARC, it's dead.
And C++ has complete abstraction of any applicable barrier.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sun Jan 4 14:05:03 2026

On 1/3/2026 11:11 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

You should put a clear comment about (bool)callable?

Anyway, here?s the SPARC?style sketch I typed into the newsreader
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:

SPARC is dead.

If you say so. I happen to like the way it handled memory order with
its MEMBAR instruction.

I'm using mebars in my code in the most efficient way since how I
did it is the simplest way to do that.

That's fine.

I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.

You code is fine, (bool)callable aside for a moment. I can read it. I
just wanted to show another way to use stand alone fences. That's all.

Oh my. If a damn compiler puts in a MEMBAR #LoadStore | #LoadLoad for
a consume membar, I would be pissed off. You should be pissed off as
well.

I use as less membars as possible and where I use them they're
at their right place. I don't follow the minimalism padadigm
all the time but hiere it applies.

In a sense, if expecting a compiler to respect the memory model and
avoid unnecessary hardware fences is 'childish,' then I guess the
entire C++ Standards Committee is in preschool. Efficiency isn't a
small detail; it's the whole point

_I_ do that the simplest way, you added superflous code to make
the code more understandable. With a function that is only 34
lines of code ...

That is another matter. A consume membar. Your code does not use them.
But, if you ever do, well, make sure a rouge compiler is not putting in
an acquire barrier when it does not have to. I should have made another
topic about it. Sorry for that.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sun Jan 4 14:06:40 2026

On 1/3/2026 11:04 PM, Bonita Montero wrote:

Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.

If it the return is bool use if( fn() ) or if( !fn() ).
If it returns an integral with 0 for success use if( fn() == 0 ).
The rest is easily readable from the context.

Right, but people might like a clear comment about it before reading the
code? Fair enough?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sun Jan 4 14:13:56 2026

On 1/3/2026 11:32 PM, Bonita Montero wrote:

Am 03.01.2026 um 05:02 schrieb Chris M. Thomasson:

Its better than using the membars in the damn cas wrt C++. One
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...

Wrong.

Oh really? How?

I use the membars correctly and as minimal as possible.

The mebars for CAS success and fail can be a bit sketchy. We were
discussing them well before the C++11 std back in
comp.programming.threads. I wonder if Alex Terekhov is still reading
usenet.

No, it makes my code more complicated.

Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when its
100% needed.

Forget SPARC, it's dead.

Should the std get rid of stand alone fences?

And C++ has complete abstraction of any applicable barrier.

Even consume?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sun Jan 4 14:19:40 2026

On 1/3/2026 11:06 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing. Strive
for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia- lized; otherwise the flag / the cacheline remains in shared mode. The performance-impact of completely avoiding false sharing here ist nearly
zero.

I would still do it. But, that's just me. Its a bit more than that.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So, whatever.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Mon Jan 5 01:00:36 2026

Am 04.01.2026 um 23:13 schrieb Chris M. Thomasson:

On 1/3/2026 11:32 PM, Bonita Montero wrote:

Am 03.01.2026 um 05:02 schrieb Chris M. Thomasson:

Its better than using the membars in the damn cas wrt C++. One
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...

Wrong.

Oh really? How?

I use the membars correctly and as minimal as possible.

The mebars for CAS success and fail can be a bit sketchy. We were
discussing them well before the C++11 std back in
comp.programming.threads. I wonder if Alex Terekhov is still reading
usenet.

My code is simple in that sense.

No, it makes my code more complicated.

Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when
its 100% needed.

Forget SPARC, it's dead.

Should the std get rid of stand alone fences?

No, but they're mostly not needed.

And C++ has complete abstraction of any applicable barrier.

Even consume?

Acquire and release consistency is sufficiently 95% of the time.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Mon Jan 5 01:01:34 2026

Am 04.01.2026 um 23:19 schrieb Chris M. Thomasson:

On 1/3/2026 11:06 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia-
lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly
zero.

I would still do it. But, that's just me. Its a bit more than that.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So, whatever.

Sorry, the initialization happens only once and after that the flag-cachline
is read-shared.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Mon Jan 5 01:03:59 2026

Am 04.01.2026 um 23:05 schrieb Chris M. Thomasson:

On 1/3/2026 11:11 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

You should put a clear comment about (bool)callable?

That's sufficient:

� � if constexpr( requires { (bool)callable(); } )

I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.

You code is fine, (bool)callable aside for a moment. I can read it. I
just wanted to show another way to use stand alone fences. That's all.

I'm using fences as minimalized as possible.

That is another matter. A consume membar. Your code does not use them.

Of course, I'm using it twice. Once after the first flag load and once
after a failed CAS.

But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.

I'm using the fences properly and as minimal as possible.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sun Jan 4 16:40:33 2026

On 1/4/2026 4:01 PM, Bonita Montero wrote:

Am 04.01.2026 um 23:19 schrieb Chris M. Thomasson:

On 1/3/2026 11:06 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia- >>> lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly
zero.

I would still do it. But, that's just me. Its a bit more than that.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So, whatever.

Sorry, the initialization happens only once and after that the flag- cachline
is read-shared.

What about that CAS in the loop?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sun Jan 4 18:46:58 2026

On 1/4/2026 4:03 PM, Bonita Montero wrote:

Am 04.01.2026 um 23:05 schrieb Chris M. Thomasson:

On 1/3/2026 11:11 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

You should put a clear comment about (bool)callable?

That's sufficient:

� � if constexpr( requires { (bool)callable(); } )

I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.

You code is fine, (bool)callable aside for a moment. I can read it. I
just wanted to show another way to use stand alone fences. That's all.

I'm using fences as minimalized as possible.

That is another matter. A consume membar. Your code does not use them.

Of course, I'm using it twice. Once after the first flag load and once
after a failed CAS.

But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.

I'm using the fences properly and as minimal as possible.

I basically agree. But, you code logic had no need for a consume membar.
So, its a mute point in this context. I only brought it up in case you
ever do need one. NOT and acquire. :^)

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sun Jan 4 18:49:00 2026

On 1/4/2026 6:46 PM, Chris M. Thomasson wrote:

On 1/4/2026 4:03 PM, Bonita Montero wrote:

Am 04.01.2026 um 23:05 schrieb Chris M. Thomasson:

On 1/3/2026 11:11 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:

Your (bool)callable issue is interesting, by the way.

That's while I wrote that. Otherwise I could have stuck with
std::once_flag.

It's just that (bool)callable is a bit scary to me.

i don't understand you; the interface is understandable in an easy way. >>>> And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.

You should put a clear comment about (bool)callable?

That's sufficient:

�� if constexpr( requires { (bool)callable(); } )

I don't know wheter a childish attitude is appropriate for sofware >>>>>> development.
But at least when it comes to such small details I might be right.

You code is fine, (bool)callable aside for a moment. I can read it. I
just wanted to show another way to use stand alone fences. That's all.

I'm using fences as minimalized as possible.

That is another matter. A consume membar. Your code does not use them.

Of course, I'm using it twice. Once after the first flag load and once
after a failed CAS.

But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.

I'm using the fences properly and as minimal as possible.

I basically agree. But, you code logic had no need for a consume membar.
So, its a mute point in this context. I only brought it up in case you
ever do need one. NOT and acquire. :^)

Actually, do you know when to use a consume membar? And why you SHOULD
get "pissed off" if a compiler injects a god damn acquire in there?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Sun Jan 4 20:05:12 2026

On 1/4/2026 4:40 PM, Chris M. Thomasson wrote:

On 1/4/2026 4:01 PM, Bonita Montero wrote:

Am 04.01.2026 um 23:19 schrieb Chris M. Thomasson:

On 1/3/2026 11:06 PM, Bonita Montero wrote:

Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:

I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)

No, false sharing needs to be avoided if it happens at least sometimes. >>>> Here false sharing might occur just once when the "once-flag" is
initia-
lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly >>>> zero.

I would still do it. But, that's just me. Its a bit more than that.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So,
whatever.

Sorry, the initialization happens only once and after that the flag-
cachline
is read-shared.

What about that CAS in the loop?

Its hitting the cache line.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Mon Jan 5 10:18:54 2026

Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:

What about that CAS in the loop?

I't only executed once.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Mon Jan 5 12:31:49 2026

On 1/5/2026 1:18 AM, Bonita Montero wrote:

Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:

What about that CAS in the loop?

I't only executed once.

Notice the loop?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Tue Jan 6 06:08:39 2026

Am 05.01.2026 um 21:31 schrieb Chris M. Thomasson:

On 1/5/2026 1:18 AM, Bonita Montero wrote:

Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:

What about that CAS in the loop?

I't only executed once.

Notice the loop?

Yes, but there's usually not so much contention that it
is executed more than once.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Tue Jan 6 16:58:00 2026

On 1/5/2026 9:08 PM, Bonita Montero wrote:

Am 05.01.2026 um 21:31 schrieb Chris M. Thomasson:

On 1/5/2026 1:18 AM, Bonita Montero wrote:

Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:

What about that CAS in the loop?

I't only executed once.

Notice the loop?

Yes, but there's usually not so much contention that it
is executed more than once.

Never know. You are exposing your algo to user code. God knows what it
might do.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Wed Jan 7 06:56:55 2026

Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:

Never know. You are exposing your algo to user code. God knows what it
might do.

The xonce_flag flips to true only once, no matter how mich user code is surrounding that flag.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Tue Jan 6 22:14:22 2026

On 1/6/2026 9:56 PM, Bonita Montero wrote:

Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:

Never know. You are exposing your algo to user code. God knows what it
might do.

The xonce_flag flips to true only once, no matter how mich user code is surrounding that flag.

Forget about the flips for a moment. How may times does a LOCK CMPXCHG
hit it?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Wed Jan 7 07:46:46 2026

Am 07.01.2026 um 07:14 schrieb Chris M. Thomasson:

On 1/6/2026 9:56 PM, Bonita Montero wrote:

Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:

Never know. You are exposing your algo to user code. God knows what
it might do.

The xonce_flag flips to true only once, no matter how mich user code
is surrounding that flag.

Forget about the flips for a moment. How may times does a LOCK CMPXCHG
hit it?

Once in 99.999% of all cases.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Jan 7 12:55:43 2026

On 1/6/2026 10:46 PM, Bonita Montero wrote:

Am 07.01.2026 um 07:14 schrieb Chris M. Thomasson:

On 1/6/2026 9:56 PM, Bonita Montero wrote:

Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:

Never know. You are exposing your algo to user code. God knows what
it might do.

The xonce_flag flips to true only once, no matter how mich user code
is surrounding that flag.

Forget about the flips for a moment. How may times does a LOCK CMPXCHG
hit it?

Once in 99.999% of all cases.

What if somebody uses a callable that throws all the time? ;^) Kidding,
but I still would align and pad the flag. But, that just me. In the name
of "proper" hygiene...

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bonita Montero@3:633/10 to All on Thu Jan 8 05:31:08 2026

Am 07.01.2026 um 21:55 schrieb Chris M. Thomasson:

What if somebody uses a callable that throws all the time? ;^) Kidding,
but I still would align and pad the flag. But, that just me. In the name
of "proper" hygiene...

There's no measurable performance difference with a padded flag.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

Who's Online

System Info

Sysop:	Tetrazocine
Location:	Melbourne, VIC, Australia
Users:	15
Nodes:	8 (0 / 8)
Uptime:	180:28:48
Calls:	213
Files:	21,502
Messages:	84,274

Re: Really beautiful

Who's Online

System Info