Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:
The flag should be completely isolated from callable?ÿ and pad does this.The callable is only a number of references [&] which will be optimized away.
#pragma once
#include <concepts>
#include <atomic>
struct xonce_flag
{
ÿÿÿ xonce_flag() noexcept = default;
private:
ÿÿÿ friend bool xcall_once( xonce_flag &, std::invocable auto );
ÿÿÿ using flag_t = std::atomic<signed char>;
ÿÿÿ flag_t m_flag = 0;
};
bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
ÿÿÿ using namespace std;
ÿÿÿ xonce_flag::flag_t &flag = xflag.m_flag;
ÿÿÿ for( signed char ref = flag.load( memory_order_relaxed ); ; )
ÿÿÿÿÿÿÿ if( ref > 0 ) [[likely]]
ÿÿÿÿÿÿÿÿÿÿÿ std::atomic_thread_fence(std::memory_order_acquire);
ÿÿÿÿÿÿÿÿÿÿÿ return true;
ÿÿÿÿÿÿÿ else if( ref < 0 ) [[unlikely]]
ÿÿÿÿÿÿÿ {
ÿÿÿÿÿÿÿÿÿÿÿ flag.wait( ref, memory_order_relaxed );
ÿÿÿÿÿÿÿÿÿÿÿ ref = flag.load( memory_order_relaxed );
ÿÿÿÿÿÿÿ }
ÿÿÿÿÿÿÿ else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_relaxed ) ) [[likely]]
ÿÿÿÿÿÿÿÿÿÿÿ break;
ÿÿÿ bool succ = true;
ÿÿÿ std::atomic_thread_fence(std::memory_order_acquire);
ÿÿÿ try
ÿÿÿ {
ÿÿÿÿÿÿÿ if constexpr( requires { (bool)callable(); } )
ÿÿÿÿÿÿÿÿÿÿÿ succ = (bool)callable();
ÿÿÿÿÿÿÿ else
ÿÿÿÿÿÿÿÿÿÿÿ callable();
ÿÿÿ }
ÿÿÿ catch( ... )
ÿÿÿ {
ÿÿÿÿÿÿÿ std::atomic_thread_fence(std::memory_order_release);
ÿÿÿÿÿÿÿ flag.store( 0, memory_order_relaxed );
ÿÿÿÿÿÿÿ flag.notify_one();
ÿÿÿÿÿÿÿ throw;
ÿÿÿ }
ÿÿÿ std::atomic_thread_fence(std::memory_order_release);
ÿÿÿ flag.store( (char)succ, memory_order_relaxed );
ÿÿÿ if( succ )
ÿÿÿÿÿÿÿ flag.notify_all();
ÿÿÿ else
ÿÿÿÿÿÿÿ flag.notify_one();
ÿÿÿ return succ;
}
On 12/31/2025 9:00 PM, Bonita Montero wrote:
[...]
#pragma once
#include <concepts>
#include <atomic>
struct xonce_flag
{
ÿÿÿ xonce_flag() noexcept = default;
private:
ÿÿÿ friend bool xcall_once( xonce_flag &, std::invocable auto );
ÿÿÿ using flag_t = std::atomic<signed char>;
ÿÿÿ flag_t m_flag = 0;
};
bool xcall_once( xonce_flag &xflag, std::invocable auto callable )
{
ÿÿÿ using namespace std;
ÿÿÿ xonce_flag::flag_t &flag = xflag.m_flag;
ÿÿÿ for( signed char ref = flag.load( memory_order_relaxed ); ; )
ÿÿÿÿÿÿÿ if( ref > 0 ) [[likely]]
ÿÿÿÿÿÿÿÿÿÿÿ std::atomic_thread_fence(std::memory_order_acquire);
ÿÿÿÿÿÿÿÿÿÿÿ return true;
ÿÿÿÿÿÿÿ else if( ref < 0 ) [[unlikely]]
ÿÿÿÿÿÿÿ {
ÿÿÿÿÿÿÿÿÿÿÿ flag.wait( ref, memory_order_relaxed );
ÿÿÿÿÿÿÿÿÿÿÿ ref = flag.load( memory_order_relaxed );
ÿÿÿÿÿÿÿ }
ÿÿÿÿÿÿÿ else if( flag.compare_exchange_strong( ref, -1, memory_order_relaxed, memory_order_relaxed ) ) [[likely]]
ÿÿÿÿÿÿÿÿÿÿÿ break;
ÿÿÿ bool succ = true;
ÿÿÿ std::atomic_thread_fence(std::memory_order_acquire);
ÿÿÿ try
ÿÿÿ {
ÿÿÿÿÿÿÿ if constexpr( requires { (bool)callable(); } )
ÿÿÿÿÿÿÿÿÿÿÿ succ = (bool)callable();
ÿÿÿÿÿÿÿ else
ÿÿÿÿÿÿÿÿÿÿÿ callable();
ÿÿÿ }
ÿÿÿ catch( ... )
ÿÿÿ {
ÿÿÿÿÿÿÿ std::atomic_thread_fence(std::memory_order_release);
ÿÿÿÿÿÿÿ flag.store( 0, memory_order_relaxed );
ÿÿÿÿÿÿÿ flag.notify_one();
ÿÿÿÿÿÿÿ throw;
ÿÿÿ }
ÿÿÿ std::atomic_thread_fence(std::memory_order_release);
ÿÿÿ flag.store( (char)succ, memory_order_relaxed );
ÿÿÿ if( succ )
ÿÿÿÿÿÿÿ flag.notify_all();
ÿÿÿ else
ÿÿÿÿÿÿÿ flag.notify_one();
ÿÿÿ return succ;
}
On 1/2/2026 4:35 PM, Bonita Montero wrote:It's the same behaviour as if the intitialization code of a once_flag
Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:
What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.
If the callable returns false the flag remains 0, i.e. uninitialized.
But, what if "callable" returning 0 means it succeeded? Akin to pthread_mutex_lock() returning 0?
So, a user provided callable needs to return a bool?
On 1/2/2026 4:34 PM, Bonita Montero wrote:
Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:
The flag should be completely isolated from callable?ÿ and pad doesThe callable is only a number of references [&] which will be
this.
optimized away.
callable can call into god knows what... You want your flag to be
isolated.
On 1/2/2026 4:32 PM, Bonita Montero wrote:Absolutely not.
Am 02.01.2026 um 21:52 schrieb Chris M. Thomasson:
I think you can make that logic all relaxed. +Yes, with two additional barriers. But it's easier how I do it.
My way has the barriers exactly where they are actually needed, and
its way easier to read.
Wrong.But my way uses a single acquire barrier after the logic has doneIt's not simpler your way; you would need two additional barriers and
its thing. That is simpler and more efficient.
I have two implicit barriers at runtime.
Its better than using the membars in the damn cas wrt C++. One membar
for fail, one membar for success. Yeah. There can be rather major
issues with that...
You only need the actual membars once right before you call intoIt's simpler how I do that.
callable, and once right after it.
Actually, not. Well, imvho.
Actually, std::atomic_thread_fence is more of a SPARC way to doSPARC is dead.
things wrt memory order.
std::atomic_thread_fence can make things oh so much easier.
Am 03.01.2026 um 04:09 schrieb Chris M. Thomasson:
On 1/2/2026 4:32 PM, Bonita Montero wrote:Absolutely not.
Am 02.01.2026 um 21:52 schrieb Chris M. Thomasson:
I think you can make that logic all relaxed. +Yes, with two additional barriers. But it's easier how I do it.
My way has the barriers exactly where they are actually needed, and
its way easier to read.
Wrong.
But my way uses a single acquire barrier after the logic has doneIt's not simpler your way; you would need two additional barriers and
its thing. That is simpler and more efficient.
I have two implicit barriers at runtime.
Its better than using the membars in the damn cas wrt C++. One membar
for fail, one membar for success. Yeah. There can be rather major
issues with that...
You only need the actual membars once right before you call intoIt's simpler how I do that.
callable, and once right after it.
Actually, not. Well, imvho.
Actually, std::atomic_thread_fence is more of a SPARC way to doSPARC is dead.
things wrt memory order.
std::atomic_thread_fence can make things oh so much easier.
No, it makes my code more complicated.
Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:
On 1/2/2026 4:34 PM, Bonita Montero wrote:
Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:
The flag should be completely isolated from callable?ÿ and pad doesThe callable is only a number of references [&] which will be
this.
optimized away.
callable can call into god knows what... You want your flag to be
isolated.
My code behaves the same way as with a std::call_once in that sense.
You make the code more complicated for nothing.[...]
On 1/2/2026 7:59 PM, Bonita Montero wrote:
Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:You want your flag to be isolated from callable. Ideally aligned and
On 1/2/2026 4:34 PM, Bonita Montero wrote:
Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:
The flag should be completely isolated from callable?ÿ and padThe callable is only a number of references [&] which will be
does this.
optimized away.
callable can call into god knows what... You want your flag to be
isolated.
My code behaves the same way as with a std::call_once in that sense.
padded to a l2 cache line.
On 1/2/2026 7:57 PM, Bonita Montero wrote:
You make the code more complicated for nothing.[...]
It makes the memory sync MUCH easier to read, imvho. Also, its not
more complicated, its more concise.
Am 03.01.2026 um 05:05 schrieb Chris M. Thomasson:
On 1/2/2026 7:57 PM, Bonita Montero wrote:You're really sick. This are 24 lines of code.
You make the code more complicated for nothing.[...]
It makes the memory sync MUCH easier to read, imvho. Also, its not
more complicated, its more concise.
If you think it's too hard to read don't program at all.
I just wanted to show another way to place the membars. The SPARCI like my minimalism.
style is neat, and C++ lets us express it cleanly. It?s simply easier
for me to think about the protocol when the barriers are spelled out explicitly. In this layout, the membars are exactly where they need to
be, and all the atomics are relaxed.
Your (bool)callable issue is interesting, by the way.That's while I wrote that. Otherwise I could have stuck with
Anyway, here?s the SPARC?style sketch I typed into the newsreaderSPARC is dead.
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:
C++ better NOT use an acquire barrier for my membar_consume()! GRRRRRR!
Am 03.01.2026 um 05:03 schrieb Chris M. Thomasson:
On 1/2/2026 7:59 PM, Bonita Montero wrote:
Am 03.01.2026 um 04:18 schrieb Chris M. Thomasson:You want your flag to be isolated from callable. Ideally aligned and
On 1/2/2026 4:34 PM, Bonita Montero wrote:
Am 02.01.2026 um 21:54 schrieb Chris M. Thomasson:
The flag should be completely isolated from callable?ÿ and padThe callable is only a number of references [&] which will be
does this.
optimized away.
callable can call into god knows what... You want your flag to be
isolated.
My code behaves the same way as with a std::call_once in that sense.
padded to a l2 cache line.
You're really really sick !
The flag is written only a few times until initialization succeds.
Then it remains in a shared cacheline; so there's no false sharing.
And no need for alignment.
Am 03.01.2026 um 07:24 schrieb Chris M. Thomasson:
I just wanted to show another way to place the membars. The SPARCI like my minimalism.
style is neat, and C++ lets us express it cleanly. It?s simply easier
for me to think about the protocol when the barriers are spelled out
explicitly. In this layout, the membars are exactly where they need to
be, and all the atomics are relaxed.
If there would be a more complex synchronization algorithm with screenpages of lines you might be right, but not with this small amout of code.
Your (bool)callable issue is interesting, by the way.That's while I wrote that. Otherwise I could have stuck with
std::once_flag.
Anyway, here?s the SPARC?style sketch I typed into the newsreaderSPARC is dead.
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:
Neither Oracle nor Fujitsu have officiall quitted this CPUs,
but the last SPARC-CPUs are nine years ago. Fujitsu has moved
its development team to design new ARM-CPUs.
C++ better NOT use an acquire barrier for my membar_consume()! GRRRRRR!
I don't know wheter a childish attitude is appropriate for sofware development.
But at least when it comes to such small details I might be right.
Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:
On 1/2/2026 4:35 PM, Bonita Montero wrote:It's the same behaviour as if the intitialization code of a once_flag
Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:
What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.
If the callable returns false the flag remains 0, i.e. uninitialized.
But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?
throws an exceptions; the once_flag remains uninitialized.
So, a user provided callable needs to return a bool?
On 1/2/2026 7:58 PM, Bonita Montero wrote:
Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:
On 1/2/2026 4:35 PM, Bonita Montero wrote:It's the same behaviour as if the intitialization code of a once_flag
Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:
What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.
If the callable returns false the flag remains 0, i.e. uninitialized.
But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?
throws an exceptions; the once_flag remains uninitialized.
So, a user provided callable needs to return a bool?
Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and ?true? must mean
successful initialization? That?s fine if it?s documented, but it?s not
a general pattern in a sense. Humm...
Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag uninitialized. Well, shit happens...
So the question isn?t whether your approach works for your specific
use case... It?s whether the interface is robust for "arbitrary"
callables? Right now, it isn?t unless you require a bool returning
callable with a very specific meaning...
Fair enough?
On 1/2/2026 7:58 PM, Bonita Montero wrote:It can return a bool but it must not (if constexpr( ... )).
Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:
On 1/2/2026 4:35 PM, Bonita Montero wrote:It's the same behaviour as if the intitialization code of a once_flag
Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:
What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX
API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.
If the callable returns false the flag remains 0, i.e. uninitialized.
But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?
throws an exceptions; the once_flag remains uninitialized.
So, a user provided callable needs to return a bool?
Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and ?true? must mean
successful initialization? That?s fine if it?s documented, but it?s
not a general pattern in a sense. Humm...
Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag uninitialized. Well, shit happens...
So the question isn?t whether your approach works for your specific
use case... It?s whether the interface is robust for "arbitrary"
callables? Right now, it isn?t unless you require a bool returning
callable with a very specific meaning...
Fair enough?
On 1/3/26 2:57 PM, Chris M. Thomasson wrote:
On 1/2/2026 7:58 PM, Bonita Montero wrote:
Am 03.01.2026 um 03:57 schrieb Chris M. Thomasson:
On 1/2/2026 4:35 PM, Bonita Montero wrote:It's the same behaviour as if the intitialization code of a once_flag
Am 02.01.2026 um 23:54 schrieb Chris M. Thomasson:
What if callable() returns zero, but zero is meant to denote
success? Are you familiar with the return values of a lot of POSIX >>>>>> API's? return zero means success. Would that mess up your logic
here? What if callable does not have a return value.
If the callable returns false the flag remains 0, i.e. uninitialized. >>>>>
But, what if "callable" returning 0 means it succeeded? Akin to
pthread_mutex_lock() returning 0?
throws an exceptions; the once_flag remains uninitialized.
So, a user provided callable needs to return a bool?
Okay, but that means your design is implicitly imposing a contract on
the user: the callable must return a bool, and ?true? must mean
successful initialization? That?s fine if it?s documented, but it?s
not a general pattern in a sense. Humm...
Plenty of API, POSIX being an example use 0 to mean success.
If someone passes a callable that follows that convention, your logic
treats a successful initialization as a failure and leaves the flag
uninitialized. Well, shit happens...
So the question isn?t whether your approach works for your specific
use case... It?s whether the interface is robust for "arbitrary"
callables? Right now, it isn?t unless you require a bool returning
callable with a very specific meaning...
Fair enough?
Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.
Not uncommon to need thin shims like this in "generic" interfaces.
If int foo() return 0 on success, you need a
bool shim_foo() { return 0 == foo(); }
I disagree. Try to get rid of any possibility of false sharing. Strive
for it. It's just good hygiene! :^)
i don't understand you; the interface is understandable in an easy way.It's just that (bool)callable is a bit scary to me.Your (bool)callable issue is interesting, by the way.That's while I wrote that. Otherwise I could have stuck with
std::once_flag.
I'm using mebars in my code in the most efficient way since how IIf you say so. I happen to like the way it handled memory order withAnyway, here?s the SPARC?style sketch I typed into the newsreaderSPARC is dead.
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:
its MEMBAR instruction.
I use as less membars as possible and where I use them they'reI don't know wheter a childish attitude is appropriate for sofwareOh my. If a damn compiler puts in a MEMBAR #LoadStore | #LoadLoad for
development.
But at least when it comes to such small details I might be right.
a consume membar, I would be pissed off. You should be pissed off as well.
In a sense, if expecting a compiler to respect the memory model and_I_ do that the simplest way, you added superflous code to make
avoid unnecessary hardware fences is 'childish,' then I guess the
entire C++ Standards Committee is in preschool. Efficiency isn't a
small detail; it's the whole point
I use the membars correctly and as minimal as possible.Oh really? How?Its better than using the membars in the damn cas wrt C++. OneWrong.
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...
No, it makes my code more complicated.Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when its
100% needed.
Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:
i don't understand you; the interface is understandable in an easy way.It's just that (bool)callable is a bit scary to me.Your (bool)callable issue is interesting, by the way.That's while I wrote that. Otherwise I could have stuck with
std::once_flag.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.
I'm using mebars in my code in the most efficient way since how IIf you say so. I happen to like the way it handled memory order withAnyway, here?s the SPARC?style sketch I typed into the newsreaderSPARC is dead.
(forgive any typos). This is the hazard?pointer load pattern. The
storeload membar makes the whole thing easy to reason about:
its MEMBAR instruction.
did it is the simplest way to do that.
I don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.
Oh my. If a damn compiler puts in a MEMBAR #LoadStore | #LoadLoad forI use as less membars as possible and where I use them they're
a consume membar, I would be pissed off. You should be pissed off as
well.
at their right place. I don't follow the minimalism padadigm
all the time but hiere it applies.
In a sense, if expecting a compiler to respect the memory model and_I_ do that the simplest way, you added superflous code to make
avoid unnecessary hardware fences is 'childish,' then I guess the
entire C++ Standards Committee is in preschool. Efficiency isn't a
small detail; it's the whole point
the code more understandable. With a function that is only 34
lines of code ...
Just says that if you have a function that doesn't return true on
success, you need to wrap it with a thin wrapper to return true on success.
If it the return is bool use if( fn() ) or if( !fn() ).
If it returns an integral with 0 for success use if( fn() == 0 ).
The rest is easily readable from the context.
Am 03.01.2026 um 05:02 schrieb Chris M. Thomasson:
I use the membars correctly and as minimal as possible.Oh really? How?Its better than using the membars in the damn cas wrt C++. OneWrong.
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...
No, it makes my code more complicated.Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when its
100% needed.
Forget SPARC, it's dead.
And C++ has complete abstraction of any applicable barrier.
Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:
I disagree. Try to get rid of any possibility of false sharing. Strive
for it. It's just good hygiene! :^)
No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia- lized; otherwise the flag / the cacheline remains in shared mode. The performance-impact of completely avoiding false sharing here ist nearly
zero.
On 1/3/2026 11:32 PM, Bonita Montero wrote:My code is simple in that sense.
Am 03.01.2026 um 05:02 schrieb Chris M. Thomasson:The mebars for CAS success and fail can be a bit sketchy. We were
I use the membars correctly and as minimal as possible.Oh really? How?Its better than using the membars in the damn cas wrt C++. OneWrong.
membar for fail, one membar for success. Yeah. There can be rather
major issues with that...
discussing them well before the C++11 std back in
comp.programming.threads. I wonder if Alex Terekhov is still reading
usenet.
No, but they're mostly not needed.Should the std get rid of stand alone fences?No, it makes my code more complicated.Actually, it does not. Well, imvvho. I love the SPARC way of doing
things wrt memory ordering. Your memory order, afaict, is correct.
But, the stand alone one works and it only executes a membar when
its 100% needed.
Forget SPARC, it's dead.
And C++ has complete abstraction of any applicable barrier.Even consume?
On 1/3/2026 11:06 PM, Bonita Montero wrote:
Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:I would still do it. But, that's just me. Its a bit more than that.
I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)
No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia-
lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly
zero.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So, whatever.
On 1/3/2026 11:11 PM, Bonita Montero wrote:
Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:You should put a clear comment about (bool)callable?
i don't understand you; the interface is understandable in an easy way.It's just that (bool)callable is a bit scary to me.Your (bool)callable issue is interesting, by the way.That's while I wrote that. Otherwise I could have stuck with
std::once_flag.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.
I'm using fences as minimalized as possible.You code is fine, (bool)callable aside for a moment. I can read it. II don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.
just wanted to show another way to use stand alone fences. That's all.
That is another matter. A consume membar. Your code does not use them.Of course, I'm using it twice. Once after the first flag load and once
But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.
Am 04.01.2026 um 23:19 schrieb Chris M. Thomasson:
On 1/3/2026 11:06 PM, Bonita Montero wrote:
Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:I would still do it. But, that's just me. Its a bit more than that.
I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)
No, false sharing needs to be avoided if it happens at least sometimes.
Here false sharing might occur just once when the "once-flag" is initia- >>> lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly
zero.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So, whatever.
Sorry, the initialization happens only once and after that the flag- cachline
is read-shared.
Am 04.01.2026 um 23:05 schrieb Chris M. Thomasson:
On 1/3/2026 11:11 PM, Bonita Montero wrote:
Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:You should put a clear comment about (bool)callable?
i don't understand you; the interface is understandable in an easy way.It's just that (bool)callable is a bit scary to me.Your (bool)callable issue is interesting, by the way.That's while I wrote that. Otherwise I could have stuck with
std::once_flag.
And if you need simpler code inside xcall_once than in call_once and
not the boolean return feature you just coud return nothing.
That's sufficient:
ÿ ÿ if constexpr( requires { (bool)callable(); } )
I'm using fences as minimalized as possible.You code is fine, (bool)callable aside for a moment. I can read it. II don't know wheter a childish attitude is appropriate for sofware
development.
But at least when it comes to such small details I might be right.
just wanted to show another way to use stand alone fences. That's all.
That is another matter. A consume membar. Your code does not use them.Of course, I'm using it twice. Once after the first flag load and once
after a failed CAS.
But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.
I'm using the fences properly and as minimal as possible.
On 1/4/2026 4:03 PM, Bonita Montero wrote:
Am 04.01.2026 um 23:05 schrieb Chris M. Thomasson:
On 1/3/2026 11:11 PM, Bonita Montero wrote:
Am 03.01.2026 um 20:39 schrieb Chris M. Thomasson:You should put a clear comment about (bool)callable?
i don't understand you; the interface is understandable in an easy way. >>>> And if you need simpler code inside xcall_once than in call_once andIt's just that (bool)callable is a bit scary to me.Your (bool)callable issue is interesting, by the way.That's while I wrote that. Otherwise I could have stuck with
std::once_flag.
not the boolean return feature you just coud return nothing.
That's sufficient:
ÿÿ ÿ if constexpr( requires { (bool)callable(); } )
I'm using fences as minimalized as possible.You code is fine, (bool)callable aside for a moment. I can read it. II don't know wheter a childish attitude is appropriate for sofware >>>>>> development.
But at least when it comes to such small details I might be right.
just wanted to show another way to use stand alone fences. That's all.
That is another matter. A consume membar. Your code does not use them.Of course, I'm using it twice. Once after the first flag load and once
after a failed CAS.
But, if you ever do, well, make sure a rouge compiler is not putting
in an acquire barrier when it does not have to. I should have made
another topic about it. Sorry for that.
I'm using the fences properly and as minimal as possible.
I basically agree. But, you code logic had no need for a consume membar.
So, its a mute point in this context. I only brought it up in case you
ever do need one. NOT and acquire. :^)
On 1/4/2026 4:01 PM, Bonita Montero wrote:
Am 04.01.2026 um 23:19 schrieb Chris M. Thomasson:
On 1/3/2026 11:06 PM, Bonita Montero wrote:
Am 03.01.2026 um 20:22 schrieb Chris M. Thomasson:I would still do it. But, that's just me. Its a bit more than that.
I disagree. Try to get rid of any possibility of false sharing.
Strive for it. It's just good hygiene! :^)
No, false sharing needs to be avoided if it happens at least sometimes. >>>> Here false sharing might occur just once when the "once-flag" is
initia-
lized; otherwise the flag / the cacheline remains in shared mode. The
performance-impact of completely avoiding false sharing here ist nearly >>>> zero.
Every atomic RMW, store, load, on your flag. Think of a reservation
granule for systems with LL/SC. A user needs to be careful where they
place those flags. But, well, they can align it themselves. So,
whatever.
Sorry, the initialization happens only once and after that the flag-
cachline
is read-shared.
What about that CAS in the loop?
What about that CAS in the loop?
Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:
What about that CAS in the loop?
I't only executed once.
On 1/5/2026 1:18 AM, Bonita Montero wrote:
Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:Notice the loop?
What about that CAS in the loop?I't only executed once.
Am 05.01.2026 um 21:31 schrieb Chris M. Thomasson:
On 1/5/2026 1:18 AM, Bonita Montero wrote:
Am 05.01.2026 um 01:40 schrieb Chris M. Thomasson:Notice the loop?
What about that CAS in the loop?I't only executed once.
Yes, but there's usually not so much contention that it
is executed more than once.
Never know. You are exposing your algo to user code. God knows what it
might do.
Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:
Never know. You are exposing your algo to user code. God knows what it
might do.
The xonce_flag flips to true only once, no matter how mich user code is surrounding that flag.
On 1/6/2026 9:56 PM, Bonita Montero wrote:
Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:
Never know. You are exposing your algo to user code. God knows what
it might do.
The xonce_flag flips to true only once, no matter how mich user code
is surrounding that flag.
Forget about the flips for a moment. How may times does a LOCK CMPXCHG
hit it?
Am 07.01.2026 um 07:14 schrieb Chris M. Thomasson:
On 1/6/2026 9:56 PM, Bonita Montero wrote:
Am 07.01.2026 um 01:58 schrieb Chris M. Thomasson:
Never know. You are exposing your algo to user code. God knows what
it might do.
The xonce_flag flips to true only once, no matter how mich user code
is surrounding that flag.
Forget about the flips for a moment. How may times does a LOCK CMPXCHG
hit it?
Once in 99.999% of all cases.
What if somebody uses a callable that throws all the time? ;^) Kidding,
but I still would align and pad the flag. But, that just me. In the name
of "proper" hygiene...
| Sysop: | Tetrazocine |
|---|---|
| Location: | Melbourne, VIC, Australia |
| Users: | 15 |
| Nodes: | 8 (0 / 8) |
| Uptime: | 180:28:48 |
| Calls: | 213 |
| Files: | 21,502 |
| Messages: | 84,274 |