Forum: d0p3 BBS

Solving thundering Herd with glibc...

From Chris M. Thomasson@3:633/280.2 to All on Fri Apr 25 06:39:27 2025

Well, I don't have any more time to mess around with this, but is Bonita right? does glibc 100% solve _all_ thundering herd problems? I know
about wait morphing, however it is not a 100% solution.

Thanks.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Fri Apr 25 07:10:46 2025

Am 24.04.2025 um 22:39 schrieb Chris M. Thomasson:

Well, I don't have any more time to mess around with this, but is Bonita right? does glibc 100% solve _all_ thundering herd problems? I know
about wait morphing, however it is not a 100% solution.

With wait morphing thundering herd is impossible.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Chris M. Thomasson@3:633/280.2 to All on Fri Apr 25 07:33:40 2025

On 4/24/2025 2:10 PM, Bonita Montero wrote:

Am 24.04.2025 um 22:39 schrieb Chris M. Thomasson:

Well, I don't have any more time to mess around with this, but is
Bonita right? does glibc 100% solve _all_ thundering herd problems? I
know about wait morphing, however it is not a 100% solution.

With wait morphing thundering herd is impossible.

"there’s no thundering herd, ever!" because a controlled test didn't
"show it" is like saying race conditions do not exist because your code "worked fine this time."? Fair enough?

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Scott Lurndal@3:633/280.2 to All on Fri Apr 25 07:35:30 2025

Reply-To: slp53@pacbell.net

Bonita Montero <Bonita.Montero@gmail.com> writes:

Am 24.04.2025 um 22:39 schrieb Chris M. Thomasson:

Well, I don't have any more time to mess around with this, but is Bonita
right? does glibc 100% solve _all_ thundering herd problems? I know
about wait morphing, however it is not a 100% solution.

With wait morphing thundering herd is impossible.

Wait morphing was removed from glibc in 2016, if I recall correctly.

In any case, a programmer should never assume that the
run-time system will support wait morphing when writing
portable code. Release the mutex before signaling the
condition variable and avoid pthread_cond_broadcast unless
absolutely necessary.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: UsenetServer - www.usenetserver.com (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Fri Apr 25 16:28:08 2025

Am 24.04.2025 um 23:35 schrieb Scott Lurndal:

Wait morphing was removed from glibc in 2016, if I recall correctly.

I've shown with my first pingpong-code that there are no further
wakeups beyond one per loop. Try to find the bug; there isn't a bug.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Fri Apr 25 18:37:04 2025

Am 24.04.2025 um 23:33 schrieb Chris M. Thomasson:

"there’s no thundering herd, ever!" because a controlled test didn't
"show it" is like saying race conditions do not exist because your code "worked fine this time."? Fair enough?

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Fri Apr 25 21:27:28 2025

Am 25.04.2025 um 10:37 schrieb Bonita Montero:

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

A thundering herd problem with a condvar should occur if I notify > 1
threads and unlock the mutex, but it actually dosn't happen; never.
So the glibc condvar is optimal.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Chris M. Thomasson@3:633/280.2 to All on Sat Apr 26 06:01:21 2025

On 4/25/2025 4:27 AM, Bonita Montero wrote:

Am 25.04.2025 um 10:37 schrieb Bonita Montero:

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

A thundering herd problem with a condvar should occur if I notify > 1
threads and unlock the mutex, but it actually dosn't happen; never.
So the glibc condvar is optimal.

If you say so... Too busy right now. Perhaps sometime later on tonight.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Sat Apr 26 16:26:21 2025

Am 25.04.2025 um 22:01 schrieb Chris M. Thomasson:

If you say so... Too busy right now. Perhaps sometime later on tonight.

If there would be a thundering herd problem with glibc's condvar it
would happen very often with my code since I awake 31 threads at once
with my machine.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Sat Apr 26 17:25:22 2025

Am 26.04.2025 um 08:26 schrieb Bonita Montero:

Am 25.04.2025 um 22:01 schrieb Chris M. Thomasson:

If you say so... Too busy right now. Perhaps sometime later on tonight.

If there would be a thundering herd problem with glibc's condvar it
would happen very often with my code since I awake 31 threads at once
with my machine.

I just tried to awaken all 31 threads from outside holding the mutex,
but not from inside:

for( size_t r = N; r; --r )
{
unique_lock lock( mtx );
signalled = nClients;
ai.store( nClients, memory_order_relaxed );
lock.unlock();
if( argc < 2 )
cv.notify_all();
else
for( int c = nClients; c; cv.notify_one(), --c );
bs.acquire();
}

The result: 7.500 context switches per thread, not 3.000.

10000 rounds,
7498.06 context switches pe thread

So never signal a condvar to multiple threads from outside !

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Chris M. Thomasson@3:633/280.2 to All on Sun Apr 27 07:41:31 2025

On 4/26/2025 12:25 AM, Bonita Montero wrote:

Am 26.04.2025 um 08:26 schrieb Bonita Montero:

Am 25.04.2025 um 22:01 schrieb Chris M. Thomasson:

If you say so... Too busy right now. Perhaps sometime later on

tonight.

If there would be a thundering herd problem with glibc's condvar it
would happen very often with my code since I awake 31 threads at once
with my machine.

I just tried to awaken all 31 threads from outside holding the mutex,
but not from inside:

��for( size_t r = N; r; --r )
��{
�� unique_lock lock( mtx );
�� signalled = nClients;
�� ai.store( nClients, memory_order_relaxed );
�� lock.unlock();
�� if( argc < 2 )
�� cv.notify_all();
�� else
�� for( int c = nClients; c; cv.notify_one(), --c );
�� bs.acquire();
��}

The result: 7.500 context switches per thread, not 3.000.

��10000 rounds,
��7498.06 context switches pe thread

So never signal a condvar to multiple threads from outside !

So, do that. It's your software. Do what you like. This is a very old
debate. Take your contrived test and just, roll with it. Whatever.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Sun Apr 27 13:18:02 2025

Am 26.04.2025 um 23:41 schrieb Chris M. Thomasson:

On 4/26/2025 12:25 AM, Bonita Montero wrote:

I just tried to awaken all 31 threads from outside holding the mutex,
but not from inside:

��for( size_t r = N; r; --r )
��{
�� unique_lock lock( mtx );
�� signalled = nClients;
�� ai.store( nClients, memory_order_relaxed );
�� lock.unlock();
�� if( argc < 2 )
�� cv.notify_all();
�� else
�� for( int c = nClients; c; cv.notify_one(), --c );
�� bs.acquire();
��}

The result: 7.500 context switches per thread, not 3.000.

��10000 rounds,
��7498.06 context switches pe thread

So never signal a condvar to multiple threads from outside !

So, do that. It's your software. Do what you like. This is a very old debate. Take your contrived test and just, roll with it. Whatever.

There's nothing to debate, I've measured it: If you awake a single
thread it doesn't matter if you awake from inside or outside, if you
awake multiple threads awakening them from inside is multiple times
faster.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Chris M. Thomasson@3:633/280.2 to All on Wed Apr 30 05:26:41 2025

On 4/26/2025 8:18 PM, Bonita Montero wrote:

Am 26.04.2025 um 23:41 schrieb Chris M. Thomasson:

On 4/26/2025 12:25 AM, Bonita Montero wrote:

I just tried to awaken all 31 threads from outside holding the mutex,
but not from inside:

��for( size_t r = N; r; --r )
��{
�� unique_lock lock( mtx );
�� signalled = nClients;
�� ai.store( nClients, memory_order_relaxed );
�� lock.unlock();
�� if( argc < 2 )
�� cv.notify_all();
�� else
�� for( int c = nClients; c; cv.notify_one(), --c );
�� bs.acquire();
��}

The result: 7.500 context switches per thread, not 3.000.

��10000 rounds,
��7498.06 context switches pe thread

So never signal a condvar to multiple threads from outside !

So, do that. It's your software. Do what you like. This is a very old
debate. Take your contrived test and just, roll with it. Whatever.

There's nothing to debate, I've measured it: If you awake a single
thread it doesn't matter if you awake from inside or outside, if you
awake multiple threads awakening them from inside is multiple times
faster.

Yawn.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Vir Campestris@3:633/280.2 to All on Thu May 15 01:05:02 2025

On 25/04/2025 09:37, Bonita Montero wrote:

Am 24.04.2025 um 23:33 schrieb Chris M. Thomasson:

"there’s no thundering herd, ever!" because a controlled test didn't
"show it" is like saying race conditions do not exist because your
code "worked fine this time."? Fair enough?

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

Once upon a time I put a race in a bit of code.

It took us 3 years to track down why our customers were reporting
occasional faults :(

(It turned out the trick to reproduce it was a combination of lots of
CPU load and disk transfers not more than once every 30 seconds)

--
Do not listen to rumour, but, if you do, do not believe it.
Ghandi.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Chris M. Thomasson@3:633/280.2 to All on Thu May 15 09:22:16 2025

On 5/14/2025 8:05 AM, Vir Campestris wrote:

On 25/04/2025 09:37, Bonita Montero wrote:

Am 24.04.2025 um 23:33 schrieb Chris M. Thomasson:

"there’s no thundering herd, ever!" because a controlled test didn't
"show it" is like saying race conditions do not exist because your
code "worked fine this time."? Fair enough?

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

Once upon a time I put a race in a bit of code.

It took us 3 years to track down why our customers were reporting
occasional faults :(

Oh shit. Damn. Did it "totally" deadlock once in a blue moon or just
give out some bugged data to the customers? I hope it just deadlocked...
Wow. Shit man. ;^o

(It turned out the trick to reproduce it was a combination of lots of
CPU load and disk transfers not more than once every 30 seconds)

Yup. I have been there done that for some of my tests. I even put a bug
in on purpose just to see how long it would take to trigger it...
Decades ago. ;^o Sigh.

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From David Brown@3:633/280.2 to All on Thu May 15 16:49:28 2025

On 14/05/2025 17:05, Vir Campestris wrote:

On 25/04/2025 09:37, Bonita Montero wrote:

Am 24.04.2025 um 23:33 schrieb Chris M. Thomasson:

"there’s no thundering herd, ever!" because a controlled test didn't
"show it" is like saying race conditions do not exist because your
code "worked fine this time."? Fair enough?

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

Once upon a time I put a race in a bit of code.

It took us 3 years to track down why our customers were reporting
occasional faults :(

(It turned out the trick to reproduce it was a combination of lots of
CPU load and disk transfers not more than once every 30 seconds)

It's always fun dealing with a bug that only triggers in rare timing situations. We once had a mistake in a timing table in a program that
could sometimes result in intermittent faults in the system if
particular events occurred at the same time, on the 30th of September.
Finding an issue that occurred at most once per year was a challenge!

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From candycanearter07@3:633/280.2 to All on Fri May 16 22:30:03 2025

David Brown <david.brown@hesbynett.no> wrote at 06:49 this Thursday (GMT):

On 14/05/2025 17:05, Vir Campestris wrote:

On 25/04/2025 09:37, Bonita Montero wrote:

Am 24.04.2025 um 23:33 schrieb Chris M. Thomasson:

"there’s no thundering herd, ever!" because a controlled test didn't >>>> "show it" is like saying race conditions do not exist because your
code "worked fine this time."? Fair enough?

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

Once upon a time I put a race in a bit of code.

It took us 3 years to track down why our customers were reporting
occasional faults :(

(It turned out the trick to reproduce it was a combination of lots of
CPU load and disk transfers not more than once every 30 seconds)

It's always fun dealing with a bug that only triggers in rare timing situations. We once had a mistake in a timing table in a program that
could sometimes result in intermittent faults in the system if
particular events occurred at the same time, on the 30th of September. Finding an issue that occurred at most once per year was a challenge!

If you knew what date the issue was happening on, could you force the
system clock to be on that day?
--
user <candycane> is generated from /dev/urandom

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: the-candyden-of-code (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Fri May 16 23:24:17 2025

Am 14.05.2025 um 17:05 schrieb Vir Campestris:

It took us 3 years to track down why our customers were reporting
occasional faults :(

I guess that your code wasn't a 101 LOC test-code but with much
more code.

Here's the code where I stopped development:

#if defined(_WIN32)
#include <Windows.h>
#endif
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <semaphore>
#include <vector>
#include <string_view>
#if defined(__unix__)
#include <sys/resource.h>
#endif

using namespace std;

struct params
{
params( unsigned argc, char **argv );
bool outside, add, all;
};

int main( int argc, char **argv )
{
constexpr size_t N = 10'000;
cout << N << " rounds" << endl;
int hc = thread::hardware_concurrency(), nClients = hc - 1;
int64_t tLast = 0;
for( unsigned outside = 0; outside <= 1; ++outside )
{
cout << (outside ? "outside:" : "inside:") << endl;
for( unsigned all = 0; all <= 1; ++all )
{
cout << (all ? "\tall:" : "\tone:") << endl;
mutex mtx;
int signalled = 0;
condition_variable cv;
atomic_int ai( 0 );
binary_semaphore bs( false );
vector<jthread> threads;
atomic_int64_t nVoluntary( 0 );
atomic_bool stop( false );
for( int c = nClients; c; --c )
threads.emplace_back( [&]
{
for( size_t r = N; r; --r )
{
unique_lock lock( mtx );
cv.wait( lock, [&] { return (bool)signalled; } );
--signalled;
lock.unlock();
if( ai.fetch_sub( 1, memory_order_relaxed ) == 1 )
bs.release( 1 );
}
#if defined(__unix__)
rusage ru;
getrusage( RUSAGE_THREAD, &ru );
nVoluntary.fetch_add( ru.ru_nvcsw, memory_order_relaxed );
#endif
} );
for( size_t r = N; r; --r )
{
auto notify = [&]
{
if( all )
cv.notify_all();
else
for( int c = nClients; c; cv.notify_one(), --c );
};
unique_lock lock( mtx );
signalled = nClients;
if( !outside )
notify();
ai.store( nClients, memory_order_relaxed );
lock.unlock();
if( outside )
notify();
bs.acquire();
}
stop.store( true, memory_order_relaxed );
threads.resize( 0 );
#if defined(_WIN32)
FILETIME ftDummy, ftKernel, ftUser;
GetProcessTimes( GetCurrentProcess(), &ftDummy, &ftDummy, &ftKernel,
&ftUser );
auto ftToU64 = []( FILETIME const &ft ) { return (uint64_t)ft.dwHighDateTime << 32 | ft.dwLowDateTime; };
int64_t t = ftToU64( ftKernel ) + ftToU64( ftUser );
cout << "\t\t" << (double)(t - tLast) / 1.0e7 << " seconds" << endl;
tLast = t;
#elif defined(__unix__)
rusage ru;
getrusage( RUSAGE_SELF, &ru );
auto tvToU64 = []( timeval const &tv ) { return (uint64_t)tv.tv_sec *
1'000'000u + tv.tv_usec; };
int64_t t = tvToU64( ru.ru_utime ) + tvToU64( ru.ru_stime );
cout << "\t\t" << (double)nVoluntary.load( memory_order_relaxed ) /
nClients << " context switches per thread" << endl;
cout << "\t\t" << (double)(t - tLast) / 1.0e6 << " seconds" << endl;
tLast = t;
#endif
}
}
}

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From David Brown@3:633/280.2 to All on Fri May 16 23:33:10 2025

On 16/05/2025 14:30, candycanearter07 wrote:

David Brown <david.brown@hesbynett.no> wrote at 06:49 this Thursday (GMT):

On 14/05/2025 17:05, Vir Campestris wrote:

On 25/04/2025 09:37, Bonita Montero wrote:

Am 24.04.2025 um 23:33 schrieb Chris M. Thomasson:

"there’s no thundering herd, ever!" because a controlled test didn't >>>>> "show it" is like saying race conditions do not exist because your
code "worked fine this time."? Fair enough?

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

Once upon a time I put a race in a bit of code.

It took us 3 years to track down why our customers were reporting
occasional faults :(

(It turned out the trick to reproduce it was a combination of lots of
CPU load and disk transfers not more than once every 30 seconds)

It's always fun dealing with a bug that only triggers in rare timing
situations. We once had a mistake in a timing table in a program that
could sometimes result in intermittent faults in the system if
particular events occurred at the same time, on the 30th of September.
Finding an issue that occurred at most once per year was a challenge!

If you knew what date the issue was happening on, could you force the
system clock to be on that day?

Yes, once we figured out that the issue was date-dependent. For the
first few years, all we knew was that we were getting occasional rare
bug reports and no one saw the coincidence. (This was a program on DOS
- changing the system clock was easy.)

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From candycanearter07@3:633/280.2 to All on Mon May 19 05:20:05 2025

David Brown <david.brown@hesbynett.no> wrote at 13:33 this Friday (GMT):

On 16/05/2025 14:30, candycanearter07 wrote:

David Brown <david.brown@hesbynett.no> wrote at 06:49 this Thursday (GMT): >>> On 14/05/2025 17:05, Vir Campestris wrote:

On 25/04/2025 09:37, Bonita Montero wrote:

Am 24.04.2025 um 23:33 schrieb Chris M. Thomasson:

"there’s no thundering herd, ever!" because a controlled test didn't >>>>>> "show it" is like saying race conditions do not exist because your >>>>>> code "worked fine this time."? Fair enough?

Yes, controlled test with 10'000 iterations.
The code is correct and trivial, but too much for you.

Once upon a time I put a race in a bit of code.

It took us 3 years to track down why our customers were reporting
occasional faults :(

(It turned out the trick to reproduce it was a combination of lots of
CPU load and disk transfers not more than once every 30 seconds)

It's always fun dealing with a bug that only triggers in rare timing
situations. We once had a mistake in a timing table in a program that
could sometimes result in intermittent faults in the system if
particular events occurred at the same time, on the 30th of September.
Finding an issue that occurred at most once per year was a challenge!

If you knew what date the issue was happening on, could you force the
system clock to be on that day?

Yes, once we figured out that the issue was date-dependent. For the
first few years, all we knew was that we were getting occasional rare
bug reports and no one saw the coincidence. (This was a program on DOS
- changing the system clock was easy.)

Ah, fair enough.
--
user <candycane> is generated from /dev/urandom

--- MBSE BBS v1.1.1 (Linux-x86_64)
* Origin: the-candyden-of-code (3:633/280.2@fidonet)

Who's Online

System Info

Sysop:	Tetrazocine
Location:	Melbourne, VIC, Australia
Users:	14
Nodes:	8 (0 / 8)
Uptime:	187:27:42
Calls:	178
Files:	21,502
Messages:	79,847

Solving thundering Herd with glibc...

Who's Online

System Info