hsearch_r is close to my definition of decent, but it has huge
limitations: fixed size that needs to be estimated before use, lack of a delete primitive, and it can only use string keys.
Did you know that these "limitations" are hash table attributes in 1. y. computer science students book?
The most annoying bit of C, to me, is the lack of a decent hash table implementation.
hsearch_r is close to my definition of decent, but it has huge
limitations: fixed size that needs to be estimated before use, lack of a delete primitive, and it can only use string keys.
I recently stumbled into klib: https://attractivechaos.github.io/klib/
Klib has a nice header-only hash-table implementation: khash.h
To be honest, I'm not a huge fan of vendoring header-only libraries.
It is also a little hard to understand at first, since it makes heavy
use of cpp macros. Yet it definitely captured my attention.
What I like most:
- A reasonable API
- It is possible to redefine malloc/calloc/free before including the
ÿ header. This allows me to use a memory pool, if so I wish
- Proper handling of malloc failures (by contrast with glib2 and
ÿ uthash)
- It has a kh_resize() primitive, that I can use to pre-allocate the
ÿ table to the maximum size, if I want to avoid re-allocations
ÿ (of course I need to free up space if kh_size() == max)
So far I couldn't find any obvious shortcoming, so I thought of sharing
it with this NG, in case anyone missed it.
Beyond "[not] any obvious shortcoming"; have you practical experiences
with using that library?
The CPP macros is certainly something I despise. It seems to be used to provide some primitive form of genericity; is that observed correctly?
Generic programming in C. Except my khash.h, all the other C hashlibraries use (void*) to achieve generic typing. Using void* is okey for strings, but will cause overhead for integers. This is why all C
Are recent "C" standards (meanwhile) providing such functionality?
Quoting:
Generic programming in C. Except my khash.h, all the other C hash
libraries use (void*) to achieve generic typing. Using void* is okey for
strings, but will cause overhead for integers. This is why all C
libraries, except khash.h, is slower than C++ libraries on integer keys,
but close to on string keys.
W dniu 7.04.2026 oÿ23:38, highcrew pisze:
hsearch_r is close to my definition of decent, but it has huge
limitations: fixed size that needs to be estimated before use, lack of a
delete primitive, and it can only use string keys.
Did you know that these "limitations" are hash table attributes in 1. y. computer science students book?
Take unordered_map<>, that's ten times more comfortable than any C
solution.
On 4/7/2026 10:14 PM, ??Jacek Marcin Jaworski???? wrote:
W dniu 7.04.2026 oÿ23:38, highcrew pisze:
hsearch_r is close to my definition of decent, but it has huge
limitations: fixed size that needs to be estimated before use,
lack of a delete primitive, and it can only use string keys.
Did you know that these "limitations" are hash table attributes in
1. y. computer science students book?
FWIW:
Hash tables, like linked lists, are one of those things that pretty
much everyone rolls custom nearly every time they need one.
On the other hand, it's harder work. Exact requirements for Hash are not specified in easy-to-read language. I don't know about you, but for me personally it feels like going on mine field.
C. std::unordered_map<> can be surprising.
I think, it was you who showed us such case couple of years ago, when
you tried to use string literals as keys.
E. Compilation time. Slow.
Could not be a problem when unordered_map used in just few compilation
units. Could be serious problem otherwise.
On Fri, 10 Apr 2026 01:17:44 -0500
BGB <cr88192@gmail.com> wrote:
On 4/7/2026 10:14 PM, ??Jacek Marcin Jaworski?? wrote:
W dniu 7.04.2026 oÿ23:38, highcrew pisze:
hsearch_r is close to my definition of decent, but it has huge
limitations: fixed size that needs to be estimated before use,
lack of a delete primitive, and it can only use string keys.
Did you know that these "limitations" are hash table attributes in
1. y. computer science students book?
FWIW:
Hash tables, like linked lists, are one of those things that pretty
much everyone rolls custom nearly every time they need one.
Not that I disagree with many of your points below, but here you are
wrong. Hash tables are very *unlike* linked lists. Be sure that
overwhelming majority of professional programmers, including good ones,
never implemented a hash table themselves after they left school.
It seems, you mostly converse with people that share your extreme DIY attitude. Please realize that for better or worse the bigger world is different.
On Fri, 10 Apr 2026 01:17:44 -0500
BGB <cr88192@gmail.com> wrote:
On 4/7/2026 10:14 PM, ??Jacek Marcin Jaworski?? wrote:
W dniu 7.04.2026 oÿ23:38, highcrew pisze:
hsearch_r is close to my definition of decent, but it has huge
limitations: fixed size that needs to be estimated before use,
lack of a delete primitive, and it can only use string keys.
Did you know that these "limitations" are hash table attributes in
1. y. computer science students book?
FWIW:
Hash tables, like linked lists, are one of those things that pretty
much everyone rolls custom nearly every time they need one.
Not that I disagree with many of your points below, but here you are
wrong. Hash tables are very *unlike* linked lists. Be sure that
overwhelming majority of professional programmers, including good ones,
never implemented a hash table themselves after they left school.
It seems, you mostly converse with people that share your extreme DIY attitude. Please realize that for better or worse the bigger world is different.
On 10/04/2026 11:25, Michael S wrote:
On Fri, 10 Apr 2026 01:17:44 -0500
BGB <cr88192@gmail.com> wrote:
On 4/7/2026 10:14 PM, ??Jacek Marcin Jaworski?? wrote:
W dniu 7.04.2026 oÿ23:38, highcrew pisze:
hsearch_r is close to my definition of decent, but it has huge
limitations: fixed size that needs to be estimated before use,
lack of a delete primitive, and it can only use string keys.
Did you know that these "limitations" are hash table attributes in
1. y. computer science students book?
FWIW:
Hash tables, like linked lists, are one of those things that pretty
much everyone rolls custom nearly every time they need one.
Not that I disagree with many of your points below, but here you are
wrong. Hash tables are very *unlike* linked lists. Be sure that
overwhelming majority of professional programmers, including good ones,
never implemented a hash table themselves after they left school.
C programmers or all programmers? Since I'd guess the majority of the
latter wouldn't know *how* to implement a hash table, if they even know
what it is.
It seems, you mostly converse with people that share your extreme DIY
attitude. Please realize that for better or worse the bigger world is
different.
C encourages DIY code, by not having any standard libraries for many
things, by omitting certain language features, by being used to /
implement such/ things for other languages where they are built-in.
For hash tables however, I agree with BGB: a hash algorithm, once you've
got one that works well for your task, is a few lines of code. If self- implemented, rather than buried in a library with a clunky 'general
purpose' interface, it can be made very tight and efficient.
Hash tables have simple criteria: how long it takes to work out the hash function, and how many clashes arise. Those are easy to test and compare.
On 4/10/2026 5:25 AM, Michael S wrote:...
On Fri, 10 Apr 2026 01:17:44 -0500
BGB <cr88192@gmail.com> wrote:
FWIW:
Hash tables, like linked lists, are one of those things that pretty
much everyone rolls custom nearly every time they need one.
Not that I disagree with many of your points below, but here you are
wrong. Hash tables are very *unlike* linked lists. Be sure that
overwhelming majority of professional programmers, including good ones,
never implemented a hash table themselves after they left school.
It seems, you mostly converse with people that share your extreme DIY
attitude. Please realize that for better or worse the bigger world is
different.
?...
So, then are people just naively looping over arrays and using linear
search for everything? Etc.
On 2026-04-10 14:00, BGB wrote:
On 4/10/2026 5:25 AM, Michael S wrote:...
On Fri, 10 Apr 2026 01:17:44 -0500
BGB <cr88192@gmail.com> wrote:
FWIW:
Hash tables, like linked lists, are one of those things that pretty
much everyone rolls custom nearly every time they need one.
Not that I disagree with many of your points below, but here you are
wrong. Hash tables are very *unlike* linked lists. Be sure that
overwhelming majority of professional programmers, including good ones,
never implemented a hash table themselves after they left school.
It seems, you mostly converse with people that share your extreme DIY
attitude. Please realize that for better or worse the bigger world is
different.
?...
So, then are people just naively looping over arrays and using linear
search for everything? Etc.
How do you reach that conclusion from the above comment? All that you
can conclude from the above is that if they think they need a hash
table, they use one that someone else has already implemented, rather
than implementing one of their own. I personally have never implemented
a hash table, but I have used the hashing capability that are built into several of the tools I do use. The closest I have come to implementing
my own hash table is to help someone else fix the hash function they
were passing to a hashing routine. Their function was producing a ridiculously high number of collisions, because they hadn't put any
thought into the distribution of the data they were hashing.
[...]
There are few options sometimes.
Say, for example, you want to look over a list of strings and map them
to an ordinal number:
ÿ char *strings[]={
ÿ "first",
ÿ "second",
ÿ ...
ÿ NULL
ÿ };
How do you do it then, exactly?...
[...]
Granted, I don't use hash tables in this particular way more often (for small key/value dictionaries had more often ended up using certain types
of B-Trees or similar).
Though, ironically, despite being "technically" B-Trees, often a very different sort of structure from that typically used in filesystems or databases (which are often built around disk blocks and for associations between name-strings and data-blobs, rather than associating an ordinal
with a pointer or similar).
On 2026-04-11 06:34, BGB wrote:
[...]
There are few options sometimes.
Say, for example, you want to look over a list of strings and map them
to an ordinal number:
ÿÿ char *strings[]={
ÿÿ "first",
ÿÿ "second",
ÿÿ ...
ÿÿ NULL
ÿÿ };
How do you do it then, exactly?...
Static lists of strings will get incorporated presorted in programs.
(I use Unix sort(1) for that, either directly or using my editor.)
In case an empty array will get populated dynamically from unknown
input there's the option to insert them in a sorted order instead
of sorting them.
[...]
Granted, I don't use hash tables in this particular way more often
(for small key/value dictionaries had more often ended up using
certain types of B-Trees or similar).
You really mean B-Trees here, not binary trees? - Compared to binary
trees, B-Trees are at least a bit more complex to organize. And the
linked structures of dynamic tree structures require more effort and
care compared to a static hash array.
Though, ironically, despite being "technically" B-Trees, often a very
different sort of structure from that typically used in filesystems or
databases (which are often built around disk blocks and for
associations between name-strings and data-blobs, rather than
associating an ordinal with a pointer or similar).
In case I'd need a fast tree structure I'd rather (for specific cases)
use B*-Trees instead of B-Trees; they have better properties. Usually, though, I never needed these sorts of trees; I used binary or AVL in
some (very few) cases.
BTW, in early days I assumed that Awk realized its associative arrays
with trees, but Arnold told me that internally he uses a hash array (I
never checked). Obvious one variant that fits for all application cases.
Janis
On 4/10/2026 4:31 PM, James Kuyper wrote:
On 2026-04-10 14:00, BGB wrote:
On 4/10/2026 5:25 AM, Michael S wrote:...
On Fri, 10 Apr 2026 01:17:44 -0500
BGB <cr88192@gmail.com> wrote:
FWIW:
Hash tables, like linked lists, are one of those things that
pretty much everyone rolls custom nearly every time they need
one.
Not that I disagree with many of your points below, but here you
are wrong. Hash tables are very *unlike* linked lists. Be sure
that overwhelming majority of professional programmers, including
good ones, never implemented a hash table themselves after they
left school.
It seems, you mostly converse with people that share your extreme
DIY attitude. Please realize that for better or worse the bigger
world is different.
?...
So, then are people just naively looping over arrays and using
linear search for everything? Etc.
How do you reach that conclusion from the above comment? All that
you can conclude from the above is that if they think they need a
hash table, they use one that someone else has already implemented,
rather than implementing one of their own. I personally have never implemented a hash table, but I have used the hashing capability
that are built into several of the tools I do use. The closest I
have come to implementing my own hash table is to help someone else
fix the hash function they were passing to a hashing routine. Their function was producing a ridiculously high number of collisions,
because they hadn't put any thought into the distribution of the
data they were hashing.
There are few options sometimes.
Say, for example, you want to look over a list of strings and map
them to an ordinal number:
char *strings[]={
"first",
"second",
...
NULL
};
How do you do it then, exactly?...
One could use a "for()" loop, say:
for(i=0; strings[i]; i++)
if(!strcmp(strings[i], str))
break;
Which is, often OK.
What else, use a 3rd party library? Far likely, that is a bigger
pain... Well, and 3rd party library dependencies are often not worth
it.
Michael says, 'Hash tables are very unlike linked lists'. Actually,
they are very much like linked lists, so I agree with BGB, but
I tend to think in more abstract terms.
On 4/10/2026 11:00 AM, BGB wrote:
[...]
Have you ever experimented with Cantor pairing has a hash?
On Sat, 11 Apr 2026 20:06:43 +0100
HashTables <invalid@invalid.invalid> wrote:
Michael says, 'Hash tables are very unlike linked lists'. Actually,
they are very much like linked lists, so I agree with BGB, but
It sounds like you didn't understand what I was talking about.
The context was "where do professional C programmers tend to use
pre-written libraries vs coding the damn thing from the first
principles as it suits the situation at hand". For hash tables it's predominantly the former, for linked lists it's often enough the latter.
I tend to think in more abstract terms.
I tend not to.
Michael says, 'Hash tables are very unlike linked lists'.
Actually, they are very much like linked lists, [...]
HashTables <invalid@invalid.invalid> writes:
Michael says, 'Hash tables are very unlike linked lists'.
Actually, they are very much like linked lists, [...]
Hash tables are nothing like linked lists. Some kinds of hash
tables (but not all) use linked lists internally, but that has
nothing to do with their public interface. A function to sort an
array might choose internally to produce a linked list and then sort
the list before putting elements back into the array, but that
doesn't mean arrays are like linked lists. The same is true for
hash tables.
HashTables <invalid@invalid.invalid> writes:
Michael says, 'Hash tables are very unlike linked lists'.
Actually, they are very much like linked lists, [...]
Hash tables are nothing like linked lists. Some kinds of hash
tables (but not all) use linked lists internally, but that has
nothing to do with their public interface. A function to sort an
array might choose internally to produce a linked list and then sort
the list before putting elements back into the array, but that
doesn't mean arrays are like linked lists. The same is true for
hash tables.
[...]
In the original sense, the idea was that they were like linked lists in
the sense that a programmer may often tend write both as one-off things whenever they come up (as opposed to using some 3rd party "foo_linked_list.h" library or something).
[...]
On 4/13/2026 8:27 AM, Tim Rentsch wrote:
HashTables <invalid@invalid.invalid> writes:
Michael says, 'Hash tables are very unlike linked lists'.
Actually, they are very much like linked lists, [...]
Hash tables are nothing like linked lists.ÿ Some kinds of hash
tables (but not all) use linked lists internally, but that has
nothing to do with their public interface.ÿ A function to sort an
array might choose internally to produce a linked list and then sort
the list before putting elements back into the array, but that
doesn't mean arrays are like linked lists.ÿ The same is true for
hash tables.
Linked lists for hash collisions are fairly common?
| Sysop: | Tetrazocine |
|---|---|
| Location: | Melbourne, VIC, Australia |
| Users: | 14 |
| Nodes: | 8 (0 / 8) |
| Uptime: | 94:01:12 |
| Calls: | 211 |
| Files: | 21,502 |
| Messages: | 82,381 |