|  | rules for iteration invalidation in std::map and tr1::unorde |  | |
| | | Guest |  |
| Posted: Thu Sep 04, 2008 4:25 am Post subject: rules for iteration invalidation in std::map and tr1::unorde |  |
Hello, what is the rationale for having different rules for iterator invalidation in std::map and tr1::unordered_map? In tr1::unordered_map, iterators can also be invalidated by insert() operations. I thought unordered_map was supposed to externally behave as a map so that it could be used anywhere where a std::map would be expected, by simply changing a typedef in the source code. Why the committee did not consider adoption of SGI's hash_map instead (which does not have this issue). Thanks. -Marek
-- [ See LINK for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
| |
| | | Guest |  |
| Posted: Thu Sep 04, 2008 8:13 am Post subject: Re: rules for iteration invalidation in std::map and tr1::u |  |
| |  | |
On Sep 3, 9:25 pm, marek.vond...@gmail.com wrote:
| Quote: | Hello, what is the rationale for having different rules for iterator invalidation in std::map and tr1::unordered_map? In tr1::unordered_map, iterators can also be invalidated by insert() operations. I thought unordered_map was supposed to externally behave as a map so that it could be used anywhere where a std::map would be expected, by simply changing a typedef in the source code. Why the committee did not consider adoption of SGI's hash_map instead (which does not have this issue). Thanks. -Marek
|
The implementation and performance necessitates it. std::map is implemented as a binary tree of some kind (generally a red-black tree), so all currently existing elements do not move in memory when you add a new element. You just add a new node, and update some links. unordered_map is implemented as a hash map. "Effectively", there's a big array of lists of elements. To maintain good performance, the hash map must grow this array of lists when a certain load factor / threshold is reached. Thus inserting an element could cause a reallocation, meaning all of the old iterators now point to memory which is no longer being used by the hash map.
-- [ See LINK for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
| |
| | | Guest |  |
| Posted: Fri Sep 05, 2008 4:19 am Post subject: Re: rules for iteration invalidation in std::map and tr1::u |  |
| |  | |
| Quote: | "Effectively", there's a big array of lists of elements. To maintain good performance, the hash map must grow this array of lists when a certain load factor / threshold is reached. Thus inserting an element could cause a reallocation, meaning all of the old iterators now point to memory which is no longer being used by the hash map.
|
Yes, but the iterators are supposed to point to nodes of the individual lists. These nodes do not have to move in memory when the big array gets resized (nodes only get unlinked from their former lists and spliced to new lists). Moreover, one can additionally splice all nodes to a global list of all elements in the map so that stepping from an element to an element during enumeration (calling hash_map::iterator::operator++()) would be a constant-time operation, like in std::map.
Just for your information, both SGI and Dinkumware provide hash_map as an extension to the standard library. Both the implementations follow the same rules about iteration invalidation like std::map. In Dinkumware's implementation, hash_map::iterator::operator++() is not guaranteed to be O( 1 ) though. I do not know how iteration is implemented in SGI's hash_map.
-Marek
-- [ See LINK for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
| |
| | | Guest |  |
| Posted: Fri Sep 05, 2008 10:58 am Post subject: Re: rules for iteration invalidation in std::map and tr1::u |  |
| |  | |
On Sep 4, 9:19 pm, marek.vond...@gmail.com wrote:
| Quote: | "Effectively", there's a big array of lists of elements. To maintain good performance, the hash map must grow this array of lists when a certain load factor / threshold is reached. Thus inserting an element could cause a reallocation, meaning all of the old iterators now point to memory which is no longer being used by the hash map.
Yes, but the iterators are supposed to point to nodes of the individual lists. These nodes do not have to move in memory when the big array gets resized (nodes only get unlinked from their former lists and spliced to new lists). Moreover, one can additionally splice all nodes to a global list of all elements in the map so that stepping from an element to an element during enumeration (calling hash_map::iterator::operator++()) would be a constant-time operation, like in std::map.
Just for your information, both SGI and Dinkumware provide hash_map as an extension to the standard library. Both the implementations follow the same rules about iteration invalidation like std::map. In Dinkumware's implementation, hash_map::iterator::operator++() is not guaranteed to be O( 1 ) though. I do not know how iteration is implemented in SGI's hash_map.
|
To do as you want, each node would have to store extra information. Each node would have a pointer to the hash map, know which bucket it's in, and know the offset into the bucket (or an equivalent). Probably 3 words worth of memory. This increases the memory footprint Big O (number of items) (and by extension increased cache misses, etc.). It also adds an amortized constant runtime cost to every insert (both to set this information on insert, and to reset all of this information on each grow).
I guess they decided this extra cost wasn't worth the iterator guarantee.
-- [ See LINK for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
| |
| | | Guest |  |
| Posted: Fri Sep 05, 2008 10:28 pm Post subject: Re: rules for iteration invalidation in std::map and tr1::u |  |
| Quote: | To do as you want, each node would have to store extra information. Each node would have a pointer to the hash map, know which bucket it's in, and know the offset into the bucket (or an equivalent).
|
Sorry, no. You do not need this information. I have implemented this before.
-Marek
-- [ See LINK for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
| |
| | | Guest |  |
| Posted: Sat Sep 06, 2008 12:20 pm Post subject: Re: rules for iteration invalidation in std::map and tr1:: |  |
| |  | |
On Sep 5, 3:28 pm, marek.vond...@gmail.com wrote:
| Quote: | To do as you want, each node would have to store extra information. Each node would have a pointer to the hash map, know which bucket it's in, and know the offset into the bucket (or an equivalent).
Sorry, no. You do not need this information. I have implemented this before.
|
I apologize. It was just an off the cuff comment. Now, it's late so I'm sorry if I'm missing the obvious quick solution. If I am, please enlighten me. Here's a partial list of the ways it can be done off the top of my head.
An iterator which becomes invalid after an insert could be - pointer to node. Nodes point to adjacent nodes. [Hash map updates links on insert.] - pointer to hash map, an int for which bucket, and an int for the offset into the bucket.
An iterator which stays valid after an insert could be - pointer to node. Nodes point to adjacent nodes. [Hash map updates links on insert and grow.] - pointer to node. Nodes point to their containing hash map. - pointer to node. Nodes point to their containing hash map. Nodes cache their hash value. - pointer to node. Nodes point to their containing hash map. Iterators cache the hash value. - pointer to node and pointer to hash map. - pointer to node and pointer to hash map. Nodes cache their hash value. - pointer to node and pointer to hash map. Iterators cache the hash value.
Some are obviously cheaper. Most involve cost comparisons involve a balancing of the cost to insert and to iterate and the memory footprint.
This appears to definitely be getting into the area where one should measure and not make a snap judgment. So, why does the TR1 hash map invalidate its iterators? I don't know. I'm sorry I answered. I should not have.
However, by forcing me to think about it, I have started to share you curiosity as to why it was done with iterators that are invalidated on insert.
-- [ See LINK for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |
| |
|
|