当前位置：首页 > article >正文

unordered_map和unordered_set相关知识详细梳理

article 2024/10/27 11:00:39

0. 引言

在C++98中，STL提供了底层为红黑树结构的一系列关联式容器，在查询时效率可达到O( $log_2N$ )，即最差情况下需要比较红黑树的高度次，当树中的节点非常多时，查询效率也不理想。为了只进行很少的比较次数就能够将元素找到，因此在C++11中，STL又提供了4个unordered系列的关联式容器，这四个容器与红黑树结构的关联式容器使用方式基本类似，但查询效率更高。

1. unordered_map的介绍

        1. unordered_map是存储<key, value>键值对的关联式容器，其允许通过keys快速的索引到与其对应的value。
        2. 在unordered_map中，键值通常用于惟一地标识元素，而映射值是一个对象，其内容与此
键关联。键和映射值的类型可能不同。
        3. 在内部,unordered_map没有对<key, value>按照任何特定的顺序排序, 为了能在常数范围内找到key所对应的value，unordered_map将相同哈希值的键值对放在相同的桶中。
        4. unordered_map容器通过key访问单个元素要比map快，但它通常在遍历元素子集的范围迭代方面效率较低。
        5. unordered_maps实现了直接访问操作符(operator[])，它允许使用key作为参数直接访问
value。
        6. 它的迭代器至少是前向迭代器。

2. unordered_set的介绍

unordered_set 是一种无序集合，用于存储唯一元素。其特点是没有特定的顺序，并且能够快速地根据元素值进行检索。
在 unordered_set 中，元素的值即为其键，这个键唯一地标识元素。键是不可变的，因此一旦元素被插入，无法再更改其值，只能插入或删除元素。
unordered_set 内部没有排序。元素根据其哈希值分配到不同的桶（buckets）中，这种结构使得可以根据值直接访问单个元素。在平均情况下，这种访问的时间复杂度是常数级（O(1)）。
与有序集合（set）相比，unordered_set 在通过键直接访问元素时更为迅速。然而，在对部分元素进行范围遍历时，unordered_set 的效率通常较低。
unordered_set 的迭代器至少是前向迭代器，能够遍历集合中的元素，但遍历顺序不确定。

3. 底层数据结构

unordered系列的关联式容器之所以效率比较高，是因为其底层使用了哈希结构。

3.1 哈希的概念

顺序结构以及平衡树中，元素关键码与其存储位置之间没有对应的关系，因此在查找一个元素时，必须要经过关键码的多次比较。顺序查找时间复杂度为O(N)，平衡树中为树的高度，即
O( $log_2N$ )，搜索的效率取决于搜索过程中元素的比较次数。
理想的搜索方法：可以不经过任何比较，一次直接从表中得到要搜索的元素。
如果构造一种存储结构，通过某种函数(hashFunc)使元素的存储位置与它的关键码之间能够建立
一一映射的关系，那么在查找时通过该函数可以很快找到该元素

这个方法成为哈希（散列）方法，该方法中使用的转换函数称为哈希(散列)函数，构造出来的结构称为哈希表(Hash Table)(或者称散列表)。

举例进行说明：

现有数据{1，4，5，6，7，9}，和一顺序结构。

通过哈希函数hash（key）= key（关键字） % capacity（容器总容量）

建立映射关系，并将数据在对应位置填入。

但是显然会出现两个元素映射到同一个位置的情况，该如何应对呢？于是便有了哈希冲突的概念。

3.2 哈希冲突

不同关键字通过相同哈希哈数计算出相同的哈希地址，该种现象称为哈希冲突或哈希碰撞。

例如

把具有不同关键码而具有相同哈希地址的数据元素称为“同义词”。

在说明哈希冲突的解决办法前，先说说哈希函数。

3.3 哈希函数

引起哈希冲突的一个原因可能是：哈希函数设计不够合理。

所以说要解决哈希冲突，第一步要设计合理的哈希函数。

哈希函数设计原则：
1. 哈希函数的定义域必须包括需要存储的全部关键码，而如果散列表允许有m个地址时，其值域必须在0到m-1之间。

2. 哈希函数计算出来的地址能均匀分布在整个空间中。
3. 哈希函数应该比较简单。

常见的哈希函数实现：

1. 直接定址法（常见）：
取关键字的某个线性函数为散列地址：Hash（Key）= A*Key + B。

该方法得到的散列地址简单、均匀，但是需要事先知道关键字的分布情况。

使用场景：适合查找比较小且连续的情况

2. 除留余数法（常见）：

设散列表中允许的地址数为m，取一个不大于m，但最接近或者等于m的质数p作为除数，
按照哈希函数：Hash(key) = key% p(p<=m),将关键码转换成哈希地址

3. 平方取中法：

假设关键字为1234，对它平方就是1522756，抽取中间的3位227作为哈希地址；
使用场景：适合不知道关键字的分布，其位数又不太大的情况。

4. 折叠法：

折叠法是将关键字从左到右分割成位数相等的几部分(最后一部分位数可以短些)，然后将这
几部分叠加求和，并按散列表表长，取后几位作为散列地址。

使用场景：适合事先不需要知道关键字的分布，适合关键字位数比较多的情况。

5. 随机数法：

选择一个随机函数，取关键字的随机函数值为它的哈希地址，即H(key) = random(key),其中
random为随机数函数。

使用场景：通常应用于关键字长度不等时。

6. 数学分析法：

设有n个d位数，每一位可能有r种不同的符号，这r种不同的符号在各位上出现的频率不一定
相同，可能在某些位上分布比较均匀，每种符号出现的机会均等，在某些位上分布不均匀只
有某几种符号经常出现。可根据散列表的大小，选择其中各种符号分布均匀的若干位作为散
列地址。

使用场景：通常适合处理关键字位数比较大的情况，如果事先知道关键字的分布且关键字的
若干位分布较均匀的情况。

3.4 处理哈希冲突

要处理哈希冲突，一方面要设计合理的哈希函数，哈希函数设计得越精妙，产生哈希冲突的可能性就越低，但是哈希冲突是无法避免的，为了解决哈希冲突，还要从哈希的存储结构上做处理。

在结构设计上，解决哈希冲突两种常见的方法是：闭散列和开散列。

3.5 散列的扩容

显然当数据大小一定，哈希函数一定的情况下，表的容量越小，出现哈希冲突的可能性就越大，反之越小。所以说为了减少哈希冲突，要合理的规划散列表的容量，不能过小，也不能过大（会导致空间浪费）。

为了处理扩容的问题，我们可以引入负载因子的概念：

4 闭散列

4.1 闭散列原理

闭散列：也叫开放定址法，当发生哈希冲突时，如果哈希表未被装满，说明在哈希表中必然还有空位置，那么可以把key存放到冲突位置中的“下一个” 空位置中去。

关于“下一个”空位置的定义，又分为几种：

1. 线性探测：

从发生冲突的位置开始，依次向后探测，直到寻找到下一个空位置为止。

例如：

插入时：

查找时：

删除：

采用闭散列处理哈希冲突时，不能随便物理删除哈希表中已有的元素，若直接删除元素
会影响其他元素的搜索。比如删除元素4，如果直接删除掉，44查找起来可能会受影
响。因此线性探测采用标记的伪删除法来删除一个元素。

2. 二次探测

线性探测的缺陷是产生冲突的数据会堆积在一块，可能会使冲突不断累积，而为了避免该问题设计了二次探测。

假设哈希函数为 h(key)，初始哈希地址为 $h_0$ = h(key) % table_size，探测序列的公式如下：

$h_i$ = ( $h_0$ + $c_1 \times i^2$ ) % table_size

其中：

$h_0$ 是由哈希函数计算的初始位置。
i 表示第 i 次探测。
$c_1$ 是一个常数，通常为 1。
table_size 是哈希表的大小，通常取素数以减少冲突。

因此，探测位置的序列为：, , , ，依此类推。如果某位置已经被占用，二次探测会尝试下一个平方位置。

优点：

相比于线性探测减少了一次聚集的问题，探测分布较为均匀。

不需要额外的存储空间（相比链地址法）。

缺点：

当装载因子较大时（接近 1），二次探测性能会下降，因为空位变少。

仍然可能产生二次聚集，不能完全避免聚集问题。

实现上比线性探测稍复杂，适用于负载较低的哈希表。

4.2 闭散列模拟实现

以线性探测为例：

#include <string>
#include <vector>
#include <utility>
template<class K>
struct HashFunc
{
	size_t operator()(const K& key)
	{
		return (size_t)key;
	}
};

// 哈希表中支持字符串的操作(特化)
template<>
struct HashFunc<std::string>
{
	size_t operator()(const std:: string& key)
	{
		size_t hash = 0;
		for (auto e : key)
		{
			hash *= 31;
			hash += e;
		}

		return hash;
	}
};

//采用开放定址法，即线性探测
namespace open_address
{
	//表中位置状态
	enum State
	{
		EXIST,//已有元素
		EMPTY,//没有元素
		DELETE//元素被删除
	};

	template<class K,class V>
	struct HashData
	{
		HashData(const std::pair<K, V>& val = std::pair<K,V>())
			:_val(val)
			,_state(EMPTY)
		{}
		std::pair<K, V> _val;
		State _state;
	};
	
	template<class K,class V,class Hash =HashFunc<K>>
	class HashTable
	{
	public:
		HashTable()
		{
			_tables.resize(10);//调用vector的resize 
		}

		//无需写析构函数，用默认生成的析构函数自动调用vector的析构
		// 
		//插入数据
		bool Insert(const std::pair<K, V>& val)
		{
			//如果有重复元素，返回false
			if (Find(val.first))
				return false;
			//检查是否需要扩容
			Checkcapcity();
			HashFunc<K> Hashf;
			size_t hash = Hashf(val.first)%_tables.size();
			//走到这步只用考虑插入了
			while (_tables[hash]._state == EXIST)
			{
				++hash;
				hash %= _tables.size();
			}
			_tables[hash] = HashData<K, V>(val);
			_tables[hash]._state = EXIST;
			++_n;
			return true;
		}

		//查找
		HashData<K, V>* Find(const K& key)
		{
			HashFunc<K> Hashf;
			size_t hash = Hashf(key)%_tables.size();
			size_t pos;
			while (_tables[hash]._state != EMPTY)
			{
				//找到对应位置返回
				if (_tables[hash]._state == EXIST && _tables[hash]._val.first == key)
					return &_tables[hash];
				++hash;
				//若找到表尾，模后变为0
				hash %= _tables.size();
			}
			//未找到返回空指针
			return nullptr;
		}
		bool Erase(const K& key)
		{
			HashData<K, V>* find = Find(key);
			if (find)
			{
				//找到后将状态置为DELETE
				find->_state = DELETE;
				return true;
			}
			//没找到，返回false
			return false;
		}
	private:
		//检查是否需要扩容，若需要则扩容
		void Checkcapcity()
		{
			size_t oldsize = _tables.size();
			//如果负载因子大于等于 0.7，扩容
			if (_n * 10 / oldsize >= 7)
			{
				HashTable<K, V> newht;
				newht._tables.resize(_tables.size() * 2);
				for (size_t i = 0; i < oldsize; ++i)
				{
					if (_tables[i]._state == EXIST)
						newht.Insert(_tables[i]._val);
				}
				_tables.swap(newht._tables);
			}

		}
		std::vector<HashData<K, V>> _tables; //用vector顺序存储
		size_t _n = 0; //存储数据个数
	};
}

需要注意的是在插入操作时，不仅状态为空的位置可以插入元素，状态为被删除的位置也可以，在查找元素时，一定要查找到对应元素或者查找到空状态的位置才算结束。

5 开散列

5.1 开散列原理

开散列：开散列法又叫链地址法(开链法)，首先对关键码集合用散列函数计算散列地址，具有相同地址的关键码归于同一子集合，每一个子集合称为一个桶，各个桶中的元素通过一个单链表链接起来，各链表的头结点存储在哈希表中。

示意图：

插入：

查找：

删除：

5.2 开散列模拟实现

#pragma once

#include <string>
#include <vector>
#include <utility>

template<class K, class V>
struct HashNode
{
	std::pair<K, V> _val; //元素
	HashNode* _next; //指向下一个节点的指针
	HashNode(const std::pair<K,V>& val)
		:_val(val)
		,_next(nullptr)
	{}
};

template<class K>
struct HashFunc
{
	size_t operator()(const K& key)
	{
		return (size_t)key;
	}
};

//特化
template<>
struct HashFunc<std::string>
{
	size_t operator()(const std::string& key)
	{
		size_t hash = 0;
		for (auto e : key)
		{
			hash *= 31;
			hash += e;
		}

		return hash;
	}
};

namespace hash_bucket
{
	template<class K,class V,class Hash = HashFunc<K>>
	class HashTable
	{
		typedef HashNode<K, V> Node;
	public:
		//默认构造
		HashTable()
			:_n(0)
		{
			_table.resize(10);
		}

		// 析构函数
		~HashTable()
		{
			for (int i = 0; i < _table.size(); ++i)
			{
				Node* del = _table[i];
				Node* next = del == nullptr ? nullptr : del->_next;
				while (next)
				{
					delete del;
					del = next;
					next = del->_next;
				}
				delete del;
			}
		}

		//插入
		bool Insert(const std::pair<K,V>& data)
		{
			HashFunc<K> Hash;
			size_t hash = Hash(data.first)%_table.size();
			if (Find(data.first))//如果已有相同key的数据存在，插入失败 
				return false;
			CheckCapacity();//若需要则扩容
			//无论对应位置存不存在节点，头插
			HashNode<K, V>* newnode = new HashNode<K, V>(data);
			newnode->_next = _table[hash];
			_table[hash] = newnode;
			++_n;
			return true;
		}

		//查找
		std::pair<K,V>* Find(const K& key)
		{
			HashFunc<K> Hash;
			size_t hash = Hash(key) % _table.size();
			//如果hash位置存在数据，就在桶中查找
			Node* pcur = _table[hash];
			while (pcur && pcur->_val.first != key)
			{
				pcur = pcur->_next;
			}
			//判断是否找到
			if (pcur)
				return &pcur->_val;
			else
				return nullptr;
		}

		//删除指定元素
		bool Erase(const K& key)
		{
			HashFunc<K> Hash;
			size_t hash = Hash(key) % _table.size();
			Node* pcur = _table[hash];
			Node* prev = nullptr;
			while (pcur && pcur->_val.first != key)
			{
				prev = pcur;
				pcur = pcur->_next;
			}
			//如果找到了pcur
			if (pcur)
			{
				//如果prev存在
				if (prev)
				{
					prev->_next = pcur->_next;
				}
				//头删 
				else
				{
					_table[hash] = pcur->_next;
				}
				delete pcur;
				--_n;
				return true;
			}
			//如果没找pcur
			else
			{
				return false;
			}
		}
	private:
		//检查是否需要扩容，如需要则扩容
		void CheckCapacity()
		{
			//若负载因子大于7，则扩容
			if (_n * 10 / _table.size() >= 7)
			{
				HashTable<K, V> newht;
				newht._table.resize(_table.size() * 2);
				for (size_t i = 0; i < _table.size(); ++i)
				{
					//如果节点指针不为空
					Node* pcur = _table[i];
					while (pcur)
					{
						newht.Insert(pcur->_val);
						pcur = pcur->_next;
					}
				}
				_table.swap(newht._table);
			}
		}

		std::vector<Node*> _table;//指针数字
		size_t _n;//表中元素个数
	};
}

总体来说就是维护各个单链表的增删查改，这里用的单链表，因为头插操作更简单，而且也不需要维护头节点。

6. 封装unordered_set

无论是unordered_set还是unordered_map，这里和STL库保持一致，底层都用开散列。

首先要加入新的模板参数，KeyofValue来让开散列可以同时兼容unordered_set和unordered_map两种结构。（KeyofValue需传入获得pair键值对的key值的仿函数类型）。同时，所有设计key值比较的地方，都要更改为调用仿函数获取key值再进行比较。

然后要实现开散列的迭代器，方便上层封装时直接调用底层接口。

主体结构：

	template<class K,class V,class KeyofValue,class Hash>
	class HashTable
	{
		//声明友元
		template<class K, class V, class Ptr, class Ref, class KeyofValue, class Hash>
		friend struct HTIterator;

		
	public:
		typedef HashNode<K, V> Node;
		typedef HTIterator<K, V, V*, V&, KeyofValue, Hash> Iterator;
		typedef HTIterator<K, V, const V*, const V&, KeyofValue, Hash> Const_Iterator;
		//默认构造
		HashTable()
			:_n(0)
		{
			_table.resize(10);
		}

		// 析构函数
		~HashTable()
		{
			for (int i = 0; i < _table.size(); ++i)
			{
				Node* del = _table[i];
				Node* next = del == nullptr ? nullptr : del->_next;
				while (next)
				{
					delete del;
					del = next;
					next = del->_next;
				}
				delete del;
			}
		}

		//迭代器
		Iterator Begin()
		{
			if (_n == 0)
				return End();
			for (size_t i = 0; i < _table.size(); ++i)
			{
				if (_table[i])
					return Iterator(_table[i],this);
			}
			return End();
		}

		Iterator End()
		{
			return Iterator( nullptr,this);
		}

		Const_Iterator Begin() const
		{
			if (_n == 0)
				return End();
			for (size_t i = 0; i < _table.size(); ++i)
			{
				if (_table[i])
					return Const_Iterator(_table[i], this);
			}
			return End();
		}

		Const_Iterator End() const
		{
			return Const_Iterator(nullptr,this);
		}


		//插入
		std::pair<Iterator,bool> Insert(const V& data)
		{
			KeyofValue kov;
			HashFunc<K> Hash;
			Iterator it = Find(kov(data));
			if (it._node)//如果已有相同key的数据存在，插入失败
			{
				return { Iterator(it._node, this),false };
			}
			CheckCapacity();//若需要则扩容
			size_t hash = Hash(kov(data)) % _table.size();
			//无论对应位置存不存在节点，头插
			HashNode<K, V>* newnode = new HashNode<K, V>(data);
			newnode->_next = _table[hash];
			_table[hash] = newnode;
			++_n;
			return { Iterator(newnode, this),true};
		}

		//查找
		//std::pair<K,V>* Find(const K& key)
		Iterator Find(const K& key)
		{
			KeyofValue kov;
			HashFunc<K> Hash;
			size_t hash = Hash(key) % _table.size();
			//如果hash位置存在数据，就在桶中查找
			Node* pcur = _table[hash];
			while (pcur && kov(pcur->_val) != key)
			{
				pcur = pcur->_next;
			}
			判断是否找到
			//if (pcur)
			//	return &pcur->_val;
			//else
			//	return nullptr;
			return Iterator(pcur,this);
		}

		Const_Iterator Find(const K& key) const
		{
			KeyofValue kov;
			HashFunc<K> Hash;
			size_t hash = Hash(key) % _table.size();
			//如果hash位置存在数据，就在桶中查找
			Node* pcur = _table[hash];
			while (pcur && kov(pcur->_val) != key)
			{
				pcur = pcur->_next;
			}
			判断是否找到
			//if (pcur)
			//	return &pcur->_val;
			//else
			//	return nullptr;
			return Const_Iterator(pcur, this);
		}

		//删除指定元素
		bool Erase(const K& key)
		{
			KeyofValue kov;
			HashFunc<K> Hash;
			size_t hash = Hash(key) % _table.size();
			Node* pcur = _table[hash];
			Node* prev = nullptr;
			while (pcur && kov(pcur->_val) != key)
			{
				prev = pcur;
				pcur = pcur->_next;
			}
			//如果找到了pcur
			if (pcur)
			{
				//如果prev存在
				if (prev)
				{
					prev->_next = pcur->_next;
				}
				//头删 
				else
				{
					_table[hash] = pcur->_next;
				}
				delete pcur;
				--_n;
				return true;
			}
			//如果没找pcur
			else
			{
				return false;
			}
		}
	private:
		//检查是否需要扩容，如需要则扩容
		void CheckCapacity()
		{
			//若负载因子大于7，则扩容
			if (_n * 10 / _table.size() >= 7)
			{
				HashTable<K, V, KeyofValue,Hash> newht;
				newht._table.resize(_table.size() * 2);
				for (size_t i = 0; i < _table.size(); ++i)
				{
					//如果节点指针不为空
					Node* pcur = _table[i];
					while (pcur)
					{
						newht.Insert(pcur->_val);
						pcur = pcur->_next;
					}
				}
				_table.swap(newht._table);
			}
		}

		std::vector<Node*> _table;//指针数字
		size_t _n;//表中元素个数
	};

迭代器部分：

//前置声明
template<class K, class V, class KeyofValue, class Hash = HashFunc<K>>
class HashTable;

template<class K, class V,class Ptr,class Ref, class KeyofValue, class Hash>
struct HTIterator
{
	typedef HashNode<K, V> Node;
	typedef HTIterator<K, V, Ptr, Ref, KeyofValue, Hash> Self;
	
	HTIterator(Node* node, const HashTable<K, V, KeyofValue, Hash>* pht)
		:_node(node)
		,_pht(pht)
	{}

	Ref operator*()
	{
		return _node->_val;
	}

	Ptr operator->()
	{
		return &_node->_val;
	}

	bool operator!=(const Self& s)
	{
		return _node != s._node;
	}
	
	Self& operator++()
	{
		//判断当前桶是否还有节点
		if (_node->_next)
		{
			//当前桶还有节点
			_node = _node->_next;
		}
		else//找到下一个不为空的桶
		{
			KeyofValue kov;
			Hash hs;
			size_t hashi = hs(kov(_node->_val)) % (_pht->_table.size()) + 1;
			while (hashi < _pht->_table.size())
			{
				if (_pht->_table[hashi])
				{
					break;
				}
					++hashi;
			}
			if (hashi == _pht->_table.size())
			{
				//如果找到末尾
				_node = nullptr;
			}
			else
			{
				//没找到末尾就找到了
				_node = _pht->_table[hashi];
			}
		}
		return *this;
	}
	//成员变量
	Node* _node;
	const HashTable<K, V, KeyofValue, Hash>* _pht;
};

上层封装：

#pragma once

#include "Bucket_HashTable.h"

namespace kia
{
	template<class K,class Hash = HashFunc<K>>
	class unordered_set
	{
		struct SetKeyofV
		{
			const K& operator()(const K& key)
			{
				return key;
			}
		};
	public:
        //这里体现了底层结构的泛型设计
		typedef typename hash_bucket::HashTable<K, K, SetKeyofV, Hash>::Iterator iterator;
		typedef typename hash_bucket::HashTable<K, K, SetKeyofV, Hash>::Const_Iterator const_iterator;
		
		iterator begin()
		{
			return _ht.Begin();
		}

		iterator end()
		{
			return _ht.End();
		}

		const_iterator begin() const
		{
			return _ht.Begin();
		}

		const_iterator end() const
		{
			return _ht.End();
		}
		//插入
		std::pair<iterator, bool> insert(const K& key)
		{
			return _ht.Insert(key);
		}

		
		
		//查找
		iterator find(const K& key)
		{
			return _ht.Find(key);
		}

		//查找
		const_iterator find(const K& key) const
		{
			return _ht.Find(key);
		}

		bool erase(const K& key)
		{
			return _ht.Erase(key);
		}
	private:
		hash_bucket::HashTable<K, K, SetKeyofV, Hash> _ht;
	};
}

基本调用开散列的实现就可以了。

7. 封装unordered_map

关于开散列的变更，上面已经说过，不再赘述。

主体结构：

template<class K,class V,class KeyofValue,class Hash>
class HashTable
{
	//声明友元
	template<class K, class V, class Ptr, class Ref, class KeyofValue, class Hash>
	friend struct HTIterator;

	
public:
	typedef HashNode<K, V> Node;
	typedef HTIterator<K, V, V*, V&, KeyofValue, Hash> Iterator;
	typedef HTIterator<K, V, const V*, const V&, KeyofValue, Hash> Const_Iterator;
	//默认构造
	HashTable()
		:_n(0)
	{
		_table.resize(10);
	}

	// 析构函数
	~HashTable()
	{
		for (int i = 0; i < _table.size(); ++i)
		{
			Node* del = _table[i];
			Node* next = del == nullptr ? nullptr : del->_next;
			while (next)
			{
				delete del;
				del = next;
				next = del->_next;
			}
			delete del;
		}
	}

	//迭代器
	Iterator Begin()
	{
		if (_n == 0)
			return End();
		for (size_t i = 0; i < _table.size(); ++i)
		{
			if (_table[i])
				return Iterator(_table[i],this);
		}
		return End();
	}

	Iterator End()
	{
		return Iterator( nullptr,this);
	}

	Const_Iterator Begin() const
	{
		if (_n == 0)
			return End();
		for (size_t i = 0; i < _table.size(); ++i)
		{
			if (_table[i])
				return Const_Iterator(_table[i], this);
		}
		return End();
	}

	Const_Iterator End() const
	{
		return Const_Iterator(nullptr,this);
	}


	//插入
	std::pair<Iterator,bool> Insert(const V& data)
	{
		KeyofValue kov;
		HashFunc<K> Hash;
		Iterator it = Find(kov(data));
		if (it._node)//如果已有相同key的数据存在，插入失败
		{
			return { Iterator(it._node, this),false };
		}
		CheckCapacity();//若需要则扩容
		size_t hash = Hash(kov(data)) % _table.size();
		//无论对应位置存不存在节点，头插
		HashNode<K, V>* newnode = new HashNode<K, V>(data);
		newnode->_next = _table[hash];
		_table[hash] = newnode;
		++_n;
		return { Iterator(newnode, this),true};
	}

	//查找
	//std::pair<K,V>* Find(const K& key)
	Iterator Find(const K& key)
	{
		KeyofValue kov;
		HashFunc<K> Hash;
		size_t hash = Hash(key) % _table.size();
		//如果hash位置存在数据，就在桶中查找
		Node* pcur = _table[hash];
		while (pcur && kov(pcur->_val) != key)
		{
			pcur = pcur->_next;
		}
		判断是否找到
		//if (pcur)
		//	return &pcur->_val;
		//else
		//	return nullptr;
		return Iterator(pcur,this);
	}

	Const_Iterator Find(const K& key) const
	{
		KeyofValue kov;
		HashFunc<K> Hash;
		size_t hash = Hash(key) % _table.size();
		//如果hash位置存在数据，就在桶中查找
		Node* pcur = _table[hash];
		while (pcur && kov(pcur->_val) != key)
		{
			pcur = pcur->_next;
		}
		判断是否找到
		//if (pcur)
		//	return &pcur->_val;
		//else
		//	return nullptr;
		return Const_Iterator(pcur, this);
	}

	//删除指定元素
	bool Erase(const K& key)
	{
		KeyofValue kov;
		HashFunc<K> Hash;
		size_t hash = Hash(key) % _table.size();
		Node* pcur = _table[hash];
		Node* prev = nullptr;
		while (pcur && kov(pcur->_val) != key)
		{
			prev = pcur;
			pcur = pcur->_next;
		}
		//如果找到了pcur
		if (pcur)
		{
			//如果prev存在
			if (prev)
			{
				prev->_next = pcur->_next;
			}
			//头删 
			else
			{
				_table[hash] = pcur->_next;
			}
			delete pcur;
			--_n;
			return true;
		}
		//如果没找pcur
		else
		{
			return false;
		}
	}
private:
	//检查是否需要扩容，如需要则扩容
	void CheckCapacity()
	{
		//若负载因子大于7，则扩容
		if (_n * 10 / _table.size() >= 7)
		{
			HashTable<K, V, KeyofValue,Hash> newht;
			newht._table.resize(_table.size() * 2);
			for (size_t i = 0; i < _table.size(); ++i)
			{
				//如果节点指针不为空
				Node* pcur = _table[i];
				while (pcur)
				{
					newht.Insert(pcur->_val);
					pcur = pcur->_next;
				}
			}
			_table.swap(newht._table);
		}
	}

	std::vector<Node*> _table;//指针数字
	size_t _n;//表中元素个数
};

迭代器部分：

//前置声明
template<class K, class V, class KeyofValue, class Hash = HashFunc<K>>
class HashTable;

template<class K, class V,class Ptr,class Ref, class KeyofValue, class Hash>
struct HTIterator
{
	typedef HashNode<K, V> Node;
	typedef HTIterator<K, V, Ptr, Ref, KeyofValue, Hash> Self;
	
	HTIterator(Node* node, const HashTable<K, V, KeyofValue, Hash>* pht)
		:_node(node)
		,_pht(pht)
	{}

	Ref operator*()
	{
		return _node->_val;
	}

	Ptr operator->()
	{
		return &_node->_val;
	}

	bool operator!=(const Self& s)
	{
		return _node != s._node;
	}
	
	Self& operator++()
	{
		//判断当前桶是否还有节点
		if (_node->_next)
		{
			//当前桶还有节点
			_node = _node->_next;
		}
		else//找到下一个不为空的桶
		{
			KeyofValue kov;
			Hash hs;
			size_t hashi = hs(kov(_node->_val)) % (_pht->_table.size()) + 1;
			while (hashi < _pht->_table.size())
			{
				if (_pht->_table[hashi])
				{
					break;
				}
					++hashi;
			}
			if (hashi == _pht->_table.size())
			{
				//如果找到末尾
				_node = nullptr;
			}
			else
			{
				//没找到末尾就找到了
				_node = _pht->_table[hashi];
			}
		}
		return *this;
	}
	//成员变量
	Node* _node;
	const HashTable<K, V, KeyofValue, Hash>* _pht;
};

上层封装：

#pragma once

#include "Bucket_HashTable.h"

namespace kia
{
	template<class K,class V,class Hash = HashFunc<K>>
	class unordered_map
	{
		struct MapKeyofV
		{
			const K& operator()(const std::pair<K, V>& kv)
			{
				return kv.first;
			}
		};
	public:
        //对比此处和unordered_set的不同
		typedef typename hash_bucket::HashTable<K, std::pair<const K, V>, MapKeyofV, Hash>::Iterator iterator;	
		typedef typename hash_bucket::HashTable<K, std::pair<const K, V>, MapKeyofV, Hash>::Const_Iterator const_iterator;
		
		iterator begin()
		{
			return _ht.Begin();
		}

		iterator end()
		{
			return _ht.End();
		}

		const_iterator begin() const
		{
			return _ht.Begin();
		}

		const_iterator end() const
		{
			return _ht.End();
		}
		//插入
		std::pair<iterator, bool> insert(const std::pair<K, V>& kv)
		{
			return _ht.Insert(kv);
		}

		//[]重载
		V& operator[](const K& key)
		{
			std::pair<iterator, bool> ret = insert({key,V()});
			return ret.first->second;
		}
		
		//查找
		iterator find(const K& key)
		{
			return _ht.Find(key);
		}

		//查找
		const_iterator find(const K& key) const
		{
			return _ht.Find(key);
		}

		bool erase(const K& key)
		{
			return _ht.Erase(key);
		}
	private:
		hash_bucket::HashTable<K, std::pair<const K, V>,MapKeyofV, Hash> _ht;
	};
}

查看全文

http://www.kler.cn/news/367170.html