当前位置：首页 > article >正文

【C++】模拟实现hash_table(哈希表)

article 2024/10/8 21:05:25

🦄个人主页:修修修也

🎏所属专栏:实战项目集

⚙️操作环境:Visual Studio 2022

目录

一.了解项目功能

二.逐步实现项目功能模块及其逻辑详解

📌实现HashNode类模板

🎏构造HashNode类成员变量

🎏实现HashNode类构造函数

📌实现HashTable类模板

🎏构造HashTable类成员变量

🎏实现HashTable类构造函数

🎏实现HashTable类插入函数

🎏实现HashTable类查找函数

🎏实现HashTable类删除函数

🎏实现HashTable类析构函数

三.项目完整代码

test.cpp文件

HashTable.h文件

结语

一.了解项目功能

在本次项目中我们的目标是使用开散列的拉链法解决哈希冲突来实现一个哈希表模板,还不了解哈希表概念的朋友可以先移步[【数据结构】什么是哈希表(散列表)?],其结构图示如下:

哈希结点(HashNode)需要包含两个成员:键值对_kv,后继结点指针域_next。逻辑结构图示如下:

哈希表类模板提供的功能有:

哈希表结点类的构造函数
哈希表构造函数
哈希表的析构函数
哈希表的插入函数
哈希表的查找函数
哈希表的删除函数

二.逐步实现项目功能模块及其逻辑详解

通过第一部分对项目功能的介绍，我们已经对哈希表的功能有了大致的了解，虽然看似需要实现的功能很多，貌似一时间不知该如何下手，但我们可以分步分模块来分析这个项目的流程，最后再将各部分进行整合，所以大家不用担心，跟着我一步一步分析吧！

！！！注意，该部分的代码只是为了详细介绍某一部分的项目实现逻辑，故可能会删减一些与该部分不相关的代码以便大家理解，需要查看或拷贝完整详细代码的朋友可以移步本文第四部分。

📌实现HashNode类模板

🎏构造HashNode类成员变量

我们在一开始需求分析时就已经明确了哈希结点(HashNode)需要包含两个成员:键值对_kv,后继结点指针域_next.结点(RBTreeNode)逻辑结构图示如下:

综上所述,该部分代码如下:
template<class K, class V>
struct HashNode
{
	pair<K, V> _kv;
	HashNode<K, V>* _next;
};

🎏实现HashNode类构造函数

HashNode的构造函数我们实现两个即可,一个是有参构造,一个是无参构造,而无参构造又可以通过给缺省值的方式和有参构造合二为一,所以我们用初始化列表来实现一下HashNode的构造函数:
HashNode(const pair<K,V>& kv = pair<K,V>())
	:_kv(kv)
	,_next(nullptr)
{}

📌实现HashTable类模板

🎏构造HashTable类成员变量

HashTable类成员变量比较简单,底层的链表数组用vector来实现就行, 为了简化模板中出现的HashNode<K,V>类型的名字,我们将其简化命名为:Node。然后再设置一个变量_n来记录当前哈希表中有效元素个数, 方便我们后续扩容使用.

该部分代码如下:
template<class K, class V, class HashFunc = DefaultHashFanc<K>>//最后一个参数是哈希函数模板
class HashTable
{
	typedef HashNode<K, V> Node;
public:

private:
		vector<Node*> _table;	//指针数组
		size_t _n;     //有效元素个数
};

🎏实现HashTable类构造函数

HashTable类的构造函数非常简单,因为只有两个成员变量_table和_n。对于_table,最开始我们可以调用vector自带的接口来先初始化10个空间的大小方便使用,对于_n最开始肯定是置为0,综上,代码如下:
HashTable()
{
	_table.resize(10, nullptr);
    _n = 0;
}

🎏实现HashTable类插入函数

哈希表的插入逻辑比红黑树简单不少,简单来讲就是先使用哈希函数计算插入位置,然后在表里找对应位置的链表将新结点头插即可。但是在插入之前还有一些小细节,比如要先判断结点在不在哈希表中,如果在就不用插入了。还要判断哈希表的负载因子是否到达1,即哈希表中有效结点个数/哈希表的大小是否=1,如果等于1就需要进行哈希表扩容, 具体的扩容逻辑见代码注释。

综上,代码如下:
bool Insert(const pair<K, V>& kv)
{
    //检查结点是否在哈希表中,如果在就返回插入失败
	if (Find(kv.first))
	{
		return false;
	}

	HashFunc hf;

	//扩容逻辑:负载因子到1就扩容
	if (_n == _table.size())
	{
		size_t newSize = _table.size() * 2;
		//这里复用插入反而会在拷贝链表结点部分浪费资源,不如直接拿老链表的结点挂在新桶上
		vector<Node*> newTable;
		newTable.resize(newSize, nullptr);

		for (size_t i = 0; i < _table.size(); i++)
		{
			Node* cur = _table[i];
			while (cur)
			{
				Node* next = cur->_next;

				size_t hashi = hf(cur->_kv.first) % newSize;
				cur->_next = newTable[hashi];
				newTable[hashi] = cur;

				cur = next;
			}
		}
		_table.swap(newTable);
	}
	
    //用哈希函数计算插入位置 
	size_t hashi = hf(kv.first) % _table.size();

	//链表头插
	Node* newnode = new Node(kv);

	newnode->_next = _table[hashi];
	_table[hashi] = newnode;

	++_n;
	return true;
}

🎏实现HashTable类查找函数

开链法哈希表的查找逻辑很简单,就是按照哈希函数去算key的位置,然后将该位置的链表向后遍历查找该元素即可,代码如下:
Node* Find(const K& key)
{
	HashFunc hf;
	size_t hashi = hf(key) % _table.size();
	Node* cur = _table[hashi];
	while (cur)
	{
		if (cur->_kv.first == key)
		{
			return cur;
		}
		cur = cur->_next;
	}
	return nullptr;
}

🎏实现HashTable类删除函数

开链法哈希表的删除逻辑也很简单,就是按照哈希函数去算key的位置,然后将该位置的链表向后遍历查找该元素,找到之后按照链表结点的删除逻辑删除该结点即可,代码如下:
bool Erase(const K& key)
{
	HashFunc hf;
    //计算位置
	size_t hashi = hf(key) % _table.size();
	Node* prev = nullptr;
	Node* cur = _table[hashi];
    //查找并删除链表中的待删结点
	while (cur)
	{
		if (cur->_kv.first == key)
		{
			if (prev == nullptr)
			{
				_table[hashi] = cur->_next;
			}
			else
			{
				prev->_next = cur->_next;
			}
			delete cur;
			return true;
		}
		prev = cur;
		cur = cur->_next;
	}
	return false;
}

🎏实现HashTable类析构函数

哈希表的析构函数我们必须自己实现, 因为无论是vector的析构函数还是默认生成的都不能做到有效释放vector链表中的一个一个结点, 会导致内存泄漏, 所以我们需要自己手动实现.实现逻辑也不难, 逐一遍历哈希表然后逐一释放所有表中结点元素即可, 代码如下:
~HashTable()
{
	for (size_t i = 0; i < _table.size(); i++)
	{
		Node* cur = _table[i];
		while (cur)
		{
			Node* next = cur->_next;
			delete cur;
			cur = next;
		}
		_table[i] = nullptr;
	}
}

三.项目完整代码

我们将程序运行的代码分别在两个工程文件中编辑,完整代码如下:

test.cpp文件

该文件主要包含哈希表功能测试代码,可酌情参考.

#include"HashTable.h"

void test_openadd()
{
	open_address::HashTable<int, int> ht;
	int a[] = { 1,111,4,7,15,25,44,9 };
	for (auto e : a)
	{
		ht.Insert(make_pair(e, e));
	}

	auto ret = ht.Find(4);
	//ret->_kv.first = 40;
	ret->_kv.second = 400;

	//字符串做key. 利用仿函数,类模板的特化    相关算法BKDR Hash
	open_address::HashTable<string, string> dict;
	dict.Insert(make_pair("sort", "排序"));
	dict.Insert(make_pair("left", "xxx"));
	dict.Insert(make_pair("insert", "插入"));
	auto dret = dict.Find("left");

	dret->_kv.second = "左边";
}


int main()
{
	//test_openadd();
	hash_bucket::HashTable<int, int> ht;
	int a[] = { 1,111,4,7,15,25,44,9,14,27,24 };
	for (auto e : a)
	{
		ht.Insert(make_pair(e, e));
	}
	ht.Print();

	//字符串做key. 利用仿函数,类模板的特化    相关算法BKDR Hash
	hash_bucket::HashTable<string, string> dict;
	dict.Insert(make_pair("sort", "排序"));
	dict.Insert(make_pair("left", "xxx"));
	dict.Insert(make_pair("insert", "插入"));
	dict.Insert(make_pair("string", "字符串"));
	dict.Insert(make_pair("erase", "删除"));
	dict.Insert(make_pair("find", "查找"));
	auto dret = dict.Find("left");

	dret->_kv.second = "左边";

	dict.Print();

	return 0;
}

HashTable.h文件

该文件中还实现了必散列的线性探测法实现哈希表,和文中主要讲的开链法分别实现在两个命名空间中,供大家参考.

#pragma once
#include<iostream>
#include<vector>
#include<string>
using namespace std;

template<class K>
struct DefaultHashFanc
{
	size_t operator()(const K& key)
	{
		return (size_t)key;
	}
};

template<>
struct DefaultHashFanc<string>
{
	size_t operator()(const string& str)
	{
		size_t hash = 0;
		for (auto e : str)
		{
			hash *= 131;
			hash += e;
		}
		return hash;
	}
};

//必散列的线性探测法实现哈希表
namespace open_address
{
	enum STATE
	{
	EXIST,	//存在
	EMPTY,	//空
	DELETE	//删除
	};

	template<class K, class V>
	struct HashData
	{
		pair<K, V> _kv;
		STATE _state = EMPTY;
	};


	template<class K,class V,class HashFunc = DefaultHashFanc<K>>
	class HashTable
	{
	public:
	HashTable()
	{
		_table.resize(10);
	}


	bool Insert(const pair<K, V>& kv)
	{
		if (Find(kv.first))
		{
			return false;
		}
		//扩容
		if ((double)_n / _table.size() >= 0.7)
		{
			size_t newSize = _table.size() * 2;
			//不能简单的只括容量,还要重新映射
			HashTable<K, V, HashFunc> newHT;
			newHT._table.resize(newSize);

			//遍历旧表的数据插入到新表
			for (size_t i = 0; i < _table.size(); i++)
			{
				if (_table[i]._state == EXIST)
				{
					newHT.Insert(_table[i]._kv);
				}
			}
			_table.swap(newHT._table);
		}
		HashFunc hf;
		//算插入位置
		size_t hashi = hf(kv.first) % _table.size();
		
		//线性探测找插入位置

		while (_table[hashi]._state == EXIST)
		{
			++hashi;

			hashi %= _table.size();	//如果找到表尾,回到表头继续找
		}

		//插入数据
		_table[hashi]._kv = kv;
		_table[hashi]._state = EXIST;
		++_n;

		return true;
	}

	HashData<const K, V>* Find(const K& key)
	{
		HashFunc hf;
		//线性探测
		size_t hashi = hf(key) % _table.size();

		while (_table[hashi]._state != EMPTY)
		{
			if (_table[hashi]._state == EXIST && _table[hashi]._kv.first == key)
			{
				return (HashData<const K, V>*) & _table[hashi];
			}
			++hashi;
			hashi %= _table.size();	//如果找到表尾,回到表头继续找
		}

		return nullptr;
	}

	bool Erase(const K& key)
	{
		HashData<K, V>* ret = Find(key);

		if (ret)
		{
			ret->_state = DELETE;
			--_n;
			return true;
		}
		return false;
	}

	private:
	vector<HashData<K,V>> _table;
	size_t _n = 10;
	};
}

//开散列的拉链法实现哈希表
namespace hash_bucket
{
	template<class K, class V>
	struct HashNode
	{
		HashNode(const pair<K,V>& kv = pair<K, V>())
			:_kv(kv)
			,_next(nullptr)
		{}

		pair<K, V> _kv;
		HashNode<K, V>* _next;
	};

	template<class K, class V, class HashFunc = DefaultHashFanc<K>>
	class HashTable
	{
		typedef HashNode<K, V> Node;
	public:
		HashTable()
		{
			_table.resize(10, nullptr);
			_n = 0;
		}

		~HashTable()
		{
			for (size_t i = 0; i < _table.size(); i++)
			{
				Node* cur = _table[i];
				while (cur)
				{
					Node* next = cur->_next;
					delete cur;
					cur = next;
				}
				_table[i] = nullptr;
			}
		}


		bool Insert(const pair<K, V>& kv)
		{
			if (Find(kv.first))
			{
				return false;
			}
			HashFunc hf;
			//负载因子到1就扩容
			if (_n == _table.size())
			{
				size_t newSize = _table.size() * 2;
				//这里复用插入反而会在拷贝链表结点部分浪费资源,不如直接拿老链表的结点挂在新桶上
				vector<Node*> newTable;
				newTable.resize(newSize, nullptr);

				for (size_t i = 0; i < _table.size(); i++)
				{
					Node* cur = _table[i];
					while (cur)
					{
						Node* next = cur->_next;

						size_t hashi = hf(cur->_kv.first) % newSize;
						cur->_next = newTable[hashi];
						newTable[hashi] = cur;

						cur = next;
					}
				}
				_table.swap(newTable);
			}
			 
			size_t hashi = hf(kv.first) % _table.size();

			//链表头插
			Node* newnode = new Node(kv);

			newnode->_next = _table[hashi];
			_table[hashi] = newnode;

			++_n;
			return true;
		}

		void Print()
		{
			for (size_t i = 0; i < _table.size(); i++)
			{
				printf("[%d]->", i);
				Node* cur = _table[i];
				while (cur)
				{
					cout << cur->_kv.first << ":"<<cur->_kv.second <<"->";
					cur = cur->_next;
				}
				printf("NULL\n");
			}
		}


		Node* Find(const K& key)
		{
			HashFunc hf;
			size_t hashi = hf(key) % _table.size();
			Node* cur = _table[hashi];
			while (cur)
			{
				if (cur->_kv.first == key)
				{
					return cur;
				}
				cur = cur->_next;
			}
			return nullptr;
		}

		bool Erase(const K& key)
		{
			HashFunc hf;
			size_t hashi = hf(key) % _table.size();
			Node* prev = nullptr;
			Node* cur = _table[hashi];
			while (cur)
			{
				if (cur->_kv.first == key)
				{
					if (prev == nullptr)
					{
						_table[hashi] = cur->_next;
					}
					else
					{
						prev->_next = cur->_next;
					}
					delete cur;
					return true;
				}
				prev = cur;
				cur = cur->_next;
			}
			return false;
		}

	private:
		vector<Node*> _table;	//指针数组
		size_t _n; 
	};
}

结语

希望这篇哈希表(hash_table)的模拟实现详解能对大家有所帮助,欢迎大佬们留言或私信与我交流.

学海漫浩浩,我亦苦作舟!关注我,大家一起学习,一起进步!

相关文章推荐

【C++】模拟实现红黑树(RB-Tree)

【C++】模拟实现AVL树

【C++】模拟实现二叉搜索(排序)树

【C++】模拟实现priority_queue(优先级队列)

【C++】模拟实现queue

【C++】模拟实现stack

【C++】模拟实现list

【C++】模拟实现vector

【C++】模拟实现string类