当前位置：首页 > article >正文

list底层的简单实现（万字长文详解！）

article 2024/11/15 1:36:41

list底层的简单实现

文章目录

list底层的简单实现
- list_node的实现！
- - list_node的构造函数
- list的迭代器！——重点！
- - list迭代器的成员变量
  - 迭代器的构造函数
  - * 重载
  - 前置++ 重载
  - 后置++ 重载
  - 前置-- 重载
  - 后置-- 重载
  - != 重载
  - == 重载
  - -- 重载
- list的const迭代器——重点
- 迭代器之-> 重载——重点
- list的成员变量
- list的构造函数（无参）
- insert
- begin
- end
- const版本的begin
- const版本的end
- push_back
- - 原始的写法
  - 复用的写法
- push_front
- swap
- 构造函数（带参）
- 拷贝构造
- - 传统写法
  - 现代写法
- 赋值重载
- - 传统写法
  - 现代写法
- erase
- pop_front
- pop_back
- clear
- 析构函数
- size
- empty
代码展示

list_node的实现！

要实现链表我们首先就要实现节点的结构体！

struct list_node
{
	list_node* _next;
	list_node* _prev;
	T _data;
};

list_node的构造函数

list_node(const T& x)
	:_data(x)，
	_next(nullptr),
	_prev(nullptr)
{
}

list的迭代器！——重点！

任何一个容器的迭代器都是内嵌类型！都要受到域的影响！

list的迭代器和string和vector不一样，并不是一个原生的指针！
typedef node* iterator;
为什么我们不可以像上面怎么写呢？

因为我们要求迭代器行为要类似指针！——1.解引用得到数据 2.支持++/–

像是我们解引用，我们就要得到这个节点的数据！但是如果我们将上面的这个迭代器解引用了我们其实得到的是一个节点！我们自己还得去x.data用来获得数据！，这就不符合迭代器的定义了！

因为节点本身不是连续的！所以上面的迭代器也不支持++/–

以前string和vector能使用原生指针的原因是因为存储的物理结构都是连续的！所以可以直接支持原生指针能完成迭代器行为！

所以list的迭代器是通过类的封装+函数重载实现的！

用函数重载来实现相似的行为！

list迭代器的成员变量

template<class T>
struct __list_iterator
{
	typedef list_node<T> node;
	typedef __list_iterator<T> self;//方便添加模板参数
	node* _pnode;
}

迭代器的构造函数

__list_iterator(node* p)
	:_pnode(p)
{}

使用一个节点用来构造一个迭代器

* 重载

T& operator*()
{
	return _pnode->_data;
}

因为我们一般都是解引用后可以修改数据的值！

所以我们要引用返回

前置++ 重载

self& operator++()
{
	_pnode = _pnode->_next;
	return *this;
}

这个是迭代器++ 要返回的是这个迭代器的类型__list_iterator！

不要忘记加上否则就是类型不完全！也不是node*

还是一个前置++ ，返回引用可以减少拷贝构造

后置++ 重载

self operator++(int)
{
	node* cur = _pnode;
	_pnode = _pnode->_next;

	return self(cur);
}

后置的定义为先用后 ++

本质就是一个先将原来的节点构造成一个迭代器然后返回

本身再移动到下一个节点！

实际上使用的是我们返回后的新创建的迭代器！

而不是原来的迭代器

前置-- 重载

self& operator--()
{
	_pnode = _pnode->_prev;
	return *this;
}

后置-- 重载

self operator--(int)
{
	node* cur = _pnode;
	_pnode = _pnode->_prev;
	return self(cur);
}

!= 重载

bool operator!=(const self& it)const
{
	return _pnode != it._pnode;
}

虽然传的参数类型是__list_iterator

但是实际上比较的是他们里面封装的pnode

== 重载

bool operator==(const self& it)const
{
	return _pnode == it._pnode;
}

– 重载

self& operator--()
{
	_pnode = _pnode->_prev;
	return *this;
}

list的const迭代器——重点

像是string和vector因为其迭代器的本质就是原生指针！所以const迭代器其实也是十分的简单的！但是list不一样！它不是一个原生的指针！是以一个内部类！这就导致了一些问题

如果我们仅仅在普通的迭代器前面加上const
const list<int>iterator cit;
这个const保护的其实是cit本身！

而不是cit指向的内容！
const T* p1;
T* const p2;
上面的行为要类似于p1，而不是p2

p1是本身可以修改！但是p1指向的内容却是不可修改的！
为了能够实现我们预想的行为我们真正要对修改的是返回值！

像是对于 *的重载我们原先返回的是它的引用

现在我们应该返回的是它的const T& 从而来进行权限的缩写！

那么我们是不是可以写一个如下的代码呢
const T operator*()const
{
	return _pnode->data;
}
让const调用这个const重载让非const调用非const重载

不行！我们不能对解引用进行重载！

为什么？答案是虽然上面的都可行了但是我们会发现const的对象无法去调用++了

const的对象无法去调用非const的成员函数

那我们如果实现一个const版本的++呢？
self& operator++()const
{
	_pnode = _pnode->_next;
	return *this;
}
可行吗？当然是不可行的！因为这个const修饰的是this指针！this指针指向的内容就无法修改了！就没办法做到修改pnode了！

那如何解决这个问题呢？——就是不要让迭代器本身const！

方法1 ——重新写一个类这两个类的唯一区别就是他们的解引用是不一样的！

一个返回引用，一个返回常引用其他都是完全一致的！
//迭代器
template<class T>
struct __list_const_iterator
{
	typedef list_node<T> node;
	node* _pnode;

	__list_iterator(node* p)
		:_pnode(p)
	{}
	const T operator*()const
	{
		return _pnode->data;
	}
	//唯一的区别返回常引用
	__list_iterator<T>& operator++();
	__list_iterator<T> operator++(int);
	__list_iterator<T>& operator--();
	__list_iterator<T> operator--(int);
	bool operator!=(const __list_iterator& it);
	bool operator==(const __list_iterator& it);
    Ptr operator->()
};
然后使用这个类去实现const版本的list函数调用接口例如begin和end
typedef __list_const_iterator const_iterator
const_iterator begin()const
{
	return const_iterator(_head->_next);
}
const_iterator end()const
{
	return const_iterator(_head);
}
然后我们就可以通过这个类来实现const的迭代器！使用的时候const和非const调用的就是两个完全不同的类了！

但是这样子实在是太繁琐了！为了一个不同而写一个类

所以有了方法二 ——使用类模板来实现！即使是同一个类模板但是使用的参数不一样的时候也就是完不同的类！
vector<int>;
vector<string>;
vector<vector<string>>:
虽然都是同一个类模板，这三种都是不同的类！
所以我们可以添加一个模板参数！
template<class T, class Ref>//Ref 引用
struct __list_iterator
{
	typedef list_node<T> node;
	typedef __list_iterator<T, Ref> self;//方便添加模板参数
	node* _pnode;

	__list_iterator(node* p)
		:_pnode(p)
	{}
	Ref operator*()//使用第二个模板参数
	{
		return _pnode->_data;
	}
	self& operator++();
	self operator++(int);
	self& operator--();
	self operator--(int);
	bool operator!=(const self& it)const;
	bool operator==(const self& it)const;
    Ptr operator->()
};
class list
{
	typedef list_node<T> node;
private:
	node* _head;
public:
	typedef __list_iterator<T, T&>  iterator;
	typedef __list_iterator<T, const T&> const_iterator;
}
typedef __list_iterator<T,T&>  iterator;//迭代器
typedef __list_iterator<T, const T&> const_iterator;//const 迭代器
这样子我们就通过复用来简洁的完成const的迭代器！

迭代器之-> 重载——重点

template<class T, class Ref,class Ptr>//多引入一个模板参数
struct __list_iterator
{
    typedef list_node<T> node;
	typedef __list_iterator<T, Ref,Ptr> self;
    Ptr operator->()
	{
		return &_pnode->_data;
	}
}

为什么我们需要重载 -> 呢？

当我遇到如下的如下的结构
struct pos
{
	int _row;
	int _col;
	pos(int row = 0, int col = 0)
	{
		_row = row;
		_col = col;
	}
};
当我们把它插入链表
void test_list()
{
	list<pos> lt;
	lt.push_back(pos(1, 2));
	lt.push_back(pos(2, 3));
	lt.push_back(pos(3, 4));
	lt.push_back(pos(3, 5));

	list<pos>::iterator it = lt.begin();
	while (it != lt.end())
	{
		cout << (*it)._col << " " << (*it)._row << endl;
		++it;
	}
}
如果我们想要访问那么就不得不先解引用！

但是我们一般在使用指针的时候都是使用 - > 更加符合我们的使用习惯！

但是当看到重载的时候或许会觉得很奇怪会什么要返回的是数据的指针呢？
it->_col;
//转换成函数调用
it.operator->();//返回值是一个地址一个pos* 类型的指针
那为什么是 pos* 类型的指针后面可以直接跟上成员变量呢？

其实这是因为编译器给我们进行了简化处理

真实的情况应该是 it->->_col转换成函数调用就是it.operator->()- > _col这样的情况但是这样子又不符合我们的日常使用的习惯和为了更好的可读性！所以编译器就简化了一个 ->
it->->_col;//当然了我们也不可以怎么使用就是了
it.operator->()->_col//但是如果显示的去调用的话我们也可以怎么使用
如果我们不引入第三个模板参数会怎么样
T* operator->()
{
	return &_pnode->_data;
}
看起来好像也毫无问题啊！

但是当我们遇到const对象的时候！
struct pos
{
	int _row;
	int _col;
	pos(int row = 0, int col = 0)
	{
		_row = row;
		_col = col;
	}
};
void print(const list<pos>& lt)
{
	list<pos>::const_iterator it = lt.begin();
	while (it != lt.end())
	{
        ++it->_col;
		cout << it->_col << " " << it->_row << endl;

		++it;
	}
    
}
void test_list()
{
	list<pos> lt;
	lt.push_back(pos(1, 2));
	lt.push_back(pos(2, 3));
	lt.push_back(pos(3, 4));
	lt.push_back(pos(3, 5));
}
我们发现const对象下仍然可以通过-> 来修改数据！这就和我们上面const迭代器讲的 * 的重载是一个道理！

对于const迭代器 - > 的返回值应该是一个 const指针！用来保护指针指向的内容！

所以我们才要引入第三个模板参数来给 - > 使用！或者就是再写一个类！
 typedef __list_iterator<T,T&,T>  iterator;//迭代器
 typedef __list_iterator<T, const T&,const T*> const_iterator;//const 迭代器
这样子迭代器才算是真正的完成了！

list的成员变量

class list
{
	typedef list_node<T> node;
public:
    typedef __list_iterator<T,T&,T>  iterator;//迭代器
    typedef __list_iterator<T, const T&,const T*> const_iterator;//const 迭代器
private:
	node* _head;
    size_t _size;
}

因为写list_node 不方便所以我们一般会进行重命名

list的构造函数（无参）

void empty_initialize()
{
	_head = new node(T());
	_head->_next = _head;
	_head->_prev = _head;
    _size = 0;
}
list()
{
	empty_initialize();
}

链表的底层是一个带头双向循环的链表

开始的头是一个哨兵位所以要new一个！

因为node 的构造函数是带参的构造，没有默认构造！

所以我们要给一个值进行初始化！不可以填0 什么的！

因为我们无法保证这个数据类型是一个内置类型！

所以我们要用T() 来进行初始化！如果是一个自定义类型那么T()就是一个匿名对象，就会去调用这个类的默认构造来初始化！

insert

iterator insert(iterator pos, const T& x = T())
{
	node* newnode = new node(x);
	node* cur = pos._pnode;//取到了这个节点的指针！
	node* prev = cur->_prev;

	newnode->_next = cur;
	newnode->_prev = prev;
	prev->_next = newnode;
	cur->_prev = newnode;
    ++_size;
	return iterator(cur);//返回当前这个新插入的元素的迭代器
}

begin

iterator begin()
{
	return iterator(_head->_next);
}

iterator是一个内部类！

假如有无参构造那么iterator() 就是一个匿名对象！

iterator(_head-> _next) 就是用 _head-> _next初始化这个对象！ begin() 是指向第一个元素的迭代器！

_head是哨兵位

也可以先创建一个对象然后返回！
iterator begin()
{
 	iterator it(_head->_next);
 	return it;
}

end

iterator end()
{
	return iterator(_head);
}

end指向的是最后一个元素的下一个元素！那么最后一个元素的下一个元素就是哨兵位！

const版本的begin

const_iterator begin()const
{
	return const_iterator(_head->_next);
}

const版本的end

const_iterator end()const
{
	return const_iterator(_head);
}

push_back

原始的写法

void push_back(const T& x)
{
	node* newnode = new node(x);
	node* tail = _head->_prev;
	tail->_next = newnode;
	newnode->_next = _head;
	newnode->_prev = tail;
	_head->_prev = newnode;
    ++_size;
}

要改变四个指针

newnode的prev和next

tail的next 和 head的prev！

其中head的prev要在要保存成tail才能能修改！否则newnode的prev就找不到原来链表的最后一个元素了了！

如果不保存tail就要最后修改_head->prev!

复用的写法

void push_back(const T& x)
{
    insert(end(), x);//在end的前面插入就是尾插
}

push_front

void push_front(const T& x)
{
	insert(begin(), x);//在begin的前面插入就是头插！
}

swap

void swap(list<T>& lt)
{
	std::swap(_head, lt._head);
    std::swap(_size, lt._size);
}

交换两个链表对象！

只要交换头结点和大小

构造函数（带参）

template<class InputIterator>
list(InputIterator first, InputIterator last)
{
	empty_initialize();
	while (first != last)
	{
		push_back(*first);
		++first;
	}
}

可以复用pushbak来完成一个带参的构造函数

拷贝构造

传统写法

list(const list<T>& lt)
{
	empty_initialize();
	//先初始化！创造哨兵位
	for (const auto& e : lt)//使用使用范围for之前要先实现const迭代器
	{
		push_back(e); //进行深拷贝
	}
	//范围for就是会自动的从lt里面取 其迭代器然后解引用得到T类型的数据
	//如果不使用auto&  一旦T是一个自定义类型 可能会多次发生深拷贝！极大的影响效率！
}

现代写法

list(const list<T>& lt)
{
    empty_initialize();
	list<T> temp(lt.begin(), lt.end());
	swap(temp);
}

现代写法是利用带参的构造来完成深拷贝！然后交换彼此的头结点来完成一次拷贝构造！

但是开始一定要完成一次初始化！

否则开始的我们我们要初始化的这个头结点是一个随机值，temp一旦出了作用域后就会立刻调用析构函数，析构一个野指针会导致程序崩溃！

或者我们能不能不能在初始化列表里面将_head初始化成nullptr呢？也不行！因为我们写的析构函数是默认有哨兵位的情况下进行析构的！所以必须进行初始化！

赋值重载

传统写法

//传统写法
iterator& operator=(const list<T>& lt)
{
	if (this != &lt)
	{
		clear();//每次赋值前都去清除掉原理的
		for (const auto& e : lt)
		{
			push_back(e);
		}
	}
    return *this;
}

现代写法

iterator& operator=(list<T> lt)
{
	swap(lt);
	return *this;
}

这个写法是利用了拷贝构造！传值传参会调用拷贝构造构造一个新的对象！lt

然后我们可以直接利用swap交换头结点！

而且出了作用域后lt也会调用析构函数来释放原来对象的资源！

erase

iterator erase(iterator pos)
{
	assert(pos != end());//不能等于哨兵位！
	node* cur = pos._pnode;
	node* next = cur->_next;
	node* prev = cur->_prev;

	prev->_next = next;
	next->_prev = prev;
	delete cur;
    --_size;
	return iterator(next);//返回删除的节点的下一个节点的迭代器
}

先把前后两个节点保存下来

然后改变前一个节点的next

改变后一个节点的prev

最后释放掉原来的节点！

pop_front

void pop_front()
{
	erase(begin());//删除掉头结点
}

pop_back

void pop_back()
{
	erase(--end());//end()是最后一个节点的下一个！所以要-- 是其指向最后一个节点！
}

clear

void clear()
{
	iterator it = begin();
	while (it != end())
	{
		it = erase(it);//erase之后不能++ 因为已经发生了迭代器失效!
	}
}

clear和析构的区别在于 clear不清头结点！

析构函数

~list()
{
	clear();
	delete _head;
	_head = nullptr;
}

析构函数记得要清楚头结点！

size

size_t size()const
{
	return _size;
}

关于list的size，因为有的源码可能不提供_size这个成员变量使用的是遍历一遍然后返回！时间复杂度为O(N)

所以使用list的size需要谨慎！

empty

bool empty()const
{
	return _size == 0;
}

也可以使用

bool empty()const
{
    return _head->_next == _head->_prev;
}

来判断是否为空

代码展示

#pragma once
#include<iostream>
#include<assert.h>
namespace My_STL
{
	template<class T>
	struct list_node
	{
		list_node* _next;
		list_node* _prev;
		T _data;
		list_node(const T& x)
			:_data(x),
			_next(nullptr),
			_prev(nullptr)
		{
		}

	};
	template<class T,class Ref,class Ptr>
	struct __list_iterator
	{
		typedef list_node<T> node;
		typedef __list_iterator<T, Ref, Ptr> self;
		node* _pnode;

		__list_iterator(node* p)
			:_pnode(p)
		{}

		Ref operator*()
		{
			return _pnode->_data;
		}
		Ptr operator->()
		{
			return &_pnode->_data;
		}

		self& operator++()
		{
			_pnode = _pnode->_next;
			return *this;
		}
		self operator++(int)
		{
			node* cur = _pnode;
			_pnode = _pnode->_next;

			return self(cur);
		}
		self& operator--()
		{
			_pnode = _pnode->_prev;
			return *this;
		}
		self operator--(int)
		{
			node* cur = _pnode;
			_pnode = _pnode->_prev;
			return self(cur);
		}
		bool operator!=(const self& it)const
		{
			return _pnode != it._pnode;
		}
		bool operator==(const self& it)const
		{
			return _pnode == it._pnode;
		}
		

	};


	template<class T>
	class list
	{
		typedef list_node<T> node;
	private:
		node* _head;
		size_t _size;
	public:
		typedef __list_iterator<T,T&,T>  iterator;
		typedef __list_iterator<T, const T&,const T*> const_iterator;
		void empty_initialize()
		{
			_head = new node(T());
			_head->_next = _head;
			_head->_prev = _head;
			_size = 0;
		}
		list()
		{
			empty_initialize();
		}
		template<class InputIterator>
		list(InputIterator first, InputIterator last)
		{
			empty_initialize();
			while (first != last)
			{
				push_back(*first);
				++first;
			}
		}
		void swap(list<T>& lt)
		{
			std::swap(_head, lt._head);
			std::swap(_size, lt._size);
		}

		list(const list<T>& lt)
		{
			empty_initialize();
			list<T> temp(lt.begin(), lt.end());
			swap(temp);
		}
		~list()
		{
			clear();
			delete _head;
			_head = nullptr;
		}
		iterator& operator=(list<T> lt)
		{
			swap(lt);
			return *this;
		}
		void clear()
		{
			iterator it = begin();
			while (it != end())
			{
				it = erase(it);
			}
		}
		iterator insert(iterator pos, const T& x = T())
		{
			node* newnode = new node(x);
			node* cur = pos._pnode;
			node* prev = cur->_prev;

			newnode->_next = cur;
			newnode->_prev = prev;
			prev->_next = newnode;
			cur->_prev = newnode;
			++_size;
			return iterator(cur);
		}

		iterator erase(iterator pos)
		{
			assert(pos != end());
			node* cur = pos._pnode;
			node* next = cur->_next;
			node* prev = cur->_prev;

			prev->_next = next;
			next->_prev = prev;
			delete cur;
			--_size;
			return iterator(next);
		}

		void push_back(const T& x)
		{
			insert(end(), x);
		}

		void push_front(const T& x)
		{
			insert(begin(), x);
		}
		void pop_front()
		{
			erase(begin());
		}
		void pop_back()
		{
			erase(--end());
		}
		iterator begin()
		{
			return iterator(_head->_next);
		}
		const_iterator begin()const
		{
			return const_iterator(_head->_next);
		}
		iterator end()
		{
			return iterator(_head);
		}

		const_iterator end()const
		{
			return const_iterator(_head);
		}
		size_t size()const
		{
			return _size;
		}
		bool empty()const
		{		
			//return _head->_next == _head->_prev;
			return _size == 0;
		}
	};

};