当前位置：首页 > article >正文

Linux C/C++编程-线程退出时的清理机会

article 2025/3/1 13:49:43

【图书推荐】《Linux C与C++一线开发实践（第2版）》_linux c与c++一线开发实践pdf-CSDN博客
《Linux C与C++一线开发实践（第2版）（Linux技术丛书）》(朱文伟，李建英)【摘要书评试读】- 京东图书 (jd.com)

Linux系统与编程技术_夏天又到了的博客-CSDN博客

Linux C/C++编程的线程创建-CSDN博客

前面讲了线程的终止，主动终止可以认为是线程正常终止，这种方式是可预见的。被动终止是其他线程要求其结束，这种退出方式是不可预见的，是一种异常终止。不论是可预见的线程终止还是异常终止，都会存在资源释放的问题，在不考虑因运行出错而退出的前提下，如何保证线程终止时能顺利地释放掉自己所占用的资源，特别是锁资源，就是一个必须解决的问题。经常出现的情形是资源独占锁的使用：线程为了访问临界资源而为其加上锁，但在访问过程中线程被外界取消，如果取消成功了，则该临界资源将永远处于锁定状态得不到释放。外界取消操作是不可预见的，因此的确需要一个机制来简化用于资源释放的编程，也就是需要一个在线程退出时执行清理的机会。关于锁后面会讲到，这里只需要知道谁上了锁，谁就要负责解锁，否则会引起程序死锁。

我们来看一个场景：线程1执行这样一段代码：

void *thread1(void *arg)  
{
	pthread_mutex_lock(&mutex);  	// 上锁
	// 调用某个阻塞函数，比如套接字的accept，该函数等待客户连接
	sock = accept(...);          
	pthread_mutex_unlock(&mutex);
}

这个例子中，如果线程1执行accept，线程就会阻塞（也就是等在那里，有客户端连接的时候才返回，或者出现其他故障）。现在线程1处于等待中，这时线程2发现线程1等了很久，不耐烦了，它想关掉线程1，于是调用pthread_cancel或者类似函数请求线程1立即退出。这时线程1仍然在accept等待中，当它收到线程2的cancel信号后，就会从accept中退出，然后终止线程，但是这个时候线程1还没有执行解锁函数pthread_mutex_unlock(&mutex);，也就是说锁资源没有释放，从而造成其他线程的死锁问题，也就是其他在等待这个锁资源的线程将永远等不到了。因此，必须在线程接收到cancel后，用一种方法来保证异常退出（也就是线程没达到终点）时可以做清理工作（主要是解锁方面）。

对此，POSIX线程库提供了函数pthread_cleanup_push和pthread_cleanup_pop，让线程退出时可以做一些清理工作。这两个函数采用先入后出的栈结构管理，前者会把一个函数压入清理函数栈，后者用来弹出栈顶的清理函数，并根据参数来决定是否执行清理函数。多次调用函数pthread_cleanup_push将把当前在栈顶的清理函数往下压，弹出清理函数时，在栈顶的清理函数先被弹出。栈的特点是先进后出。pthread_cleanup_push声明如下：

void pthread_cleanup_push(void (*routine)(void *), void *arg);

其中，参数routine是一个函数指针，arg是该函数的参数。由pthread_cleanup_push压栈的清理函数在下面3种情况下会执行：

（1）线程主动结束时，比如return或调用pthread_exit的时候。

（2）调用函数pthread_cleanup_pop，且其参数为非0时。

（3）线程被其他线程取消时，也就是有其他的线程对该线程调用pthread_cancel函数。

函数pthread_cleanup_pop声明如下：

void pthread_cleanup_pop(int execute);

其中，参数execute用来决定在弹出栈顶清理函数的同时是否执行清理函数，取0时表示不执行清理函数，非0时则执行清理函数。要注意的是，函数pthread_cleanup_pop与pthread_cleanup_push必须成对出现在同一个函数中，否则就会出现语法错误。

了解这两个函数后，我们可以把上面可能会引起死锁的线程1的代码改写如下：

void *thread1(void *arg)  
{
	pthread_cleanup_push(clean_func,...) 	// 压栈一个清理函数 clean_func
	pthread_mutex_lock(&mutex); 			// 上锁
	// 调用某个阻塞函数，比如套接字的accept，该函数等待客户连接
	sock = accept(...);            

	pthread_mutex_unlock(&mutex);  			// 解锁
	pthread_cleanup_pop(0); 				// 弹出清理函数，但不执行，因为参数是0
	return NULL;
}

在上面的代码中，如果accept被其他线程cancel后退出，就会自动调用clean_func函数，在这个函数中可以释放锁资源。如果accept没有被cancel，那么线程继续执行，当执行到“pthread_mutex_unlock(&mutex);”时，线程自己已正确地释放资源了，再执行到“pthread_cleanup_pop(0);”时，会把前面压栈的清理函数clean_func弹出栈，但不会去执行它（因为参数是0）。现在的流程就安全了。

【例8.17】线程主动终止时，调用清理函数

（1）打开Visual Studio Code，新建一个test.cpp文件，在test.cpp中输入代码：

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h> // strerror
 
void mycleanfunc(void *arg) 					// 清理函数
{
	printf("mycleanfunc:%d\n", *((int *)arg));	// 打印传进来的不同参数					 
}
void *thfrunc1(void *arg)
{
	int m=1;
	printf("thfrunc1 comes \n");
	pthread_cleanup_push(mycleanfunc, &m);  // 把清理函数压栈
	return (void *)0;	 	// 退出线程
	pthread_cleanup_pop(0);	// 把清理函数出栈，这句不会执行，但必须有，否则编译不过
}
 
void *thfrunc2(void *arg)
{
	int m = 2;
	printf("thfrunc2 comes \n");
	pthread_cleanup_push(mycleanfunc, &m); // 把清理函数压栈
	pthread_exit(0); // 退出线程
	pthread_cleanup_pop(0); // 把清理函数出栈，这句不会执行，但必须有，否则编译不过	
}

int main(void)
{
	pthread_t pid1,pid2;
	int res;
	res = pthread_create(&pid1, NULL, thfrunc1, NULL); // 创建线程1
	if (res) 
	{
		printf("pthread_create failed: %d\n", strerror(res));
		exit(1);
	}
	pthread_join(pid1, NULL); // 等待线程1结束
	
	res = pthread_create(&pid2, NULL, thfrunc2, NULL); // 创建线程2
	if (res) 
	{
		printf("pthread_create failed: %d\n", strerror(res));
		exit(1);
	}
	pthread_join(pid2, NULL); // 等待线程2结束
	
	printf("main over\n");
	return 0;
}

（2）上传test.cpp到Linux，在终端下输入命令g++ -o test test.cpp -lpthread，其中pthread是线程库的名字，然后运行test，运行结果如下：

[root@localhost cpp98]# g++ -o test test.cpp -lpthread
[root@localhost cpp98]# ./test
thfrunc1 comes 
mycleanfunc:1
thfrunc2 comes 
mycleanfunc:2
main over

从例子中可以看到，无论是return还是pthread_exit都会引起清理函数的执行。值得注意的是，pthread_cleanup_pop必须和pthread_cleanup_push成对出现在同一个函数中，否则编译不过，读者可以把pthread_cleanup_pop注释掉后再编译试试。这个例子是线程主动调用清理函数，下面我们再看一个由pthread_cleanup_pop执行清理函数的例子。

【例8.18】pthread_cleanup_pop调用清理函数

（1）打开Visual Studio Code，新建一个test.cpp文件，在test.cpp中输入代码：

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h> // strerror
 
void mycleanfunc(void *arg)					// 清理函数
{
	printf("mycleanfunc:%d\n", *((int *)arg));						 
}
void *thfrunc1(void *arg) 					// 线程函数
{
	int m=1,n=2;
	printf("thfrunc1 comes \n");
	pthread_cleanup_push(mycleanfunc, &m); 	// 把清理函数压栈
	pthread_cleanup_push(mycleanfunc, &n); 	// 再把一个清理函数压栈
	pthread_cleanup_pop(1);// 出栈清理函数，并执行
	pthread_exit(0); 						// 退出线程
	pthread_cleanup_pop(0); 				// 不会执行，仅为了成对
}
  
int main(void)
{
	pthread_t pid1 ;
	int res;
	res = pthread_create(&pid1, NULL, thfrunc1, NULL); // 创建线程
	if (res) 
	{
		printf("pthread_create failed: %d\n", strerror(res));
		exit(1);
	}
	pthread_join(pid1, NULL);				// 等待线程结束
	
	printf("main over\n");
	return 0;
}

（2）上传test.cpp到Linux，在终端下输入命令g++ -o test test.cpp -lpthread，其中pthread是线程库的名字，然后运行test，运行结果如下：

[root@localhost cpp98]# g++ -o test test.cpp -lpthread
[root@localhost cpp98]# ./test
thfrunc1 comes 
mycleanfunc:2
mycleanfunc:1
main over

从例子中可以看出，我们连续压了两次清理函数入栈，第一次压栈的清理函数在栈底，第二次压栈的清理函数就到栈顶了，出栈的时候应该是第二次压栈的清理函数先执行，因此“pthread_cleanup_pop(1);”执行的是传n进去的清理函数，输出的整数值是2。pthread_exit退出线程时，引发执行的清理函数是传m进去的清理函数，输出的整数值是1。下面再看最后一种情况，线程被取消时引发清理函数。

【例8.19】取消线程时引发清理函数

（1）打开Visual Studio Code，新建一个test.cpp文件，在test.cpp中输入代码：

#include<stdio.h>  
#include<stdlib.h>  
#include <pthread.h>  
#include <unistd.h> // sleep

void mycleanfunc(void *arg) // 清理函数
{
	printf("mycleanfunc:%d\n", *((int *)arg)); 
}
 
void *thfunc(void *arg)  
{  
	int i = 1;  
	printf("thread start-------- \n"); 
	pthread_cleanup_push(mycleanfunc, &i); 	// 把清理函数压栈
	while (1)  
	{
		i++;  
		printf("i=%d\n", i);
	}	
	printf("this line will not run\n"); 	// 这句不会调用
	pthread_cleanup_pop(0);  				// 仅仅为了成对调用
	
	return (void *)0;  
}  
int main()  
{  
	void *ret = NULL;  
	int iret = 0;  
	pthread_t tid;  
	pthread_create(&tid, NULL, thfunc, NULL);	// 创建线程
	sleep(1); 					 // 等待一会，让子线程开始while循环
	pthread_cancel(tid); 		// 发送取消线程的请求  
	pthread_join(tid, &ret);  	// 等待线程结束
	if (ret == PTHREAD_CANCELED) 				// 判断是否成功取消线程
		printf("thread has stopped,and exit code: %d\n", ret);  
		// 打印返回值，应该是-1
	else
		printf("some error occured");
          
	return 0;  
}

（2）上传test.cpp到Linux，在终端下输入命令g++ -o test test.cpp -lpthread，其中pthread是线程库的名字，然后运行test，运行结果如下：

[root@localhost cpp98]# g++ -o test test.cpp -lpthread
[root@localhost cpp98]# ./test
i=2
i=3
i=4
...
i=24383
i=24384
i=24385
i=24386
i=24387
i=24388
i=24389i=24389
mycleanfunc:24389
thread has stopped,and exit code: -1

从这个例子可以看出，子线程在循环打印i的值，一直到被取消。由于循环里有系统调用printf，因此取消成功。取消成功的时候，将会执行清理函数，在清理函数中打印的i值将是执行很多次i++后的i值。这是因为我们压栈清理函数的时候，传给清理函数的是i的地址，而执行清理函数的时候，i的值已经变了，所以打印的是最新的i值。