当前位置：首页 > article >正文

Unix 进程的启动方式及经典和现代做法（中英双语）

article 2025/2/5 4:46:32

Unix 进程的启动方式及经典做法

1. 引言

Shell 的核心功能之一是启动进程（starting processes）。在 Unix/Linux 系统中，所有的用户进程（除了 init 进程）都是由已有进程派生出来的，因此理解进程的创建方式是编写 Shell 或管理系统进程的基础。

在 Unix 及其衍生系统（如 Linux）中，启动进程的经典方法是使用 fork() 和 exec() 组合。本文将详细介绍 Unix 进程的启动机制、经典方法、以及现代通用做法。

2. Unix 进程的启动方式

在 Unix 系统中，进程的创建有两种方式：

系统启动时，由 init（或 systemd）启动
通过 fork() 复制进程，再用 exec() 替换程序

2.1 `init` 进程

当 Unix 内核加载完成后，它启动的第一个用户空间进程就是 init（现代 Linux 采用 systemd）。init 负责：

初始化系统，启动后台服务（如 cron, syslogd）。
运行 getty 进程，提供登录界面。
作为所有孤儿进程的收容者（reaper）。

2.2 `fork()` + `exec()`：进程创建的标准方式

普通进程的创建方式是：

fork() 复制当前进程（创建子进程）。
子进程使用 exec() 运行新程序（替换自身）。

示例：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
    pid_t pid = fork();  // 创建子进程

    if (pid < 0) {
        perror("fork failed");
        exit(EXIT_FAILURE);
    } else if (pid == 0) {
        // 子进程执行新的程序
        execlp("ls", "ls", "-l", NULL);
        perror("exec failed");  // 如果 exec 失败，打印错误
        exit(EXIT_FAILURE);
    } else {
        // 父进程等待子进程结束
        wait(NULL);
        printf("Child process finished.\n");
    }
    return 0;
}

2.3 `fork()` 和 `exec()` 解析

(1) `fork()`：复制当前进程

fork() 调用后，当前进程会被复制，成为两个几乎相同的进程（父进程和子进程）。
在子进程中，fork() 返回 0，表示自己是子进程。
在父进程中，fork() 返回子进程的 PID。

示例：

pid_t pid = fork();

if (pid == 0) {
    printf("我是子进程，PID=%d\n", getpid());
} else {
    printf("我是父进程，PID=%d，子进程 PID=%d\n", getpid(), pid);
}

(2) `exec()`：执行新程序

exec() 系列函数用于替换当前进程的代码，包括：

execl()
execv()
execle()
execvp()
execvpe()

示例：

execlp("ls", "ls", "-l", NULL);

进程调用 exec() 后，会加载 ls 命令，并运行它，原进程的代码完全被新进程的代码替换。
如果 exec() 成功，后面的代码不会执行，除非失败（此时会返回 -1）。

3. `fork()` + `exec()` 的经典使用

Shell 处理用户输入时，会：

解析命令，拆分参数。
调用 fork() 创建子进程。
子进程调用 exec() 执行新程序。
父进程调用 wait() 等待子进程结束。

3.1 经典 Shell 进程模型

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main() {
    char *cmd = "/bin/ls";
    char *args[] = {"ls", "-l", NULL};

    pid_t pid = fork();
    if (pid < 0) {
        perror("fork failed");
        exit(EXIT_FAILURE);
    } else if (pid == 0) {
        // 子进程执行命令
        execv(cmd, args);
        perror("exec failed");
        exit(EXIT_FAILURE);
    } else {
        // 父进程等待子进程结束
        wait(NULL);
        printf("Child process completed.\n");
    }
    return 0;
}

4. 现代通用的进程创建方式

虽然 fork() + exec() 仍然是主流，但现代操作系统提供了更高效的替代方案：

4.1 `posix_spawn()`

fork() 会复制整个进程的 内存空间，但在 exec() 之后，原始数据会被丢弃，因此效率不高。
posix_spawn() 直接创建进程并执行新程序，避免 fork() 额外的资源消耗。

示例：

#include <spawn.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

extern char **environ;

int main() {
    pid_t pid;
    char *args[] = {"ls", "-l", NULL};
    
    if (posix_spawn(&pid, "/bin/ls", NULL, NULL, args, environ) != 0) {
        perror("posix_spawn failed");
        exit(EXIT_FAILURE);
    }

    printf("Spawned process PID=%d\n", pid);
    return 0;
}

优势：

适用于 低资源环境（如嵌入式系统）。
避免 fork() 造成的写时复制（Copy-On-Write）。

4.2 `clone()`（Linux 专用）

clone() 是 fork() 的更底层实现，允许创建共享资源的进程。
Docker、Linux 容器等技术广泛使用 clone() 以优化进程管理。

示例：

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int child_func(void *arg) {
    printf("Child process running\n");
    return 0;
}

int main() {
    char stack[1024*1024];  // 子进程的栈空间
    pid_t pid = clone(child_func, stack + sizeof(stack), SIGCHLD, NULL);

    if (pid == -1) {
        perror("clone failed");
        exit(EXIT_FAILURE);
    }

    printf("Created process with PID=%d\n", pid);
    return 0;
}

优势：

允许共享 内存、文件描述符、信号 等资源。
用于 线程（pthread）、轻量级进程（LWP）。

5. 总结

方式	适用场景	优势	劣势
`fork() + exec()`	传统进程创建方式	可靠，适用于 Shell	`fork()` 开销大
`posix_spawn()`	嵌入式/轻量级应用	避免 `fork()` 复制数据	兼容性较低
`clone()`	Linux 容器/线程	共享资源，高效	仅适用于 Linux

🚀 经典 Shell 仍然使用 fork() + exec()，但现代操作系统在高性能场景下采用 posix_spawn() 或 clone() 来优化进程管理！

How Unix Starts Processes: Classic and Modern Approaches

1. Introduction

One of the core functions of a Unix shell is starting processes. Every command you type into a shell results in the creation of a new process. But how does Unix actually start new processes?

The traditional approach in Unix and Linux is based on the fork() and exec() system calls. However, modern systems have introduced more efficient methods like posix_spawn() and clone(), particularly for performance-sensitive applications.

This article explores how processes start on Unix, including classic methods and modern alternatives.

2. How Processes Start in Unix

There are only two ways to start a process in Unix:

The init Process (or systemd on modern Linux)
Using fork() and exec()

2.1 `init` (or `systemd`) Starts System Processes

When a Unix system boots, the kernel loads into memory and starts a single process: init (or systemd in modern Linux).
init is responsible for:
- Launching system daemons (e.g., cron, syslogd).
- Starting getty to provide login prompts.
- Managing orphaned processes.

2.2 `fork()` + `exec()`: The Standard Way to Create Processes

For regular applications, new processes are created using fork() followed by exec():

fork() duplicates the current process, creating a child process.
The child process replaces itself with a new program using exec().

Example: Creating a Process with `fork()` + `exec()`

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main() {
    pid_t pid = fork();  // Create child process

    if (pid < 0) {
        perror("fork failed");
        exit(EXIT_FAILURE);
    } else if (pid == 0) {
        // Child process executes a new program
        execlp("ls", "ls", "-l", NULL);
        perror("exec failed");  // If exec fails
        exit(EXIT_FAILURE);
    } else {
        // Parent process waits for child to finish
        wait(NULL);
        printf("Child process finished.\n");
    }
    return 0;
}

3. Understanding `fork()` and `exec()`

(1) `fork()`: Duplicates the Process

After calling fork(), there are two nearly identical processes (parent and child).
The child process receives 0 as the return value of fork(), while the parent process receives the child’s PID.

Example:

pid_t pid = fork();

if (pid == 0) {
    printf("I am the child process, PID=%d\n", getpid());
} else {
    printf("I am the parent process, PID=%d, Child PID=%d\n", getpid(), pid);
}

(2) `exec()`: Runs a New Program

exec() replaces the current process image with a new program.
There are multiple exec() variants:
- execl()
- execv()
- execle()
- execvp()
- execvpe()

Example:

execlp("ls", "ls", "-l", NULL);

The process calls exec() to replace itself with ls.
If exec() succeeds, the original process code is completely replaced.
If exec() fails, it returns -1 (error).

4. Classic Shell Model Using `fork()` + `exec()`

When a shell processes a command:

Parses the command into arguments.
Calls fork() to create a child process.
Child calls exec() to run the command.
Parent calls wait() to wait for the child to finish.

Shell-style Process Management

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main() {
    char *cmd = "/bin/ls";
    char *args[] = {"ls", "-l", NULL};

    pid_t pid = fork();
    if (pid < 0) {
        perror("fork failed");
        exit(EXIT_FAILURE);
    } else if (pid == 0) {
        execv(cmd, args);
        perror("exec failed");
        exit(EXIT_FAILURE);
    } else {
        wait(NULL);
        printf("Child process completed.\n");
    }
    return 0;
}

5. Modern Alternatives to `fork()` + `exec()`

5.1 `posix_spawn()`: More Efficient Process Creation

fork() duplicates memory pages (Copy-on-Write), which can be inefficient.
posix_spawn() creates a new process and immediately executes a new program.

Example:

#include <spawn.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

extern char **environ;

int main() {
    pid_t pid;
    char *args[] = {"ls", "-l", NULL};

    if (posix_spawn(&pid, "/bin/ls", NULL, NULL, args, environ) != 0) {
        perror("posix_spawn failed");
        exit(EXIT_FAILURE);
    }

    printf("Spawned process PID=%d\n", pid);
    return 0;
}

Advantages:

Avoids unnecessary memory duplication.
More efficient for embedded systems and low-resource environments.

5.2 `clone()`: Used in Containers and Lightweight Processes

clone() is a Linux-specific system call used for containerization and thread management.
Unlike fork(), clone() allows processes to share memory, file descriptors, etc..

Example:

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int child_func(void *arg) {
    printf("Child process running\n");
    return 0;
}

int main() {
    char stack[1024*1024];  // Allocate child stack
    pid_t pid = clone(child_func, stack + sizeof(stack), SIGCHLD, NULL);

    if (pid == -1) {
        perror("clone failed");
        exit(EXIT_FAILURE);
    }

    printf("Created process with PID=%d\n", pid);
    return 0;
}

Advantages:

Used in Docker, Linux containers, and multithreading.
More efficient than fork() for creating lightweight processes.

6. Comparison of Process Creation Methods

Method	Use Case	Pros	Cons
`fork() + exec()`	Traditional process creation	Standard Unix method	High memory overhead
`posix_spawn()`	Embedded systems, performance-sensitive apps	Avoids memory duplication	Less flexible than `fork()`
`clone()`	Containers, lightweight processes	Efficient resource sharing	Linux-only

7. Conclusion

✅ Classic shells use fork() + exec() to create processes.
✅ Modern systems optimize process creation with posix_spawn() (efficient) and clone() (used in containers).
✅ Understanding process creation is essential for writing shells and managing system performance.

🚀 While fork() remains the standard, modern Unix-like OSes leverage alternatives for improved efficiency.