当前位置：首页 > article >正文

基础IO相关知识

article 2025/1/30 16:41:51

基础IO

在这里插入图片描述

理解"⽂件"

狭义理解

• ⽂件在磁盘⾥

• 磁盘是永久性存储介质，因此⽂件在磁盘上的存储是永久性的

• 磁盘是外设（即是输出设备也是输⼊设备）

• 磁盘上的⽂件本质是对⽂件的所有操作，都是对外设的输⼊和输出简称 IO

⼴义理解

Linux 下⼀切皆⽂件（键盘、显⽰器、⽹卡、磁盘…… 这些都是抽象化的过程）

⽂件操作的归类认知

• 对于 0KB 的空⽂件是占⽤磁盘空间的

• ⽂件是⽂件属性（元数据）和⽂件内容的集合（⽂件 = 属性（元数据）+ 内容）

• 所有的⽂件操作本质是⽂件内容操作和⽂件属性操作

系统⻆度

• 对⽂件的操作本质是进程对⽂件的操作

• 磁盘的管理者是操作系统

• ⽂件的读写本质不是通过 C 语⾔ / C++ 的库函数来操作的（这些库函数只是为⽤⼾提供⽅便），⽽是通过⽂件相关的系统调⽤接⼝来实现的

系统⽂件I/O

打开⽂件的⽅式不仅仅是fopen，ifstream等流式，语⾔层的⽅案，其实系统才是打开⽂件最底层的⽅案。不过，在学习系统⽂件IO之前，先要了解下如何给函数传递标志位，该⽅法在系统⽂件IO接⼝中会使⽤到：

⼀种传递标志位的⽅法

#include <stdio.h>
#define ONE 0001 //0000 0001
#define TWO 0002 //0000 0010
#define THREE 0004 //0000 0100
void func(int flags) {
if (flags & ONE) printf("flags has ONE! ");
if (flags & TWO) printf("flags has TWO! ");
if (flags & THREE) printf("flags has THREE! ");
printf("\n");
}
int main() {
func(ONE);
func(THREE);
func(ONE | TWO);
func(ONE | THREE | TWO);
return 0;
}

操作⽂件，除了上⼩节的C接⼝（当然，C++也有接⼝，其他语⾔也有），我们还可以采⽤系统接⼝来进⾏⽂件访问，先来直接以系统代码的形式，实现和上⾯⼀模⼀样的代码：

hello.c 写⽂件:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
int main()
{
umask(0);
int fd = open("myfile", O_WRONLY|O_CREAT, 0644);
if(fd < 0){
perror("open");
return 1;
}
int count = 5;
const char *msg = "hello bit!\n";
int len = strlen(msg);
while(count--){
write(fd, msg, len);//fd: 后⾯讲， msg：缓冲区⾸地址， len: 本次读取，期望写
⼊多少个字节的数据。 返回值：实际写了多少字节数据
}
close(fd);
return 0;
}

hello.c读⽂件

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
int main()
{
int fd = open("myfile", O_RDONLY);
if(fd < 0){
perror("open");
return 1;
}
const char *msg = "hello bit!\n";
char buf[1024];
while(1){
ssize_t s = read(fd, buf, strlen(msg));//类⽐write
if(s > 0){
printf("%s", buf);
}else{
break;
}
}
close(fd);
return 0;
}

接口介绍

open man open

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
pathname: 要打开或创建的⽬标⽂件
flags: 打开⽂件时，可以传⼊多个参数选项，⽤下⾯的⼀个或者多个常量进⾏“或”运算，构成
flags。
参数:
O_RDONLY: 只读打开
O_WRONLY: 只写打开
O_RDWR : 读，写打开
这三个常量，必须指定⼀个且只能指定⼀个
O_CREAT : 若⽂件不存在，则创建它。需要使⽤mode选项，来指明新⽂件的访问
权限
O_APPEND: 追加写
返回值：
成功：新打开的⽂件描述符
失败：-1

mode_t理解：直接 man ⼿册，⽐什么都清楚。

open 函数具体使⽤哪个，和具体应⽤场景相关，如⽬标⽂件不存在，需要open创建，则第三个参数表⽰创建⽂件的默认权限,否则，使⽤两个参数的open。

write read close lseek ,类⽐C⽂件相关接⼝。

open函数返回值

在认识返回值之前，先来认识⼀下两个概念: 系统调⽤和库函数

• 上⾯的 fopen fclose fread fwrite 都是C标准库当中的函数，我们称之为库函数（libc）。

• ⽽ open close read write lseek 都属于系统提供的接⼝，称之为系统调⽤接⼝

系统调⽤接⼝和库函数的关系，⼀⽬了然。

所以，可以认为， f# 系列的函数，都是对系统调⽤的封装，⽅便⼆次开发。

⽂件描述符fd

⽂件描述符就是⼀个⼩整数

0 & 1 & 2

• Linux进程默认情况下会有3个缺省打开的⽂件描述符，分别是标准输⼊0，标准输出1，标准错误2.

• 0,1,2对应的物理设备⼀般是：键盘，显⽰器，显⽰器

所以输⼊输出还可以采⽤如下⽅式：

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
int main()
{
char buf[1024];
ssize_t s = read(0, buf, sizeof(buf));
if(s > 0){
buf[s] = 0;
write(1, buf, strlen(buf));
write(2, buf, strlen(buf));
}
return 0;
}

⽽现在知道，⽂件描述符就是从0开始的⼩整数。当我们打开⽂件时，操作系统在内存中要创建相应的数据结构来描述⽬标⽂件。于是就有了file结构体。表⽰⼀个已经打开的⽂件对象。⽽进程执⾏open系统调⽤，所以必须让进程和⽂件关联起来。每个进程都有⼀个指针*files, 指向⼀张表files_struct,该表最重要的部分就是包含⼀个指针数组，每个元素都是⼀个指向打开⽂件的指针！所以，本质上，⽂件描述符就是该数组的下标。所以，只要拿着⽂件描述符，就可以找到对应的⽂件。

⽂件描述符的分配规则

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main()
{
int fd = open("myfile", O_RDONLY);
if(fd < 0){
perror("open");
return 1;
}
printf("fd: %d\n", fd);
close(fd);
return 0;
}

输出发现是 fd: 3

关闭0或者2，再看

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main()
{
close(0);
//close(2);
int fd = open("myfile", O_RDONLY);
if(fd < 0){
perror("open");
return 1;
}
printf("fd: %d\n", fd);
close(fd);
return 0;
}

发现是结果是： fd: 0 或者 fd 2 ，可⻅，⽂件描述符的分配规则：在files_struct数组当中，找到当前没有被使⽤的最⼩的⼀个下标，作为新的⽂件描述符。

重定向

那如果关闭1呢？看代码：

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>
int main()
{
close(1);
int fd = open("myfile", O_WRONLY|O_CREAT, 00644);
if(fd < 0){
perror("open");
return 1;
}
printf("fd: %d\n", fd);
fflush(stdout);
close(fd);
exit(0);
}

此时，我们发现，本来应该输出到显⽰器上的内容，输出到了⽂件 myfile 当中，其中，fd＝1。这种现象叫做输出重定向。常⻅的重定向有: > , >> , <

那重定向的本质是什么呢？

使⽤ dup2 系统调⽤

函数原型如下:

#include <unistd.h>
int dup2(int oldfd, int newfd);

⽰例代码

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
int fd = open("./log", O_CREAT | O_RDWR);
if (fd < 0) {
perror("open");
return 1;
}
close(1);
dup2(fd, 1);
for (;;) {
char buf[1024] = {0};
ssize_t read_size = read(0, buf, sizeof(buf) - 1);
if (read_size < 0) {
perror("read");
break;
}
printf("%s", buf);
fflush(stdout);
}
return 0;
}

printf是C库当中的IO函数，⼀般往 stdout 中输出，但是stdout底层访问⽂件的时候，找的还是fd:1,

但此时，fd:1下标所表⽰内容，已经变成了myfifile的地址，不再是显⽰器⽂件的地址，所以，输出的

任何消息都会往⽂件中写⼊，进⽽完成输出重定向。那追加和输⼊重定向如何完成呢？

文件描述符的理解

在进程中每打开一个文件，都会创建有相应的文件描述信息struct file，这个描述信息被添加在pcb的struct files_struct中，以数组的形式进行管理，随即向用户返回数组的下标作为文件描述符，用于操作文件

重定向的实现原理

每个文件描述符都是一个内核中文件描述信息数组的下标，对应有一个文件的描述信息用于操作文件，而重定向就是在不改变所操作的文件描述符的情况下，通过改变描述符对应的文件描述信息进而实现改变所操作的文件

在这里插入图片描述