当前位置：首页 > article >正文

关于linux里的df命令以及inode、数据块-stat链接数以及关于awk文本处理命令中内置函数sub、gsub、sprintf

article 2025/2/23 14:25:38

一、关于linux里的df命令以及inode、数据块-stat链接数

Linux中df命令用于显示目前在Linux系统上的文件系统的磁盘使用情况统计，平常这个命令也用得很多，但一般就是使用df -h查看各个分区的空间使用情况，除此外也可以使用df查看当前linux系统上的文件系统，可使用-T选项,： --print-type print file system type结合-a查看所有的分区的文件系统，可以看到当前阿里云的分区使用的是ext4文件系统，其它还有例如proc, sysfs, tmpfs等文件系统。如下：

#使用 df -aT查看所有的分区的文件系统
[root@007 ~]# df -aT
Filesystem     Type        1K-blocks     Used Available Use% Mounted on
/dev/vda1      ext4         20510332 16449048   3012760  85% /
proc           proc                0        0         0    - /proc
sysfs          sysfs               0        0         0    - /sys
devpts         devpts              0        0         0    - /dev/pts
tmpfs          tmpfs          510004        0    510004   0% /dev/shm
none           binfmt_misc         0        0         0    - /proc/sys/fs/binfmt_misc
#使用 df -i通过inode数量来查看系统空间占用
[root@04007 ~]# df -i
Filesystem      Inodes  IUsed  IFree IUse% Mounted on
/dev/vda1      1310720 383303 927417   30% /
tmpfs           127501      1 127500    1% /dev/shm

关于inode可以理解为整个磁盘的索引节点，linux上扇区（Sector）是磁盘的最小存储单位。每个扇区储存512字节（即0.5kb）。系统读取硬盘时，比如MYSQL通过索引查找到数据存储位置后会一次性读取多个扇区(即多个block数据块）上的数据。这种由多个扇区组成的块，是文件存取的最小单位。block块的大小，最常见的是4KB，即连续八个sector组成一个block。块大小在系统文件格式化的时候可以设置，也可设置为1Kb, 2Kb。可以通过dumpe2fs 分区来查看分区分件系统的相关信息，包括总共有多少个Inode，多少空间Inode等等，如下：

[root@007 ~]# dumpe2fs /dev/vda1 | more 
dumpe2fs 1.41.12 (17-May-2010)
Filesystem volume name:   <none>
Last mounted on:          /
Filesystem UUID:          94e4e384-0ace-437f-bc96-057dd64f42ee
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              1310720
Block count:              5242624
Reserved block count:     262131
Free blocks:              2009428
Free inodes:              1016904
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1022
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Thu Aug 14 21:16:07 2014
Last mount time:          Sun Apr  8 06:20:13 2018
Last write time:          Sun Apr  8 06:18:55 2018
Mount count:              12
Maximum mount count:      -1
Last checked:             Thu Aug 14 21:16:07 2014
Check interval:           0 (<none>)
Lifetime writes:          4464 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
First orphan inode:       1064918
Default directory hash:   half_md4
Directory Hash Seed:      d5c54a86-d535-4c9b-9dea-e1b8e8088761
Journal backup:           inode blocks
Journal features:         journal_incompat_revoke
Journal size:             128M
Journal length:           32768
Journal sequence:         0x021b30cb
Journal start:            2774

Group 0: (Blocks 0-32767) [ITABLE_ZEROED]
  Checksum 0xa8dc, unused inodes 0
  Primary superblock at 0, Group descriptors at 1-2
  Reserved GDT blocks at 3-1024
  Block bitmap at 1025 (+1025), Inode bitmap at 1041 (+1041)
  Inode table at 1057-1568 (+1057)

操作系统用inode号码来识别不同的文件和目录，每一个目录、文件都有对应的inode，linux系统内部并不使用文件名，而是使用inode号码来识别文件。对系统来说，文件名只是inode号码便于识别的别称。用户通过一个路径获取一个文件的数据，系统内部都是重复去查找目录的inode信息获取其子目录上的inode最后找到文件的inode从而取到文件文件数据所在的block读出数据。所以inode上存储的就是这些数据块上所存数据（文件或目录）的元信息，里面包含了与该文件有关的一些信息。具体inode包含文件的元信息详细列出如下：

* 文件的字节数
* 文件拥有者的User ID
* 文件的Group ID
* 文件的读、写、执行权限
* 文件的时间戳，共有三个：ctime指inode上一次变动的时间，mtime指文件内容上一次变动的时间，atime指文件上一次打开的时间。
* 链接数，即有多少文件名指向这个inode
* 文件数据block的位置

可以使用stat查看目录或文件的inode信息，记得以前写过篇文章中也涉及到在进行文件删除时，如果文件名称中有一些特殊字符不好在命令中输入时，可以ls -i列出文件的inode值从而通过inode删除文件。

[root@007 ~]# stat htpasswd 
  File: `htpasswd'
  Size: 20              Blocks: 8          IO Block: 4096   regular file
Device: fc01h/64513d    Inode: 1086541     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2019-10-11 18:22:35.172274217 +0800
Modify: 2019-04-11 10:13:59.226543338 +0800
Change: 2019-04-11 10:13:59.226543338 +0800

上面通过stat查看到的链接数这项信息，就是通过ls命令查看文件列表时的第三项的值。其和php里面的变量引用有类似功能，天下虽有万象，但很多都是一样的原理。linux系统中通过ln创建链接时有软链接和硬链接，当然一般用的软链接居多。通常情况文件名或目录和inode号码是一一对应的，每个inode号码但可以通过创建硬链接来使用一个数据块可以有多个文件名字，这时如果删除一个文件名，不影响另一个文件名的访问。如下：

[root@007 shell]# ln -P s.txt hard-s.txt 
[root@007 shell]# ll
total 20
-rwxr-xr-x 1 root root  40 Sep 24 20:23 a.sh
-rw-r--r-- 1 root root 240 Sep 24 19:15 a.txt
-rwxr-xr-x 1 root root  62 Oct  8 17:23 do.sed
-rw-r--r-- 2 root root 155 Oct 11 18:52 hard-s.txt
-rw-r--r-- 2 root root 155 Oct 11 18:52 s.txt
[root@007 shell]# ls -il
total 20
1087988 -rwxr-xr-x 1 root root  40 Sep 24 20:23 a.sh
1087933 -rw-r--r-- 1 root root 240 Sep 24 19:15 a.txt
1088279 -rwxr-xr-x 1 root root  62 Oct  8 17:23 do.sed
1088275 -rw-r--r-- 2 root root 155 Oct 11 18:52 hard-s.txt
1088275 -rw-r--r-- 2 root root 155 Oct 11 18:52 s.txt
[root@007 shell]# rm -f s.txt 
[root@007 shell]# stat hard-s.txt 
  File: `hard-s.txt'
  Size: 155             Blocks: 8          IO Block: 4096   regular file
Device: fc01h/64513d    Inode: 1088275     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2019-10-11 18:52:12.079865836 +0800
Modify: 2019-10-11 18:52:10.391907111 +0800
Change: 2019-10-12 15:03:16.935725573 +0800

二、关于awk文本处理命令中内置函数sub、gsub、sprintf

awk是常用的日志统计命令，之前有篇文章也有详细的介绍：linux文本分析利器awk命令使用详解、详细示例及linux服务器上手动释放内存和交换内存的详细介绍应用_linux的awk文本分析-CSDN博客不过我一直很少使用sub命令。sub就英文单词substitute的简写，意思就是替代、替换、代替。除sub命令外，awk里面还有一个gsub命令，其实名称中相差的这个g字符串已经说明了它们的区别了。我们知道在vim或者使用sed进行字符替换的时候，最后可以通过加上一个g选项来指定替换全局，如果不加上这个g，那么每行的替换只会进行一次。sed示例如下：

[root@007 shell]# cat s.txt 
what is you name, yes 890?aaz123
bettertest, dont konw this char
--hello--

1234567890
hellowolrd, some times this is good;
yes 890,the end is 1234567890.
[root@007 shell]# sed -n 's/890/===/p' s.txt 
what is you name, yes ===?aaz123
1234567===
yes ===,the end is 1234567890.
[root@007 shell]# sed -n 's/890/===/gp' s.txt 
what is you name, yes ===?aaz123
1234567===
yes ===,the end is 1234567===.

而sub和gsub的区别正在于此，使用gsub直接进行全局替换，使用sub则只替换一次。

[root@007 shell]# awk '{sub("890", "==="); print $0;}' s.txt             
what is you name, yes ===?aaz123
bettertest, dont konw this char
--hello--

1234567===
hellowolrd, some times this is good;
yes ===,the end is 1234567890.
[root@007 shell]# awk '{gsub("890", "==="); print $0;}' s.txt  
what is you name, yes ===?aaz123
bettertest, dont konw this char
--hello--

1234567===
hellowolrd, some times this is good;
yes ===,the end is 1234567===.

awk还有一个sprintf函数可以数字进行处理，根据 Format 参数指定的 printf 子例程格式字符串来格式化 Expr 参数指定的表达式并返回最后生成的字符串。如下使用%3.2f指定保留2位小数。

[root@007 shell]#   awk '{if(NF) print NR,length($0)/NF}' s.txt 
1 5.33333
2 6.2
3 9
5 10
6 6
7 6
[root@007 shell]#   awk '{if(NF) print NR,sprintf("%3.2f",length($0)/NF)}' s.txt 
1 5.33
2 6.20
3 9.00
5 10.00
6 6.00
7 6.00

另外sprintf函数除了使用%3.2f格式外，还能使用%d、%i参数，另外也可以使用printf命令，使用此命令不会换行。示例如下：

[root@007 shell]#   awk '{print sprintf("%4d%4i", NR,length($0),NF)}' s.txt     
   1  32
   2  31
   3   9
   4   0
   5  10
   6  36
   7  30
[root@007 shell]#   awk '{print sprintf("%4-d%4-i", NR,length($0),NF)}' s.txt 
1   32  
2   31  
3   9   
4   0   
5   10  
6   36  
7   30 
[root@007 shell]#   awk '{printf("%4-d%4-i", NR,length($0),NF)}' s.txt 
1   32  2   31  3   9   4   0   5   10  6   36  7   30  
[root@007 shell]#

printf修饰符和格式明细如下：printf的格式说明符

格式说明符	功能
%c	打印单个ASCII 字符，printf("The character is %c\n",x)，输出: The character is A
%d	打印一个十进制数，printf("The boy is %d years old\n",y)，输出：The boy is 15 years old
%e	打印数字的e 记数法形式，printf("z is %e\n",z) 打印: z is 2.3e+0 1，%f	打印一个浮点数
    printf("z is %f\n", 2.3 * 2)，输出: z is 4.600000，%o	打印数字的八进制
    printf("y is %o\n",y)，输出：z is 17
%s	打印一个字符串，print("The name of the culprit is %s\n",$1)，输出：The name of the culprit is Bob Smith
%x	打印数字的十六进制值，printf("y is %x\n",y)，输出：x is f

关于printf的格式化可以参见这两篇更详细的文章：https://www.cnblogs.com/thefirstfeeling/p/5667053.html 和 https://blog.csdn.net/augusdi/article/details/41128911 。除了这些内置函数外，awk还能使用一些内置数值函数，比如int, sqrt, exp, log, sin, cos, rand等。

查看全文

http://www.kler.cn/a/305624.html