awk笔记231129
awk的脚本套路是:
awk的脚步部分最好用一对单引号将
‘BEGIN{} /pattern1/{}…/patternN/{} END{}’ 套起来,
因为常用到$
号, $
号在单引号中不会被转义, 在双引号中有取值的含义
awk -F '自定义分隔符' `BEGIN{开始块}
/pattern1/{操作pattern1过滤的行的块}
/pattern2/{操作pattern2过滤的行的块}
...
/patternN/{操作patternN过滤的行的块}
END{结束块}
-F
指定分隔符,可以没有,默认是空格
BEGIN{开始块}可以没有
END{结束块}可以没有
BEGIN,END 必须全大写,否则不起效
例
echo "1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa" | awk '/[135]/{sub("aaa","b&b",$0); print $0}'
结果
1 baaab aaa aaa
3 baaab aaa aaa
5 baaab aaa aaa
与上面的区别只是 sub
改为gsub
echo "1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa" | awk '/[135]/{gsub("aaa","b&b",$0); print $0}'
结果:
1 baaab baaab baaab
3 baaab baaab baaab
5 baaab baaab baaab
原样实例
[z@fedora root]$ echo "1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa" | awk '/[135]/{sub("aaa","b&b",$0); print $0}'
1 baaab aaa aaa
3 baaab aaa aaa
5 baaab aaa aaa
[z@fedora root]$ echo "1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa" | awk '/[135]/{gsub("aaa","b&b",$0); print $0}'
1 baaab baaab baaab
3 baaab baaab baaab
5 baaab baaab baaab
讲解
# /[135]/ 筛选出含1,3,5的行
# sub和gub是替换函数, sub替换每行的第一个匹配, gsub替换每行的所有匹配
# b&b表示给匹配的结果左右加上b字母, &代表匹配的字段
awk '/[135]/{sub("aaa","b&b",$0); print $0}'
awk '/[135]/{gsub("aaa","b&b",$0); print $0}'
例2
echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa" |
awk 'BEGIN{print "这是开始块" } /[135]/{gsub("aaa","b&b",$0); print $0} /[234]/{print $0} END{print "这是结束块" }'
上下是一样的,单双引号未结束时可换行, 管道符|
后可换行
echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa" |
awk 'BEGIN{print "这是开始块" }
/[135]/{gsub("aaa","b&b",$0); print $0}
/[234]/{print $0 }
END{print "这是结束块" }'
结果
这是开始块
1 baaab baaab baaab
2 aaa aaa aaa
3 baaab baaab baaab
3 baaab baaab baaab
4 aaa aaa aaa
5 baaab baaab baaab
这是结束块
控制台原样
[z@fedora root]$ echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa" |
awk 'BEGIN{print "这是开始块" }
/[135]/{gsub("aaa","b&b",$0); print $0}
/[234]/{print $0 }
END{print "这是结束块" }'
这是开始块
1 baaab baaab baaab
2 aaa aaa aaa
3 baaab baaab baaab
3 baaab baaab baaab
4 aaa aaa aaa
5 baaab baaab baaab
这是结束块
可看出, 第三个块输入的数据会受第二个块的影响
第三块筛选包含2,3,4的行,第二块筛选包含1,3,5的行,
第3行是共选,所以出现两次,被第二块改了,第三块什么都不做,输出第二块的修改的样子
第2,4行没有被第二块筛选,保持原态,被第三块筛选输出.
例3
echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa" |
awk 'BEGIN{print "这是开始块" }
/[135]/{gsub("aaa","b&b",$0); print $0 }
/[234]/{print $0 }
/[1579]/{sub("aaa","1579&1579",$0); print $0 }
END{print "这是结束块" }'
结果
这是开始块
1 baaab baaab baaab
1 b1579aaa1579b baaab baaab
2 aaa aaa aaa
3 baaab baaab baaab
3 baaab baaab baaab
4 aaa aaa aaa
5 baaab baaab baaab
5 b1579aaa1579b baaab baaab
7 1579aaa1579 aaa aaa
9 1579aaa1579 aaa aaa
这是结束块
控制台原样
[z@fedora root]$ echo "0 aaa aaa aaa
1 aaa aaa aaa
2 aaa aaa aaa
3 aaa aaa aaa
4 aaa aaa aaa
5 aaa aaa aaa
6 aaa aaa aaa
7 aaa aaa aaa
8 aaa aaa aaa
9 aaa aaa aaa" |
awk 'BEGIN{print "这是开始块" }
/[135]/{gsub("aaa","b&b",$0); print $0 }
/[234]/{print $0 }
/[1579]/{sub("aaa","1579&1579",$0); print $0 }
END{print "这是结束块" }'
这是开始块
1 baaab baaab baaab
1 b1579aaa1579b baaab baaab
2 aaa aaa aaa
3 baaab baaab baaab
3 baaab baaab baaab
4 aaa aaa aaa
5 baaab baaab baaab
5 b1579aaa1579b baaab baaab
7 1579aaa1579 aaa aaa
9 1579aaa1579 aaa aaa
这是结束块
发现菜鸟教程的解释挺好的,和我理解的一样 点击跳转 AWK 工作原理