awk
是一个强大的文本分析工具,awk
工作原理:把文件逐行的读入,以空格为默认分隔符将每行切片,切开的部分再进行各种分析处理。
来源
awk
是因为其取了三位创始人 Alfred Aho
,Peter Weinberger
, 和 Brian Kernighan
的 Family Name
的首字符。
awk
有3个不同版本: awk
、nawk
和gawk
,未作特别说明,一般指gawk
,gawk
是 awk
的 GNU
版本。
安装
yum install -y gawk
Help
# awk --help
Usage: awk [POSIX or GNU style options] -f progfile [--] file ...
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options: GNU long options: (standard)
-f progfile --file=progfile
-F fs --field-separator=fs
-v var=val --assign=var=val
Short options: GNU long options: (extensions)
-b --characters-as-bytes
-c --traditional
-C --copyright
-d[file] --dump-variables[=file]
-e 'program-text' --source='program-text'
-E file --exec=file
-g --gen-pot
-h --help
-L [fatal] --lint[=fatal]
-n --non-decimal-data
-N --use-lc-numeric
-O --optimize
-p[file] --profile[=file]
-P --posix
-r --re-interval
-S --sandbox
-t --lint-old
-V --version
To report bugs, see node `Bugs' in `gawk.info', which is
section `Reporting Problems and Bugs' in the printed version.
gawk is a pattern scanning and processing language.
By default it reads standard input and writes standard output.
Examples:
gawk '{ sum += $1 }; END { print sum }' file
gawk -F: '{ print $1 }' /etc/passwd
内建变量
$0
当前记录(这个变量中存放着整个行的内容)
$1~$n
前记录的第n个字段,字段间由FS分隔
FS
输入字段分隔符 默认是空格或Tab
NF
当前记录中的字段个数,就是有多少列
NR
已经读出的记录数,就是行号,从1开始,如果有多个文件话,这个值也是不断累加中。
FNR
当前记录数,与NR不同的是,这个值会是各个文件自己的行号
RS
输入的记录分隔符, 默认为换行符
OFS
输出字段分隔符, 默认也是空格
ORS
输出的记录分隔符,默认为换行符
FILENAME
当前输入文件的名字
示例
# cat netstat.txt
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 a1.xiexianbin.cn:55536 100.100.15.4:https TIME_WAIT
tcp 0 36 a1.xiexianbin.cn:ssh 101.226.164.165:53138 ESTABLISHED
tcp 0 0 localhost:36654 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:36654 ESTABLISHED
tcp 0 0 localhost:59236 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:59238 ESTABLISHED
tcp 0 0 localhost:59238 localhost:2379 ESTABLISHED
基本用法
# awk '{print $1, $4}' netstat.txt
Proto Local
tcp a1.xiexianbin.cn:55536
tcp a1.xiexianbin.cn:ssh
tcp localhost:36654
tcp localhost:2379
tcp localhost:59236
tcp localhost:2379
tcp localhost:59238
- 单引号中的被大括号括着的就是awk的语句
- 其只能被单引号包含
格式化输出
# awk '{printf "%-8s %-8s %-8s %-18s %-22s %-15s\n",$1,$2,$3,$4,$5,$6}' netstat.txt
Proto Recv-Q Send-Q Local Address Foreign
tcp 0 0 a1.xiexianbin.cn:55536 100.100.15.4:https TIME_WAIT
tcp 0 36 a1.xiexianbin.cn:ssh 101.226.164.165:53138 ESTABLISHED
tcp 0 0 localhost:36654 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:36654 ESTABLISHED
tcp 0 0 localhost:59236 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:59238 ESTABLISHED
tcp 0 0 localhost:59238 localhost:2379 ESTABLISHED
过滤
# awk '$3==0 && $6=="ESTABLISHED" ' netstat.txt # 第三列的值为0 && 第6列的值为ESTABLISHED
tcp 0 0 localhost:36654 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:36654 ESTABLISHED
tcp 0 0 localhost:59236 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:59238 ESTABLISHED
tcp 0 0 localhost:59238 localhost:2379 ESTABLISHED
# awk ' $3>0 {print $0}' netstat.txt # 第三列的值大于0
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 36 a1.xiexianbin.cn:ssh 101.226.164.165:53138 ESTABLISHED
# awk '$3==0 && $6=="ESTABLISHED" || NR==1 ' netstat.txt
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:36654 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:36654 ESTABLISHED
tcp 0 0 localhost:59236 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:59238 ESTABLISHED
tcp 0 0 localhost:59238 localhost:2379 ESTABLISHED
# awk '$3==0 && $6=="ESTABLISHED" || NR==1 {printf "%-20s %-20s %s\n",$4,$5,$6}' netstat.txt
Local Address Foreign
localhost:36654 localhost:2379 ESTABLISHED
localhost:2379 localhost:36654 ESTABLISHED
localhost:59236 localhost:2379 ESTABLISHED
localhost:2379 localhost:59238 ESTABLISHED
localhost:59238 localhost:2379 ESTABLISHED
# awk 'BEGIN{FS=":"} {print $1,$3,$6}' /etc/passwd
root 0 /root
bin 1 /bin
daemon 2 /sbin
# awk -F: '{print $1,$3,$6}' /etc/passwd
root 0 /root
bin 1 /bin
daemon 2 /sbin
# awk -F: '{print $1,$3,$6}' OFS="\t" /etc/passwd
root 0 /root
bin 1 /bin
daemon 2 /sbin
# awk '$6 ~ /FIN|TIME/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1 Local Address Foreign
2 a1.xiexianbin.cn:55536 100.100.15.4:https TIME_WAIT
# awk '/TIME_WAIT/' netstat.txt
tcp 0 0 a1.xiexianbin.cn:55536 100.100.15.4:https TIME_WAIT
# awk '$6 !~ /WAIT/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1 Local Address Foreign
3 a1.xiexianbin.cn:ssh 101.226.164.165:53138 ESTABLISHED
4 localhost:36654 localhost:2379 ESTABLISHED
5 localhost:2379 localhost:36654 ESTABLISHED
6 localhost:59236 localhost:2379 ESTABLISHED
7 localhost:2379 localhost:59238 ESTABLISHED
8 localhost:59238 localhost:2379 ESTABLISHED
# awk '!/WAIT/' netstat.txt
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 36 a1.xiexianbin.cn:ssh 101.226.164.165:53138 ESTABLISHED
tcp 0 0 localhost:36654 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:36654 ESTABLISHED
tcp 0 0 localhost:59236 localhost:2379 ESTABLISHED
tcp 0 0 localhost:2379 localhost:59238 ESTABLISHED
tcp 0 0 localhost:59238 localhost:2379 ESTABLISHED
- 比较运算符:==, !=, >, <, >=, <=
获取 IP 地址
ifconfig | awk 'BEGIN{RS="";FS="\n"}!/^lo:/{$0=$2;FS=" ";$0=$0;print $2;FS="\n"};'