strace
来跟踪进程执行时系统调用
和所接收的信号
。strace
底层使用内核的 ptrace
特性来实现其功能
安装
yum install -y strace
help
$ strace -h
usage: strace [-CdffhiqrtttTvVwxxy] [-I n] [-e expr]...
[-a column] [-o file] [-s strsize] [-P path]...
-p pid... / [-D] [-E var=val]... [-u username] PROG [ARGS]
or: strace -c[dfw] [-I n] [-e expr]... [-O overhead] [-S sortby]
-p pid... / [-D] [-E var=val]... [-u username] PROG [ARGS]
Output format:
-a column alignment COLUMN for printing syscall results (default 40)
-i print instruction pointer at time of syscall
-k obtain stack trace between each syscall
-o file send trace output to FILE instead of stderr
-q suppress messages about attaching, detaching, etc.
-r print relative timestamp
-s strsize limit length of print strings to STRSIZE chars (default 32)
-t print absolute timestamp
-tt print absolute timestamp with usecs
-T print time spent in each syscall
-x print non-ascii strings in hex
-xx print all strings in hex
-X format set the format for printing of named constants and flags
-y print paths associated with file descriptor arguments
-yy print protocol specific information associated with socket file descriptors
Statistics:
-c count time, calls, and errors for each syscall and report summary
-C like -c but also print regular output
-O overhead set overhead for tracing syscalls to OVERHEAD usecs
-S sortby sort syscall counts by: time, calls, name, nothing (default time)
-w summarise syscall latency (default is system time)
Filtering:
-e expr a qualifying expression: option=[!]all or option=[!]val1[,val2]...
options: trace, abbrev, verbose, raw, signal, read, write, fault, inject, kvm
-P path trace accesses to path
Tracing:
-b execve detach on execve syscall
-D run tracer process as a detached grandchild, not as parent
-f follow forks
-ff follow forks with output into separate files
-I interruptible
1: no signals are blocked
2: fatal signals are blocked while decoding syscall (default)
3: fatal signals are always blocked (default if '-o FILE PROG')
4: fatal signals and SIGTSTP (^Z) are always blocked
(useful to make 'strace -o FILE PROG' not stop on ^Z)
Startup:
-E var remove var from the environment for command
-E var=val put var=val in the environment for command
-p pid trace process with process id PID, may be repeated
-u username run command as username handling setuid and/or setgid
Miscellaneous:
-d enable debug output to stderr
-v verbose mode: print unabbreviated argv, stat, termios, etc. args
-h print help message
-V print version
- -c 统计每个系统调用执行的时间,执行次数和出错次数
- -f 跟踪由fork调用生成的子进程
- -t 在输出的每一行前面加上时间信息
- -e trace=set 只跟踪指定的系统调用如,-e trace=open, close, read等
- -e trace=file 只跟踪文件操作相关的系统调用
- -e trace=network 只跟踪网络相关的系统调用
- -o 将结果输出到文件
- -p pid指定进程
demo
$ strace -c -p <pid> # ctrl-c 终止
$ strace -t -e trace=all -p <pid>
$ strace -f -tt -T -v -o out.log -p <pid>
$ strace -T -tt -e trace=all -p <pid>
$ strace ping qq.com
$ strace -c -p 128949
strace: Process 128949 attached
^Cstrace: Process 128949 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.000018 0 294 poll
0.00 0.000000 0 7 process_vm_readv
------ ----------- ----------- --------- --------- ----------------
100.00 0.000018 301 total
说明:
- -c: 计数选项,提供了一些性能上的诊断信息,可以从
syscall
的CPU时间来查看
示例
$ ps -ef|grep xxx
root 1561 1 0 Aug16 ? 00:00:14 /usr/sbin/crond -n
root 31656 1561 0 10:00 ? 00:00:00 /usr/sbin/CROND -n
root 31664 31656 0 10:00 ? 00:00:00 /bin/sh -c bash run.sh 2>&1
root 31673 31664 0 10:00 ? 00:00:00 bash sub_task.sh
root 31729 31673 0 10:00 ? 00:00:17 python3 xxx
$ strace -tt -e trace=all -p 31656
strace: Process 31656 attached
15:40:46.557965 read(8,
$ strace -tt -e trace=all -p 31664
strace: Process 31664 attached
15:41:01.301453 wait4(-1,
$ strace -tt -e trace=all -p 31673
strace: Process 31673 attached
15:41:16.061078 wait4(-1,
$ strace -tt -e trace=all -p 31729
strace: Process 31729 attached
15:40:28.239911 futex(0x55dedf570be0, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY
- 通过进程
31656
等待 read(8,
,程序就卡在 pipe
这里:
$ cd /proc/31656/fd
$ ll
total 0
lr-x------ 1 root root 64 Aug 17 09:50 0 -> /dev/null
lrwx------ 1 root root 64 Aug 17 09:50 1 -> 'socket:[22525]'
lrwx------ 1 root root 64 Aug 17 09:50 2 -> 'socket:[22525]'
lrwx------ 1 root root 64 Aug 17 09:50 3 -> 'socket:[38981]'
lr-x------ 1 root root 64 Aug 17 09:50 4 -> anon_inode:inotify
l-wx------ 1 root root 64 Aug 17 09:50 7 -> /run/systemd/sessions/47374.ref
lr-x------ 1 root root 64 Aug 17 09:50 8 -> 'pipe:[43602808]'
$ cd /proc/31729
$ cat status
Name: python3
Umask: 0022
State: S (sleeping)
Tgid: 31729
Ngid: 0
Pid: 31729
...
$ cat stack
[<0>] futex_wait_queue_me+0xc1/0x120
[<0>] futex_wait+0x13f/0x240
[<0>] do_futex+0x321/0x5b0
[<0>] __x64_sys_futex+0x143/0x180
[<0>] do_syscall_64+0x5b/0x1b0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
[<0>] 0xffffffffffffffff
$ cat wchan
0
卡在 futex_wait_queue_me
是 Linux 内核的 bug,参考:https://bugs.centos.org/view.php?id=8703