追踪java 程序中的线程执行情况

当我们希望观察java程序运行的堆栈情况,多次执行jstack,或者通过cpu 采用发现某个线程都在同一个函数函数.

这可能是两种情况引起的:

  1. 在这个线程中,这个函数是高频函数.
  2. 线程当前阻塞在这个函数上.

Linux中没有真正的线程,java线程是由进程来模拟实现的. 这种进程又称作:轻量级进程(LWP).

因此可以通过strace -p 追踪其运行情况,来区分以上两种情况.

但是通过ps -ef或者jps只能看到java程序的pid. 如果需要追踪线程,可以jstack查找线程的pid.

线程第一个行中的 nid字段,就是该线程pid的16进制表示. 转化为10进制即可.

例如下面,线程显示在执行io上,读数据相对这个操作是比重多的. 但是是频繁执行io,还是文件系统出现问题,卡在io上,这是无法区分. 就可以这样追踪一下.

查找线程nid

[wjh@node1 ~]$ jps
1287939 AppMockRealDedupDelete
1287937 Launcher
1293889 Jps
1098227 Main
1099286 Launcher
1098463 RemoteMavenServer36

[wjh@node1 ~]$ jstack 1287939
2021-09-26 16:11:56
Full thread dump OpenJDK 64-Bit Server VM (11.0.12+7-LTS mixed mode, sharing):

Threads class SMR info:
_java_thread_list=0x00007f6570000e90, length=15, elements={
0x00007f66a8018000, 0x00007f66a8399000, 0x00007f66a839b000, 0x00007f66a83a4800,
0x00007f66a83a6800, 0x00007f66a83a8800, 0x00007f66a83b2800, 0x00007f66a83b4800,
0x00007f66a83d3000, 0x00007f66a8719800, 0x00007f66a8b66000, 0x00007f66a8bc0800,
0x00007f6648001000, 0x00007f65dc001000, 0x00007f65b0001800
}

"main" #1 prio=5 os_prio=0 cpu=2159223.33ms elapsed=2533.45s tid=0x00007f66a8018000 nid=0x13a705 runnable  [0x00007f66b24d7000]
   java.lang.Thread.State: RUNNABLE
    at java.io.FileInputStream.readBytes(java.base@11.0.12/Native Method)
    at java.io.FileInputStream.read(java.base@11.0.12/FileInputStream.java:279)
    at java.io.BufferedInputStream.fill(java.base@11.0.12/BufferedInputStream.java:252)
    at java.io.BufferedInputStream.read1(java.base@11.0.12/BufferedInputStream.java:292)
    at java.io.BufferedInputStream.read(java.base@11.0.12/BufferedInputStream.java:351)
    - locked <0x000000051d48ad88> (a java.io.BufferedInputStream)
    at java.io.FilterInputStream.read(java.base@11.0.12/FilterInputStream.java:107)
    at com.afonddream.algorithm.real.ReadRealData.loadData(ReadRealData.java:25)
    at com.afonddream.algorithm.app.AppMockRealDedupDelete.mockDedup(Unknown Source)
    at com.afonddream.algorithm.app.AppMockRealDedupDelete.mock(Unknown Source)
    at com.afonddream.algorithm.app.AppMockRealDedupDelete.main(Unknown Source)


转化为10进制

[wjh@node1 ~]$ python
Python 2.7.18 (default, Jan 22 2021, 11:25:37)
[GCC 8.4.1 20200928 (Red Hat 8.4.1-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 0x13a705
1287941
>>>

追踪线程执行情况

原来程序在频繁执行io函数.

[wjh@node1 ~]$ strace -p 1287941
strace: Process 1287941 attached
read(112, "", 38797312)                 = 0
munmap(0x7f6561afe000, 38801408)        = 0
mmap(NULL, 38801408, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6561afe000
read(112, "", 38797312)                 = 0
munmap(0x7f6561afe000, 38801408)        = 0
mmap(NULL, 38801408, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6561afe000
read(112, "", 38797312)                 = 0
munmap(0x7f6561afe000, 38801408)        = 0
mmap(NULL, 38801408, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6561afe000
...