Detection of a Java memory problem causing overuse of the garbage collector:
[orhus@lxinalta ~]$ top
top - 15:09:00 up 134 days, 14:14, 1 user, load average: 4,20, 3,91, 2,94
Tasks: 223 total, 1 running, 222 sleeping, 0 stopped, 0 zombie
%Cpu(s): 98,5 us, 1,5 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
MiB Mem : 31804,2 total, 334,3 free, 29177,8 used, 2292,1 buff/cache
MiB Swap: 4096,0 total, 2266,8 free, 1829,2 used. 1980,2 avail Mem
Unknown command - try 'h' for help
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1816128 orhus 20 0 9601648 7,4g 42824 S 387,9 23,8 170:32.46 java
1885835 orhus 20 0 54576 4512 3588 R 3,0 0,0 0:00.05 top
1 root 20 0 247756 10436 6372 S 0,0 0,0 52:29.29 systemd
2 root 20 0 0 0 0 S 0,0 0,0 0:08.29 kthreadd
3 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 rcu_par_gp
5 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 slub_flushwq
6 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 netns
8 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 kworker/0:0H-events_highpri
10 root 0 -20 0 0 0 I 0,0 0,0 1:26.86 kworker/0:1H-kblockd
11 root 0 -20 0 0 0 I 0,0 0,0 0:00.00 mm_percpu_wq
12 root 20 0 0 0 0 S 0,0 0,0 0:00.00 rcu_tasks_rude_
13 root 20 0 0 0 0 S 0,0 0,0 0:00.00 rcu_tasks_trace
14 root 20 0 0 0 0 S 0,0 0,0 2:02.70 ksoftirqd/0
15 root 20 0 0 0 0 I 0,0 0,0 21:31.90 rcu_sched
16 root rt 0 0 0 0 S 0,0 0,0 0:27.14 migration/0
18 root 20 0 0 0 0 S 0,0 0,0 0:00.00 cpuhp/0
...
We are on a server with 4 cpu cores. We can see that 98,5% of these 4 cors is used, it means 387,9% of one cpu.
The culprit is the Java process with ID 1816128.
[orhus@lxinalta ~]$ top -H -p 1816128
top - 15:09:49 up 134 days, 14:15, 1 user, load average: 4,08, 3,91, 2,99
Threads: 245 total, 4 running, 241 sleeping, 0 stopped, 0 zombie
%Cpu(s): 93,8 us, 1,2 sy, 0,0 ni, 4,6 id, 0,0 wa, 0,0 hi, 0,0 si, 0,5 st
MiB Mem : 31804,2 total, 343,6 free, 29167,4 used, 2293,2 buff/cache
MiB Swap: 4096,0 total, 2266,8 free, 1829,2 used. 1990,3 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1816494 orhus 20 0 9603668 7,4g 42824 R 93,0 23,8 21:10.20 GC Thread#2
1816495 orhus 20 0 9603668 7,4g 42824 R 93,0 23,8 21:07.37 GC Thread#3
1816134 orhus 20 0 9603668 7,4g 42824 R 92,0 23,8 21:08.55 GC Thread#0
1816492 orhus 20 0 9603668 7,4g 42824 R 92,0 23,8 21:09.43 GC Thread#1
1816154 orhus 20 0 9603668 7,4g 42824 S 5,0 23,8 0:39.12 VM Thread
1816128 orhus 20 0 9603668 7,4g 42824 S 0,0 23,8 0:00.00 java
1816131 orhus 20 0 9603668 7,4g 42824 S 0,0 23,8 0:12.48 java
1816136 orhus 20 0 9603668 7,4g 42824 S 0,0 23,8 0:00.69 G1 Main Marker
1816138 orhus 20 0 9603668 7,4g 42824 S 0,0 23,8 21:15.46 G1 Conc#0
1816146 orhus 20 0 9603668 7,4g 42824 S 0,0 23,8 0:56.42 G1 Refine#0
1816147 orhus 20 0 9603668 7,4g 42824 S 0,0 23,8 0:28.82 G1 Young RemSet
The top command with the -H option displays the threads associated with the Java process. We can see that almost all of the CPU is being used by four garbage collection threads, which means that our Java application is currently looping in an attempt to free up memory, but is not succeeding sufficiently, so it keeps trying again and again…
The solution is to increase the Xmx parameter of the java application.
Leave a Reply