Java - Hang Troobleshooting

Java Conceptuel Diagram

About

procedures for troubleshooting hanging or looping processes.

Reason

  • deadlock in application code, API code, or library code.
  • bug in the HotSpot virtual machine.

Step

Find the process Id

Java - Process

CPU state

An initial step when diagnosing a hang is to find out if the VM process is idle or consuming all available CPU cycles.

If the process appears to be busy and is consuming all available CPU cycles then it is likely that the issue is a looping thread rather than a deadlock.

Causes

Looping

  • get a thread dump. In some cases it might be necessary to get a sequence of thread dumps in order to determine which threads appear to be continuously busy.

If the VM does not respond to a Ctrl-\ this could indicate a VM bug rather than an issue with application or library code. To get a thread stack for all threads (VM internal threads included)

jstack -Fm process ID
  • focus initially on the threads that are in the RUNNABLE state. This is the most likely state for threads that are busy and possibly looping.
  • If a thread appears to be always in the RUNNABLE state, then the -m option of Jstack can be used to print the native frames and can provide a further hint on what the thread is doing.

hung

If the application appears to be hung and the process appears to be idle, then the first step is to try to obtain a thread dump.

Deadlock Not Detected

If the thread dump is printed and no deadlocks are found, then the issue might be a bug in which a thread waiting on a monitor that is never notified. This could be a timing issue or a general logic bug.

To find out more about the issue, examine each of the threads in the thread dump and each thread that is blocked in Object.wait(). The caller frame in the stack trace indicates the class and method that is invoking the wait() method. If the code was compiled with line number information (the default), then this provides direction as to the code to examine. In most cases you must have some knowledge of the application logic or library in order to diagnose this issue further. In general you must understand how the synchronization works in the application and in particular the details and conditions for when and where monitors are notified.

Documentation / Reference







Share this page:
Follow us:
Task Runner