I’ve added some diagnostic tools to the JPDA monitor to analyze why and in what way the slave gets stuck. I’ve added counters that display successful updates, delayed updates, and the number of synchronization points in the slave list at the time of the last update attempt.
There are three scenarios:
- If everything is going well, then the successful updates counter keeps incrementing, and the synchronization point counter displays the number of points transferred during each update.
- If the JPDA monitor cannot update because the slave just always happens to be in SyncPoint.add(), but the program is actually advancing, then the delayed updates counter keeps incrementing, and the synchronization point counter keeps on incrementing as well, as the list does not get emptied.
- If the JPDA monitor cannot update because the slave is stuck in one and the same call to SyncPoint.add(), then the delayed updates counter keeps incrementing, but the counter for the synchronization points remains unchanged. That’s because more update attempts are made and get delayed, but no more synchronization points are added.
What I observe happening sometimes (very often, unfortunately), is the last behavior. That means that the slave program is actually stuck in the same call to SyncPointRecorder.add(). There’s no other way. If it moved out of that call but not into another call, it would exhibit the first behavior. If it moved out of the call and into another call, it would be the second.
I have no idea, though, why it gets stuck. Take a look at the source code in the previous posting. I don’t see any reason.
I’ve also noticed that every once in a while I still get a HotSpot VM error:
# # An unexpected error has been detected by HotSpot Virtual Machine: # # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x6d72b030, pid=3980, tid=2848 # # Java VM: Java HotSpot(TM) Client VM (1.5.0_04-b05 mixed mode) # Problematic frame: # V [jvm.dll+0xeb030]
Again, no clue about that. And I have a NullPointerException in the deadlock detection after I do a manual update of the threads in the slave VM, but that’s probably just an oversight.