- A Concurrent Affair - https://www.concurrentaffair.org -

First Cut of the “Doomed wait” Solution

I made a small change to my random delay code to improve how tests with “doomed waits” are being dealt with: Instead of just calling SyncPointBuffer.randomDelay, before a wait I now make a call to SyncPointBuffer.delayObjectWait. This method first checks if there’s only one user thread alive, and if that’s the case, throws an error. Here’s the output of such an execution:

$ ant run-delay -Dtest-class-name=SyncProblem3
Buildfile: build.xml

run-delay:
     [echo] Test name = SyncProblem3
     [java] .Main thread starting worker thread...
     [java] Main thread started worker thread...
     [java] Worker thread running
     [java] Main thread waits...
     [java] Worker thread calling notify
     [java] E
     [java] Time: 3.328
     [java] There was 1 error:
     [java] 1) testNotifyTooEarly(SyncProblem3)java.lang.AssertionError:
              Call to Object.wait with only one user thread alive (_runningThreads==3)
     [java]     at edu.rice.cs.cunit.SyncPointBuffer.delayObjectWait
                   (SyncPointBuffer.java:825)
     [java]     at SyncProblem3.testNotifyTooEarly
                   (SyncProblem3.java:42)
     [java]     at sun.reflect.NativeMethodAccessorImpl.invoke0
                   (Native Method)
     [java]     at sun.reflect.NativeMethodAccessorImpl.invoke
                   (NativeMethodAccessorImpl.java:39)
     [java]     at sun.reflect.DelegatingMethodAccessorImpl.invoke
                   (DelegatingMethodAccessorImpl.java:25)

     [java] FAILURES!!!
     [java] Tests run: 1,  Failures: 0,  Errors: 1


BUILD FAILED
R:\Concutest\ClassLoader\build.xml:722: Java returned: 1

Total time: 4 seconds

But like I’ve written in an update [1] to an earlier post, my tests indicate that a wait without timeout should be broken down into a series of wait calls with timeouts and interspersed checks of the number of living threads. There is a chance that — because we’re dealing with concurrency here — the second-to-last thread has not died at the time the check is made, but then dies before wait is called.

To make this change, I have to write my own version of Object.wait and replace calls to it with calls to my own version. I’ve done that before already, so it shouldn’t be difficult.

I have to admit, though, that this situation has so far been hard to coerce with random delays, and even then it only works on a small scale: If there are several other threads running in the background, e.g. a GUI event thread, then the thread count will not drop sufficiently. For the GUI event thread, I can probably modify the minimum number, but if there are additional user threads alive, then the “doomed wait” will not be detected.

[2] [3]Share [4]