I made a small change to my random delay code to improve how tests with “doomed waits” are being dealt with: Instead of just calling SyncPointBuffer.randomDelay, before a wait I now make a call to SyncPointBuffer.delayObjectWait. This method first checks if there’s only one user thread alive, and if that’s the case, throws an error. Here’s the output of such an execution:
$ ant run-delay -Dtest-class-name=SyncProblem3
Buildfile: build.xml
run-delay:
[echo] Test name = SyncProblem3
[java] .Main thread starting worker thread...
[java] Main thread started worker thread...
[java] Worker thread running
[java] Main thread waits...
[java] Worker thread calling notify
[java] E
[java] Time: 3.328
[java] There was 1 error:
[java] 1) testNotifyTooEarly(SyncProblem3)java.lang.AssertionError:
Call to Object.wait with only one user thread alive (_runningThreads==3)
[java] at edu.rice.cs.cunit.SyncPointBuffer.delayObjectWait
(SyncPointBuffer.java:825)
[java] at SyncProblem3.testNotifyTooEarly
(SyncProblem3.java:42)
[java] at sun.reflect.NativeMethodAccessorImpl.invoke0
(Native Method)
[java] at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
[java] at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
[java] FAILURES!!!
[java] Tests run: 1, Failures: 0, Errors: 1
BUILD FAILED
R:\Concutest\ClassLoader\build.xml:722: Java returned: 1
Total time: 4 seconds
But like I’ve written in an update to an earlier post, my tests indicate that a wait without timeout should be broken down into a series of wait calls with timeouts and interspersed checks of the number of living threads. There is a chance that — because we’re dealing with concurrency here — the second-to-last thread has not died at the time the check is made, but then dies before wait is called.
To make this change, I have to write my own version of Object.wait and replace calls to it with calls to my own version. I’ve done that before already, so it shouldn’t be difficult.
I have to admit, though, that this situation has so far been hard to coerce with random delays, and even then it only works on a small scale: If there are several other threads running in the background, e.g. a GUI event thread, then the thread count will not drop sufficiently. For the GUI event thread, I can probably modify the minimum number, but if there are additional user threads alive, then the “doomed wait” will not be detected.
