At our last meeting, Corky suggested another approach for the first cut of our test framework. Instead of going for the full record-and-replay framework that works with fixed schedules that allow bugs to be reproduced, I should work on a version that inserts random delays at synchronization points. This would sort of be a derivative the heuristic version he initially proposed and that should be used when there are too many schedules to run them all. It has drawbacks, namely that it does not try all schedules and that a schedule which lead to an error cannot be reproduced, but it does not require the replay mechanism that I’ve just had so many difficulties with.
After making some more small improvements on the deadlock detector yesterday and earlier today, I’ve started to take a look at what would be necessary to make this new approach work. The deadlock detector isn’t finished either. I still have to make it work on bigger programs.
Which Synchronization Points to Watch?
I need to consider the following synchronization points (synchronized methods get converted to unsynchronized methods with a synchronized block again):
monitorenter
monitorexit
Object.wait
Object.notify
Object.notifyAll
Thread.join
Thread.start
Thread.run
(updated)Thread.exit
Where to Insert the Delays?
Where should the delays be inserted? For monitorenter
, there should definitely be a delay before the block is entered: At that time, any thread can still acquire the lock and dramatically change program execution. Conversely, for monitorexit
, there should probably be a delay after the block is exited: At that time, the thread has relinquished control of the lock and another thread might grab it. I’m not really sure this delay is necessary, though, since anything important should have a monitorenter
before it anyway, so if the same thread that just released a lock were to acquire another one, there would be another delay anyway. monitorexit
nonetheless is a place where the synchronization sequence can easily change, so I’m going to put a delay there too. It might also be prudent to insert a delay after the monitorenter
to expose situations in which a lock is used but it is ineffective.
The same that applied to monitorenter
also applies to Object.wait
, which can be thought of as a generalized version in which any thread — not just the JVM — can wake up any other thread. I will also handle Thread.join
that way: It’s sort of a “wake-up by thread death”.
For Object.notify
and Object.notifyAll
, I am going to insert a delay before the call to decouple the notification order from the order in which the threads doing the notifications execute. Right now I don’t think a delay after the call is necessary.
Finally, Thread.start
and Thread.exit
will have delays after the thread start and before the thread death, respectively. Here’s a summary:
Delays before:
monitorenter
Object.wait
Object.notify
Object.notifyAll
Thread.join
Thread.exit
Delays after:
monitorenter
monitorexit
Object.wait
Thread.join
Thread.start
Thread.run
(updated)
How to Determine the Lengths of the Delays?
The lengths should be random numbers with an upper and a lower bound, e.g. between 50 ms and 1000 ms. I still have to think about how I am going to generate these random numbers. It’s not important that the numbers appear very random, but the generation should be efficient. I guess just using Random.nextInt(int n)
is fast enough, but I’ve also considered using a pre-generated array of delay lengths. That way, getting the next random number could be sped up. I don’t think that’s necessary, though.
The seed that is being used for a particular execution should be determined randomly or by the user. If it was chosen randomly (by using the time; Java adds to System.nanoTime()
), then it should be printed out or stored somewhere. If errors are found, the user could than reuse that seed to generate the same delays. That would not guarantee that the same sequence of synchronization points is generated again, but it should strongly favor some schedules over others.
I think I can just use Thread.sleep
for the delays. It’s a native method and may have an impact on scheduling, i.e. it might immediately switch to another thread, but that’s ok. In fact, it is probably better than a tight for
loop that causes the thread to hog the CPU.
When to Delay?
I came up with three different strategies for determining when to delay the program (the section above that lists the synchronization points describes where to delay). These strategies range from simple (and inefficient) to complex (but perhaps more efficient):
- Always delay.
- Delay if more than one thread is running.
- Delay if more than one thread is running and the current thread does not already hold the lock.
Here are descriptions of the three strategies:
Strategy 1: Always delay.
When the program reaches one of the synchronization points and there is a delay (before it, after it, or both before and after it), that delay is always executed. As an example, here is what monitorenter
would look like in pseudocode:
sleep(getNextRandomDelay());
monitorenter;
sleep(getNextRandomDelay());
Strategy 2: Delay if more than one thread is running.
If there is only one thread that is currently running, then there is no point in delaying the execution. The program therefore keeps a counter public static volatile long _threadsRunning;
in the SyncPointBuffer
class. The counter gets changed at the synchronization points in the following way:
Thread.start
: incrementThread.exit
: decrement- before
monitorenter
: decrement - after
monitorenter
: increment monitorexit
: no change- before
Object.wait
: decrement - after
Object.wait
: increment - before
Thread.join
: decrement - after
Thread.join
: increment
The random delays are then only executed if the counter indicates there is more than one active thread (i.e. the counter is 0 or 1, depending on the sync point and the location of the delay). As an example, here is monitorenter
in pseudocode:
if (\_threadsRunning>1) { sleep(getNextRandomDelay()); }
synchronized(SyncPointBuffer.class) {
\_threadsRunning = \_threadsRunning - 1;
}
monitorenter;
synchronized(SyncPointBuffer.class) {
\_threadsRunning = \_threadsRunning - 1;
}
if (\_threadsRunning>1) { sleep(getNextRandomDelay()); }
Unfortunately, this code will get a little more complicated: Thread.sleep
throws a checked InterruptedException
which I need to catch, but I can just do that in a wrapper method.
The code to increment and decrement the counter needs to be synchronized because it involves both a read and a write, but I have to make sure the lock is released before the call to sleep
, otherwise the delay would not allow other interleavings and the whole exercise would be pointless.
Strategy 3: Delay if more than one thread is running and the current thread does not already hold the lock.
Program execution can be sped up by using strategy 2, but not invoking a delay at a monitorenter
if the thread already owned the lock. This often happens in recursive calls, e.g.
void foo(int n) {
if (n>0) {
synchronized(this) {
// ...
foo(n-1);
}
}
}
The synchronized block is entered n times, always with the same object, so the lock only has to be acquired and released once; all the monitorenter
instructions except for the first and all monitorexit
instructions except for the last are no-ops.
To figure out if a thread already owns a lock, the program has to maintain a set of locks for every thread. To not interfere with garbage collection, it’s probably a good idea to use object IDs and a set of the object IDs of the owned locks.
This requires a lot more effort just for bookkeeping than a simple counter, and object IDs are still a complicated issue. Considering that strategy 2 isn’t a lot harder to implement than strategy 1, but doesn’t require object IDs or abstract data structures, I am going to implement strategy 2. Perhaps I am going to implement strategy 1 first, though.
How to Run the Program?
Except for the seed value for the random number generator, I don’t need to report any data. The important behavior is a simply a failure of a unit test in one of the schedules. I will therefore try to make it possible to run the program in just one VM without an attached monitor program using JPDA. That should make execution a lot more efficient.
This is somewhat of a return to my implementation approach from the beginning of the project and may fail, but I think it’s worth the effort to reevaluate. I’ll try to make the instrumentors compatible with the other instrumentors I have, so a second VM can be used if the user wants to.
For now, I’ll also opt for an offline instrumentation, i.e. I will use the FileInstrumentor
program and not a class loader for dynamic instrumentation.
Corky has also suggested only instrumenting user classes and not the entire rt.jar file. For record and replay, I don’t think that was enough, but for this approach it should be sufficient.