Slow-Downs in DrJava

On Friday, before and during COMP 312, the production programming class that drives a lot of the development of DrJava, we discussed the reasons for the apparent slowdown of DrJava. At the beginning, I quite randomly mentioned that the reduced model was in list form and therefore wasn’t set up to handle more complicated requests, such as what class the cursor was in. Naturally it wasn’t set up to do that, that wasn’t on the programmers’ minds when the code was written, and it performs quite well for syntax highlighting, the main purpose, and pretty much acceptably for indenting. I just thought out loud if Barnaby’s attempt to write a tree-shaped reduced model would help satisfy those more complicated queries.

Corky vigorously defended the list-shaped reduced model. I’m not quite sure why, because I wasn’t even attacking it. Then he mentioned the slowdowns of DrJava that he had noticed, compared to about a year ago. A year ago was approximately when I started making major changes to the DrJava codebase and adding features like “Go to File”, persistent breakpoints and bookmarks, “Open Javadoc”, a simple form of code completion, a clipboard history, and “Find All”, all things that I’m sure absolutely nobody wants to miss anymore. Most of these features were driven by my own desire to use DrJava on bigger projects, and I always found something missing; unfortunately, I’m quite sure most people don’t even know about all the things I implemented, even though I documented them quite thoroughly (the first documentation update in a year and a half).

At the same time I had to admit that Corky was right: Opening the DrJava project took about three times as long with a recent version than with a version from January 2006, so something was definitely going on, although I was pretty sure that my changes had nothing to do with it. Sure, many of my changes, especially “Find All”, were not implemented in the best way, but they were implemented solidly with what the DrJava source code gave me. I added a few parts with extremely high code reuse, like the “predictive input dialog” and regions, then I recombined these new things with what was already there. To make “Find All” work, I have to cycle through all documents that contained matches, and this is actually visible in the title bar; however, now there’s a very convenient “Find All” feature (kind of like grep), before there wasn’t, and my code doesn’t do anything that a manual search by repeatedly pressing “Find Next” wouldn’t have done. Well, it does… it highlights the matches, if desired.

I felt like I wanted to prove that my changes had absolutely nothing to do with the slowdown, so I started playing around with the YourKit Java Profiler, a new and fantastic toy that we may use free of charge for our open source project. Just before Corky left for the weekend, I was able to demonstrate that one of the slow parts of DrJava, “Go to Line”, was due to horrible code that abused the list-shaped reduced model: It set the cursor of the reduced model to the beginning, and then moved it, counting newlines. Setting the line seems like a simple setter that should have constant runtime, but this algorithm was actually executing linearly! I promised I would rewrite the code by using Swing information.

When I had done that, I did some more profiling and received puzzling information: While the actual “Go to Line” operation was faster, something else took absolutely forever afterwards; with the old code, it didn’t, probably because something was cached. After some more profiling, I identified that the getter used to update the current line number in the status bar was the culprit. It was written in the same way “Go to Line” was written, and this code was being executed every time the cursor was moved. Absolutely horrible code, and the cause was abuse of the reduced model.

I had never claimed that the reduced model was bad, but I had stated that it wasn’t suited for more complicated queries. Unknowingly we were making unnecessarily complicated queries for “Go to Line” and to find the current line number, we were abusing the reduced model, and totally killing our performance.

By the time I left around 10 PM, I was relieved that my code actually had nothing to do with the slowdown, but I was a bit worried that my lucky guess about the reduced model made me sound “too right”. I was actually pretty worked up over this. I didn’t want to rub anyone’s nose in this, I just made a lucky guess, but I didn’t know how my statements and findings would be interpreted. I went to the lab to do some more profiling and perhaps fix the problems on Sunday, but Corky had already done all the work.

Today at the lab meeting, he lauded my good work profiling and how it helped him fix the problems, and said each and every member of the DrJava team should learn how to use YourKit. I agree, it’s a fantastic tool. I wrote a long email today that described what I had learned about it, but of course, knowing human nature (at least if it is anywhere close to my human nature), I doubt many people actually went through it and tried to use YourKit.

Here’s the email I sent out, just to share my small slice of knowledge with the world once more:

Hi,

I installed YourKit into the javaplt account on CSnet. Here are
instructions if you want to use it. Please note that this is for the
Linux environment on CSnet. For Windows and MacOS, or away from CSnet,
these steps don't apply.

1. The directory YourKit is in is /home/javaplt/packages/yjp-6.0.11
2. Please set the following environment variable, probably in your .bashrc file:

setenv LD_LIBRARY_PATH /home/javaplt/packages/yjp-6.0.11/bin/linux-x86-32

3. To start YourKit, run the following script from the command line:

/home/javaplt/packages/yjp-6.0.11/bin/yjp.sh &

The & at the end will make YourKit run in the background, i.e. you can
still use the same command line window you've been using.

The first time you start YourKit, it should ask you for your license
information as explained in the previous email.

4. You need to pass a special command line argument to your java
executable when you're running a program that you want to profile:
-agentlib:yjpagent

To make that easier to remember, I have written a script that passes
that arguments, and all the argument passed to the script, to java.
It's located in:

/home/javaplt/packages/yjp-6.0.11/bin/yjpjava

So if you want to use it to run a program that you would normally start with:

java MyClass

just use:

/home/javaplt/packages/yjp-6.0.11/bin/yjpjava MyClass

Here is a more complicated example to start DrJava. Instead of writing:

java -cp drjava-15.jar edu.rice.cs.drjava.DrJava

now write:

/home/javaplt/packages/yjp-6.0.11/bin/yjpjava -cp drjava-15.jar
edu.rice.cs.drjava.DrJava

You may want to include the bin directory in your path or make
symlinks to yjp.sh and yjpjava.

5. Note that even though YourKit is running AND you have started your
Java program now, the two aren't automatically linked. To do that,
click on "Connect to locally running profiled application..." in
YourKit's "Welcome" screen. That should open a new pane that displays
CPU and memory usage graphs.

6. When you actually want to record usage times and counts, press the
"Start CPU Profiling" button. That will open another pane. There are
two selections, "Sampling" and "Tracing", and as far as I know,
"Sampling" records times only and doesn't give you invocation counts
and call stacks, but is faster. "Tracing" is a lot slower but gives
you more information.

7. When you want to view the data, i.e. when you're done performing
the action you want to profile, press the "Capture Snapshot" button.
That will bring up a dialog that tells you the file name as which the
snapshot has been saved, and it asks you whether you want to open the
snapshot. You probably want to press "Open" if you want to look at the
data right now.

8. Note that taking a snapshot stops profiling, so if you want to
continue to profile, you need to press the "Start CPU Profiling"
button again.

9. On the page that displays CPU snapshot informations, there are lots
of ways to gather information. I haven't explored everything. I found
selecting "Method List" on the left the most useful. That will give
you a lsit of all methods that were called, and you can sort by name,
time, own time, and invocation count (if you used tracing). Time
includes time spent in calls to other methods, i.e. the time from
method entry to method exit. Own time is only the time spent in the
actual method, i.e. the time from method entry to method exit
EXCLUDING the time spent in other methods that were called.

Usually I found a very sharp break between the times, i.e. going from
40 seconds in some methods (that were probably in a call chain) to 5
seconds. My intuition tells me that the last method to have a lot of
time spent is one of the "hot spots". You should also look at the
invocation count: The program may spend a lot of time in a method, but
if it's called very often, that may not mean very much.

This should really only be a guide to get you started. I'm far from
being an expert in using YourKit, so please play with it and learn by
using it. If you find something useful, perhaps you can let the rest
of us know.

If there are any problems, please let me know. Please also take into
account that we only have licenses for development on DrJava; if we
need additional licenses for DrJava developers, let me know and I will
request them. If you want to get free licenses for other open source
projects, please contact the YourKit licensing people directly.

--Mathias

I really think DrJava is a lot better now than it was in January 2006. I enjoy using it. It has features it desperately needed for big-time development, such as developing DrJava in DrJava, our very mantra. Now that some of the problems that had nothing to do with my additions have been removed, it is even better.

Share

About Mathias

Software development engineer. Principal developer of DrJava. Recent Ph.D. graduate from the Department of Computer Science at Rice University.
This entry was posted in DrJava. Bookmark the permalink.

Leave a Reply