The -Xprof profiler is the HotSpot profiler. HotSpot works by running Java code in interpreted mode, while running a profiler in parallel. The HotSpot profiler looks for “hot spots” in the code, i.e. methods that the JVM spends a significant amount of time running, and then compiles those methods into native generated code.
The embedded HotSpot profiler is a specialist low overhead profiler suitable for running alongside the application while not causing too much of an overhead. It does this by being a very very simple profiler, using the lowest overhead techniques available. Firstly it samples the runtime stack (the methods currently being run) at regular intervals. In order to make this sampling have minimal impact, the interval between samples being taken is not too short. But much more importantly, unlike most stack sampling profilers, the stack is not “walked”, i.e. the elements on the stack are not identified in full. Instead only the topmost element of the runtime stack is identified, i.e. the method in which code is being executed at the sample time.
This sampling of the topmost runtime stack element is sufficient to identify which methods need to be compiled into native code. Basically, if any method is found to be at the top of the stack more than a few times, then the application can probably benefit from having that method compiled. Simple, but powerful.
So if the HotSpot profiler is always running, what does -Xprof do? It tells the profiler to keep a record of the profile information, and output that information at termination. The output is sent to stdout. Since most of the information needs to be kept by the profiler in any case, this means that -Xprof can actually be run in production without an excessive overhead being put on the application.
It is worth noting the severe limitations of this profiler. The information can be of very limited use. Because of its “minimal overhead” remit, there is minimal information available from the profiler:
Each thread has it’s profile recorded separately, and is output separately on thread termination; there is no combined view of the application runtime.
Only the top runtime stack method at sample time is identified, so there is no contextual information; being told that java.lang.String.equals is a bottleneck in your application is almost useless since you don’t know which of the many methods which call String.equals() is causing most of the trouble. (Note that it is fine for HotSpot, HotSpot doesn’t care about context, it just cares that String.equals() is a bottleneck, so that it knows it should spend some time compiling that method to native code.)
Only method execution is profiled; there is no object creation, garbage collection, or thread conflict profiling.
So is -Xprof useful? Yes, after all you are getting some profiling information for almost nothing. But it isn’t a sufficient profiler to performance tune on it’s own, except for very simple applications. And, of course, you only get output from -Xprof when the JVM includes HotSpot, which is not the case for all JVMs.
Reprinted from http://www.javaperformancetuning.com/