Hi Mandy, On 6 December 2010 19:26, Mandy Chung <mandy.chung@xxxxxxxxxx> wrote: > Remi, Eamonn, Brian, David, Doug, > > Thanks for the feedback. > I don't know if you welcome external feedback, but I'd like to point out (if you're not already aware) that this change modifies the VM interface. While I'm cognisant of the reason for the change, my understanding of the existing map mechanism is that it makes the JDK class library independent of the underlying VM thread status values. The value of Thread.threadStatus is opaque, with the mapping from VM thread status being determined by the following VM interface functions (see hotspot/src/share/vm/prims/jvm.h): ------------------ /* * Returns an array of the threadStatus values representing the * given Java thread state. Returns NULL if the VM version is * incompatible with the JDK or doesn't support the given * Java thread state. */ JNIEXPORT jintArray JNICALL JVM_GetThreadStateValues(JNIEnv* env, jint javaThreadState); /* * Returns an array of the substate names representing the * given Java thread state. Returns NULL if the VM version is * incompatible with the JDK or the VM doesn't support * the given Java thread state. * values must be the jintArray returned from JVM_GetThreadStateValues * and javaThreadState. */ JNIEXPORT jobjectArray JNICALL JVM_GetThreadStateNames(JNIEnv* env, jint javaThreadState, jintArray values); ------------------ These two functions are used by the native method sun.misc.VM.getThreadStateValues to setup the arrays which are then used to initialise the map. This change breaks this abstraction, and requires Thread.threadStatus to be a JVM TI thread state (which happens to match Hotspot's internal thread state). This change will therefore break any VM which does not have an internal thread state based on JVM TI. As far as I'm aware, IKVM and CACAO are currently the only other users of OpenJDK (I'm also nearing completion of a port to JamVM). Unfortunately, from looking at CACAO I can see that this change will break it. It may also break IKVM, but I haven't checked. I, of course, can modify the internal thread states of JamVM to be compatible. I'm CC'ing CACAO's mailing list and GNU Classpath so that affected parties can be made aware of this change. As an aside, will there be any later clean-up of the native method implementation and VM interface? Thanks, Rob. > On 12/04/10 04:22, Eamonn McManus wrote: > > Hi Mandy, > > This test: > > if ((threadStatus & JVMTI_THREAD_STATE_RUNNABLE) == 1) { > > is always false, since JVMTI_THREAD_STATE_RUNNABLE is 4. (NetBeans 7.0 > helpfully flags this; I'm not sure if earlier versions do.) > > Good catch. This explains why the speed up for RUNNABLE was not as high in > the microbenchmark measurement. Correcting it shows that Thread.getState() > gets 3.5X speed up on a thread in RUNNABLE state. > > But, once corrected, I think you could use this idea further to write a much > simpler and faster method, on these lines: > > public static Thread.State toThreadState(int threadStatus) { > if ((threadStatus & JVMTI_THREAD_STATE_RUNNABLE) != 0) { > return RUNNABLE; > } else if ((threadStatus & > JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER) != 0) { > return BLOCKED; > } else if ((threadStatus & JVMTI_THREAD_STATE_WAITING_WITH_TIMEOUT) > != 0) { > return TIMED_WAITING; > } else if ((threadStatus & JVMTI_THREAD_STATE_WAITING_INDEFINITELY) > != 0) { > return WAITING; > } else if ((threadStatus & JVMTI_THREAD_STATE_TERMINATED) != 0) { > return TERMINATED; > } else { > return NEW; > } > } > > I forgot to mention in the email that I implemented this simpler approach to > compare with the table lookup approach. There were no significant > difference. I now rerun with the corrected fix (checking != 0 rather than > == 1) and the table lookup approach is about 2-6% faster than the sequence > of tests approach. > > I am also for the simpler approach but I post the table lookup approach as a > proposed fix to get any opinion on the performance aspect with that > approach. > > Given that the Fork-Join framework doesn't depend on it, I will go for a > simpler approach (sequence of tests) and further tune its performance when > there is a use case requiring a perf improvement. > > New webrev: > http://cr.openjdk.java.net/~mchung/6977034/webrev.01/ > > Can you review this version? > > Thanks > Mandy > > You could tweak the order of the tests based on what might be the relative > frequency of the different states but it probably isn't worth it. > > Regards, > > Éamonn > > On 3/12/10 11:52 PM, Mandy Chung wrote: > > Fix for 6977034: Thread.getState() very slow > > Webrev at: > http://cr.openjdk.java.net/~mchung/6977034/webrev.00/ > > This is an improvement to map a Thread's threadStatus field to > Thread.State. The VM updates the Thread.threadStatus field directly at > state transition with the value as defined in JVM TI [1]. The > java.lang.Thread.getState() implementation can directly access the > threadStatus value and do a direct lookup from an array of Thread.State. > The threadStatus value is a bit vector and we would have to create an array > of a minimum of 1061 (0x425) elements to do direct mapping. I took the > approach to use the first highest order bit set to 1 in the masked > threadStatus value as the index to the Thread.State element and only caches > 32 elements (could be fewer). I wrote a micro-benchmark measuring the > Thread.getState of a thread in different state that shows 1.7X to 6X speedup > (see below). There is possibly some issue with my micro-benchmark that I > didn't observe the 14X speed up as Doug did in his experiment. However, I'd > like to get this reviewed and pushed to the repository so that anyone can do > more experiment on the performance measurement. > > Thanks > Mandy > P.S. The discussion on this thread can be found at [2] [3]. > > [1] > http://download.java.net/jdk7/docs/platform/jvmti/jvmti.html#GetThreadState > [2] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2010-July/004567.html > [3] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2010-August/004721.html > > > JDK 7 b120 (in ms) With fix (in ms) Speed up > main 46465 22772 2.04 > NEW 50676 29921 1.69 > RUNNABLE 42202 14690 2.87 > BLOCKED 72773 12296 5.92 > WAITING 48811 13041 3.74 > TIMED_WAITING 45737 12849 3.56 > TERMINATED 40314 16376 2.46 > > >