[concurrency-interest] Intel's HLE and Java

Nathan Reynolds nathan.reynolds at oracle.com
Wed Apr 25 15:27:58 EDT 2012

TSX seems like another type of lock to be added to the current 
bias/thin/fat synchronized design.

I guess RTM could fit somewhere between thin and fat (experience and 
performance data will tell).  If the thin lock is too contented (i.e. 
too much spinning), then the JVM will switch the lock to an RTM lock.  
If too many aborts happen per transaction, then the JVM will switch to 
fat.  Or maybe RTM locks would be tried first and then switch to thin locks.

HLE seems like it could be added to thin locks.  If the transaction 
fails, then the thread will have to execute CAS instruction to acquire 
the lock.  If the CAS fails too many times per transaction, then we know 
that HLE won't work here and the lock will be changed to a regular thin 
lock.  If threads timeout while spinning, then the lock will be 
converted to a fat lock.

It doesn't seem like we need both RTM and HLE.  It seems like one or the 
other will be better.  HLE has the advantage of providing transactions 
in all cases but at the cost of failed transactions (hence performance 
loss).  RTM is very explicit and gives power to the JVM to decide if 
transactions are working.

Nathan Reynolds 
<http://psr.us.oracle.com/wiki/index.php/User:Nathan_Reynolds> | 
Consulting Member of Technical Staff | 602.333.9091
Oracle PSR Engineering <http://psr.us.oracle.com/> | Server Technology

On 4/25/2012 4:35 AM, Andrew Haley wrote:
> I've been thinking about what to do with Intel's Hardware Lock Elision
> (HLE), if anything.
> Briefly, HLE allows locked regions to proceed transactionally,
> so something like Hashtable's
>      public synchronized V put(K key, V value) {
>          ...
>      }
> could proceed in parallel with other cores accessing the same
> Hashtable, as long as there were no conflict with the same hash slot.
> If any unsafe instructions (interrupts, I/O, etc.) were executed or if
> there were a conflict with another thread, the hardware would abort the
> transaction (i.e. discard all pending memory updates and automagically
> fall back to locking).  If there were no conflict, all threads would
> proceed in parallel, and all accesses to the lock would be elided.
> In theory this could be used for Java.  There is, as far as I can see,
> one significant downside: if conflicts are highly probable, there will
> be a performance degradation because speculation on transactions that
> will abort wastes CPU cycles.
> Also, I was worried that this might not actually respect the JMM
> because on the x86, a StoreLoad barrier requires some sort of fence,
> such as MFENCE.  However, according to the Intel documentation MFENCE
> is allowed in a transaction and will not abort, so I think we're OK.
> To handle the performance worry, a JIT could use some kind of cost
> measure (length, number of memory accesses, etc.) to determine whether
> to use HLE, on the assumption that short transactions are unlikely to
> abort.  But this doesn't actually tell us what we need to know, which
> is whether a conflict is likely.  Only the programmer knows that, and
> even short transactions might abort if they are very frequent.
> We could create a family of HLE-enabled locks and leave it all to the
> programmer, but this seems like a lot of work and means that this
> potentially very useful option will be wasted.  It also means a lot
> of bloat in j.u.c.
> Thoughts welcome...
> Andrew.
> Intel® Architecture Instruction Set Extensions Programming Reference
> http://software.intel.com/file/41604/319433-012a.pdf
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20120425/fcd61375/attachment.html>

More information about the Concurrency-interest mailing list