[concurrency-interest] How bad can volatile long++ be?

Osvaldo Pinali Doederlein osvaldo at visionnaire.com.br
Wed Dec 12 08:54:53 EST 2007


Hi David,

I am aware of the produced bytecode and other issues you mention. So I 
think I was not very clear in my comment, so trying again:
1) Code like the inc() below can be compiled down to a single 
instruction, which (at least on singleprocessors as several people 
pointed) is an atomic instruction. (*) Even on 
multicore/multiprocessors, I think the instruction will be atomic, 
provided that the long value is aligned to 8 bytes, which typically 
(always?) has the nice side effect of making sure the value doesn't 
straddle cache-line boundaries, so, no atomicity problems with the 
propagation of writes through the memory hierararchy.
2) But a good JIT compiler may see that this single instruction is not 
the optimal compilation, it can get better performance with separate 
instructions for the load, inc, and store.
3) Now, the confusing part of my argument is: perhaps the JIT compiler 
knows how to do the optimizations that result in separate instructions, 
but decides not to do it, preferring instead to emit single instructions 
for ++ and -- just because they happen to be atomic (in singleprocessors 
anyway) and there are enough applications with concurrency bugs.
I am probably being paranoid, I don't think any major optimizer would 
make this kind of tradeoff between performance and support for broken 
code, at least in scenarios like concurrency where "broken" usually 
means "may fail in rare circumstances" - in other scenarios, where 
"broken" means "will always fail", it's well known that Sun and other 
JVM implementers often hold enhancements and even bugfixes because they 
know of important applications which would break because they're buggy.

(*) BTW, most JVMs can and will run method in interpreted mode many 
times before doing native compilation. So the code is buggy anyway as it 
may fail in interpreted runs. ANY safety assumption that depends on JIT 
compilation is flawed, but people often don't realize that, because the 
chances to catch the error are abismally low - you gotta be really 
unlucky to hit that dreadful thread/CPU interleaving with a few hundreds 
or thousands of interpreted executions of a buggy method.

A+
Osvaldo
> Hi Osvaldo,
>  
> Taking a simple example:
>  
>    int x;  // field
>  
>     public inc() { x++; }
>  
> the bytecode generated by javac is:
>  
> public void inc();
>   Code:
>    0:   aload_0
>    1:   dup
>    2:   getfield        #2; //Field x:I
>    5:   iconst_1
>    6:   iadd
>    7:   putfield        #2; //Field x:I
>    10:  return
>  
> }
>  
> And you can see that there nothing atomic in that. Trying to recognize 
> that the above might be replaced by a single atomic assembly 
> instruction is not a worthwhile "optimization":
> a) if the field is not accessed concurrently then there is no need for 
> the atomic update, and atomic instructions have a cost in terms being 
> atomic, so such a change would actually degrade performance;
> b) if the field is accessed concurrently then either:
>    i) there is synchronization protecting the field - in which case 
> we're in the same boat as (a), the atomic is unnecessary and 
> expensive.; or
>    ii) there is no sync, so the code is broken anyway and making this 
> atomic is unlikely to actually make the overall program correct.
>  
> Hence no point even attempting such an "optimization". :)
>  
> Cheers,
> David
>  
> -----Original  Message-----
> *From:* concurrency-interest-bounces at cs.oswego.edu 
> [mailto:concurrency-interest-bounces at cs.oswego.edu]*On Behalf Of 
> *Osvaldo Pinali Doederlein
> *Sent:* Tuesday, 11 December 2007 9:29 PM
> *To:* dholmes at ieee.org
> *Cc:* Concurrency-interest at cs.oswego.edu
> *Subject:* Re: [concurrency-interest] How bad can volatile long++ be?
>
>     David Holmes escreveu:
>>     David Gallardo writes:
>>       
>>>     ++ is not atomic; while it may effectively be so on a single processor
>>>     machine, this is not the case on multiprocessor machines.
>>>         
>>
>>     It isn't the case on single processor machines either. ++ is
>>     read-modify-write sequence and a thread can be preempted at any point in the
>>     sequence.
>>
>>     ++ is just syntatic short-hand. Write it out in full and you'd never expect
>>     it to be atomic.
>>
>>       
>     Perhaps the problem is that on CISC platforms like the
>     over-popular x86, this /can/ be compiled down to a single
>     instruction that does the fetch, increment and store on a memory
>     address operand. People get used to this, they often read assembly
>     output from compilers and see a single pretty, atomic instruction
>     like INC DWORD PTR  [EBX], and expect this to be the rule - "it's
>     atomic in practice". The problems is, it's not a portable
>     assumption. And even in the platforms that allow this code
>     generation, I'd expect the best optimizers to often /not/ do it,
>     for example because they see that a new read is unnecessary on a
>     previously used field, or the write can be delayed (e.g. if the
>     increments are inside a loop this would provide a huge boost). I
>     wonder, though, if any optimizers that could do that avoid it -
>     giving more priority to perform an atomic increment - just to
>     compensate for buggy application code?...
>
>     A+
>     Osvaldo
>>     Cheers,
>>     David Holmes
>>
>>     _______________________________________________
>>     Concurrency-interest mailing list
>>     Concurrency-interest at altair.cs.oswego.edu
>>     http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>       
>
>
>     -- 
>     -----------------------------------------------------------------------
>     Osvaldo Pinali Doederlein                   Visionnaire Informática S/A
>     osvaldo at visionnaire.com.br                http://www.visionnaire.com.br
>     Arquiteto de Tecnologia                          +55 (41) 337-1000 #223
>         
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at altair.cs.oswego.edu
> http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
>   


-- 
-----------------------------------------------------------------------
Osvaldo Pinali Doederlein                   Visionnaire Informática S/A
osvaldo at visionnaire.com.br                http://www.visionnaire.com.br
Arquiteto de Tecnologia                          +55 (41) 337-1000 #223

-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/attachments/20071212/149ca71f/attachment.html 


More information about the Concurrency-interest mailing list