[concurrency-interest] A little lock free algorithm [the code]

David Holmes dcholmes at optusnet.com.au
Fri Mar 31 20:38:45 EST 2006


> studdugie writes:
>
> You are quite right about the race. It's been nagging at me since I
> first wrote it. It's the reason why I wanted to post the code here in
> the first place because after testing the code and not seeing the race
> happen I wasn't so sure anymore.So I wanted to see if anyone else
> thought it was a problem besides me and whatever else they might find.

Sorry about that. I was focussing on the visibility issues and overlooked
the details of the semantic requirements.

If you want the execution of the if block to only be performed by one thread
per "time window" then the test of the time has to be atomic with respect to
updating expiration. Simplest way is to re-do the test when you have the
"lock":

    void increment()
    {
        long millis;
        // quick check if nothing to do
        if( dead || (millis = System.currentTimeMillis()) < expiration )
            return;

        if( locked.compareAndSet( false, true ) )
        {
            // check again as things might have changed since the first
check
            if( dead || millis < expiration )
              return;

            backoff += BACKOFF_INCREMENT;
            if( backoff > BACKOFF_MAX )
                dead = true;
            else
                expiration = millis + backoff;
            locked.set( false );
        }
    }

This is still approximate as you really need to snapshot the current time as
close as possible to the detection of the failure - otherwise you might get
preempted and get a distorted view of when the failure actually happened. Of
course there is never a guarantee as you can be preempted at any time.

Looking closer I'm not clear on exactly how you want this to work. When and
how are TFailure objects created? I'm wondering how long might elapse
between the initialization of "expiration" and the first call to increment.
I suspect your logic is something like:

if connect_to_host_failed then
   if host.tfailure == null
       host.tfailure = new Tfailure()
   else
      host.tfailure.increment()

in which case how do you make the construction/publication thread-safe?

Also your expiration logic would seem to lead to a host being declared dead
after a certain number of failures, regardless of successes in between the
failures. Eventually backoff will be greater than BACKOFF_MAX because it is
never reduced. I also don't understand why you want the failure window to
get larger and larger.

Cheers,
David Holmes



More information about the Concurrency-interest mailing list