[concurrency-interest] Fences

Doug Lea dl at cs.oswego.edu
Mon Aug 17 15:01:10 EDT 2009


Paul E. McKenney wrote:
> I believe that the best way for me
> to contribute is to show example code that I believe could benefit
> from weaker guarantees.  

Thanks! Very helpful.

> 1.	Split counters.  

Here's a shot at generalizing:
* You have a variable (like a counter) that is normally private
   to a thread, but sometimes needs inter-thread visibility.
* You don't want to penalize the normal private update.
* You do not need to achieve an accurate consensus/snapshot value
   when aggregating across per-thread values.

> 	In C++0x, one would use atomic variables with
> 	memory_order_relaxed.  I don't see the Java equivalent.

In C++, using atomic with memory_order_relaxed is sometimes in
part a permit for having a race without your program having
completely undefined semantics.
In Java, we ascribe (complicated and weak) meanings
even to programs with races, so on those grounds, you could just
declare the variables as normal variables and take what you get.
Without some further care though, you might not like what you get;
for example you may see all counters as zero.
Among the ways of avoiding this is to infrequently
use a fence in order to, conceptually, sometimes "promote"
the variable to being volatile-like.
Maybe just something of the form:
   if ((++stat.counter & 1023) == 0) Fences.orderAccesses(stat);

(There are better styles for doing this, but they don't fit
on one line :-). Plus a another single fence on reader side
before accumulation loop. You could use orderWrites here
instead, but it might require more care/thought, and if
they are infrequent enough, then it
won't matter much performance-wise. And of course you can use
some other infrequent trigger rather than count values.
(Also, while not guaranteed, having even a conditional
rare fence there is likely to force regular visible updates because
that would become the cheapest implementation.)

Not too relevant to Fences, but another line of attack here
is to let threads occasionally push values to a global atomic, as in:
   if ((++counter & 1023) == 0) {
      globalAtomicCounter.addAndget(counter);
      counter = 0;
   }
The CAS inside addAndGet will rarely contend, so is probably about as
cheap as a fence, and saves you from needing accumulation loop.
Or you can transfer counts only when the thread is known not to be doing
anything important, like before blocking. The ForkJoin framework does
this in a few places.

(And even less relevantly for present purposes, for the case where
aggregates must be accurate, there are the various techniques
described in the Herlihy & Shavit book.)

> 
> 2.	Publication with read-only subscription.  
> ...
> 
> 	The "[]" are supposed to map to the proposal.  If they do,
> 	I believe that this example is handled by the proposal.

Me too.

> 
> 3.	Publication with read-only subscription, but interspersed
> 	with reads from other variables.  
> ...
> 
> 	This appears to me to handled by the latest proposal, the one
> 	with three sub-bullets under "rf".  

(Or, more clearly, the revised version that separates out that third
case to clarify that this form only applies to dependent reads.)

> 	But if we add:
> 
> 		int offset = 0;
> 
> 	And suppose we then change subscribe_a() to read:
> 
> 	int subscribe_a(int *a)
> 	{
> 		int i;
> 		struct s1 *q; /* [ref] */
> 
> 		q = rcu_dereference(gp); /* [r] */
> 		/* [rf] */
> 		if (q == NULL)
> 			return 0;
> 		i = index;
> 		*a = q->a[i]; /* first read by t using ref. */
> 		*a += offset;
> 		return 1;
> 	}
> 
> 	The proposal would require a memory barrier to be emitted
> 	so as to force ordering of the fetch from "offset" on some
> 	architectures.  The weaker RCU semantics would require no
> 	such memory barrier, nor does C++0x memory_order_consume.

You are right that we don't have a version of fence/acquire
that ONLY governs dependent reads, regardless of whether they
are initial. So I think this example would need to use plain orderReads
fence but am not sure about details of "weaker RCU semantics" --
it might be that they are no weaker than minimal JMM guarantees
for plain (aka relaxed) variables?


> 
> 4.	But, strangely enough, RCU read-side critical sections
> 	(the stuff between an rcu_read_lock() and an rcu_read_unlock(),
> 	and the only place where an rcu_dereference() is legal, at
> 	least to a first order of approximation) can contain writes.
> 
> 	struct s1 {
> 		int a[2], b, c, accessed;
> 	};
> 
> 	struct s1 *gp = NULL;
> 	int index = 0;
> 
> 	void publish_new(a0, a1, b, c)
> 	{
> 		struct s1 *p;
> 
> 		p = malloc(sizeof(*p));
> 		if (p == NULL)
> 			handle_OOM();  /* Doesn't return. */
> 		p->a[0] = a0;
> 		p->a[1] = a1;
> 		p->b = b;
> 		p->c = c;
> 		p->accessed = 0;
> 		rcu_assign_pointer(gp, p);  /* similar to orderWrites() [wf] */
> 					    /*  followed by assignment */
> 					    /*  to gp [w] */
> 	}
> 
> 	int subscribe_a(int *a)
> 	{
> 		int i;
> 		struct s1 *q; /* [ref] */
> 
> 		q = rcu_dereference(gp); /* [r] */
> 		q->accessed = 1; /* not ordered by proposal!!! */
> 		/* [rf] */
> 		if (q == NULL)
> 			return 0;

Do you really mean to write through possibly null pointer?
Maybe you mean:
...
  		q = rcu_dereference(gp); /* [r] */
  		/* [rf] */
  		if (q == NULL)
  			return 0;
  		q->accessed = 1;

...

> 		i = index;
> 		*a = q->a[i]; /* first read by t using ref. */
> 		return 1;
> 	}
> 
> 	If I understand the proposal correctly, subscribe_a()'s
> 	write to ->accessed could be ordered before publish_new()'s
> 	write to ->accessed, leaving the value of this field equal
> 	to zero despite the fact that it has been accessed.

Hopefully my alternative form is what you actually want,
in which case "accessed = 0" is before wf and
"accessed = 1" after rf, so all is well.

All together, modulo the lack of pure-dependent-read
fence (which I don't think is supportable within JMM,
but maybe you don't need anyway?) I'm encouraged that
these look to work out, especially for RCU, which is
now a sort of canonical example of encapsulating
expert-level lightweight sync.

-Doug


More information about the Concurrency-interest mailing list