[concurrency-interest] On A Formal Definition of 'Data-Race'

oleksandr otenko oleksandr.otenko at oracle.com
Mon Apr 15 15:18:29 EDT 2013


On 15/04/2013 17:55, thurstonn wrote:
> oleksandr otenko wrote
>> When "data race" means "broken logic", there must be a place where that
>> logic is defined.
> Yes, there is an assumption in my OP that "data-race" ==> "incorrect" (or
> "broken logic" in your words)
> Vitaly does not necessarily equate "raciness" with "incorrectness" (and
> probably Brian as well) and that's OK with me
>
> oleksandr otenko wrote
>> Literature on linearizability introduces a history of events (not the
>> only place where it is done so). If a valid reordering of events
>> produces a invalid history, you have a data race. But you need a
>> definition of a valid history.
> Yes, what I'm in (a quixotic?) search for is a process that does the
> following:
> given a set of operations that can execute across multiple threads (like the
> example in the OP)
> -define a set of "execution histories" (your "history of events") that are
> possible given the MM in effect and consistent with the original program's
> set of operations (of course there will be multiple such histories)
> -each "execution history" defines at least a partial ordering among
> conflicting operations (r/w or w/w on the same shared data item)
> -analyze each execution history for "correctness"
> if each possible history is correct, then you're good
> else add explicit happens-before relations. Repeat
This sounds awfully like Java PathFinder.

Not tractable for less-trivial code. (I couldn't test a semaphore-like 
primitive with more than 4 threads).


Alex


> oleksandr otenko wrote
>> Your argument is that any reordering of instructions in your example is
>> a valid history. But in order to state that, we would need to see the
>> rest of history. The subsequent use of local will determine the validity
>> of history.
>> Alex
> Agreed (the example 'program' is simplistic at best).
>
> What my original post described was exactly the kind of process (viz. an
> acyclic serialization graph) that I'm in search of, but is applied to
> database concurrency control.  The problems are very similar (turning
> concurrent executions into serial, partially-ordered ones; operations are
> reads/writes of data items), but they are not exactly the same.  I was
> wondering if we could use the same techniques (with a serialization graph
> ==> "execution graph") to analyze the "correctness" of, e.g. non-locking
> concurrent algorithms/data structures.
>
> Thurston
>
>
>
> On 12/04/2013 19:55, thurstonn wrote:
>> In thinking about whether code is thread-safe or not, one can attempt to
>> identify whether it 'contains a data-race'.  If not, you're good.  Else
>> you
>> need to add an explicit happens-before relationship.
>>
>> Which begs the question: what exactly constitutes a 'data-race'?  And here
>> I'm interested in something a little more formal than the famed judicial
>> judgement of obscenity (I know it when I see it)
>>
>> If you do a web search, you unfortunately get quite a few divergent
>> definitions, many of which seem to be inconsistent.
>> IIRC, the official JMM defines a data-race as any two conflicting
>> operations
>> from two or more threads on shared data (where at least one of the two
>> operations is a write).
>>
>> Brian Goetz (in his excellent  article
>> <http://www.ibm.com/developerworks/library/j-jtp03304/>  ) defines
>> data-race
>> thusly:
>>
>> "A program is said to have a data race, and therefore not be a "properly
>> synchronized" program, when there is a variable that is read by more than
>> one thread, written by at least one thread, and the write and the reads
>> are
>> not ordered by a happens-before relationship."
>>
>> But this would mark the following code as a data-race
>>
>> int shared = 0
>>
>> Thread 1                  Thread 2                 Thread 3
>> local = this.shared      this.shared = 10       local = this.shared
>>
>> This clearly meets his definition, yet I do not consider this a
>> 'data-race'.
>>
>> I've always relied on traditional database concurrency control theory (I
>> still find the treatise by Bernstein, Hadzilacos, and Goodman to be the
>> best), which has a formal definition of 'serializability', viz. that any
>> transaction log is 'serializable', if and only if, its serialization graph
>> is acyclic.  Why can we not use this as the basis for a formal definition
>> of
>> 'data-race' (excluding the notion of commit and abort of course):
>>
>> "A program is said to have a data-race, if any legal (as prescribed by the
>> MM) execution order produces a serialization graph that is *cyclic*"
>>
>> It has the advantage of a formal, mathematical model and although it is
>> has
>> historically been confined to databases (and transactions), it seems
>> applicable to concurrent execution of any kind?
>>
>> Hoping that I don't get flamed.
>>
>>
>>
>> --
>> View this message in context:
>> http://jsr166-concurrency.10961.n7.nabble.com/On-A-Formal-Definition-of-Data-Race-tp9408.html
>> Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at .oswego
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at .oswego
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>
>
>
>
> --
> View this message in context: http://jsr166-concurrency.10961.n7.nabble.com/On-A-Formal-Definition-of-Data-Race-tp9408p9425.html
> Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest



More information about the Concurrency-interest mailing list