[concurrency-interest] Garbage First vs a counting collector

Endre Stølsvik Online at Stolsvik.com
Thu Mar 26 13:59:04 EDT 2009


Way on the fringe of my knowledge, but I've long wondered: Would it be
an idea to make a GCer that is a combination of reference counting and
tracking? I'm musing that one then would have three types of
collections: The two types in a generational collector, but below
this, a "nonzero copy", which was utterly dumb and simply copies all
objects that have a nonzero reference count. This would be the one
running most often - so the characteristics of this collector would
probably shape the overall performance. The "real", reference tracked
collector would only really be necessary for collecting object cycles
every once and then (The real GC'er could note which objects' classes
are prone to being collected with zero in reference count, possibly
use this for some optimization? Or do some statistics on how often a
real collection should be done, based on counts of collected such
objects? Count in particular the "cycle prone" type instantiations,
and compare this to the amount of collected such objects, trigger a
real GC on some threshold?). Has something like this been tried? Or
would this just obviously create too much overhead because of the
necessary added counting, where following references only when needed
is less costly than continuously updating counts?

Endre.

On Thu, Mar 26, 2009 at 16:55, Gregg Wonderly <gregg at cytetech.com> wrote:
> Brian S O'Neill wrote:
>>
>> How did your collector deal with reference cycles? Was it slow enough to
>> not be a problem, or did the language prohibit cycles from arising?
>
> Practically, we didn't write software using this language which had cycles.
> This language was an automation language (on the 5ESS switch) which tied
> into the UI mechanisms and provide parallel process management (fork->exec
> tracking) so that we could take programs and scripts and stitch together
> procedures for software updates/retrofits etc.  It provided forward and
> reverse execution sequencing to do and undo steps repeatedly to restrict how
> the switch changed states.
>
> I guess everyone knows how counting can be used to track references?
> Every assignment, or pass of a value to a method call, created an explicit
> decrement on the lost reference and an increment on the new reference.
>  Anytime the count went to zero, something was freed, and there was an
> explicit decrement of all references within that 'object' too.  Reference
> cycles were not handled explicitly, so they would have been a problem which
> I never saw materialize in the small software modules that were developed
> using the language.
>
> For example if you had a cycle (with counts shown) such as
>
> F->A(2)->B(1)->C(1)->A(2)
>
> and you drop the F reference to A, A's count goes to 1, it is no longer
> globally referenced and yet you don't free it because the count doesn't go
> to zero.
>
> I think this is the issue you are referring to right?
>
> As a bit of history, initially, the language had no objects, only numeric
> and string values and things like pipe and process etc, that all went one
> way in terms of reference.  So, there was never a cycle reference issue.
>  Only 3 years after I wrote first wrote it, did I add objects as an
> extension of maps.  At that time, it became possible for circular references
> to be created.  We just didn't have object structures that complicated to
> see any issues with that happening.
>
> Gregg Wonderly
>
>> Gregg Wonderly wrote:
>>>
>>> One of the things about concurrent programming that bothers me the most
>>> is latency control.  GC activity can often introduce unwanted latency by
>>> causing arbitrary contention for resources that a thread may not be able to
>>> recognize as contentious.  A long time ago, I created a language with a
>>> counting collector which I found to be very easy to do on a uniprocessor
>>> environment.  With the advent of CAS, it seems at casual grant pretty
>>> straight forward to do a counting collector on a multi-processor machine.
>>>
>>> The upcomming introduction of the garbage first collector intrigues me
>>> from the perspective that there is a "ton" of managed resources that it uses
>>> to make itself smart enough to try and accurately estimate and manage the
>>> cost of GC activity.  There is use of CAS, plus space and time based
>>> isolation schemes to try and reduce latency that is introduced into the
>>> application.
>>>
>>> However, it seems to me that there is going to be a new cost introduced
>>> into every assignment for mutator tracking that might just be as expensive
>>> as a counting collector is.
>>>
>>> I am not really familiar with the real costs of much of the concurrency
>>> management instructions, can anyone on the list give me some pointers to
>>> more information or some insight into whether the amount of "tracking" that
>>> the garbage first collector will use, is still cheaper than a CAS on every
>>> reference change to a global object?
>>>
>>> Gregg Wonderly
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>
>>
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>



More information about the Concurrency-interest mailing list