[concurrency-interest] some MapMaker design decisions

Charles Fry fry at google.com
Tue May 11 14:52:11 EDT 2010


[resending, since the original missed the list]

We are currently in the process of implementing several new features
in  com.google.common.collect.MapMaker. We've had plenty of internal
discussions about them, and last week Josh Bloch suggested that we
raise the issues on concurrency-interest to get additional feedback.

One of the new features we're adding is eviction notification. This
will allow users to learn of eviction events that occur due to
expiration, garbage collection of weak/soft keys/values, or due to
size eviction. The other new feature is size eviction, whereby a
maximum size can be specified, with entries being automatically
evicted when the map grows too large.

Here are some of the design decisions we've been discussing. I'll try
to keep this initial summary on the brief end, but would be more than
happy to provide additional details as desired. We'd love feedback on
communal wisdom and best practices in these areas.

a) Should eviction notifications be callbacks, or should we simply add
evicted entries to a queue that users could then poll? Callback
notifications definitely seem conceptually simpler from the users
perspective, but they do require the execution of the callback code.
Using a queue makes the job of performing eviction simpler, but then
is more complicated for users to integrate with.

b) If eviction notifications are made using callbacks, where should
the callback be executed? We could execute them on user threads when
they make other calls on MapMaker maps, but that implies that
exceptions and blocking within callbacks could then impact threads
performing other operations on the map. The other obvious alternative
would be to require users to provide an ExecutorService for executing
callbacks, but that feels a bit overkill when all you really want is
an eviction notification.

c) Is it worth doing removal notifications, that include a callback
when remove is called manually, or is it sufficient to limit
notifications to eviction events?

d) While there are plenty of algorithms for concurrent
recency/frequency based eviction policies, it is also simple to
maintain a mutation queue that records a memento of each modification
operation, occasionally locking and applying them on a single thread.
The overhead of such a mechanism is trivial, and it allows the exact
application of non-concurrent eviction algorithms. It has the minor
caveat of requiring care to be taken with elements that are removed
from the cache while one of their mutation mementos is still enqueued.
Are there other concerns to be aware of, or is this a reasonable
approach?

e) With respect to size eviction, our initial thought was to put a
hard limit on the size of each segment. This has the advantage of
being very simple conceptually, and the notable disadvantage of
applying the eviction algorithm at the segment level instead of
globally. If a mutation queue is used to manage evictions then it can
very simply be extended to apply them globally instead of per-segment.
Are there other major considerations to be aware of here? Does it
sound reasonable to make global eviction decisions?

Ben has already implemented many of these ideas in
http://code.google.com/p/concurrentlinkedhashmap/ and we have pending
internal changes in MapMaker.  We've discussed all issues in great
detail, but would welcome any external feedback on the specific issues
raised here, or other potentially relevant issues we should be
thinking about.

thanks,
Charles


More information about the Concurrency-interest mailing list