[concurrency-interest] A concurrent, linked HashMap impl

Shaffer, Darron Darron_Shaffer at stercomm.com
Thu May 28 10:43:31 EDT 2009


One eviction strategy that mirrors hardware caches is to use the concept of a "cache line".  Make each hash bucket a LRU list and evict if the bucket size grows over a threshold.

-----Original Message-----
From: concurrency-interest-bounces at cs.oswego.edu [mailto:concurrency-interest-bounces at cs.oswego.edu] On Behalf Of Doug Lea
Sent: Thursday, May 28, 2009 8:04 AM
To: Bhasin, Vishal
Cc: concurrency-interest at cs.oswego.edu
Subject: Re: [concurrency-interest] A concurrent, linked HashMap impl

Bhasin, Vishal wrote:
> There is a high need for LRU in CHM - folks end up either creating their
> own or using a caching product (like JCS or EHcache) for this
> functionality.

First, we cannot hope to supply something that automates all
possible caching needs, especially those with persistent
(disk-based) components or integrated with J2EE services.

But we can supply some eviction policies in custom maps.
Basically, a cache is a map with an eviction policy.
An eviction policy in turn has two main parts:
   * (approximately) when to evict
   * (approximately) what to evict

For concurrent maps we only want to consider policies
compatible with our concurrency properties.
So strict LIFO-based LRU is out. However, this is not
a big loss. People usually pick LRU because it is
cheap (which it is in non-concurrent maps) and usually
performs OK. However, there are many other related
policies that usually perform at least as well, especially
for those that are ultimately tied to web-based requests
For a slightly dated survey see
http://portal.acm.org/citation.cfm?id=954341
(you might need ACM membership to access this)
A survey of Web cache replacement strategies
Stefan Podlipnig, Laszlo Böszörmenyi, ACM Computing Surveys 2003.
Most of these policies choose eviction victims based on
some combination of recency, frequency, and size.

We can provide generalized support for a number of
these by allowing users to tell us, for each
cached value:
   * The item's "utility" as an eviction preference score.
     for example for LRU-ish behavior, the score is access timestamp
   * The item's "weight" as its contribution to the decision of
     whether any item should be evicted. For example, for
     a bounded cache of equal-sized items, each weight is just 1.

Given these (definable in subclassable ValueRecords), we plan to
support methods that will evict low-utility items when a given
total-weight threshold is exceeded. The method can have only
probabilistic specs, both in terms when eviction triggers and
which items are removed.

Aside: It is really too bad that there is no way to
automate size calculations. Just bounding by total number
of items can lead to unbounded footprint growth. The only
recourse is to also declare the values as "Soft" references
to allow the GC to delete, which we will also support, even though
this can also seriously slow down GC.

-Doug




_______________________________________________
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



More information about the Concurrency-interest mailing list