-
Notifications
You must be signed in to change notification settings - Fork 36
Conversation
d3a6c5a
to
60c2d8e
Compare
// arccache wraps a BlockStore with an Adaptive Replacement Cache (ARC) that | ||
// does not store the actual blocks, just metadata about them: existence and | ||
// size. This provides block access-time improvements, allowing | ||
// to short-cut many searches without querying the underlying datastore. | ||
type arccache struct { | ||
lklk sync.Mutex | ||
lks map[string]*lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the allocations here become an issue, we could pool the locks. But really, I doubt that'll make much of a difference in practice.
arc_cache.go
Outdated
func (b *arccache) refLock(k string) *lock { | ||
b.lklk.Lock() | ||
lk, ok := b.lks[k] | ||
if !ok { | ||
lk = new(lock) | ||
b.lks[k] = lk | ||
} | ||
lk.refcnt++ | ||
b.lklk.Unlock() | ||
return lk | ||
} | ||
|
||
func (b *arccache) lock(k string) { | ||
lk := b.refLock(k) | ||
lk.mu.Lock() | ||
} | ||
|
||
func (b *arccache) rlock(k string) { | ||
lk := b.refLock(k) | ||
lk.mu.RLock() | ||
} | ||
|
||
func (b *arccache) decLock(key string) *lock { | ||
b.lklk.Lock() | ||
lk := b.lks[key] | ||
lk.refcnt-- | ||
if lk.refcnt == 0 { | ||
delete(b.lks, key) | ||
} | ||
b.lklk.Unlock() | ||
return lk | ||
} | ||
|
||
func (b *arccache) unlock(key string) { | ||
lk := b.decLock(key) | ||
lk.mu.Unlock() | ||
} | ||
|
||
func (b *arccache) runlock(key string) { | ||
lk := b.decLock(key) | ||
lk.mu.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review this section carefully.
} | ||
|
||
func (b *arccache) DeleteBlock(k cid.Cid) error { | ||
if !k.Defined() { | ||
return nil | ||
} | ||
|
||
if has, _, ok := b.queryCache(k); ok && !has { | ||
key := cacheKey(k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now reusing this key to avoid having to compute/allocate it twice. That's probably why I'm seeing equivalent perf with the striped version.
b.cacheHave(k, false) | ||
b.cacheHave(key, false) | ||
} else { | ||
b.cacheInvalidate(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we fail to delete for some random reason, we can't be sure that it wasn't actually deleted. This is the safe bet.
|
||
b.cache.Remove(k) // Invalidate cache before deleting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was invalidating the wrong key. Also, I think it was trying to fix the race that we're now fixing with locks (but it didn't work).
arc_cache.go
Outdated
return b.viewer.View(k, callback) | ||
var cberr error | ||
var size int | ||
if err := b.viewer.View(k, func(buf []byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We weren't caching here before, now we are. Technically a new feature, but eh.
60c2d8e
to
788f454
Compare
@Stebalien @whyrusleeping what's the status of this PR, does it need review, are we closing it, etc? |
I think it just needs a review? |
Unfortunately, stripped locking breaks down when doing many concurrent operations. Luckily, per-cid locking isn't that expensive. In my benchmarks, it doesn't even make a noticeable difference.
788f454
to
f8f446d
Compare
b.lock(key, false) | ||
defer b.unlock(key, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's not worthwhile/sufficiently rare, but is it worthwhile to be checking the cache again or taking the read lock earlier? Otherwise you're going to hit the underlying storage twice if you get to requests at a similar time.
It's probably fine to say that that pattern is sufficiently rare as to not be worth the extra work for other block reads though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could take the read lock earlier, but I'd like to avoid the overhead in case the item is in the cache. I'd just optimize this later if it's an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just left a few questions it'd be nice to clear up before merging @Stebalien
return b.viewer.View(ctx, k, callback) | ||
var cberr error | ||
var size int | ||
if err := b.viewer.View(ctx, k, func(buf []byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't reviewed this section carefully since I'm not familiar with the Viewer abstractions, but it seems reasonable.
if !k.Defined() { | ||
log.Error("undefined cid in arccache") | ||
// Return cache invalid so the call to blockstore happens | ||
// in case of invalid key and correct error is created. | ||
return false, -1, false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we removing this because it's extraneous since the ""
key will always be invalidated/removed from the cache and therefore we'll hit the blockstore anyway, or is it a different change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're removing this because the cache key is now pre-computed and passed into this function so we can't call cid.Defined()
. We now check if it's defined at every call site before computing the cache key.
@aschmahmann merge when you feel comfortable with this. |
Unfortunately, stripped locking breaks down when doing many concurrent operations. Luckily, per-cid locking isn't that expensive.
In my benchmarks, it doesn't even make a noticeable difference. Probably because I also optimized some things to avoid re-computing the cache key multiple times.