Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache: add timeout for groupcache's fetch operation #5206

Merged
merged 5 commits into from Mar 3, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Expand Up @@ -13,7 +13,9 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
### Fixed

### Changed

- [#5205](https://github.com/thanos-io/thanos/pull/5205) Rule: Add ruler labels as external labels in stateless ruler mode.
- [#5206](https://github.com/thanos-io/thanos/pull/5206) Cache: add timeout for groupcache's fetch operation

### Removed

Expand Down
3 changes: 3 additions & 0 deletions docs/components/store.md
Expand Up @@ -429,6 +429,7 @@ config:
- http://10.123.22.100:8080
groupcache_group: test_group
dns_interval: 1s
timeout: 500ms
```

In this case, three Thanos Store nodes are running in the same group meaning that they all point to the same remote object storage.
Expand All @@ -441,6 +442,8 @@ In the `peers` section it is possible to use the prefix form to automatically lo

Note that there must be no trailing slash in the `peers` configuration i.e. one of the strings must be identical to `self_url` and others should have the same form. Without this, loading data from peers may fail.

If timeout is set to zero then there is no timeout for fetching and fetching's lifetime is equal to the lifetime to the original request's lifetime. It is recommended to keep it more than zero.

## Index Header

In order to query series inside blocks from object storage, Store Gateway has to know certain initial info from each block index. In order to achieve so, on startup the Gateway builds an `index-header` for each block and stores it on local disk; such `index-header` is build by downloading specific pieces of original block's index, stored on local disk and then mmaped and used by Store Gateway.
Expand Down
12 changes: 12 additions & 0 deletions pkg/cache/groupcache.go
Expand Up @@ -36,6 +36,7 @@ type Groupcache struct {
galaxy *galaxycache.Galaxy
universe *galaxycache.Universe
logger log.Logger
timeout time.Duration
}

// GroupcacheConfig holds the in-memory cache config.
Expand All @@ -59,13 +60,17 @@ type GroupcacheConfig struct {

// How often we should resolve the addresses.
DNSInterval time.Duration `yaml:"dns_interval"`

// Timeout specifies the read/write timeout.
Timeout time.Duration
matej-g marked this conversation as resolved.
Show resolved Hide resolved
}

var (
DefaultGroupcacheConfig = GroupcacheConfig{
MaxSize: 250 * 1024 * 1024,
DNSSDResolver: dns.GolangResolverType,
DNSInterval: 1 * time.Minute,
Timeout: 500 * time.Millisecond,
}
)

Expand Down Expand Up @@ -255,6 +260,7 @@ func NewGroupcacheWithConfig(logger log.Logger, reg prometheus.Registerer, conf
logger: logger,
galaxy: galaxy,
universe: universe,
timeout: conf.Timeout,
}, nil
}

Expand All @@ -265,6 +271,12 @@ func (c *Groupcache) Store(ctx context.Context, data map[string][]byte, ttl time
func (c *Groupcache) Fetch(ctx context.Context, keys []string) map[string][]byte {
data := map[string][]byte{}

if c.timeout != 0 {
timeoutCtx, cancel := context.WithTimeout(ctx, c.timeout)
ctx = timeoutCtx
defer cancel()
}

for _, k := range keys {
codec := galaxycache.ByteCodec{}

Expand Down