Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sidecar/compact/store/receiver - Add the prefix option to buckets #5337

Merged
merged 36 commits into from Jun 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
de8dd26
Create prefixed bucket
jademcosta Apr 27, 2022
64642db
started PrefixedBucket tests
dudaduarte Apr 29, 2022
8e3e102
finish objstore tests
dudaduarte May 2, 2022
124bc43
Simplify string removal logic
jademcosta May 4, 2022
0e55117
Test more prefix cases on PrefixedBucket
jademcosta May 4, 2022
fffb2ef
Only use a prefixedbucket if we have a valid prefix
jademcosta May 4, 2022
eec2750
Add single unit test for prefixedBucket prefix
jademcosta May 4, 2022
d4a209f
test other prefixes on UsesPrefixTest
dudaduarte May 4, 2022
d494c9e
add remaining methods to UsesPrefixTest
dudaduarte May 5, 2022
4161825
add prefix to docs examples
dudaduarte May 5, 2022
e6a9fb5
Simplify Iter method
jademcosta May 6, 2022
e7ae910
add prefix explanation to S3 docs
dudaduarte May 6, 2022
6645a12
Conclusion of prefix sentence on docs
jademcosta May 6, 2022
cba5d77
Use DirDelim instead of magic string
jademcosta May 6, 2022
ba98b22
Add log when using prefixed bucket
jademcosta May 6, 2022
d4acbb7
Remove "@" from test string to make them simpler
jademcosta May 6, 2022
5177900
fix BucketConfig Config type - back to interface
dudaduarte May 10, 2022
14db917
add changelog
dudaduarte May 11, 2022
c30eb10
add missing checks in UsesPrefixTest
dudaduarte May 11, 2022
776c17e
fix linter and test errors
dudaduarte May 11, 2022
4ec141e
Add license to new files
jademcosta May 11, 2022
6ee2638
Remove autogenerated docs
jademcosta May 11, 2022
661b755
Remove duplicated transformation of string->[]byte
jademcosta May 17, 2022
bd05bd9
Add prefixed bucket on all e2e tests for S3
jademcosta May 17, 2022
5c8baee
Add e2e tests using prefixed bucket to all providers
jademcosta May 17, 2022
2c4fbc4
refactor: move validPrefix to prefixed_bucket logic
dudaduarte May 20, 2022
5173567
Enhance the documentation about prefix.
jademcosta May 24, 2022
435c130
Fix format
jademcosta May 24, 2022
5c0c3cc
Add prefix entry on bucket config example
jademcosta May 17, 2022
e611858
Removing redundancies on prefix checks and tests
jademcosta May 25, 2022
8926d2a
Remove redundant YAML unmarshal
jademcosta May 25, 2022
39c8dbf
Remove unused parameter
jademcosta May 25, 2022
3e04c41
Remove docs that should be auto-geneated
jademcosta May 26, 2022
e4a03da
refactor: move prefix to config root level
dudaduarte May 31, 2022
a7a9229
add auto generated docs
dudaduarte Jun 2, 2022
d72d645
fix changelog
dudaduarte Jun 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -16,6 +16,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re

### Added

- [#5337](https://github.com/thanos-io/thanos/pull/5337) Thanos Object Store: Add the `prefix` option to buckets
- [#5352](https://github.com/thanos-io/thanos/pull/5352) Cache: Add cache metrics to groupcache.
- [#5391](https://github.com/thanos-io/thanos/pull/5391) Receive: Add relabeling support.

Expand Down
1 change: 1 addition & 0 deletions docs/components/receive.md
Expand Up @@ -44,6 +44,7 @@ type: GCS
config:
bucket: ""
service_account: ""
prefix: ""
```

The example content of `hashring.json`:
Expand Down
1 change: 1 addition & 0 deletions docs/components/sidecar.md
Expand Up @@ -56,6 +56,7 @@ type: GCS
config:
bucket: ""
service_account: ""
prefix: ""
```

## Upload compacted blocks
Expand Down
1 change: 1 addition & 0 deletions docs/components/store.md
Expand Up @@ -15,6 +15,7 @@ type: GCS
config:
bucket: ""
service_account: ""
prefix: ""
```

In general, an average of 6 MB of local disk space is required per TSDB block stored in the object storage bucket, but for high cardinality blocks with large label set it can even go up to 30MB and more. It is for the pre-computed index, which includes symbols and postings offsets as well as metadata JSON.
Expand Down
3 changes: 3 additions & 0 deletions docs/components/tools.md
Expand Up @@ -101,6 +101,7 @@ type: GCS
config:
bucket: ""
service_account: ""
prefix: ""
```

Bucket can be extended to add more subcommands that will be helpful when working with object storage buckets by adding a new command within [`/cmd/thanos/tools_bucket.go`](../../cmd/thanos/tools_bucket.go) .
Expand Down Expand Up @@ -601,6 +602,7 @@ type: GCS
config:
bucket: ""
service_account: ""
prefix: ""
```

```$ mdox-exec="thanos tools bucket downsample --help"
Expand Down Expand Up @@ -675,6 +677,7 @@ type: GCS
config:
bucket: ""
service_account: ""
prefix: ""
```

```$ mdox-exec="thanos tools bucket mark --help"
Expand Down
10 changes: 10 additions & 0 deletions docs/storage.md
Expand Up @@ -96,12 +96,15 @@ config:
kms_encryption_context: {}
encryption_key: ""
sts_endpoint: ""
prefix: ""
```

At a minimum, you will need to provide a value for the `bucket`, `endpoint`, `access_key`, and `secret_key` keys. The rest of the keys are optional.

However if you set `aws_sdk_auth: true` Thanos will use the default authentication methods of the AWS SDK for go based on [known environment variables](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html) (`AWS_PROFILE`, `AWS_WEB_IDENTITY_TOKEN_FILE` ... etc) and known AWS config files (~/.aws/config). If you turn this on, then the `bucket` and `endpoint` are the required config keys.

The field `prefix` can be used to transparently use prefixes in your S3 bucket. This allows you to separate blocks coming from different sources into paths with different prefixes, making it easier to understand what's going on (i.e. you don't have to use Thanos tooling to know from where which blocks came).

The AWS region to endpoint mapping can be found in this [link](https://docs.aws.amazon.com/general/latest/gr/s3.html).

Make sure you use a correct signature version. Currently AWS requires signature v4, so it needs `signature_version2: false`. If you don't specify it, you will get an `Access Denied` error. On the other hand, several S3 compatible APIs use `signature_version2: true`.
Expand Down Expand Up @@ -255,6 +258,7 @@ type: GCS
config:
bucket: ""
service_account: ""
prefix: ""
```

##### Using GOOGLE_APPLICATION_CREDENTIALS
Expand Down Expand Up @@ -356,6 +360,7 @@ config:
key_file: ""
server_name: ""
insecure_skip_verify: false
prefix: ""
```

If `msi_resource` is used, authentication is done via system-assigned managed identity. The value for Azure should be `https://<storage-account-name>.blob.core.windows.net`.
Expand Down Expand Up @@ -396,6 +401,7 @@ config:
connect_timeout: 10s
timeout: 5m
use_dynamic_large_objects: false
prefix: ""
```

#### Tencent COS
Expand All @@ -421,6 +427,7 @@ config:
max_idle_conns: 100
max_idle_conns_per_host: 100
max_conns_per_host: 0
prefix: ""
```

The `secret_key` and `secret_id` field is required. The `http_config` field is optional for optimize HTTP transport settings. There are two ways to configure the required bucket information:
Expand All @@ -442,6 +449,7 @@ config:
bucket: ""
access_key_id: ""
access_key_secret: ""
prefix: ""
```

Use --objstore.config-file to reference to this configuration file.
Expand All @@ -457,6 +465,7 @@ config:
endpoint: ""
access_key: ""
secret_key: ""
prefix: ""
```

#### Filesystem
Expand All @@ -469,6 +478,7 @@ NOTE: This storage type is experimental and might be inefficient. It is NOT advi
type: FILESYSTEM
config:
directory: ""
prefix: ""
```

### How to add a new client to Thanos?
Expand Down
4 changes: 3 additions & 1 deletion pkg/objstore/client/factory.go
Expand Up @@ -41,6 +41,7 @@ const (
type BucketConfig struct {
Type ObjProvider `yaml:"type"`
Config interface{} `yaml:"config"`
Prefix string `yaml:"prefix" default:""`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we generate those docs automatically from the structs. Since this option is "hidden" now in the types, the check cannot pass. What about adding a new field to the BucketConfig type? Then we wouldn't have to have hacks like this.

Hey @GiedriusS , I followed your suggestion here :) The only thing I don't love about bringing the prefix to the BucketConfig is that all the bucket options are inside Config 🤔
But with this change, we also keep the PrefixedBucket layer with its validations and don't take the risk to break something by changing all the Config types that BucketConfig handles, so I think it's the best alternative!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 yeah, I think this is the best solution out of the alternatives

}

// NewBucket initializes and returns new object storage clients.
Expand Down Expand Up @@ -81,5 +82,6 @@ func NewBucket(logger log.Logger, confContentYaml []byte, reg prometheus.Registe
if err != nil {
return nil, errors.Wrap(err, fmt.Sprintf("create %s client", bucketConf.Type))
}
return objstore.NewTracingBucket(objstore.BucketWithMetrics(bucket.Name(), bucket, reg)), nil

return objstore.NewTracingBucket(objstore.BucketWithMetrics(bucket.Name(), objstore.NewPrefixedBucket(bucket, bucketConf.Prefix), reg)), nil
}
8 changes: 8 additions & 0 deletions pkg/objstore/objtesting/foreach.go
Expand Up @@ -64,6 +64,7 @@ func ForeachStore(t *testing.T, testFn func(t *testing.T, bkt objstore.Bucket))
b, err := filesystem.NewBucket(dir)
testutil.Ok(t, err)
testFn(t, b)
testFn(t, objstore.NewPrefixedBucket(b, "some_prefix"))
})

// Optional GCS.
Expand All @@ -77,6 +78,7 @@ func ForeachStore(t *testing.T, testFn func(t *testing.T, bkt objstore.Bucket))

// TODO(bwplotka): Add goleak when https://github.com/GoogleCloudPlatform/google-cloud-go/issues/1025 is resolved.
testFn(t, bkt)
testFn(t, objstore.NewPrefixedBucket(bkt, "some_prefix"))
})
}

Expand All @@ -95,6 +97,7 @@ func ForeachStore(t *testing.T, testFn func(t *testing.T, bkt objstore.Bucket))
// This needs to be investigated more.

testFn(t, bkt)
testFn(t, objstore.NewPrefixedBucket(bkt, "some_prefix"))
})
}

Expand All @@ -108,6 +111,7 @@ func ForeachStore(t *testing.T, testFn func(t *testing.T, bkt objstore.Bucket))
defer closeFn()

testFn(t, bkt)
testFn(t, objstore.NewPrefixedBucket(bkt, "some_prefix"))
})
}

Expand All @@ -121,6 +125,7 @@ func ForeachStore(t *testing.T, testFn func(t *testing.T, bkt objstore.Bucket))
defer closeFn()

testFn(t, container)
testFn(t, objstore.NewPrefixedBucket(container, "some_prefix"))
})
}

Expand All @@ -134,6 +139,7 @@ func ForeachStore(t *testing.T, testFn func(t *testing.T, bkt objstore.Bucket))
defer closeFn()

testFn(t, bkt)
testFn(t, objstore.NewPrefixedBucket(bkt, "some_prefix"))
})
}

Expand All @@ -147,6 +153,7 @@ func ForeachStore(t *testing.T, testFn func(t *testing.T, bkt objstore.Bucket))
defer closeFn()

testFn(t, bkt)
testFn(t, objstore.NewPrefixedBucket(bkt, "some_prefix"))
})
}

Expand All @@ -160,6 +167,7 @@ func ForeachStore(t *testing.T, testFn func(t *testing.T, bkt objstore.Bucket))
defer closeFn()

testFn(t, bkt)
testFn(t, objstore.NewPrefixedBucket(bkt, "some_prefix"))
})
}
}
97 changes: 97 additions & 0 deletions pkg/objstore/prefixed_bucket.go
@@ -0,0 +1,97 @@
// Copyright (c) The Thanos Authors.
// Licensed under the Apache License 2.0.

package objstore

import (
"context"
"io"
"strings"
)

type PrefixedBucket struct {
bkt Bucket
prefix string
}

func NewPrefixedBucket(bkt Bucket, prefix string) Bucket {
if validPrefix(prefix) {
return &PrefixedBucket{bkt: bkt, prefix: strings.Trim(prefix, DirDelim)}
}

return bkt
}

func validPrefix(prefix string) bool {
prefix = strings.Replace(prefix, "/", "", -1)
return len(prefix) > 0
}

func conditionalPrefix(prefix, name string) string {
if len(name) > 0 {
return withPrefix(prefix, name)
}

return name
}

func withPrefix(prefix, name string) string {
return prefix + DirDelim + name
}

func (p *PrefixedBucket) Close() error {
return p.bkt.Close()
}

// Iter calls f for each entry in the given directory (not recursive.). The argument to f is the full
// object name including the prefix of the inspected directory.
// Entries are passed to function in sorted order.
func (p *PrefixedBucket) Iter(ctx context.Context, dir string, f func(string) error, options ...IterOption) error {
pdir := withPrefix(p.prefix, dir)

return p.bkt.Iter(ctx, pdir, func(s string) error {
return f(strings.TrimPrefix(s, p.prefix+DirDelim))
}, options...)
}

// Get returns a reader for the given object name.
func (p *PrefixedBucket) Get(ctx context.Context, name string) (io.ReadCloser, error) {
return p.bkt.Get(ctx, conditionalPrefix(p.prefix, name))
}

// GetRange returns a new range reader for the given object name and range.
func (p *PrefixedBucket) GetRange(ctx context.Context, name string, off int64, length int64) (io.ReadCloser, error) {
return p.bkt.GetRange(ctx, conditionalPrefix(p.prefix, name), off, length)
}

// Exists checks if the given object exists in the bucket.
func (p *PrefixedBucket) Exists(ctx context.Context, name string) (bool, error) {
return p.bkt.Exists(ctx, conditionalPrefix(p.prefix, name))
}

// IsObjNotFoundErr returns true if error means that object is not found. Relevant to Get operations.
func (p *PrefixedBucket) IsObjNotFoundErr(err error) bool {
return p.bkt.IsObjNotFoundErr(err)
}

// Attributes returns information about the specified object.
func (p PrefixedBucket) Attributes(ctx context.Context, name string) (ObjectAttributes, error) {
return p.bkt.Attributes(ctx, conditionalPrefix(p.prefix, name))
}

// Upload the contents of the reader as an object into the bucket.
// Upload should be idempotent.
func (p *PrefixedBucket) Upload(ctx context.Context, name string, r io.Reader) error {
return p.bkt.Upload(ctx, conditionalPrefix(p.prefix, name), r)
}

// Delete removes the object with the given name.
// If object does not exists in the moment of deletion, Delete should throw error.
func (p *PrefixedBucket) Delete(ctx context.Context, name string) error {
return p.bkt.Delete(ctx, conditionalPrefix(p.prefix, name))
}

// Name returns the bucket name for the provider.
func (p *PrefixedBucket) Name() string {
return p.bkt.Name()
}
92 changes: 92 additions & 0 deletions pkg/objstore/prefixed_bucket_test.go
@@ -0,0 +1,92 @@
// Copyright (c) The Thanos Authors.
// Licensed under the Apache License 2.0.

package objstore

import (
"context"
"io/ioutil"
"sort"
"strings"
"testing"

"github.com/thanos-io/thanos/pkg/testutil"
)

func TestPrefixedBucket_Acceptance(t *testing.T) {

prefixes := []string{
"/someprefix/anotherprefix/",
"someprefix/anotherprefix/",
"someprefix/anotherprefix",
"someprefix/",
"someprefix"}

for _, prefix := range prefixes {
AcceptanceTest(t, NewPrefixedBucket(NewInMemBucket(), prefix))
UsesPrefixTest(t, NewInMemBucket(), prefix)
}
}

func UsesPrefixTest(t *testing.T, bkt Bucket, prefix string) {
testutil.Ok(t, bkt.Upload(context.Background(), strings.Trim(prefix, "/")+"/file1.jpg", strings.NewReader("test-data1")))

pBkt := NewPrefixedBucket(bkt, prefix)
rc1, err := pBkt.Get(context.Background(), "file1.jpg")
testutil.Ok(t, err)

testutil.Ok(t, err)
defer func() { testutil.Ok(t, rc1.Close()) }()
content, err := ioutil.ReadAll(rc1)
testutil.Ok(t, err)
testutil.Equals(t, "test-data1", string(content))

testutil.Ok(t, pBkt.Upload(context.Background(), "file2.jpg", strings.NewReader("test-data2")))
rc2, err := bkt.Get(context.Background(), strings.Trim(prefix, "/")+"/file2.jpg")
testutil.Ok(t, err)
GiedriusS marked this conversation as resolved.
Show resolved Hide resolved
defer func() { testutil.Ok(t, rc2.Close()) }()
contentUpload, err := ioutil.ReadAll(rc2)
testutil.Ok(t, err)
testutil.Equals(t, "test-data2", string(contentUpload))

testutil.Ok(t, pBkt.Delete(context.Background(), "file2.jpg"))
_, err = bkt.Get(context.Background(), strings.Trim(prefix, "/")+"/file2.jpg")
testutil.NotOk(t, err)
testutil.Assert(t, pBkt.IsObjNotFoundErr(err), "expected not found error got %s", err)

rc3, err := pBkt.GetRange(context.Background(), "file1.jpg", 1, 3)
testutil.Ok(t, err)
defer func() { testutil.Ok(t, rc3.Close()) }()
content, err = ioutil.ReadAll(rc3)
testutil.Ok(t, err)
testutil.Equals(t, "est", string(content))

ok, err := pBkt.Exists(context.Background(), "file1.jpg")
testutil.Ok(t, err)
testutil.Assert(t, ok, "expected exits")
jademcosta marked this conversation as resolved.
Show resolved Hide resolved

attrs, err := pBkt.Attributes(context.Background(), "file1.jpg")
testutil.Ok(t, err)
testutil.Assert(t, attrs.Size == 10, "expected size to be equal to 10")

testutil.Ok(t, bkt.Upload(context.Background(), strings.Trim(prefix, "/")+"/dir/file1.jpg", strings.NewReader("test-data1")))
seen := []string{}
testutil.Ok(t, pBkt.Iter(context.Background(), "", func(fn string) error {
seen = append(seen, fn)
return nil
}, WithRecursiveIter))
expected := []string{"dir/file1.jpg", "file1.jpg"}
sort.Strings(expected)
sort.Strings(seen)
testutil.Equals(t, expected, seen)

seen = []string{}
testutil.Ok(t, pBkt.Iter(context.Background(), "", func(fn string) error {
seen = append(seen, fn)
return nil
}))
expected = []string{"dir/", "file1.jpg"}
sort.Strings(expected)
sort.Strings(seen)
testutil.Equals(t, expected, seen)
}