Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data usage crawler refactor #9075

Merged
merged 8 commits into from Mar 18, 2020

Conversation

klauspost
Copy link
Contributor

@klauspost klauspost commented Mar 2, 2020

Description

Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b

Includes staticcheck upgrade and fixes for that.

Some quick performance tests of crawls.

This is with a cycle size of 16, 174082 XL objects. ~4K folders at prefix level 2, NVME.

BEFORE:
Crawl time 1m17.3025259s
Disk access: 350MB/s
Kernel  Time =    59.468 =   70%
User    Time =    21.984 =   25%
Process Time =    81.453 =   95%    Virtual  Memory =    303 MB

AFTER:
SET MINIO_DISK_USAGE_CRAWL_DELAY=10 (default)
Cycle scan time: 45.354097s
Disk Access: 15MB/s
Kernel  Time =     0.562 =    1%
User    Time =     0.640 =    1%
Process Time =     1.203 =    2%    Virtual  Memory =    304 MB

SET MINIO_DISK_USAGE_CRAWL_DELAY=1
Cycle scan time: 3.3367481s
Disk Access: 160MB/s
Kernel  Time =     0.468 =    3%
User    Time =     1.421 =   12%
Process Time =     1.890 =   16%    Virtual  Memory =    303 MB

SET MINIO_DISK_USAGE_CRAWL_DELAY=0
Cycle scan time: 2.1245395s
Disk Access: 175MB/s
Kernel  Time =     1.500 =   14%
User    Time =     0.765 =    7%
Process Time =     2.265 =   22%    Virtual  Memory =    304 MB

How to test this PR?

For now the server will display extra information on crawling.

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

  • Documentation needed
  • Unit tests needed

@klauspost klauspost marked this pull request as ready for review March 5, 2020 14:43
browser/ui-assets.go Outdated Show resolved Hide resolved
cmd/data-usage-cache.go Show resolved Hide resolved
cmd/storage-rest-server.go Outdated Show resolved Hide resolved
pkg/s3select/sql/utils.go Show resolved Hide resolved
cmd/fastwalk.go Show resolved Hide resolved
cmd/xl-v1.go Show resolved Hide resolved
cmd/xl-v1.go Outdated Show resolved Hide resolved
cmd/xl-v1.go Outdated Show resolved Hide resolved
cmd/xl-v1.go Outdated Show resolved Hide resolved
cmd/xl-zones.go Outdated Show resolved Hide resolved
Copy link
Member

@vadmeste vadmeste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

cmd/storage-rest-server.go Outdated Show resolved Hide resolved
cmd/xl-zones.go Outdated Show resolved Hide resolved
cmd/xl-v1.go Show resolved Hide resolved
cmd/xl-v1.go Outdated Show resolved Hide resolved
cmd/xl-v1.go Show resolved Hide resolved
cmd/xl-v1.go Show resolved Hide resolved
@klauspost klauspost mentioned this pull request Mar 10, 2020
cmd/data-usage-cache.go Outdated Show resolved Hide resolved
cmd/data-usage.go Outdated Show resolved Hide resolved
Copy link
Contributor

@fwessels fwessels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, just minor comments, LGTM.

cmd/data-usage-cache.go Outdated Show resolved Hide resolved
@harshavardhana
Copy link
Member

Can all your intermediate commits be squashed? or squashed to relevant number of commits?

@minio minio deleted a comment from minio-trusted Mar 13, 2020
@klauspost
Copy link
Contributor Author

Can all your intermediate commits be squashed? or squashed to relevant number of commits?

@harshavardhana Don't we squash when merging?

@harshavardhana
Copy link
Member

There are way too many commits, the merger won't know what to keep in terms of commit subject. So it is better for you to squash and provide a healthy commit subject.

## Description
Clarify disk (硬盘) and node (节点).
Remove the limit (限制) paragraph since there are no max 16 disks limit now.

## Motivation and Context


## How to test this PR?


## Types of changes
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Checklist:
- [ ] Fixes a regression (If yes, please add `commit-id` or `PR #` here)
- [ ] Documentation needed
- [ ] Unit tests needed
- [ ] Functional tests needed (If yes, add [mint](https://github.com/minio/mint) PR # here: )
@krishnasrinivas
Copy link
Contributor

func storeDataUsageInBackend(ctx context.Context, objAPI ObjectLayer, gui <-chan DataUsageInfo) {
        for dataUsageInfo := range gui {

looks like gui is not closed ever causing go-routine leak here

@klauspost
Copy link
Contributor Author

func storeDataUsageInBackend(ctx context.Context, objAPI ObjectLayer, gui <-chan DataUsageInfo) {
        for dataUsageInfo := range gui {

looks like gui is not closed ever causing go-routine leak here

It is closed by the caller, here:

close(results)

klauspost added a commit to klauspost/minio that referenced this pull request Mar 16, 2020
Squashed version of/replaces minio#9075

## Description

Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b

Includes staticcheck upgrade and fixes for that.

Some quick performance tests of crawls.

This is with a cycle size of 16, 174082 XL objects. ~4K folders at prefix level 2, NVME.

```
BEFORE:
Crawl time 1m17.3025259s
Disk access: 350MB/s
Kernel  Time =    59.468 =   70%
User    Time =    21.984 =   25%
Process Time =    81.453 =   95%    Virtual  Memory =    303 MB

AFTER:
SET MINIO_DISK_USAGE_CRAWL_DELAY=10 (default)
Cycle scan time: 45.354097s
Disk Access: 15MB/s
Kernel  Time =     0.562 =    1%
User    Time =     0.640 =    1%
Process Time =     1.203 =    2%    Virtual  Memory =    304 MB

SET MINIO_DISK_USAGE_CRAWL_DELAY=1
Cycle scan time: 3.3367481s
Disk Access: 160MB/s
Kernel  Time =     0.468 =    3%
User    Time =     1.421 =   12%
Process Time =     1.890 =   16%    Virtual  Memory =    303 MB

SET MINIO_DISK_USAGE_CRAWL_DELAY=0
Cycle scan time: 2.1245395s
Disk Access: 175MB/s
Kernel  Time =     1.500 =   14%
User    Time =     0.765 =    7%
Process Time =     2.265 =   22%    Virtual  Memory =    304 MB
```

## How to test this PR?

For now the server will display extra information on crawling.

## Types of changes
- [x] New feature (non-breaking change which adds functionality)
@klauspost klauspost mentioned this pull request Mar 16, 2020
1 task
Squashed version of/replaces minio#9075

## Description

Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b

Includes staticcheck upgrade and fixes for that.

Some quick performance tests of crawls.

This is with a cycle size of 16, 174082 XL objects. ~4K folders at prefix level 2, NVME.

```
BEFORE:
Crawl time 1m17.3025259s
Disk access: 350MB/s
Kernel  Time =    59.468 =   70%
User    Time =    21.984 =   25%
Process Time =    81.453 =   95%    Virtual  Memory =    303 MB

AFTER:
SET MINIO_DISK_USAGE_CRAWL_DELAY=10 (default)
Cycle scan time: 45.354097s
Disk Access: 15MB/s
Kernel  Time =     0.562 =    1%
User    Time =     0.640 =    1%
Process Time =     1.203 =    2%    Virtual  Memory =    304 MB

SET MINIO_DISK_USAGE_CRAWL_DELAY=1
Cycle scan time: 3.3367481s
Disk Access: 160MB/s
Kernel  Time =     0.468 =    3%
User    Time =     1.421 =   12%
Process Time =     1.890 =   16%    Virtual  Memory =    303 MB

SET MINIO_DISK_USAGE_CRAWL_DELAY=0
Cycle scan time: 2.1245395s
Disk Access: 175MB/s
Kernel  Time =     1.500 =   14%
User    Time =     0.765 =    7%
Process Time =     2.265 =   22%    Virtual  Memory =    304 MB
```

## How to test this PR?

For now the server will display extra information on crawling.

## Types of changes
- [x] New feature (non-breaking change which adds functionality)
@klauspost klauspost force-pushed the data-usage-crawler-recfactor branch from 7a51742 to 003cb18 Compare March 16, 2020 10:45
@klauspost
Copy link
Contributor Author

ok, did it, reluctantly. It is very annoying for anyone working on it not being able to pull anymore and get straight up fixes.

I know you like force pushing stuff, but that is extremely annoying for anyone else working on it (and I will have to spend time re-merging the stuff into the bloom filter branch).

To get a reasonable commit message, may I recommend going to the top comment, find Edit:

image

And copy paste the message as the commit message. Takes 5 seconds and far outweighs having everything force pushed.

@harshavardhana
Copy link
Member

And copy paste the message as the commit message. Takes 5 seconds and far outweighs having everything force pushed.

That is definitely easy and well understood, but when we merge the squashed commits become bullet items which are harder to sift through when intermediate commits have titles like below

* fix up something odd

Should we save it, is it intended to be like this? - It is an onus on the developer to write concise titles and messages even if they are intermediate which needs to be saved.

A merger cannot make up commit titles, this also helps the one who merges ample time - because this is not the only PR the merger is merging in his/her day to day activity.

So as a practice we generally ask if it's a large PR to squash at the end such that we know exactly what to save as part of the commit message, to keep the history clean as per the original author's intention.

@harshavardhana
Copy link
Member

@klauspost it was decided that we should move the usage file to usage.msgp as part of the buckets sub-folder in .minio.sys

~ tree -d /home/play/data{1..4}/.minio.sys/buckets
/home/play/data1/.minio.sys/buckets
└── usage.msgp
/home/play/data2/.minio.sys/buckets
└── usage.msgp
/home/play/data3/.minio.sys/buckets
└── usage.msgp
/home/play/data4/.minio.sys/buckets
└── usage.msgp

So we can remove background-ops based approach with this PR.

@klauspost
Copy link
Contributor Author

it was decided that we should move the usage file to usage.msgp as part of the buckets sub-folder in .minio.sys

Yes, but there are many more files than this. I've changed it to:

/home/play/data1/.minio.sys/buckets
  - usage.json   << Contains data usage totals as JSON.
  - usage-cache.bin    << Contains cache for the total cluster
  / usage-caches  << Contains per bucket caches.
       - bucket1.bin   << Cache for bucket1 (name of the bucket)
       - bucket2.bin   

Caches are not strict msgpack, so I don't want to give that impression.

@minio-trusted
Copy link
Contributor

Mint Automation

Test Result
mint-xl.sh ✔️
mint-large-bucket.sh ✔️
mint-worm.sh ✔️
mint-fs.sh ✔️
mint-dist-xl.sh ✔️
mint-gateway-nas.sh ✔️
mint-gateway-azure.sh more...

9075-8b7a814/mint-gateway-azure.sh.log:

Running with
SERVER_ENDPOINT:      minio-dev3.minio.io:31557
ACCESS_KEY:           minioazure
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp 0bb0d2fa132c:/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 8 seconds
(2/15) Running aws-sdk-java tests ... done in 2 seconds
(3/15) Running aws-sdk-php tests ... done in 6 minutes and 52 seconds
(4/15) Running aws-sdk-ruby tests ... done in 18 seconds
(5/15) Running awscli tests ... done in 2 minutes and 36 seconds
(6/15) Running healthcheck tests ... done in 5 seconds
(7/15) Running mc tests ... done in 7 minutes and 0 seconds
(8/15) Running minio-dotnet tests ... done in 2 minutes and 37 seconds
(9/15) Running minio-go tests ... done in 21 minutes and 33 seconds
(10/15) Running minio-java tests ... FAILED in 8 minutes and 22 seconds
{
  "name": "minio-java",
  "function": "composeObject(String bucketName, String objectName,List<ComposeSource> composeSources, Map <String,String > headerMap, ServerSideEncryption sseTarget)",
  "args": "size: 6 MB & 6 MB",
  "duration": 11638,
  "status": "FAIL",
  "error": "error occurred\nErrorResponse(code=NotImplemented, message=A header you provided implies functionality that is not implemented, bucketName=minio-java-test-2p9bprd, objectName=minio-java-test-12q6ud9, resource=/minio-java-test-2p9bprd/minio-java-test-12q6ud9, requestId=15FD1C8E3FAB397C, hostId=a9acec85-a16c-493d-ad0b-b7c5048505e0)\nrequest={method=PUT, url=http://minio-dev3.minio.io:31557/minio-java-test-2p9bprd/minio-java-test-12q6ud9?uploadId=6bdb9dfbbefc1e29&partNumber=1, headers=x-amz-copy-source-if-match: 5ce8233c756fab8a99990e5071ec0749-2\nx-amz-copy-source: minio-java-test-2p9bprd/minio-java-test-340ltmj\nAccept-Encoding: identity\nHost: minio-dev3.minio.io:31557\nUser-Agent: MinIO (amd64; amd64) minio-java/dev\nx-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855\nx-amz-date: 20200317T140721Z\nAuthorization: AWS4-HMAC-SHA256 Credential=*REDACTED*/20200317/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;host;x-amz-content-sha256;x-amz-copy-source;x-amz-copy-source-if-match;x-amz-date, Signature=*REDACTED*\n}\nresponse={code=501, headers=Accept-Ranges: bytes\nContent-Length: 410\nContent-Security-Policy: block-all-mixed-content\nContent-Type: application/xml\nServer: MinIO/DEVELOPMENT.2020-03-17T13-04-38Z\nVary: Origin\nX-Amz-Request-Id: 15FD1C8E3FAB397C\nX-Xss-Protection: 1; mode=block\nDate: Tue, 17 Mar 2020 14:07:21 GMT\n}\n >>> [io.minio.MinioClient.executeReq(MinioClient.java:1220), io.minio.MinioClient.execute(MinioClient.java:1082), io.minio.MinioClient.executePut(MinioClient.java:1446), io.minio.MinioClient.executePut(MinioClient.java:1468), io.minio.MinioClient.uploadPartCopy(MinioClient.java:2619), io.minio.MinioClient.composeObject(MinioClient.java:2572), FunctionalTest.composeObject_test1(FunctionalTest.java:2587), FunctionalTest.runTests(FunctionalTest.java:3396), FunctionalTest.main(FunctionalTest.java:3514)]"
}

Executed 9 out of 15 tests successfully.

Deleting image on docker hub
Deleting image locally

@minio minio deleted a comment from minio-ops Mar 18, 2020
@harshavardhana harshavardhana merged commit 8d98662 into minio:master Mar 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants