Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Side loading compressed layer files into the cacheDir #220

Open
1 task
simha-db opened this issue Jun 8, 2023 · 13 comments
Open
1 task

Side loading compressed layer files into the cacheDir #220

simha-db opened this issue Jun 8, 2023 · 13 comments
Labels
question Further information is requested

Comments

@simha-db
Copy link

simha-db commented Jun 8, 2023

What happened in your environment?

is side loading compressed layer files into the cacheDir supported?

I am looking into fetching the layer files out of band and saving them in the cacheDir as sha256:. Does overlaybd pickup the new files added externally to the directory after it has started? Or does it hydrate once and maintains the inventory internally until a restart?

What did you expect to happen?

I still see a few downloads happening - the logging wasn't verbose and didn't easily tell me which digest it was referring to - i will turn on debug logging and try again - but wanted to check if the above scenario is supposed to work.

How can we reproduce it?

copy the compressed layer files to the cacheDir after overlaybd has started - then start the container - expectation is overlaybd doesn't issue any blob download requests.

What is the version of your Overlaybd?

NA

What is your OS environment?

Ubuntu 18

Are you willing to submit PRs to fix it?

  • Yes, I am willing to fix it.
@simha-db simha-db added the bug Something isn't working label Jun 8, 2023
@simha-db
Copy link
Author

simha-db commented Jun 8, 2023

Sorry i mistagged the request. It is more of a question.

@yuchen0cc
Copy link
Contributor

yuchen0cc commented Jun 9, 2023

@simha-db thanks for asking!

  1. Is side loading compressed layer files into the cacheDir supported?
    Yes, but it has limitations.
    First, you should make sure the loaded files will not override existing files. Those files may have been opened by running containers. Overriding won't let overlaybd reopen these files. (downloading from source will still happen for this layer)
    Second, cacheDir size cacheConfig.cacheSizeGB should large enough to avoid loaded files triggering cache gc, which is unexpected behavior.
    So, it is highly recommanded that you stop overlaybd first, then load files to cache, and restart overlaybd.
  2. Why downloads happening ?
    I think it's because background download is enable in overlaybd config (download.enable). Caching and background download are seprate operations, and background download files are not stored in cache dir. If you are trying to load files into cache dir, background download can be disabled.

@simha-db
Copy link
Author

simha-db commented Jun 9, 2023

Ah ok. When is background download triggered?If there is no prefetch involved - do any of the ondemand loads go through background download?

Some what off topic: If i use the ctr rpull --download-blobs - does that go through the cache?

@yuchen0cc
Copy link
Contributor

@simha-db background download starts 600s after container started by default, which can be set by download.delay.
Background download is performed in a seprate thread, and on-demand load won't go through it, neither will prefetch.

ctr rpull --download-blobs download blobs to snapshotter dir, will not go through cache.

@simha-db
Copy link
Author

simha-db commented Jun 9, 2023

Thanks - And the background downloader fetches the full image to the snapshot directory?

@yuchen0cc
Copy link
Contributor

@simha-db Yes, exactly.
But when blobs have been downloaded by ctr rpull --download-blobs, container will start from local files, and background download will not be triggered.

@simha-db
Copy link
Author

Do all registry requests get logged in overlaybd-audit.log?

One thing i notice is - if i prepull the entire image into registry_cache - pull image request to image service - which should be doing only rpull (without the download) takes much longer. When an image file is opened - if the layer files are in the registry_cache does the snapshotter do additional work to pull them to the snapshot directory or something of that sort?

@yuchen0cc yuchen0cc added question Further information is requested and removed bug Something isn't working labels Jun 15, 2023
@yuchen0cc
Copy link
Contributor

@simha-db all overlaybd registry requests (after rpull) will be logged in overlaybd-audit.log (background download audit is added in v0.6.9 #222).
If layers have been in registry_cache, container open will not request data from registry. Snapshotter won't pull layers to snapshott directory after container started, but background download will. Please make sure background download feature is disabled before your test.
Prepull should not affect time of rpull (without download), could you please show me some data ?

@simha-db
Copy link
Author

Turns out the time is the same - i incorrectly attributed it to an increase.

@simha-db simha-db reopened this Jul 25, 2023
@simha-db
Copy link
Author

I think i have some data on the slowness - maybe it is exposed when we upgraded our overlaybd - but here's what i observe.

  1. My side loading the layer files finishes in 4 seconds.
  2. Then we call CRI pull image and i see in the containerd logs the following
Jul 25 07:01:25 ip-10-1-166-126 containerd[936]: time="2023-07-25T07:01:25.841633618Z" level=info msg="PullImage \”<removed>@sha256:a65d68704b22c9752c023cec0a2ef0868d31623574dabaf9949fcf525645b32e\""
Jul 25 07:01:25 ip-10-1-166-126 containerd[936]: time="2023-07-25T07:01:25.841757128Z" level=info msg="experimental: PullImage \”<removed>@sha256:a65d68704b22c9752c023cec0a2ef0868d31623574dabaf9949fcf525645b32e\" for runtime runc-overlaybd, using snapshotter overlaybd"
**Jul 25 07:02:10** ip-10-1-166-126 containerd[936]: time="2023-07-25T07:02:10.703589621Z" level=info msg="ImageCreate event &ImageCreate{Name:<removed>@sha256:a65d68704b22c9752c023cec0a2ef0868d31623574dabaf9949fcf525645b32e,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"

the CRI pull image took close to 45 seconds.

  1. I look in overlaybd-auditlog - indeed it is empty - no pulls were made from network
2023/07/25 07:01:47|AUDIT|th=00007F08D4F5F300|file:read[pathname=][offset=0][size=512][latency=34292]
2023/07/25 07:02:26|AUDIT|th=00007F0743D35E80|file:read[pathname=][offset=108004587][size=2922][latency=29157]
2023/07/25 07:02:26|AUDIT|th=00007F07D35F7E80|file:read[pathname=][offset=1412525637][size=30698][latency=43467]
2023/07/25 07:02:26|AUDIT|th=00007F07D0DD7F40|file:read[pathname=][offset=1412556335][size=63151][latency=43467]
2023/07/25 07:02:27|AUDIT|th=00007F07D0DD7F40|file:read[pathname=][offset=1396719871][size=859][latency=12106]
2023/07/25 07:02:27|AUDIT|th=00007F0743D35E80|file:read[pathname=][offset=107999319][size=2456][latency=20176]
2023/07/25 07:02:27|AUDIT|th=00007F06E6BC7300|file:read[pathname=][offset=512][size=512][latency=19326]
2023/07/25 07:02:36|AUDIT|th=00007F06E4BAF2C0|file:read[pathname=][offset=213191565][size=45214][latency=12454]

  1. When i look in the overlaybd-snapshotter service logs i see the following - the last entry in the snapshotter - roughly matches the time when the image reference was returned to containerd. Seems like the snapshotter is doing something synchronously if it finds the cache files. When we don't side load - we don't see this delay. What is the snapshotter doing here? is it validating the files synchronously?
Jul 25 07:01:33 ip-10-1-166-126 overlaybd-snapshotter[1543]: time="2023-07-25T07:01:33Z" level=info msg="{\n  \"repoBlobUrl\": \”<removed>/blobs\",\n  \"lowers\": [\n    {\n      \"digest\": \"sha256:0d5554fe6fff2aa24f4065881f061f4c1b03749341e119a93174cc0858f0ed57\",\n      \"size\": 47891968,\n      \"dir\": \"/var/lib/overlaybd/snapshots/29/fs\"\n    }\n  ],\n  \"upper\": {},\n  \"resultFile\": \"/var/lib/overlaybd/snapshots/29/block/init-debug.log\"\n}"


**Jul 25 07:02:10** ip-10-1-166-126 overlaybd-snapshotter[1543]: time="2023-07-25T07:02:10Z" level=info msg="{\n  \"repoBlobUrl\": \”<removed>/blobs\",\n  \"lowers\": [\n    {\n      \"digest\": \"sha256:0d5554fe6fff2aa24f4065881f061f4c1b03749341e119a93174cc0858f0ed57\",\n      \"size\": 47891968,\n      \"dir\": \"/var/lib/overlaybd/snapshots/29/fs\"\n    },\n    {\n      \"digest\": \"sha256:82d331f6b157c3ee944d1064c55fa41f20024c34b2d7138598cec6942bc6bead\",\n      \"size\": 6144,\n      \"dir\": \"/var/lib/overlaybd/snapshots/32/fs\"\n    },\n    {\n      \"digest\": \"sha256:187a69a0ef15042fa944bc05217cf116ffc1de923bf1bbaa21241bf471324a55\",\n      \"size\": 7680,\n      \"dir\": \"/var/lib/overlaybd/snapshots/35/fs\"\n    },\n    {\n      \"digest\": \"sha256:2d0b9a2eff4cb17c414ca3cee0eb48d9e000ffd9661994997b08baa3a837d742\",\n      \"size\": 1020676608,\n      \"dir\":

@yuchen0cc
Copy link
Contributor

@simha-db overlaybd-snapshotter will not try to find and validate files. There is a global lock in overlaybd-snapshotter, which makes prepare/mount layer running in serial. Usually prepare is quite fast for each layer, does it happens when there are many rpull requests and prepares are queueing, or disk io load is rather high ?

@simha-db
Copy link
Author

Something doesn't add up - when all this activity is happening in overlaybd snapshotter - like this - containerd is still waiting for the image reference - there is no lazy load happening - what is all this activity in snapshotter? It seems to touch all the layer files one by one -does it setup the snapshot directory for all the registry cache layer files before the image reference is created?

Jul 25 07:01:33 ip-10-1-166-126 overlaybd-snapshotter[1543]: time="2023-07-25T07:01:33Z" level=info msg="{\n \"repoBlobUrl\": \”<removed>/blobs\",\n \"lowers\": [\n {\n \"digest\": \"sha256:0d5554fe6fff2aa24f4065881f061f4c1b03749341e119a93174cc0858f0ed57\",\n \"size\": 47891968,\n \"dir\": \"/var/lib/overlaybd/snapshots/29/fs\"\n }\n ],\n \"upper\": {},\n \"resultFile\": \"/var/lib/overlaybd/snapshots/29/block/init-debug.log\"\n}"

@yuchen0cc
Copy link
Contributor

@simha-db overlaybd-snapshotter indeed prepare snapshot directory for each image layer. snapshot directory contains some meta info for overlaybd-tcmu.
when rpull, containerd calls overlaybd-snapshotter to prepare snapshot for each image layer (ro layers).
when run, containerd calls overlaybd-snapshotter to prepare rootfs mountpoint (rw layer), which will call overlaybd-tcmu to setup block device based on those layers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants