-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround assumption that layers are unique in zstdchunk converter #1847
base: main
Are you sure you want to change the base?
Conversation
Hi @apostasie did you check this PR with ctr-remote command for example? Unfortunately this change works for me in nerdctl and doesn't in ctr-remote image optimize :( |
Hey @GrigoryEvko Yep. If you do not specify it, because of the extra opts from Analyze, the converter function can no longer acquire a lock and the calls are again parallelized. Clearly there are deep-seated issues in the codebase making assumptions about images that are not true: Bottom-line: With this patch, use nerdctl, or use ctr-remote with If the maintainers here find this direction acceptable, we can also patch None of this appears to be safe to use concurrently (IMO the uncompress/defer delete step on the content store is problematic). |
Oh no, it was the whole point of using optimize command 😄 For today, if I'd like to have optimized zstdchunked image, would It's interesting that converting to estargz with gzip doesn't have all these issues, I thought it's mostly a compression algorithm change, looks like it's implemented differently. Thanks again! |
Signed-off-by: apostasie <[email protected]>
Oh, stop making these puppy eyes sir! 😂 Here. Latest version works for both (note that I hate the updated version even more than the previous one, because of the global vars). Enjoy ;-). |
Afk for most of the day - will be slow to answer. |
@AkihiroSuda @ktock gentle nudge |
Hey,
(Hopefully) fixes #1842 and fixes containerd/nerdctl#3623
zstdchunk convert function assumes that layers are unique and cannot be repeated.
stargz-snapshotter/nativeconverter/zstdchunked/zstdchunked.go
Lines 112 to 116 in a6b9bdb
When this is not the case, the defer delete of the uncompressed version will remove it while other competing routines may still need it, leading to the conversion failing.
This PR offers one possible solution that is minimally intrusive and does not change the overall design - just use a mutex when uncompressing a specific layer to ensure there cannot be concurrent processing for the same desc.
Note that if the layers are already uncompressed, no locking is enforced, hence the same class of issue may manifest itself elsewhere in the code (this PR does not try to address that - and it is unclear if such a situation is a problem or not).
Also PR-ed on nerdctl - containerd/nerdctl#3628 - should be closed over there if this here can be merged and released.
Finally, there may be more issues involved with large images like the ones in the initial reports - and this PR might not be a silver bullet, though it did successfully convert the nvidia image locally.
Thanks.
cc @AkihiroSuda @ktock