Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash with high concurrency in warp put #301

Open
harshavardhana opened this issue Feb 9, 2024 · 15 comments
Open

crash with high concurrency in warp put #301

harshavardhana opened this issue Feb 9, 2024 · 15 comments
Assignees

Comments

@harshavardhana
Copy link
Member

warp put --tls --insecure --host 10.10.100.61:9000 --access-key minio --secret-key minio123 --autoterm --concurrent 168
panic: runtime error: slice bounds out of range [24560:16400]▓▓▓█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░┃   6.67%

goroutine 953 [running]:
github.com/secure-io/sio-go.(*EncReader).Read(0xc00033ebb0, {0xc002394000?, 0xc0072929a8?, 0x41d4f6?})
        github.com/secure-io/[email protected]/reader.go:57 +0x1ea
io.ReadAtLeast({0xe309e0, 0xc00033ebb0}, {0xc002394000, 0x2000, 0x2000}, 0x2000)
        io/io.go:335 +0x90
io.ReadFull(...)
        io/io.go:354
github.com/minio/warp/pkg/generator.(*scrambler).Read(0xc000640090, {0xc002394000?, 0x452ae9?, 0x2000?})
        github.com/minio/warp/pkg/generator/scambler.go:116 +0x6c
github.com/minio/minio-go/v7.(*hookReader).Read(0xc002c2aa80, {0xc002394000, 0x6?, 0x2000})
        github.com/minio/minio-go/[email protected]/hook-reader.go:76 +0xbe
io.discard.ReadFrom({}, {0x7f6a69f55180, 0xc0069cedb0})
        io/io.go:658 +0x6d
io.copyBuffer({0xe2f5e0, 0x13ba060}, {0x7f6a69f55180, 0xc0069cedb0}, {0x0, 0x0, 0x0})
        io/io.go:416 +0x147
io.Copy(...)
        io/io.go:389
net/http.(*transferWriter).doBodyCopy(0xc006054a00, {0xe2f5e0?, 0x13ba060?}, {0x7f6a69f55180?, 0xc0069cedb0?})
        net/http/transfer.go:412 +0x48
net/http.(*transferWriter).writeBody(0xc006054a00, {0xe2fa20, 0xc003212140})
        net/http/transfer.go:375 +0x408
net/http.(*Request).write(0xc006ffdc00, {0xe2fa20, 0xc003212140}, 0x0, 0x0, 0x0)
        net/http/request.go:738 +0xbad
net/http.(*persistConn).writeLoop(0xc0039c8a20)
        net/http/transport.go:2424 +0x18f
created by net/http.(*Transport).dialConn in goroutine 2172
        net/http/transport.go:1777 +0x16f1
@klauspost
Copy link
Collaborator

Maybe @aead can help a bit since it is sio? Could also be some concurrent access without looking at the code.

@harshavardhana
Copy link
Member Author

@aead ^^

@klauspost
Copy link
Collaborator

I think this could actually be related to the issues we are having with on multipart uploads.

@harshavardhana
Copy link
Member Author

I think this could actually be related to the issues we are having with on multipart uploads.

Which one @klauspost ?

@klauspost
Copy link
Collaborator

@harshavardhana The one that forced us to turn off checksums on multipart replication or tiering - forget which.

@harshavardhana
Copy link
Member Author

@harshavardhana The one that forced us to turn off checksums on multipart replication or tiering - forget which.

We didn't turn off checksums for that we turned off doing sha256 and md5sum which are expensive.

We still enable crc checksums.

@akshay8043
Copy link

Sorry I am jumping in between, possibly i feel I am in same boat.

I am running warp mixed to fill a bucket of an NVME object storage of a less than 150KB object size with 500-600 million objects with 500 concurrent using 2 clients,

i think client are automatically killing and my warp scripts stops.

warp put doesn't have option to upload number of objects which is why i am using warp mixed and keeping all other distribution zero and keeping put-distrib to 100.

  1. Feature request to use warp put with number of objects option / parameter
  2. what could be the issue warp client shows killed.

@klauspost
Copy link
Collaborator

@akshay8043 You are just running out of memory, and that is not related to this. Use --stress, and requests will no longer be logged. Use warp get if you want to upload a specific number of objects.

@romayalon
Copy link

Hey @klauspost
We also experience a crash of warp client when running high concurrency, --stress did not help, Are there any recommendations for debugging this issue?
This is the command we run -

warp versioned --host="$host_address" --access-key="$access_key" --secret-key="$secret_key" --obj.size=1k --duration=1h --stress --objects=10000 --concurrent=100 --bucket="bucket1" --insecure –tls

@klauspost
Copy link
Collaborator

@romayalon Provide a trace from the crash. Without that there is nothing to go on.

@romayalon
Copy link

@klauspost Is there a way to get a trace if the server is not minio? we run on NooBaa as server, this is all I got from the person who ran it -
warp dies
351316 Killed
warp versioned --host={10 hosts addresses} --access-key="$access_key" --secret-key="$secret_key" --obj.size=1k --stress --duration=8h --objects=10000 --concurrent=1000 --bucket="bucket5004" --insecure --tls

@klauspost
Copy link
Collaborator

@romayalon Sounds like you are getting OOM killed.

@romayalon
Copy link

@klauspost I thought so too but we usually see OOMKilled 137 error, Is there a way to get warp logs?

@klauspost
Copy link
Collaborator

@romayalon Either way it is being killed externally.

@romayalon
Copy link

Updating for the community that we found a proof that warp was OOMkilled in var/log/messages -
kernel: Out of memory: Killed process <pid> (warp)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants