Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up hashing by introducing 2 passes #1423

Open
quizac- opened this issue Dec 27, 2024 · 0 comments
Open

Speed up hashing by introducing 2 passes #1423

quizac- opened this issue Dec 27, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@quizac-
Copy link

quizac- commented Dec 27, 2024

Hi there,

I haven't checked the code yet but judging by the results shown by iostat, czkawka performs hashing on full files having the same size. This can be improved by introducing 2-passes hashing. First pass can read and hash arbitrary block size (64kB, 1MB, etc) starting from beginning of file. The second pass could continue running hashing on full file if previous step determined that files might be the same.
This could speed up duplicate search especially on network attached volumes with large amount of files to compare.

@quizac- quizac- added the enhancement New feature or request label Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant