Speed up hashing by introducing 2 passes #1423

quizac- · 2024-12-27T10:39:44Z

Hi there,

I haven't checked the code yet but judging by the results shown by iostat, czkawka performs hashing on full files having the same size. This can be improved by introducing 2-passes hashing. First pass can read and hash arbitrary block size (64kB, 1MB, etc) starting from beginning of file. The second pass could continue running hashing on full file if previous step determined that files might be the same.
This could speed up duplicate search especially on network attached volumes with large amount of files to compare.

quizac- added the enhancement New feature or request label Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up hashing by introducing 2 passes #1423

Speed up hashing by introducing 2 passes #1423

quizac- commented Dec 27, 2024

Speed up hashing by introducing 2 passes #1423

Speed up hashing by introducing 2 passes #1423

Comments

quizac- commented Dec 27, 2024