OSS-Fuzz: OSS-Fuzz fuzzing integration #385

arthurscchan · 2024-10-23T19:45:54Z

Hi! Would you be interested in setting up fuzzing for the tar-rs module via OSS-Fuzz?

Fuzzing is essentially a stress-testing approach used to find bugs in software, and OSS-Fuzz is a free service run by Google that continuously fuzzes important open-source projects. Integrating your module with OSS-Fuzz could help uncover memory corruption issues that may exist.

This PR adds a Cargo fuzz configuration along with 3 fuzzers for the tar-rs module. In combination with an initial attempt in OSS-Fuzz (google/oss-fuzz#12645), it enables OSS-Fuzz to fuzz the tar-rs module while keeping the fuzzers upstream for further modification and expansion. If you're happy to proceed with the integration and store the fuzzers upstream, please let me know, and I'd be glad to provide more details if needed.

The only thing required at this point is an email associated with a Google account, which will be used to receive notifications when bugs are found.

Signed-off-by: Arthur Chan <[email protected]>

cgwalters

This is great, thanks! I only took a start of a review on this but I love the idea.

fuzz/fuzz_targets/archive.rs

fuzz/fuzz_targets/builder.rs

fuzz/fuzz_targets/archive.rs

cgwalters · 2024-10-24T17:42:23Z

The only thing required at this point is an email associated with a Google account, which will be used to receive notifications when bugs are found.

My gmail is [email protected], the other most active maintainers are @xzfc and @alexcrichton so let's see if they have and want to provide gmail accounts too.

(But that said...why can't oss-fuzz learn to e.g. file security issues on GH?)

cgwalters

Marking changes requested to track review state above

Signed-off-by: Arthur Chan <[email protected]>

alexcrichton · 2024-10-24T18:55:23Z

Personally I'm all for more fuzzing. I'm cc'd on the wasmtime project on oss-fuzz and we use it to great benefit there. In that sense 👍 from me.

On the specifics of the fuzzers here one thing I might recommend is to more heavily use the arbitrary crate to make parsing the input data easier. I'd also be a bit worried about the coverage of these fuzzers, for example the tar.rs fuzzer looks more like a unit test than a fuzzer since the data is just written to the filesystem. For builder.rs that also looks more like a unit tests since the set of things that can happen in the fuzzer is so small (more-or-less only looking at the first byte). The archive.rs fuzzer looks like it's got some interesting bits going on but it's a bit hard to read/follow I find and I think that arbitrary would help clean it up.

Overall though I think the idea of fuzzing is good, but I think it might be best to shore up what's being fuzzed here. For example this might be a good candidate for differential fuzzing against a different tar library or similar (e.g. linking to libtar in C if that's reasonably easy to do). Either that or perhaps fuzzing things like assigning arbitrary filenames and ensuring that no files are created outside of the folder during extraction (stuff like that). I'll note though that in general fuzzers are most effective when they do no I/O since that enables them to go much faster.

Signed-off-by: Arthur Chan <[email protected]>

arthurscchan · 2024-10-28T16:49:00Z

@cgwalters I have updated the fuzzers to include more randomness and clean up with arbitrary crate. I have also removed builder.rs which have duplicated targets from the other two fuzzers to make it more clear.

arthurscchan · 2024-10-29T17:35:28Z

@alexcrichton Thank you for your suggestions. I have updated the fuzzers accordingly. I have removed builder.rs as it contains much of the duplicated logic from the other two fuzzers. I have also included more randomness in file structure and contents using the arbitrary crate for both tar.rs and archive.rs. The aim of these fuzzers is to explore and test the code for tar archiving and extraction functionalities. I believe the fuzzers are much improved now, both in terms of randomness and targeted functions. Please do share any further comments on improving the fuzzing targets or approaches to enhance them. Thanks again for your suggestions and feedback.

fuzz/fuzz_targets/archive.rs

Signed-off-by: Arthur Chan <[email protected]>

cgwalters · 2024-11-01T21:16:03Z

Thanks for your work on this! Your filtering of the pathnames looks sane, but I still worry about having code that reads or writes to the filesystem as a result of the fuzzer. In CI, it doesn't really matter - I'm sure that the oss-fuzz runners are sandboxed heavily.

But someone trying to reproduce a fuzzing failure locally is probably not by default, and it'd be pretty unfortunate if we somehow ended up writing into their $HOME or so for example.

The safest fix is really to avoid all filesystem APIs. I should have recommended this earlier but we could also use https://crates.io/crates/cap-std which itself acts as a sandbox.

Alternatively - and this may be the easiest - we could have the fuzz targets bomb out like this:

if !std::env::var_os("TAR_FUZZ_ACKNOWLEDGE_SANDBOX_RECOMMENDED").is_some() {
  anyhow::bail!("These fuzzing targets may read/write to the filesystem; please run in a sandbox and set TAR_FUZZ_ACKNOWLEDGE_SANDBOX_RECOMMENDED");

or so?

NobodyXu · 2024-11-02T03:18:10Z

The safest fix is really to avoid all filesystem APIs. I should have recommended this earlier but we could also use https://crates.io/crates/cap-std which itself acts as a sandbox.

Just my 2c, but it seems that having a sans-io interface would help fuzzing, as it ensures no I/O?

fuzz/fuzz_targets/archive.rs

alexcrichton

Fuzzers look good to me, thanks! I might second the cap-std suggestion for perhaps creating the directory to crate an archive from. That can help remove the tests for safe paths I think since it should automatically deny access outside the directory (if I understand it right)

While skipping I/O entirely would be best I do realize that tar is intimately tied with the filesystem most of the time so it might just not be possible to skip it for these fuzzers.

fuzz/fuzz_targets/tar.rs

Signed-off-by: Arthur Chan <[email protected]>

arthurscchan · 2024-11-04T11:51:39Z

@alexcrichton @cgwalters I have revamped the logic, using derive_abitrary crate and change the size check to use int_in_range. I have also implemented the sandbox_dir with cap_std and remove unnecessary path checking and sanitisation (since cap_std won't allow going outside from the sandbox_dir). I agree that it is a better handling of the I/O process. I actually second the idea that I/O is kinda not avoidable in fuzzing this project since tar is coupled with I/O process in some sence, and that is also one of the worthy fuzzing target in the project.

cgwalters

Thanks so much for your work on this! I think we can get this in and see if fuzzing at scale turns up anything.

Am I correct in that basically what this fuzzing coverage should find is an unexpected panic or so? We're swallowing/ignoring most Err.

I think it'd be quite interesting to have a followup for Alex's suggestion of differential fuzzing, e.g. something like validate that a tar we create can "round trip" especially through other tar implementations like GNU tar, libarchive, the Go encoding/tar etc.
That would likely turn up a lot of edge cases in interoperability.

fuzz/fuzz_targets/archive.rs

arthurscchan · 2024-11-04T15:46:22Z

@cgwalters Yes, we are aiming to find unexpected panic, or even deadly signal like Segmentation Fault or else. Thanks for merging the fuzzers in. I will go on and add your email to the OSS-Fuzz integration and it should be ready to go. Not sure if @alexcrichton also wants to be added to the OSS-Fuzz contact for this project. If yes, please give me a email address linked to a Google account and I am happy to add it in for you. Thanks.

This PR initialises OSS-Fuzz integration for the tar-rs project in Rust. New fuzzers have been created, and a PR (alexcrichton/tar-rs#385) has been submitted upstream to merge the fuzzers. --------- Signed-off-by: Arthur Chan <[email protected]>

OSS-Fuzz: OSS-Fuzz fuzzing integration

ff2b133

Signed-off-by: Arthur Chan <[email protected]>

arthurscchan mentioned this pull request Oct 23, 2024

tar-rs: Initial integation google/oss-fuzz#12645

Merged

cgwalters reviewed Oct 24, 2024

View reviewed changes

cgwalters requested changes Oct 24, 2024

View reviewed changes

Update archive.rs

6249324

Signed-off-by: Arthur Chan <[email protected]>

Update fuzzers and Cargo.toml

d620fec

Signed-off-by: Arthur Chan <[email protected]>

arthurscchan requested a review from cgwalters October 29, 2024 13:47

cgwalters reviewed Oct 29, 2024

View reviewed changes

fuzz/fuzz_targets/archive.rs Outdated Show resolved Hide resolved

fuzz/fuzz_targets/archive.rs Outdated Show resolved Hide resolved

Update fuzzer

cbc6f86

Signed-off-by: Arthur Chan <[email protected]>

alexcrichton reviewed Nov 2, 2024

View reviewed changes

fuzz/fuzz_targets/archive.rs Outdated Show resolved Hide resolved

alexcrichton reviewed Nov 2, 2024

View reviewed changes

fuzz/fuzz_targets/tar.rs Outdated Show resolved Hide resolved

Fix fuzzers and Cargo.toml

3d2d0d0

Signed-off-by: Arthur Chan <[email protected]>

cgwalters approved these changes Nov 4, 2024

View reviewed changes

fuzz/fuzz_targets/archive.rs Show resolved Hide resolved

cgwalters merged commit 9189d3c into alexcrichton:main Nov 4, 2024
7 checks passed

arthurscchan deleted the oss-fuzz branch November 4, 2024 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OSS-Fuzz: OSS-Fuzz fuzzing integration #385

OSS-Fuzz: OSS-Fuzz fuzzing integration #385

arthurscchan commented Oct 23, 2024

cgwalters left a comment

cgwalters commented Oct 24, 2024

cgwalters left a comment

alexcrichton commented Oct 24, 2024

arthurscchan commented Oct 28, 2024

arthurscchan commented Oct 29, 2024

cgwalters commented Nov 1, 2024

NobodyXu commented Nov 2, 2024

alexcrichton left a comment

arthurscchan commented Nov 4, 2024

cgwalters left a comment

arthurscchan commented Nov 4, 2024 •

edited

Loading

OSS-Fuzz: OSS-Fuzz fuzzing integration #385

OSS-Fuzz: OSS-Fuzz fuzzing integration #385

Conversation

arthurscchan commented Oct 23, 2024

cgwalters left a comment

Choose a reason for hiding this comment

cgwalters commented Oct 24, 2024

cgwalters left a comment

Choose a reason for hiding this comment

alexcrichton commented Oct 24, 2024

arthurscchan commented Oct 28, 2024

arthurscchan commented Oct 29, 2024

cgwalters commented Nov 1, 2024

NobodyXu commented Nov 2, 2024

alexcrichton left a comment

Choose a reason for hiding this comment

arthurscchan commented Nov 4, 2024

cgwalters left a comment

Choose a reason for hiding this comment

arthurscchan commented Nov 4, 2024 • edited Loading

arthurscchan commented Nov 4, 2024 •

edited

Loading