Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzzing bug fixes #3989

Open
wants to merge 759 commits into
base: main
Choose a base branch
from
Open

Conversation

shankarseal
Copy link
Collaborator

@shankarseal shankarseal commented Nov 8, 2024

Description

This PR fixes #4010. These are a collection of bugs found by fuzz testing.
This PR also fixes #3992, #3998 and #4001.

Testing

New fuzz tests were added.

Documentation

No.

Installation

No.

gtrevi and others added 30 commits January 23, 2024 15:00
Fix CICD build failures.
This reverts commit e7311fe.
…o-Update requirements.

This PR changes the following with the eBPF VM Extension:

- Adds workarounds for handling the new Auto-Update requirements with the existing VM Platform context handling limitations:
  - The *Update command* now handles atomically the entire updating process.
  - Before running, the *Update command* backs-up the current installation (can be the initial WinPA provisioning, or from a previous VM Extension).
  - Adds handling of a local "Updating" persistent state (as a registry key), in order to compensate the lack of a global state handled by the VM Extension platform.
  - The *Update operation* will fail the overall update if restarting the GuestProxyAgent service fails and, upon any failure, it will attempt to rollback to the previous installation.
    >Background: although restarting the GuestProxyAgent service is an extended operation that is not part of the eBPF VM Extension's scope, we should account success/failure of this operation in the overall update status, so that the VM Agent will stop rolling out updates to Azure VMs.
  - Allows the update to downgrade the current installation only if in the context of a rollback. Any other downgrade attempt will fail the *Update operation*.
- Adds more error return codes, defined in global constants for improved logging and troubleshooting.

Related work items: #158377
libs/execution_context/ebpf_core.c Outdated Show resolved Hide resolved
libs/execution_context/ebpf_program.c Outdated Show resolved Hide resolved
libs/runtime/ebpf_bitmap.c Outdated Show resolved Hide resolved
libs/runtime/ebpf_bitmap.h Show resolved Hide resolved
scripts/config_test_vm.psm1 Outdated Show resolved Hide resolved
tests/libfuzzer/include/fuzz_helper_function.hpp Outdated Show resolved Hide resolved
tests/libfuzzer/include/fuzz_helper_function.hpp Outdated Show resolved Hide resolved
tests/libfuzzer/include/fuzz_helper_function.hpp Outdated Show resolved Hide resolved
tests/libfuzzer/include/fuzz_helper_function.hpp Outdated Show resolved Hide resolved
tests/sample/sample.vcxproj.filters Outdated Show resolved Hide resolved
@Alan-Jowett
Copy link
Member

In my view "These are a collection of bug fixes found by fuzz testing." is not a sufficient PR description. I don't know how to review this since I don't know what's being fixed so I can't tell whether it fixes them right. (And I suspect not since the tests appear to be failing.)

I second this. As much as it is going to be painful to do, we should file each internally found bug in github, and fix them in the open.

@dthaler
Copy link
Collaborator

dthaler commented Nov 14, 2024

Issue #4001 is fixed by PR #4002 so I'd recommend that be merged first and this be rebased, or else remove it from this PR.

@dthaler
Copy link
Collaborator

dthaler commented Nov 14, 2024

Issue #3992 is fixed by PR #3991 so I'd recommend that be merged first and this be rebased, or else remove it from this PR.

@dthaler dthaler added the security Related to security hardening label Nov 14, 2024
@dthaler
Copy link
Collaborator

dthaler commented Nov 14, 2024

Issue #3998 is just a perf optimization according to that issue, and is apparently non-trivial, so should be separated from security bug fixes. The principle is that a fix for #4010 would be a backport candidate (e.g., to v.0.19.1) but a PR for #3998 would not be since it doesn't fix any security issue that affects production.

@shankarseal
Copy link
Collaborator Author

shankarseal commented Nov 15, 2024

Issue #3998 is just a perf optimization according to that issue, and is apparently non-trivial, so should be separated from security bug fixes. The principle is that a fix for #4010 would be a backport candidate (e.g., to v.0.19.1) but a PR for #3998 would not be since it doesn't fix any security issue that affects production.

0.19.1 already has these changes that are being merged to main. Without fix to 3998 first, many of the issues would not be even found, that this PR is fixing as otherwise the fuzz tests took too long to run and often timed out without finding any issues.

Hopefully, this is the first and last time where we have to merge so many fixes together. Please resolve this comment so that we can get this important payload merged.

@shankarseal
Copy link
Collaborator Author

In my view "These are a collection of bug fixes found by fuzz testing." is not a sufficient PR description. I don't know how to review this since I don't know what's being fixed so I can't tell whether it fixes them right. (And I suspect not since the tests appear to be failing.)

I second this. As much as it is going to be painful to do, we should file each internally found bug in github, and fix them in the open.

Due to special circumstances we had to take these set of changes this way, after discussing with Microsoft security experts. Hopefully we will only rarely get security bugs, which we will fix according to the MSRC process.

@shankarseal
Copy link
Collaborator Author

Issue #4001 is fixed by PR #4002 so I'd recommend that be merged first and this be rebased, or else remove it from this PR.

#4002 is changed to draft.

@shankarseal
Copy link
Collaborator Author

Issue #3992 is fixed by PR #3991 so I'd recommend that be merged first and this be rebased, or else remove it from this PR.

#3991 is changed to draft.

@shankarseal
Copy link
Collaborator Author

In my view "These are a collection of bug fixes found by fuzz testing." is not a sufficient PR description. I don't know how to review this since I don't know what's being fixed so I can't tell whether it fixes them right. (And I suspect not since the tests appear to be failing.)

PR description is updated with more context and a new issue #4010 is created that lists all the defects found by fuzzing

@shankarseal shankarseal added this pull request to the merge queue Nov 16, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 16, 2024
@Alan-Jowett Alan-Jowett added this pull request to the merge queue Nov 16, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 16, 2024
@shankarseal shankarseal added this pull request to the merge queue Nov 16, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security Related to security hardening
Projects
None yet
10 participants