Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sotw][linear] Fix missing watch cleanup in linear cache for sotw watches subscribing to multiple resources #4

Merged

Conversation

valerian-roche
Copy link
Collaborator

@valerian-roche valerian-roche commented Jan 10, 2024

Datadog-branch specific PR for envoyproxy#854

When using the linear cache in sotw mode, there are multiple issues when using non-wildcard requests
This PR addresses two issues potentially impactful:

  • the request provided to callbacks and the servers is a fake one not including the actual data. This is confusing and can generate issues (e.g. the node section is nil while user might expect it set when using the simple cache)
  • if the request is not wildcard, the cache stores the watch for each requested resource, but only clear it from the updated resources, leaving stale channels potentially full. If two updates come prior to the server consuming the channel or closing it fully, the code will block on the channel under the mutex and the server calling CreateWatch will also deadlock. As this mutex is common to all nodes the entire cache would then be deadlocked. The current implementation of sotw server makes it unlikely except if multiple updates are sent to the cache (e.g. iterating on UpdateResource instead of using UpdateResources). Any server implementation which would consider a watch properly cleaned if triggered (which is the initial design of the cache interface) would leak the channel and potentially trigger a deadlock later on

Other issues are not yet tackled in this PR as they depend for some on changes in the cache interface proposed in other PRs

  • properly trigger watches when a new resource is requested in the request but the version is unchanged
  • support for explicit wildcard request

Other issues to be addressed in other PRs

  • resource ordering when using Mux Cache in ADS mode
  • full updates or partial updates in sotw depending on resource type (lds/cds with full update, partial for rds/eds) and not on cache/protocol type (current implementation)

…ches subscribing to multiple resources

Properly return the request in sotw responses to allow proper handling in callbacks

Signed-off-by: Valerian Roche <[email protected]>
@valerian-roche valerian-roche merged commit 4709834 into DataDog:dd/sotw-fixes Jan 16, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants