Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linear] Ensure a resource is only serialized/hashed at most once to reply it #21

Merged
merged 4 commits into from
Dec 23, 2024

Conversation

valerian-roche
Copy link
Collaborator

@valerian-roche valerian-roche commented Dec 18, 2024

Currently the go-control-plane caches (both linear and snapshots) will serialize the resource as many time as there are clients receiving it.
This is an issue with control-planes watched by a lot of clients, especially with large resources (e.g. endpoints)

This PR ensures that the serialization occurs at most once per resource, in all cases (sotw/delta watches and linear/snapshot cache). A resource will still only be serialized if:

  • it is returned to at least one client.
  • its version had to be considered to be returned (i.e. the resource was added again with the same stable version).

@valerian-roche valerian-roche changed the title Vr/serialize once lock [Linear] Ensure a resource is only serialized/hashed at most once to reply it Dec 18, 2024
Copy link

@rob05c rob05c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

cacheVersion: cacheVersion,
}
}

// getMarshaledResource lazily marshals the resource and returns the bytes.
Copy link

@zhiyanfoo zhiyanfoo Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say lazily you mean returns the marshalled resource right if it exists, otherwise it marshals it?

When I think of lazily returns i think of haskell style lazy returns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with Haskell, but here it is lazy computed within a getter.
In this case it's painfully returning an error, which can trigger if the user passes in a type not known from the protoregistry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@atollena atollena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a suggestion to make things simpler.

Comment on lines 45 to 50
// marshaledResource contains the marshaled version of the resource.
// It is lazy initialized and should be accessed through getMarshaledResource
marshaledResource []byte

// mu is the mutex used to lazy compute the marshaled resource and stable version.
mu sync.Mutex
Copy link

@atollena atollena Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified quite a bit by using sync.Once with a custom struct that ensures we compute the stable version and the serialized proto once:

[...]
	// marshalResource computes the resource bytes and version...
	marshalResource func() (ser, error)
}

In the initialization:

[...]
		marshalResource: sync.OnceValues(func() (s ser, err error) {
			s.res, err = MarshalResource(res)
			h := sha256.New()
			h.Write(c.res)
			s.v = hex.EncodeToString(h.Sum(nil))
			return c, err
		}),

And then you can freely call marshalResource anywhere without having to worry about locking.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion, updated. In the future I'd like to support user provided too so it might become a bit more complex but should still be clearer than mutex-based

Copy link
Collaborator Author

valerian-roche commented Dec 23, 2024

Merge activity

  • Dec 23, 1:45 PM EST: A user started a stack merge that includes this pull request via Graphite.
  • Dec 23, 1:45 PM EST: Graphite couldn't merge this PR because it failed for an unknown reason (Stack merges are not currently supported for forked repositories. Please create a branch in the target repository in order to merge).

return c.stableVersion, nil
// getMarshaledResource lazily marshals the resource and returns the bytes.
func (c *cachedResource) getMarshaledResource() ([]byte, error) {
return sync.OnceValues(func() ([]byte, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to understand how this effectively synchronizes the calls here. If you call the function returned by sync.OnceValues, then different calls to getMarshaledResource are not synchronized. AFAICT, the call to sync.OnceValues is either unnecessary or not doing its job.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Missed the implementation part. I did a quick test to confirm it's now serialized only once

@valerian-roche valerian-roche merged commit 966204d into DataDog:dd/main Dec 23, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants