Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not consider daemonset resources for node pool if the daemonset matches the nodepool using a node Affininty #1337

Open
myaser opened this issue Jun 20, 2024 · 3 comments · May be fixed by #1727
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@myaser
Copy link

myaser commented Jun 20, 2024

Description

Observed Behavior:
I craeted a daemonset that has the following nodeAffinity

      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: zalando.org/nvidia-gpu
                operator: Exists
            - matchExpressions:
              - key: karpenter.k8s.aws/instance-gpu-manufacturer
                operator: In
                values:
                - nvidia

observing karpenter logs and experimenting with scheduling pods of different sizes, I could find that karpenter's calculations for daemonset resources excludes this pod

I could confirm this by checking the code.

here it is reading only the first of the affinities relying on an outer loop to remove the first affinity and continue with the next one

but, this is not happening for daemonset calculation as shown here

to validate my findings, I flipped the affinities order and the calculations were corrected

      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: karpenter.k8s.aws/instance-gpu-manufacturer
                operator: In
                values:
                - nvidia
            - matchExpressions:
              - key: zalando.org/nvidia-gpu
                operator: Exists

Expected Behavior:
all affinities should be considered for calculating daemonset resources

Reproduction Steps (Please include YAML):

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: datalab-workloads
spec:
  disruption:
    budgets:
    - nodes: 10%
    consolidationPolicy: WhenUnderutilized
    expireAfter: Never
  template:
    metadata:
    spec:
      kubelet:
        clusterDNS:
        - 10.0.1.100
        cpuCFSQuota: false
        kubeReserved:
          cpu: 100m
          memory: 282Mi
        maxPods: 32
        systemReserved:
          cpu: 100m
          memory: 164Mi
      nodeClassRef:
        name: datalab-workloads
      requirements:
      - key: node.kubernetes.io/instance-type
        operator: In
        values:
        - g4dn.xlarge
        - g4dn.4xlarge
        - g4dn.12xlarge
        - g4dn.16xlarge
        - g4dn.metal
        - g5.xlarge
        - g5.4xlarge
        - g5.16xlarge
        - g5.24xlarge
        - g5.48xlarge
        - g3s.xlarge
        - g4dn.2xlarge
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - spot
        - on-demand
      - key: kubernetes.io/arch
        operator: In
        values:
        - arm64
        - amd64
      - key: topology.kubernetes.io/zone
        operator: In
        values:
        - eu-central-1a
        - eu-central-1b
        - eu-central-1c
      startupTaints:
      - effect: NoSchedule
        key: zalando.org/node-not-ready
      taints:
      - effect: NoSchedule
        key: dedicated
        value: datalab-workloads

Versions:

  • karpenter Version: 0.36.2
  • Kubernetes Version (kubectl version): v1.30.2
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@myaser myaser added the kind/bug Categorizes issue or PR as related to a bug. label Jun 20, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 20, 2024
@jigisha620
Copy link
Contributor

Hi @myaser, I was able to track this down. It definitely is a bug since we are not considering all affinities for calculating daemonset resources. Do you have a workaround for your use-case that can get you unblocked on this?

@myaser
Copy link
Author

myaser commented Jun 26, 2024

Hi @myaser, I was able to track this down. It definitely is a bug since we are not considering all affinities for calculating daemonset resources. Do you have a workaround for your use-case that can get you unblocked on this?

for my specific case, I just flipped the order of the affinities. since the other affinity is meant for cluster autoscaler.

since we are in a middle of migration we run both CA (legacy) and Karpenter together.

@jonathan-innis
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
4 participants