[occm] Add Openstack server hostId as k8s node label #2579

chess-knight · 2024-04-22T14:19:52Z

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
This issue follows a discussion in kubernetes-sigs/cluster-api-provider-openstack#1605, as a request to automatically label nodes with underlying hostId information, so e.g. workload can be scheduled on different physical hosts. It can be used as a topology differentiator when all other topology labels are the same.

Anything else we need to know?:
Issue kubernetes/cloud-provider#67 is closed and potentially resolved by kubernetes/kubernetes#123223 so now based on comment kubernetes/cloud-provider#67 (comment), AdditionalLabels with hostId information can be added to InstanceMetadata:

cloud-provider-openstack/pkg/openstack/instancesv2.go

Lines 136 to 142 in dab0f06

    
           return &cloudprovider.InstanceMetadata{ 
        
           	ProviderID:    i.makeInstanceID(&server.Server), 
        
           	InstanceType:  instanceType, 
        
           	NodeAddresses: addresses, 
        
           	Zone:          server.AvailabilityZone, 
        
           	Region:        i.region, 
        
           }, nil

This information should be available in the server struct AFAIK, so there should not be too much work I think now.

One potential issue that I see is live migration, e.g. see #1801, where occm will have to update the label because hostId will change.

Environment:

openstack-cloud-controller-manager(or other related binary) version:
OpenStack version:
Others:

dulek · 2024-04-26T11:26:18Z

This seems like a valid request from my point of view. @mdbooth, what do you think?

@chess-knight: Are you planning to contribute implementation?

dulek · 2024-04-26T11:27:09Z

@gryf, @stephenfin: This sounds like a low hanging fruit you can grab to get up to speed with CPO code.

zetaab · 2024-05-25T06:41:23Z

is hostid really available to normal user in openstack? I do not have access now to openstack, but if I remember correctly normal user cannot see that?

chess-knight · 2024-05-27T06:05:42Z

According to our research in SovereignCloudStack/issues#540, hostId should be available for all users. You can see e.g. in the nova code, that it is supported from API version 2.62 https://opendev.org/openstack/nova/commit/c2f7d6585818c04e626aa4b6c292e5c2660cb8b3. hostId is different from host. The host is available only to admins but hostId is a hash of (project_id + host) so it can be available to all.

zetaab · 2024-05-27T06:26:50Z

actually I can see both hostId and host_id which has same value. Perhaps that label could be added if it exists. This can be added in https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/openstack/instancesv2.go#L136-L142 as AdditionalLabels https://github.com/kubernetes/cloud-provider/blob/d2f5e75a5fd6d31f75b1519fe20879b1ab5347b8/cloud.go#L300

chess-knight · 2024-05-28T04:37:07Z

actually I can see both hostId and host_id which has same value. Perhaps that label could be added if it exists. This can be added in https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/openstack/instancesv2.go#L136-L142 as AdditionalLabels https://github.com/kubernetes/cloud-provider/blob/d2f5e75a5fd6d31f75b1519fe20879b1ab5347b8/cloud.go#L300

Yes, exactly as I originally wrote in the issue, thanks.
As @dulek said, it shouldn't be hard to implement this, but IMO discussion is needed here, on what should/will happen when live migration happens. When we initially set this custom label, it may not be correct later because the underlying host might change. So, should we consider this custom label only as "initial_host_id"? Or is there anything that we can do to reconcile it?

zetaab · 2024-05-28T10:10:00Z

we are not using live migrations at all, so difficult to say how it works.

mdbooth · 2024-05-28T12:46:39Z

IMHO the node controller should update node labels to reflect their current reality, i.e. a live migration will trigger node relabelling the next time the node is reconciled. Most things currently running on the Node are unlikely to act on it, but:

Newly scheduled workloads can act on it
Existing workloads at least have the possibility to notice that they're in the wrong place now
We should document the known limitations and recommend an alternative (node reprovisioning)

We should also validate this decision with the cloud-provider folks in case there are caveats we're not aware of.

mdbooth · 2024-05-28T12:48:58Z

I'm very much in favour of adding a HostID label. I can confirm that HostID is unprivileged. It's an opaque value which can't be used to determine anything about the host including its name.

artificial-intelligence · 2024-06-06T08:58:43Z

I'm very much in favour of adding a HostID label. I can confirm that HostID is unprivileged. It's an opaque value which can't be used to determine anything about the host including its name.

Hi,

could you - or anybody - please clarify: You say this HostID can't be used to determine anything about the host, but from the name I would suppose it's some kind of unique(?) distinct value per host, no?

So can't I at least infer, that when the HostID changes, the underlying host has changed?

If the answer to the above is "no", so you can't infer anything from this ID, I don't see where adding it would bring any benefit, at least for our use case in the Sovereign Cloudstack project.

So any clarification around this would be highly appreciated, thanks!

mdbooth · 2024-06-07T09:38:04Z

I'm very much in favour of adding a HostID label. I can confirm that HostID is unprivileged. It's an opaque value which can't be used to determine anything about the host including its name.

Hi,

could you - or anybody - please clarify: You say this HostID can't be used to determine anything about the host, but from the name I would suppose it's some kind of unique(?) distinct value per host, no?

It's a sha224 of project_id and hostname: https://github.com/openstack/nova/blob/7dc4b1ea627d864a0ee2745cc9de4336fc0ba7b5/nova/utils.py#L1028-L1043

So hostID can't be compared between tenants.

So can't I at least infer, that when the HostID changes, the underlying host has changed?

If the answer to the above is "no", so you can't infer anything from this ID, I don't see where adding it would bring any benefit, at least for our use case in the Sovereign Cloudstack project.

So any clarification around this would be highly appreciated, thanks!

@stephenfin may be able to confirm that hostID will change if a VM live migrates, but I'm pretty sure that it would.

In general, k8s isn't going to handle a live migration well because, in general, we don't continuously reconcile the placement of things which have already been scheduled.

I think the value of HostID to a kubernetes cluster is the ability to schedule Pods on different underlying hypervisors. This means that an end-user can ensure their workload can survive a maintenance outage of a single hypervisor.

chess-knight · 2024-06-07T10:46:03Z

I think the value of HostID to a kubernetes cluster is the ability to schedule Pods on different underlying hypervisors. This means that an end-user can ensure their workload can survive a maintenance outage of a single hypervisor.

I am thinking of the following scenario:

OCCM will add the correct hostId label when the node starts
Live migration happens
hostId label is incorrect now so worker nodes with different hostId can be placed on the same host. And when the user uses this incorrect label, the pods can end up on the same host.

mdbooth · 2024-06-07T12:06:47Z

I think the value of HostID to a kubernetes cluster is the ability to schedule Pods on different underlying hypervisors. This means that an end-user can ensure their workload can survive a maintenance outage of a single hypervisor.

I am thinking of the following scenario:

OCCM will add the correct hostId label when the node starts

Live migration happens

hostId label is incorrect now so worker nodes with different hostId can be placed on the same host. And when the user uses this incorrect label, the pods can end up on the same host.

I expect it to be updated.

However, live migrating k8s hosts already violates the scheduling constraints of everything which was running on it. Live migrating k8s workers is not a good idea if it can be avoided. Simply draining the node and shutting it down during maintenance is preferrable.

chess-knight · 2024-06-07T12:28:24Z

I think the value of HostID to a kubernetes cluster is the ability to schedule Pods on different underlying hypervisors. This means that an end-user can ensure their workload can survive a maintenance outage of a single hypervisor.

I am thinking of the following scenario:

OCCM will add the correct hostId label when the node starts

Live migration happens

hostId label is incorrect now so worker nodes with different hostId can be placed on the same host. And when the user uses this incorrect label, the pods can end up on the same host.

However, live migrating k8s hosts already violates the scheduling constraints of everything which was running on it. Live migrating k8s workers is not a good idea if it can be avoided. Simply draining the node and shutting it down during maintenance is preferrable.

I agree with you, that live-migrating k8s node is not a good idea.

I expect it to be updated.

But which controller should be responsible for that? Right now, I am not aware of any.

mdbooth · 2024-06-12T11:55:12Z

But which controller should be responsible for that? Right now, I am not aware of any.

This PR is requesting that OpenStack CCM sets it, so OpenStack CCM would also be responsible for updating it. IIRC there is now a mechanism for returning arbitrary node labels, but I don't recall what it is.

chess-knight · 2024-06-17T05:20:07Z

But which controller should be responsible for that? Right now, I am not aware of any.

This PR is requesting that OpenStack CCM sets it, so OpenStack CCM would also be responsible for updating it. IIRC there is now a mechanism for returning arbitrary node labels, but I don't recall what it is.

Do you mean AdditionalLabels in the InstanceMetadata mentioned in this issue or something else? If so, I still don't know how these labels can be updated.

chess-knight · 2024-07-19T08:09:38Z

I created PR so the discussion can move on. Can someone try it, please? You can use registry.scs.community/occm-rh/openstack-cloud-controller-manager:v1.30.0-8-ga00ee1d8 image.

chess-knight · 2024-08-01T10:17:23Z

#2628 is approved. Should we merge it immediately and close this issue or if someone wants to look at it? I am not able to test migrations where host-id will change. I tried only deletion of the label with kubectl command and this additional label is not reconciled back into the place, so I assume that live migration will have the same effect(wrong host-id label after).
Maybe we can just document this new host-id label with my observations and proceed...

kayrus · 2024-08-01T10:45:50Z

@chess-knight I agree, the label must be reconciled and updated once the node is live-migrated.

chess-knight · 2024-08-01T11:18:54Z

@chess-knight I agree, the label must be reconciled and updated once the node is live-migrated.

I am not sure if OCCM is capable of doing that. Maybe after all we should go back into the original issue and implement it in the CAPO, where machine reconciliation happens(I hope so). CAPO can introduce e.g. node.cluster.x-k8s.io/host-id label, set it on the Machine object and it will be propagated to the k8s Node labels.
Or just document this limitation, something like that host-id can serve as the "initial" host-id without any guarantees in the future.

mdbooth · 2024-09-05T09:29:02Z

@chess-knight As well as reconciling new Nodes, the node controller resyncs nodes periodically: https://github.com/kubernetes/kubernetes/blob/03fe89c2339a1582733649faab5f5df471f65f09/staging/src/k8s.io/cloud-provider/controllers/node/node_controller.go#L191-L198

However, it looks like that job:

Ignores tainted nodes. These are handled separately by sync(), which sets all values.
Only sets Addresses

It sounds like if we want to continuously reconcile zone information that should be a discussion with the cloud-provider folks. Maybe @aojea can let us know if this has been discussed before, and if not the best place to start the discussion.

My view: Kubernetes doesn't expect zone information to change, and in general will not respond to changes in zone information. We should advise users that there are alternatives which will give better behaviour. Despite that, zone information can still change, which means it will occasionally change. An example is a managed cloud service where the user has no influence over the migration of workloads. By updating the zone information on the Node when it does change we:

at least make it detectable
allow new workloads to be created in the correct place
I believe we should continuously reconcile zone information, including node labels.

For now, this is an edge case. Lets return HostID in the instance metadata as is done by #2628. This is an immediate win for anybody wanting to schedule with hypervisor anti-affinity. The problem of continuous reconciliation is somewhat independent as it covers more than just the HostID label.

chess-knight · 2024-09-05T11:01:46Z

Hi @mdbooth,
thank you for your review.

@chess-knight As well as reconciling new Nodes, the node controller resyncs nodes periodically: https://github.com/kubernetes/kubernetes/blob/03fe89c2339a1582733649faab5f5df471f65f09/staging/src/k8s.io/cloud-provider/controllers/node/node_controller.go#L191-L198

However, it looks like that job:

Ignores tainted nodes. These are handled separately by sync(), which sets all values.

Only sets Addresses

Interestingly, the comment suggests reconcile the nodes addresses and labels, not only addresses.

It sounds like if we want to continuously reconcile zone information that should be a discussion with the cloud-provider folks. Maybe @aojea can let us know if this has been discussed before, and if not the best place to start the discussion.

My view: Kubernetes doesn't expect zone information to change, and in general will not respond to changes in zone information. We should advise users that there are alternatives which will give better behaviour. Despite that, zone information can still change, which means it will occasionally change. An example is a managed cloud service where the user has no influence over the migration of workloads. By updating the zone information on the Node when it does change we:

at least make it detectable

allow new workloads to be created in the correct place
I believe we should continuously reconcile zone information, including node labels.

For now, this is an edge case. Lets return HostID in the instance metadata as is done by #2628. This is an immediate win for anybody wanting to schedule with hypervisor anti-affinity. The problem of continuous reconciliation is somewhat independent as it covers more than just the HostID label.

I agree, that updating labels needs to be discussed. Maybe it can be configurable on/off behaviour. Do you think that I should write also some docs about the HostID label, so users are aware of it?

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 22, 2024

chess-knight mentioned this issue Apr 24, 2024

Create v2 of node distribution standard (issues/#494) SovereignCloudStack/standards#524

Merged

chess-knight linked a pull request Jul 19, 2024 that will close this issue

[occm] Add Openstack server hostId as k8s node label #2628

Open

This was referenced Aug 7, 2024

Add Openstack compute labels to k8s nodes nodes kubernetes-sigs/cluster-api-provider-openstack#1605

Closed

✨ Add OpenStackServerGroup CRD and Controller kubernetes-sigs/cluster-api-provider-openstack#1912

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[occm] Add Openstack server hostId as k8s node label #2579

[occm] Add Openstack server hostId as k8s node label #2579

chess-knight commented Apr 22, 2024

dulek commented Apr 26, 2024

dulek commented Apr 26, 2024

zetaab commented May 25, 2024 •

edited

Loading

chess-knight commented May 27, 2024

zetaab commented May 27, 2024 •

edited

Loading

chess-knight commented May 28, 2024

zetaab commented May 28, 2024

mdbooth commented May 28, 2024

mdbooth commented May 28, 2024

artificial-intelligence commented Jun 6, 2024

mdbooth commented Jun 7, 2024

chess-knight commented Jun 7, 2024

mdbooth commented Jun 7, 2024

chess-knight commented Jun 7, 2024

mdbooth commented Jun 12, 2024

chess-knight commented Jun 17, 2024

chess-knight commented Jul 19, 2024 •

edited

Loading

chess-knight commented Aug 1, 2024

kayrus commented Aug 1, 2024

chess-knight commented Aug 1, 2024

mdbooth commented Sep 5, 2024

chess-knight commented Sep 5, 2024

[occm] Add Openstack server hostId as k8s node label #2579

[occm] Add Openstack server hostId as k8s node label #2579

Comments

chess-knight commented Apr 22, 2024

dulek commented Apr 26, 2024

dulek commented Apr 26, 2024

zetaab commented May 25, 2024 • edited Loading

chess-knight commented May 27, 2024

zetaab commented May 27, 2024 • edited Loading

chess-knight commented May 28, 2024

zetaab commented May 28, 2024

mdbooth commented May 28, 2024

mdbooth commented May 28, 2024

artificial-intelligence commented Jun 6, 2024

mdbooth commented Jun 7, 2024

chess-knight commented Jun 7, 2024

mdbooth commented Jun 7, 2024

chess-knight commented Jun 7, 2024

mdbooth commented Jun 12, 2024

chess-knight commented Jun 17, 2024

chess-knight commented Jul 19, 2024 • edited Loading

chess-knight commented Aug 1, 2024

kayrus commented Aug 1, 2024

chess-knight commented Aug 1, 2024

mdbooth commented Sep 5, 2024

chess-knight commented Sep 5, 2024

zetaab commented May 25, 2024 •

edited

Loading

zetaab commented May 27, 2024 •

edited

Loading

chess-knight commented Jul 19, 2024 •

edited

Loading