-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[occm] Improve route controller reconciling to ensure the cluster's nodes can access each other #2484
base: master
Are you sure you want to change the base?
Conversation
Hi @jeffyjf. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Why did this ended up in Routes controller? The issue description feels like this is a general issue in cases CAPO is missing SG rules to allow interconnectivity of Node. |
According to the offical document:
I think that this is route controller's duty. I user don't activate route controller, the interconnectivity of pods should be ensured by other mechanisms. |
/ok-to-test |
6b498a6
to
f9ab6ef
Compare
3c6af74
to
d639248
Compare
The PR needs a rebase. However the #2499 is on the way, and I adopted your |
d639248
to
eaaea60
Compare
/hold |
eaaea60
to
35f8dca
Compare
/remove hold |
87dea84
to
81236c6
Compare
…onnectivity In oreder to ensure the connectivity between different nodes, route controller need to do the following things: 1. Check and set openstack router's route rules, so that the packets can be forwarded to correct nodes. 2. Check and set the node port's AllowAddressPair, to permit the packets from the pods pass through the node's port and leave the node. 3. Check and set openstack security group's rules, so that the nodes that bind the security group permit the packets from other nodes enter into. But, the previous occm codes just check router's route rules, just set router's route and AllowAddressPair, this patch completed the other steps.
81236c6
to
ee22e72
Compare
/retest pull-cloud-provider-openstack-test |
@jeffyjf: The
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test pull-cloud-provider-openstack-test |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the PR. Could you take a look at comments I just left?
@@ -483,7 +485,7 @@ func (os *OpenStack) Routes() (cloudprovider.Routes, bool) { | |||
return nil, false | |||
} | |||
|
|||
r, err := NewRoutes(os, network, netExts["extraroute-atomic"], netExts["allowed-address-pairs"]) | |||
r, err := NewRoutes(os, network, netExts["extraroute-atomic"], netExts["allowed-address-pairs"], os.routeOpts.AutoConfigSecurityGroup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the os
is already passed to the NewRoutes
func. There is no need to pass the os.routeOpts.AutoConfigSecurityGroup
.
if err != nil { | ||
if errors.IsNotFound(err) { | ||
mc := metrics.NewMetricContext("security_group", "create") | ||
klog.V(2).InfoS("Create node security group", "name", sgName) | ||
group, err := groups.Create(r.network, groups.CreateOpts{Name: sgName}).Extract() | ||
if mc.ObserveRequest(err) != nil { | ||
return err | ||
} | ||
sgId = group.ID | ||
} else { | ||
return err | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err != nil { | |
if errors.IsNotFound(err) { | |
mc := metrics.NewMetricContext("security_group", "create") | |
klog.V(2).InfoS("Create node security group", "name", sgName) | |
group, err := groups.Create(r.network, groups.CreateOpts{Name: sgName}).Extract() | |
if mc.ObserveRequest(err) != nil { | |
return err | |
} | |
sgId = group.ID | |
} else { | |
return err | |
} | |
} | |
if err != nil { | |
if !errors.IsNotFound(err) { | |
return err | |
} | |
mc := metrics.NewMetricContext("security_group", "create") | |
klog.V(2).InfoS("Create node security group", "name", sgName) | |
group, err := groups.Create(r.network, groups.CreateOpts{Name: sgName}).Extract() | |
if mc.ObserveRequest(err) != nil { | |
return err | |
} | |
r.nodeSecurityGroupId = group.ID | |
return nil | |
} |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffyjf could you please rebase + see code review comments?
if r.autoConfigSecurityGroup { | ||
// check if the node security group bind to the port | ||
nodePortBindSecurityGroup := false | ||
for _, sg := range port.SecurityGroups { | ||
if sg == r.nodeSecurityGroupId { | ||
nodePortBindSecurityGroup = true | ||
break | ||
} | ||
} | ||
if !nodePortBindSecurityGroup { | ||
return false | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if r.autoConfigSecurityGroup { | |
// check if the node security group bind to the port | |
nodePortBindSecurityGroup := false | |
for _, sg := range port.SecurityGroups { | |
if sg == r.nodeSecurityGroupId { | |
nodePortBindSecurityGroup = true | |
break | |
} | |
} | |
if !nodePortBindSecurityGroup { | |
return false | |
} | |
} | |
// check if the node security group bind to the port | |
if if r.autoConfigSecurityGroup && !slices.Contains(port.SecurityGroups, r.nodeSecurityGroupId) | |
return false | |
} |
if r.allowedAddressPairs { | ||
// check whether the related AllowAddressPair is existing | ||
for _, addrPair := range port.AllowedAddressPairs { | ||
if addrPair.IPAddress == route.DestinationCIDR { | ||
return true | ||
} | ||
} | ||
return false | ||
} | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if r.allowedAddressPairs { | |
// check whether the related AllowAddressPair is existing | |
for _, addrPair := range port.AllowedAddressPairs { | |
if addrPair.IPAddress == route.DestinationCIDR { | |
return true | |
} | |
} | |
return false | |
} | |
return true | |
// check whether the related AllowAddressPair is existing | |
if r.allowedAddressPairs && !slices.Contains(port.AllowedAddressPairs, route.DestinationCIDR) { | |
return false | |
} | |
return true |
@@ -177,6 +177,9 @@ The options in `Global` section are used for openstack-cloud-controller-manager | |||
|
|||
* `router-id` | |||
Specifies the Neutron router ID to activate [route controller](https://kubernetes.io/docs/concepts/architecture/cloud-controller/#route-controller) to manage Kubernetes cluster routes. | |||
`auto-config-node-security-group` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`auto-config-node-security-group` | |
* `auto-config-node-security-group` |
if port.PortSecurityEnabled { | ||
if !r.checkPortSecurityRules(port, item) { | ||
continue | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if port.PortSecurityEnabled { | |
if !r.checkPortSecurityRules(port, item) { | |
continue | |
} | |
} | |
if port.PortSecurityEnabled && !r.checkPortSecurityRules(port, item) { | |
continue | |
} |
r.Lock() | ||
defer r.Unlock() | ||
mc := metrics.NewMetricContext("router", "get") | ||
router, err := routers.Get(r.network, r.os.routeOpts.RouterID).Extract() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're adding a complexity here. before, the router API was called only once when atomic API was not available. Now this API is called even atomic method is available.
What this PR does / why we need it:
In order to the nodes of one cluster can access each other, route controller need to check and set three things:
The current codes just check router's route rule when controller call
ListRoutes
and just setRoute
andAllowAddressPair
when controller callCreateRoute
. This PR complements all of the other works.Which issue this PR fixes(if applicable):
fixes #2482
Special notes for reviewers:
This PR lack of unit tests, because of I found occm lack of a mechanism to mock openstack client. Further more the whole occm lack of lots of unit tests due to the reason. I plan to dive deeper into gophercloud in the next several days try to study whether it is possible to mock it. If it is possible I'll commit anohter PR to add the unit tests.
Release note: