Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cinder-csi-plugin] csi-cinder storage capacity #2597

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sergelogvinov
Copy link
Contributor

@sergelogvinov sergelogvinov commented May 16, 2024

Available capacity of disk storage

What this PR does / why we need it:

To ensure Kubernetes scheduler selects the optimal zone/region based on available disk capacity limits of the account.
Also to have statistics/alerts in cluster for the specific storageClass

This PR doesn't aim to solve the real capacity issue because Kubernetes CSI doesn't have permission to see all the details of the disk infrastructure.

Which issue this PR fixes(if applicable):
for #2035, #2551

Special notes for reviewers:

kubectl get csistoragecapacities -ocustom-columns=CLASS:.storageClassName,AVAIL:.capacity,ZONE:.nodeTopology.matchLabels -A
CLASS                    AVAIL     ZONE
csi-cinder-classic-xfs   19990Gi   map[topology.cinder.csi.openstack.org/zone:nova]

Release note:

required documentation update/helm-chart/migration process

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 16, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @sergelogvinov. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 16, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jichenjc for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 16, 2024
@jichenjc
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 16, 2024
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 16, 2024
@sergelogvinov
Copy link
Contributor Author

/retest

pkg/csi/cinder/openstack/openstack_volumes.go Outdated Show resolved Hide resolved
@@ -396,6 +397,16 @@ func (os *OpenStack) GetMaxVolLimit() int64 {
return defaultMaxVolAttachLimit
}

// GetFreeCapacity returns free capacity of the block storage
func (os *OpenStack) GetFreeCapacity() (int64, error) {
res, err := volumelimit.Get(os.blockstorage).Extract()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure, but this sounds like an admin API. We can't make admin calls from CSIs, all CPO is supposed to work normally on a regular tenant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.openstack.org/api-ref/block-storage/v3/#limits-limits i did not find the role/permission list for csi-cinder.
I think this feature should be optional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm it with DevStack and demo tenant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OVH Cloud, openstack account has computeOperator, volumeOperator permission, here doc: https://help.ovhcloud.com/csm/en-public-cloud-authenticate-api-openstack-service-account?id=kb_article_view&sysparm_article=KB0059364

but i do not know/can get the real openstack permission of it.
If you have a permission list of DevStack - i'd like to check...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked that the API is available. What puzzles me now is - how often is GetCapacity() called? Would calling this API be a significant blow on the Cinder API in the cloud?

/ok-to-test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it calls 1/min (capacity-poll-interval flag of csi-provisioner https://github.com/kubernetes-csi/external-provisioner/blob/c7f94435bd29e49edf1af0eda9c4ee2907c4d160/cmd/csi-provisioner/csi-provisioner.go#L111) for each storageClass and accessibleTopology (av zone).

flag enable-capacity by default is off, so csi-provisioner does not collect this metrics by default.

@sergelogvinov
Copy link
Contributor Author

/retest-required

@sergelogvinov
Copy link
Contributor Author

/retest-required

1 similar comment
@sergelogvinov
Copy link
Contributor Author

/retest-required

@@ -396,6 +397,21 @@ func (os *OpenStack) GetMaxVolLimit() int64 {
return defaultMaxVolAttachLimit
}

// GetFreeCapacity returns free capacity of the block storage, in GB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least this is not my understanding
the free storage usually comes from backend and I don't know whether every storage provider has this

and the API seems from here https://github.com/openstack/cinder/blob/master/cinder/api/v3/limits.py#L25 ?
if so ,it's actually the quota, not the real storage ? e.g you can have 100G max NFS pool
but you can set quota to 1T ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is a quota.
Based on this quota, I cannot allocate more disk space as I am not the owner of the OpenStack cluster.

These metrics provide the scheduler with additional information about which availability zone the pod can be scheduled in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

um... at least the CSI GetFreeSpace is expecting to give real free space?
this might confuse a little bit

and if the quota is bigger than real storage, this is not accurate info?

Copy link
Contributor Author

@sergelogvinov sergelogvinov May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, now I see that you mean. thanks.
I've tried to get the real size of the pool, but it required rule:admin_api role for the api method scheduler_extension:scheduler_stats:get_pools

So, It is better to rename the function to GetFreeQuotaStorageSpace since we cannot get the real size of the pool.

@sergelogvinov sergelogvinov force-pushed the csi-cinder-capacity branch 3 times, most recently from a5a1e4d to 57f388b Compare May 29, 2024 08:08
@sergelogvinov
Copy link
Contributor Author

/retest-required

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 3, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 8, 2024
Copy link
Contributor

@kayrus kayrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this change looks fine. However I wonder whether we can consider the maxTotalVolumes and totalVolumesUsed data returned by the limits API. In theory there can be a case, when you have a lot of capacity but running low on amount of volumes.
The csi.GetCapacityResponse struct has only AvailableCapacity, MaximumVolumeSize and MinimumVolumeSize: https://pkg.go.dev/github.com/container-storage-interface/spec/lib/go/csi#GetCapacityResponse. I guess it makes sense to add an option toggle the GetCapacity (e.g. whether to advertise the csi.ControllerServiceCapability_RPC_GET_CAPACITY) + add a check for maxTotalVolumes vs totalVolumesUsed to fake zero capacity response , when the total amount of volumes has been reached.

pkg/csi/cinder/controllerserver.go Outdated Show resolved Hide resolved
Comment on lines +404 to +413
res, err := limits.Get(context.TODO(), os.blockstorage).Extract()
if mc.ObserveRequest(err) != nil {
return 0, err
}

capacity := res.Absolute.MaxTotalVolumeGigabytes - res.Absolute.TotalGigabytesUsed
if capacity < 0 {
capacity = 0
}
return capacity, nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kayrus asked me to take a look at this, since I'm our team's resident OpenStack quota/usage measurement expert.

This calculation is going to yield incorrect (or at the least, misleading) results when Cinder supports multiple volume types. The /limits endpoint lumps quota and usage for all volume types together into a grand total, so if you get capacity = 10 here, it could mean "6 GiB for volume type A plus 4 GiB for volume type B", and actually creating a 10 GiB volume of either type will fail.

The correct endpoint for the thing that you want to do is https://docs.openstack.org/api-ref/block-storage/v3/index.html#show-quota-usage-for-a-project, modelled in Gophercloud as https://pkg.go.dev/github.com/gophercloud/gophercloud/v2/openstack/blockstorage/v3/quotasets#GetUsage. Unfortunately, the modelling in Gophercloud is not very good, so the relevant quota_set.gigabytes_$VOLUMETYPE fields need to be extracted in a custom way. This is how I do it in my own code: https://github.com/sapcc/limes/blob/7153750c217e51e668ba67375e64a2938b97532b/internal/liquids/cinder/usage.go#L39-L52

Hope that helps!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the interesting thoughts! However, my focus is on addressing customer-side issues, specifically when dealing with account limits for storage space. This limit includes both block volumes and snapshots, and having this information helps the Kubernetes scheduler pods correctly, and also alerting us when we are nearing the usage limit.

The actual storage capacity is managed by the OpenStack team, not the customers. I trust that they have several methods in place to monitor and determine when additional storage capacity is needed.

@sergelogvinov
Copy link
Contributor Author

Overall this change looks fine. However I wonder whether we can consider the maxTotalVolumes and totalVolumesUsed data returned by the limits API. In theory there can be a case, when you have a lot of capacity but running low on amount of volumes. The csi.GetCapacityResponse struct has only AvailableCapacity, MaximumVolumeSize and MinimumVolumeSize: https://pkg.go.dev/github.com/container-storage-interface/spec/lib/go/csi#GetCapacityResponse. I guess it makes sense to add an option toggle the GetCapacity (e.g. whether to advertise the csi.ControllerServiceCapability_RPC_GET_CAPACITY) + add a check for maxTotalVolumes vs totalVolumesUsed to fake zero capacity response , when the total amount of volumes has been reached.

This is indeed a very interesting idea. However, I can imagine that receiving an alert about running out of space when there is still space available might cause some confusion, similar to the situation with file systems and inodes. In my opinion, it would be ideal if Kubernetes first supported something like totalVolumesUsed before implementing this feature.

Available capacity of disk storage

Signed-off-by: Serge Logvinov <[email protected]>
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants