Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsh db connect cannot be run concurrently #51096

Open
Ezzahhh opened this issue Jan 16, 2025 · 3 comments
Open

tsh db connect cannot be run concurrently #51096

Ezzahhh opened this issue Jan 16, 2025 · 3 comments
Labels
bug database-access Database access related issues and PRs

Comments

@Ezzahhh
Copy link

Ezzahhh commented Jan 16, 2025

Expected behavior:
When running some scripts to connect to multiple dbs and execute SQL queries simultaneously with the tsh db connect command, tsh should be able to handle the certificates through its keystore concurrently without running into race conditions.

Current behavior:
Running multiple tsh db connect simultaneously to different databases will cause some or all to fail.

Bug details:

  • Teleport version: 17.1.5
  • Debug logs
    Here are example logs of the simultaneous runs. Some seem to work and others don't.
2025-01-16T09:53:06.395+10:00 INFO [CLIENT]    ALPN connection upgrade required for "teleport.[REDACT].com:443": false. client/api.go:915
2025-01-16T09:53:06.395+10:00 INFO [CLIENT]    no host login given. defaulting to ericlin client/api.go:1260
2025-01-16T09:53:06.395+10:00 INFO [CLIENT]    [KEY AGENT] Connected to the system agent: "/private/tmp/com.apple.launchd.VSdbfUmW4F/Listeners" client/api.go:4692
2025-01-16T09:53:06.399+10:00 INFO [KEYAGENT]  Loading SSH key for user "admin" and cluster "teleport.[REDACT].com". client/keyagent.go:198
2025-01-16T09:53:06.407+10:00 DEBU [TSH]       Listing databases with predicate () and labels map[KubernetesCluster:[REDACT].[REDACT].com ReadReplica:false] common/db.go:1190
2025-01-16T09:53:07.346+10:00 DEBU [TSH]       Selected database "[REDACT]" common/db.go:1079
2025-01-16T09:53:08.528+10:00 DEBU [TSH]       "Defaulting to the allowed database user \"admin\"\n" common/db.go:1004
2025-01-16T09:53:09.697+10:00 DEBU [TSH]       Fetching database access certificate for Database(Service=[REDACT], Protocol=postgres, Username=admin, Database=[REDACT], Roles=[]) on cluster teleport.[REDACT].com. common/db.go:299
2025-01-16T09:53:10.606+10:00 DEBU [CLIENT]    MFA not required for access. client/cluster_client.go:564
2025-01-16T09:53:10.609+10:00 DEBU [CLIENT]    Activating relogin on error="no credentials: /Users/ericlin/.tsh/keys/teleport.[REDACT].com/admin.key is empty" (type=*client.noCredentialsError) client/api.go:662
2025-01-16T09:52:56.819+10:00 INFO [CLIENT]    ALPN connection upgrade required for "teleport.[REDACT].com:443": false. client/api.go:915
2025-01-16T09:52:56.819+10:00 INFO [CLIENT]    no host login given. defaulting to ericlin client/api.go:1260
2025-01-16T09:52:56.819+10:00 INFO [CLIENT]    [KEY AGENT] Connected to the system agent: "/private/tmp/com.apple.launchd.VSdbfUmW4F/Listeners" client/api.go:4692
2025-01-16T09:52:56.827+10:00 INFO [KEYAGENT]  Loading SSH key for user "admin" and cluster "teleport.[REDACT].com". client/keyagent.go:198
2025-01-16T09:52:56.835+10:00 DEBU [TSH]       Listing databases with predicate () and labels map[KubernetesCluster:jet-boa-86684d.customer.k8s.[REDACT].com ReadReplica:false] common/db.go:1190
2025-01-16T09:52:57.772+10:00 DEBU [TSH]       Selected database "[REDACT]" common/db.go:1079
2025-01-16T09:52:59.064+10:00 DEBU [TSH]       "Defaulting to the allowed database user \"admin\"\n" common/db.go:1004
2025-01-16T09:52:59.974+10:00 DEBU [TSH]       Fetching database access certificate for Database(Service=[REDACT]., Protocol=postgres, Username=admin, Database=[REDACT], Roles=[]) on cluster teleport.[REDACT].com. common/db.go:299
2025-01-16T09:53:00.889+10:00 DEBU [CLIENT]    MFA not required for access. client/cluster_client.go:564
2025-01-16T09:53:00.900+10:00 DEBU [CLIENT]    not using loopback pool for remote proxy addr: teleport.[REDACT].com:443 client/api.go:4647
2025-01-16T09:53:00.900+10:00 DEBU  Attempting request to Proxy web api method:GET host:teleport.[REDACT].com:443 path:/webapi/ping/local trace_id:6e153d4bf6501aba89741bf2ecb9acd2 span_id:f194bc837467828a webclient/webclient.go:131
2025-01-16T09:53:02.045+10:00 DEBU  ALPN connection upgrade test complete address:teleport.[REDACT].com:443 upgrade_required:false trace_id:6e153d4bf6501aba89741bf2ecb9acd2 span_id:f194bc837467828a client/alpn_conn_upgrade.go:96
2025-01-16T09:53:02.472+10:00 DEBU [KEYAGENT]  Deleting obsolete stored keyring with index {ProxyHost:teleport.[REDACT].com Username:admin ClusterName:teleport.[REDACT].com}. client/keyagent.go:550
2025-01-16T09:53:03.177+10:00 DEBU [KEYSTORE]  Adding known host teleport.[REDACT].com with proxy teleport.[REDACT].com client/trusted_certs_store.go:395
2025-01-16T09:53:03.186+10:00 DEBU [TSH]       Starting local proxy because: cluster teleport.[REDACT].com proxy is using TLS routing common/db.go:616
2025-01-16T09:53:03.203+10:00 DEBU [TSH]       /opt/homebrew/opt/postgresql@16/bin/psql postgres://admin@localhost:62244/[REDACT]?sslrootcert=/Users/ericlin/.tsh/keys/teleport.[REDACT].com/cas/teleport.[REDACT].com.pem&sslcert=/Users/ericlin/.tsh/keys/teleport.[REDACT].com/admin-db/teleport.[REDACT].com/[REDACT].crt&sslkey=/Users/ericlin/.tsh/keys/teleport.[REDACT].com/admin-db/teleport.[REDACT].com/[REDACT].key&sslmode=verify-full common/db.go:805
2025-01-16T09:53:03.214+10:00 DEBU [LOCALPROX] Accepted downstream connection. alpnproxy/local_proxy.go:200

All the logs from my runs will look either one or the other of the above. My theory is that it's likely related to the Deleting obsolete stored keyring with index log line which might be removing the keys/certificates? This comparison must be returning false for this code path to be followed.

As a side note, if the tsh logged in user is with Github (not local user) running these tsh concurrently also causes issue with Github (redirects to reauthorization pages during the runs); however, this is a separate issue and is resolved if using local users.

@Ezzahhh Ezzahhh added the bug label Jan 16, 2025
@zmb3 zmb3 added the database-access Database access related issues and PRs label Jan 16, 2025
@zmb3
Copy link
Collaborator

zmb3 commented Jan 16, 2025

Looks like a potential duplicate of #48664

@Ezzahhh
Copy link
Author

Ezzahhh commented Jan 16, 2025

Sorry my bad didn't find those issues before filing this one. Both #15577 and #48664 seem relevant for sure and likely have the same underlying issue.

@Ezzahhh
Copy link
Author

Ezzahhh commented Jan 16, 2025

I would also add that this issue isn't just through tsh db connect, if I try to run automation that runs in parallel using the kubeconfig generated by tsh kube login, I also encounter similar issues:

no credentials: open /Users/ericlin/.tsh/keys/teleport.[REDACT].com/Ezzahhh: no such file or directory

I suppose this is due to the exec in the kubeconfig that calls tsh kube credentials.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug database-access Database access related issues and PRs
Projects
None yet
Development

No branches or pull requests

2 participants