Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

master in aws not able to connect to rds馃悰[bug] #7979

Open
humbleearth opened this issue Sep 24, 2023 · 3 comments
Open

master in aws not able to connect to rds馃悰[bug] #7979

humbleearth opened this issue Sep 24, 2023 · 3 comments
Labels

Comments

@humbleearth
Copy link

Describe the bug

INFO[2023-09-24T06:39:21Z] Determined master 0.26.0-dev0 (built with go1.21.0)
INFO[2023-09-24T06:39:21Z] connecting to database determined-ai-stack-database-8.cluster-clmqsrouqcuk.us-gov-west-1.rds.amazonaws.com:5432
WARN[2023-09-24T06:39:25Z] failed to connect to postgres, trying again in 4s error="failed to connect to host=determined-ai-stack-database-gmauvb4ghjrp.cluster-8.us-gov-west-1.rds.amazonaws.com user=postgres database=determined: hostname resolving error (lookup determined-ai-stack-database-8.cluster-clmqsrouqcuk.us-gov-west-1.rds.amazonaws.com on 127.0.0.11:53: no such host)"
WARN[2023-09-24T06:39:29Z] failed to connect to postgres, trying again in 4s error="failed to connect to host=determined-ai-stack-database-8.cluster-clmqsrouqcuk.us-gov-west-1.rds.amazonaws.com user=postgres database=determined: hostname resolving error (lookup determined-ai-stack-database-8.cluster-clmqsrouqcuk.us-gov-west-1.rds.amazonaws.com on 127.0.0.11:53: no such host)"
WARN[2023-09-24T06:39:33Z] failed to connect to postgres, trying again in 4s error="failed to connect to host=determined-ai-stack-database-8.cluster-clmqsrouqcuk.us-gov-west-1.rds.amazonaws.com user=postgres database=determined: hostname resolving error (lookup determined-ai-stack-database-8.cluster-clmqsrouqcuk.us-gov-west-1.rds.amazonaws.com on 127.0.0.11:53: no such host)"
WARN[2023-09-24T06:39:37Z] failed to connect to postgres, trying again in 4s error="failed to connect to host=determined-ai-stack-database-8.cluster-clmqsrouqcuk.us-gov-west-1.rds.amazonaws.com user=postgres database=determined: hostname resolving error (lookup determined-ai-stack-database-8.cluster-clmqsrouqcuk.us-gov-west-1.rds.amazonaws.com on 127.0.0.11:53: no such host)"

Reproduction Steps

  1. deploy determined
  2. check docker logs for agent
  3. not able to lookup error

Expected Behavior

should be able to connect to database

Screenshot

na

Environment

  • Device or hardware: [aws]
  • OS: [linux]
  • Browser [chrome]
  • Version [25]

Additional Context

No response

@ioga
Copy link
Contributor

ioga commented Sep 24, 2023

in another issue of yours, master is on govcloud and has successfully connected to RDS. what changed?

@humbleearth
Copy link
Author

It's still happening. I am manually doing nslookup and other means to connect and restart the docker container many times until it connects back.

@ioga
Copy link
Contributor

ioga commented Sep 24, 2023

first of all, I am sorry you're seeing all these issues. we do not have any testing for govcloud today, and we primarily rely on the community for support.

I don't understand what's different about DNS configuration in govcloud which would make it break intermittently like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants