New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setsockopt: invalid argument #8841
Comments
Hello @DuckThom, Thanks for your interest in Traefik ! We can see that you are using a custom version of traefik. |
Ah, I see. I installed this version with
However, just FYI, it can take some time for the issue to appear. |
hello @DuckThom I have just tried to reproduce the issue you are experiencing. I am on Mac 12.1 and I downloaded the latest Traefik version which is 2.6.1 ❯ traefik version
Version: 2.6.1
Codename: cheddar
Go version: go1.17.6
Built: I don't remember exactly
OS/Arch: darwin/arm64 I used almost the same static and dynamic configuration as you shared. So far, everything works as expected. Do you know how long should I wait to get the error message you were experiencing? |
@jakubhajek Unfortunately, it usually only happens when traefik has been running for a couple of days. I just checked the logs again and it appears that this morning, the
This is with debug logging enabled, unfortunately, there doesn't seem to be anything related to this error in there. the logs before and after are from more than an hour earlier and later. |
I've done some more digging and it appears that the error itself is coming from traefik/pkg/server/server_entrypoint_tcp.go Line 539 in 79aab5a
I haven't been able to reliably reproduce this yes but I have a feeling that perhaps a dropped connection or something isn't being handled correctly and that it might kill the entrypoint and it doesn't get restarted. But that's just a theory, I cannot say for certain. |
Can you please check that the issue you described might be related to computer hibernation? May we also ask you to share the complete debug log? Thank you, FYI: I was running Traefik on my laptop for some time I was not able to reproduce the issue. My laptop has been hibernated and then awaken and the issue has not occurred. |
The computer doesn't go into hibernation, it's a server I have running 24/7 at home. Below is the long debug log. I did strip out some repetitive lines (mostly GET/POST requests trying to get .env files and such) Click to expand
|
Thank you for all the information you have provided @DuckThom. However, we have not been able to reproduce this bug in our environment and it looks like time to get more eyes on it. If any community member can help us find verify steps to reproduce, we would love the help. Just summarize what we have already tried to verify:
|
I'm still trying to debug this (got a bit slowed down as I caught covid last week...) From what I can gather it could be related to TCP KeepAlive. As such, I've currently added some debugging to this method: traefik/pkg/server/server_entrypoint_tcp.go Line 367 in 8c56d1a
Specifically:
I got the idea from these PRs/Issues: |
@jakubhajek The error is indeed coming from this line: if err := tc.SetKeepAlive(true); err != nil { I modified the Accept function to look like this: func (ln tcpKeepAliveListener) Accept() (net.Conn, error) {
tc, err := ln.AcceptTCP()
if err != nil {
log.WithoutContext().Errorf("failed to accept TCP: %v", err)
return nil, err
}
if err := tc.SetKeepAlive(true); err != nil {
log.WithoutContext().Errorf("failed to enable TCP keepalive: %v", err)
return nil, err
}
if err := tc.SetKeepAlivePeriod(3 * time.Minute); err != nil {
// Some systems, such as OpenBSD, have no user-settable per-socket TCP
// keepalive options.
if !errors.Is(err, syscall.ENOPROTOOPT) {
log.WithoutContext().Errorf("failed to set TCP keepalive period: %v", err)
return nil, err
}
}
return tc, nil
} And let it run again for a couple of days and it just logged this error message:
Which seems to indicate that it's unable to call |
Hello @DuckThom, Sorry for the late reply, thanks for your investigations, while it is still unclear why we haven't been able to reproduce the issue on darwin/arm64 arch, we have everything we need to try something again. We will let you know the outcome. |
Welcome!
What did you do?
Nothing in particular. The error seems to be happening without a specific action occurring.
What did you see instead?
The entrypoints seem to die out of the blue with the error "setsockopt: invalid argument" in the logs.
After this error appears in the log, Traefik wont accept any new incoming connections.
I've found #8071 however this was closed without a resolution. And https://community.traefik.io/t/traffic-stops-accepting-connections-on-certain-entrypoints-after-running-for-about-a-week/10390 also has not been updated since April 2021
What version of Traefik are you using?
What is your environment & configuration?
System:
traefik.toml
traefik_dynamic.yml
If applicable, please paste the log output in DEBUG level
Note: This is INFO level logging, I'm still collecting debug output logging
The text was updated successfully, but these errors were encountered: