Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed send/receive updates: high-load bot stop works time to time #760

Closed
grinrill opened this issue May 8, 2022 · 5 comments
Closed
Labels
bug Something isn't working

Comments

@grinrill
Copy link

grinrill commented May 8, 2022

What version of gotd are you using?

github.com/gotd/td v0.56.0

Can this issue be reproduced with the latest version?

Yes I think

What did you do?

I have a high-load bot. Every 5-10 minutes it stops receiving updates/send messages for a while:
image

Here's what errors happens during this time:

  • First rpcDoRequest: retryUntilAck: send: write: write intermediate: write tcp <server ip here>:45706->149.154.167.41:443: write: broken pipe for 10-15 seconds
  • Than for 30-40 seconds happens nothing
  • Than engine was closed for 10-15 seconds

This problem does not depend on which bot the code is running on. If it is high-load bot, the problem happens. If it is not high-load bot, the problem does not happens.

What did you expect to see?

The bot is working properly.

What did you see instead?

The bot does not respond time to time.

What Go version and environment are you using?

go version go1.18 linux/amd64

go env Output

Where the code is builded on:

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/grinrill/.cache/go-build"
GOENV="/home/grinrill/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/grinrill/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/grinrill/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.18"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/grinrill/bots/gomentiondev/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3530652061=/tmp/go-build -gno-record-gcc-switches"

Where the code is running on:

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/grinrill/.cache/go-build"
GOENV="/home/grinrill/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/grinrill/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/grinrill/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.2"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build614313604=/tmp/go-build -gno-record-gcc-switches"
@grinrill grinrill added the bug Something isn't working label May 8, 2022
@borzovplus
Copy link

borzovplus commented Jul 8, 2023

any updates? When it happens then context cancels but client.Run does not return
0.83 version

@ernado
Copy link
Member

ernado commented Jul 8, 2023

This can be caused by #1030 and is not fixed. Currently I'm out capacity.

@borzovplus
Copy link

This can be caused by #1030 and is not fixed. Currently I'm out capacity.

Maybe there is a workaround how to catch it and reconnect the account yourself?
My main problem is that even though the context is canceled, but (*updates.Manager).Run does not return a result, and continues to block the thread.

@ioukarbro
Copy link

ioukarbro commented Dec 15, 2023

I encountered this issue with my high-load bot too, this situation always happen and resume in 1-2 minutes. Can anyone help? It's really important to me.
BTW, I have some other similar projects with telethon which never encountered this issue.
github.com/gotd/td v0.91.0

@ernado
Copy link
Member

ernado commented Jan 21, 2024

I'm trying to find a way to reproduce this issue.
Tried something like that:

sudo ss -K dst 149.154.167.50

Just closing connection does not reproduce this issue, so probably another way is needed.

If anobody have ideas how to reproduce it in controlled environment (I'm using echobot started locally) I will be very happy.

Closing in favor of #1030 to track this problem in single issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants