Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Upload Error]: [upload part: send upload part 81 RPC: rpcDoRequest: retryUntilAck: engine forcibly closed: context canceled] #1021

Closed
jhemmmm opened this issue Mar 15, 2023 · 13 comments
Assignees
Labels
bug Something isn't working

Comments

@jhemmmm
Copy link

jhemmmm commented Mar 15, 2023

What version of gotd are you using?

github.com/gotd/td v0.71.0

Can this issue be reproduced with the latest version?

Yes

What did you do?

func (d *Telegram) upload(path string) (int, *tg.Document, error) {
	f, err := uploader.NewUploader(d.TGClient).WithPartSize(524288).WithProgress(d).WithThreads(12).FromPath(d.Context, path)
	if err != nil {
		return 0, nil, err
	}
	msg, err := unpack.Message(message.NewSender(d.TGClient).Media(d.Context, message.UploadedDocument(f)))
	if err != nil {
		return 0, nil, err
	}
	doc, ok := msg.Media.(*tg.MessageMediaDocument).Document.AsNotEmpty()
	if !ok {
		return 0, nil, fmt.Errorf("invalid document")
	}
	return msg.ID, doc, nil
}

What did you expect to see?

No error

What did you see instead?

upload part: send upload part 81 RPC: rpcDoRequest: retryUntilAck: engine forcibly closed: context canceled
upload part: send upload part 154 RPC: rpcDoRequest: engine forcibly closed: context canceled

What Go version and environment are you using?

go version go1.19.2 linux/amd64
go env Output
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/user/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.2"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/user/gotelegram/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1435934855=/tmp/go-build -gno-record-gcc-switches"
@jhemmmm jhemmmm added the bug Something isn't working label Mar 15, 2023
@ernado
Copy link
Member

ernado commented Mar 15, 2023

Looks like client was closed during upload.

@jhemmmm
Copy link
Author

jhemmmm commented Mar 15, 2023

I am using the bg.Connect

It says that Background returns a non-nil, empty Context. It is never canceled, has no values, and has no deadline. It is typically used by the main function, initialization, and tests, and as the top-level Context for incoming requests.

So, there's no way unless I forcibly close my app.

@ernado
Copy link
Member

ernado commented Mar 15, 2023

Client can close itself on some critical failure.
Can you please pass logger to client and enable debug logs?

@jhemmmm
Copy link
Author

jhemmmm commented Mar 15, 2023

Client can close itself on some critical failure. Can you please pass logger to client and enable debug logs?

I'll check it out, BTW; that error only happens sometimes, like 2 out of 10. But still, I want it to be fixed.

@ernado
Copy link
Member

ernado commented Mar 15, 2023

Let's fix it, I just need some debug logs just before this error happens to investigate.

@ernado ernado self-assigned this Mar 15, 2023
@jhemmmm
Copy link
Author

jhemmmm commented Mar 16, 2023

Let's fix it, I just need some debug logs just before this error happens to investigate.

Here's the log. I found 3 errors while uploading some files.
https://gist.github.com/jhemmmm/5817377cca135542aedd60d707969b47

@jhemmmm
Copy link
Author

jhemmmm commented Mar 18, 2023

@ernado any news?.

@borzovplus
Copy link

@ernado any news?

I have the same problem

@ernado
Copy link
Member

ernado commented Mar 24, 2023

Sorry, pretty busy right now, I'll try to find some time to investigate it soon.

General workaround for this error is use client.Run() directly without bg helper.

@ernado
Copy link
Member

ernado commented Mar 24, 2023

After inspecting logs, I've found out that telegram client observed connection timeout:

2023-03-16T06:29:57.022Z	DEBUG	conn	manager/conn.go:125	Connection dead	{"v": "v0.79.0", "conn_id": 0, "dc_id": 5, "error": "group: task pingLoop: disconnect (pong missed): write: write: write intermediate: write tcp 51.91.229.155:46430->91.108.56.152:443: i/o timeout", "errorVerbose": "group:\n    github.com/gotd/td/internal/mtproto.(*Conn).Run\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/mtproto/conn.go:214\n  - task pingLoop:\n    github.com/gotd/td/internal/tdsync.(*LogGroup).Go.func1\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/tdsync/log_group.go:48\n  - disconnect (pong missed):\n    github.com/gotd/td/internal/mtproto.(*Conn).pingLoop\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/mtproto/ping.go:115\n  - write:\n    github.com/gotd/td/internal/mtproto.(*Conn).pingDelayDisconnect\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/mtproto/ping.go:71\n  - write:\n    github.com/gotd/td/transport.(*connection).Send\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/transport/connection.go:48\n  - write intermediate:\n    github.com/gotd/td/internal/proto/codec.Intermediate.Write\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/proto/codec/intermediate.go:64\n  - write tcp 51.91.229.155:46430->91.108.56.152:443: i/o timeout"}
github.com/gotd/td/telegram/internal/manager.(*Conn).Run.func1
	/root/go/pkg/mod/github.com/gotd/td@v0.79.0/telegram/internal/manager/conn.go:125
github.com/gotd/td/telegram/internal/manager.(*Conn).Run
	/root/go/pkg/mod/github.com/gotd/td@v0.79.0/telegram/internal/manager/conn.go:128
github.com/gotd/td/internal/tdsync.(*CancellableGroup).Go.func1
	/root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/tdsync/cancel_group.go:48
golang.org/x/sync/errgroup.(*Group).Go.func1
	/root/go/pkg/mod/golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75
2023-03-16T06:29:57.022Z	INFO	telegram/connect.go:75	Restarting connection	{"v": "v0.79.0", "error": "group: task pingLoop: disconnect (pong missed): write: write: write intermediate: write tcp 51.91.229.155:46430->91.108.56.152:443: i/o timeout", "errorVerbose": "group:\n    github.com/gotd/td/internal/mtproto.(*Conn).Run\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/mtproto/conn.go:214\n  - task pingLoop:\n    github.com/gotd/td/internal/tdsync.(*LogGroup).Go.func1\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/tdsync/log_group.go:48\n  - disconnect (pong missed):\n    github.com/gotd/td/internal/mtproto.(*Conn).pingLoop\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/mtproto/ping.go:115\n  - write:\n    github.com/gotd/td/internal/mtproto.(*Conn).pingDelayDisconnect\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/mtproto/ping.go:71\n  - write:\n    github.com/gotd/td/transport.(*connection).Send\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/transport/connection.go:48\n  - write intermediate:\n    github.com/gotd/td/internal/proto/codec.Intermediate.Write\n        /root/go/pkg/mod/github.com/gotd/td@v0.79.0/internal/proto/codec/intermediate.go:64\n  - write tcp 51.91.229.155:46430->91.108.56.152:443: i/o timeout", "backoff": "569.375796ms"}

This led to rpc engine restart and re-connection, which drops all ongoing requests.
All invokations will fail after this.

While re-connection is in progress, no requests can be successfully done.

Also I've found out that reconnect was successfull and invocations should work after that.

So, there is no bug in client that we can fix.
To mitigate this issue, implement retries for uploads on your side.

Probably we can add opt-in retry for uploading individual chunks in uploader, but this will be not a bugfix, but a feature.

@ernado ernado closed this as completed Mar 24, 2023
@jhemmmm
Copy link
Author

jhemmmm commented Mar 25, 2023

@ernado at you can see on my func, there's already a retry feature on it, but I'm still having the same issue, for now, I have a temporary fix by closing the client if the upload fails and create a new client and retry the upload. What's the main reason why the observed connection got a timeout?

@ernado
Copy link
Member

ernado commented Mar 25, 2023

What's the main reason why the observed connection got a timeout?

It got timeout while sending data via TCP connection, i.e. some network problems, that lead to ping loop to fail.

Looks like re-creating client is better workaround. This can't be implemented in contrib/bg package now.

Anyway, client should work after reconnect. I'll try to reproduce this issue, but logs shows that connection was restarted.

@ernado
Copy link
Member

ernado commented Mar 25, 2023

I've simulated connection loss and looks like client can't recover for some reason.

So more general issue is #1030

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants