Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial pipelined request blocks outstanding responses #1043

Closed
wcs1only opened this issue Jun 7, 2021 · 11 comments · Fixed by #1050
Closed

Partial pipelined request blocks outstanding responses #1043

wcs1only opened this issue Jun 7, 2021 · 11 comments · Fixed by #1050

Comments

@wcs1only
Copy link

wcs1only commented Jun 7, 2021

If N pipelined requests are sent in the same packet, with the N+1th request split across a packet boundary. All replies are blocked until the N+1th request is finally sent. You can reproduce this behavior with the following client (run against helloworldserver example):

// socket client for golang
// https://golangr.com
package main

import "net"
import "fmt"
import "bufio"
import "time"

func main() {

  // connect to server
  conn, _ := net.Dial("tcp", "127.0.0.1:8080")
  reader := bufio.NewReader(conn)

  go func () {
    for { 
      message, _ := reader.ReadString('\n')
      fmt.Printf("%s reply from server: %s", time.Now(), message)
    }
  } ()
  payload := "GET / HTTP/1.1\r\nhost: localhost\r\n\r\nGE"
  fmt.Fprintf(conn, payload)
  fmt.Printf("%s client sent: %s", time.Now(), payload)
  time.Sleep(10 * time.Second) 
  
  payload = "T / HTTP/1.1\r\nhost: localhost\r\n\r\n"
  fmt.Fprintf(conn, payload)
  fmt.Printf("%s client sent: %s", time.Now(), payload)
  time.Sleep(10 * time.Second) 

}

I would expect the first request to respond immediately, instead it is blocked until the second request is fully parsed. I confirmed via tcpdump that no response packet is sent for a full 10 seconds.

Output for the above is the following:

2021-06-07 13:42:41.375146 -0700 PDT m=+0.002485389 client sent: GET / HTTP/1.1
host: localhost

GE2021-06-07 13:42:51.375805 -0700 PDT m=+10.003223534 client sent: T / HTTP/1.1
host: localhost

2021-06-07 13:42:51.376106 -0700 PDT m=+10.003523940 reply from server: HTTP/1.1 200 OK
2021-06-07 13:42:51.376213 -0700 PDT m=+10.003631153 reply from server: Server: fasthttp
2021-06-07 13:42:51.376222 -0700 PDT m=+10.003639915 reply from server: Date: Mon, 07 Jun 2021 20:42:41 GMT
2021-06-07 13:42:51.376228 -0700 PDT m=+10.003645898 reply from server: Content-Type: text/plain; charset=utf8
2021-06-07 13:42:51.376233 -0700 PDT m=+10.003651249 reply from server: Content-Length: 457
2021-06-07 13:42:51.376248 -0700 PDT m=+10.003666252 reply from server: X-My-Header: my-header-value
2021-06-07 13:42:51.37626 -0700 PDT m=+10.003678224 reply from server: Set-Cookie: cookie-name=cookie-value
2021-06-07 13:42:51.376266 -0700 PDT m=+10.003684119 reply from server: 
2021-06-07 13:42:51.376272 -0700 PDT m=+10.003690002 reply from server: Hello, world!
2021-06-07 13:42:51.376278 -0700 PDT m=+10.003695596 reply from server: 
2021-06-07 13:42:51.376286 -0700 PDT m=+10.003703790 reply from server: Request method is "GET"
2021-06-07 13:42:51.376292 -0700 PDT m=+10.003709599 reply from server: RequestURI is "/"
2021-06-07 13:42:51.376298 -0700 PDT m=+10.003716135 reply from server: Requested path is "/"
2021-06-07 13:42:51.376304 -0700 PDT m=+10.003721616 reply from server: Host is "localhost"
2021-06-07 13:42:51.37631 -0700 PDT m=+10.003728093 reply from server: Query string is ""
2021-06-07 13:42:51.376316 -0700 PDT m=+10.003733513 reply from server: User-Agent is ""
2021-06-07 13:42:51.376339 -0700 PDT m=+10.003756996 reply from server: Connection has been established at 2021-06-07 13:42:41.374607 -0700 PDT m=+1743.737173557
2021-06-07 13:42:51.37635 -0700 PDT m=+10.003767764 reply from server: Request has been started at 2021-06-07 13:42:41.375181 -0700 PDT m=+1743.737747673
2021-06-07 13:42:51.376355 -0700 PDT m=+10.003773139 reply from server: Serial request number for the current connection is 1
2021-06-07 13:42:51.37636 -0700 PDT m=+10.003778111 reply from server: Your ip is "127.0.0.1"
2021-06-07 13:42:51.376365 -0700 PDT m=+10.003783169 reply from server: 
2021-06-07 13:42:51.37637 -0700 PDT m=+10.003788145 reply from server: Raw request is:
2021-06-07 13:42:51.376375 -0700 PDT m=+10.003793212 reply from server: ---CUT---
2021-06-07 13:42:51.37638 -0700 PDT m=+10.003798132 reply from server: GET / HTTP/1.1
2021-06-07 13:42:51.376386 -0700 PDT m=+10.003803539 reply from server: Host: localhost
2021-06-07 13:42:51.376391 -0700 PDT m=+10.003808755 reply from server: 
2021-06-07 13:42:51.376397 -0700 PDT m=+10.003815315 reply from server: 
2021-06-07 13:42:51.376403 -0700 PDT m=+10.003820696 reply from server: ---CUT---HTTP/1.1 200 OK
2021-06-07 13:42:51.376408 -0700 PDT m=+10.003825883 reply from server: Server: fasthttp
2021-06-07 13:42:51.376417 -0700 PDT m=+10.003834568 reply from server: Date: Mon, 07 Jun 2021 20:42:51 GMT
2021-06-07 13:42:51.376423 -0700 PDT m=+10.003840440 reply from server: Content-Type: text/plain; charset=utf8
2021-06-07 13:42:51.37643 -0700 PDT m=+10.003847483 reply from server: Content-Length: 457
2021-06-07 13:42:51.376436 -0700 PDT m=+10.003853503 reply from server: X-My-Header: my-header-value
2021-06-07 13:42:51.376442 -0700 PDT m=+10.003860314 reply from server: Set-Cookie: cookie-name=cookie-value
2021-06-07 13:42:51.376448 -0700 PDT m=+10.003866027 reply from server: 
2021-06-07 13:42:51.376456 -0700 PDT m=+10.003873585 reply from server: Hello, world!
2021-06-07 13:42:51.376461 -0700 PDT m=+10.003879020 reply from server: 
2021-06-07 13:42:51.376467 -0700 PDT m=+10.003884499 reply from server: Request method is "GET"
2021-06-07 13:42:51.376522 -0700 PDT m=+10.003940311 reply from server: RequestURI is "/"
2021-06-07 13:42:51.376544 -0700 PDT m=+10.003962110 reply from server: Requested path is "/"
2021-06-07 13:42:51.376553 -0700 PDT m=+10.003971169 reply from server: Host is "localhost"
2021-06-07 13:42:51.37656 -0700 PDT m=+10.003977898 reply from server: Query string is ""
2021-06-07 13:42:51.376566 -0700 PDT m=+10.003983786 reply from server: User-Agent is ""
2021-06-07 13:42:51.376573 -0700 PDT m=+10.003990454 reply from server: Connection has been established at 2021-06-07 13:42:41.374607 -0700 PDT m=+1743.737173557
2021-06-07 13:42:51.376578 -0700 PDT m=+10.003996189 reply from server: Request has been started at 2021-06-07 13:42:51.375906 -0700 PDT m=+1753.738551447
2021-06-07 13:42:51.376584 -0700 PDT m=+10.004001803 reply from server: Serial request number for the current connection is 2
2021-06-07 13:42:51.37659 -0700 PDT m=+10.004007581 reply from server: Your ip is "127.0.0.1"
2021-06-07 13:42:51.376598 -0700 PDT m=+10.004016335 reply from server: 
2021-06-07 13:42:51.376605 -0700 PDT m=+10.004022976 reply from server: Raw request is:
2021-06-07 13:42:51.376611 -0700 PDT m=+10.004029062 reply from server: ---CUT---
2021-06-07 13:42:51.376622 -0700 PDT m=+10.004039669 reply from server: GET / HTTP/1.1
2021-06-07 13:42:51.376633 -0700 PDT m=+10.004050679 reply from server: Host: localhost
2021-06-07 13:42:51.376639 -0700 PDT m=+10.004056976 reply from server: 
2021-06-07 13:42:51.376648 -0700 PDT m=+10.004065554 reply from server: 
@wcs1only
Copy link
Author

wcs1only commented Jun 7, 2021

So I did some step debugging and I found this:
https://github.com/valyala/fasthttp/blob/master/server.go#L2273-L2282

Which looks like this is a known (and intentional?) behavior. Would you accept a PR to make this behavior configurable? Or perhaps, add a very short timer event (a few ms, maybe?) to allow response packing without waiting for the unfinished request to arrive indefinitely?

@wcs1only
Copy link
Author

wcs1only commented Jun 7, 2021

A bit of context about how we discovered this, might provide a little motivation for why this behavior is undesirable. A buggy client was setting a Content-Length header that was too small on a PUT request with a JSON body. The extra few bytes are interpreted by the server as the beginning of the next HTTP request. Rather than responding right away with a 400 when our callback failed to parse the PUT body, the server response hangs until ReadTimeout is reached.

@kirillDanshin
Copy link
Collaborator

That's a tricky case you found here, thank you for the report. While it sounds like we have to provide a way to fix this, I'm not sure how we can do that while keeping fasthttp behavior backward compatible. I would appreciate a PR, let's see if we can fix that

@wcs1only
Copy link
Author

wcs1only commented Jun 8, 2021

So I know how I'd fix this in C. I'd do a non-blocking read of the socket, and if I got EAGAIN, I'd know that the OS was still waiting on packets from the network. That'd be an easy way to tell the difference between "waiting on network" and "there are more pipelined requests in the buffer". It seems that Go doesn't have a non-blocking way to do this and the buffer peek operation is always blocking. If you know a good way to do this in Go, I'd love to hear it.

If we can't do some version of the above, then perhaps adding a very short (like ~1-10ms ?) timer event to flush the buffer if the unfinished pipelined request still hasn't arrived yet? That might give us a window to pack in a few pipelined responses without costing an entire round trip, or more in time to first byte on the first request? Does this sound sane?

Third option would be to add a configurable "FlushBufferAfterPipelinedRequests" option, or something like that, which would just flush the buffer every time for users that care more about time to first byte rather than packing more responses into a TCP packet.

@erikdubbelboer
Copy link
Collaborator

EAGAIN is not an indicator that more data is coming so its impossible to use that.

I would prefer to have an option, I don't see any way to automatically fix this.

@wcs1only what I'm confused about is why you have to wait for ReadTimeout in your case. Unless the extra few bytes happen to be a correct start of a HTTP request the parser should give an error and return a response immediately.

@wcs1only
Copy link
Author

wcs1only commented Jun 9, 2021

EAGAIN is not an indicator that more data is coming so its impossible to use that.

I would prefer to have an option, I don't see any way to automatically fix this.

What EAGAIN would tell you, is that the kernel hasn't processed any more bytes from the network. It doesn't need to know that no more data is coming, just so long as all of your pipelined requests fit within the first congestion window, it is reasonable to expect that all of them would be received within the window of servicing the first response, thus not regressing on the benchmark. That's the hypothesis anyway. If I knew how to do that in Go, we wouldn't have to speculate, we could run the benchmark in question and see if it regresses.

@wcs1only what I'm confused about is why you have to wait for ReadTimeout in your case. Unless the extra few bytes happen to be a correct start of a HTTP request the parser should give an error and return a response immediately.

My experimentation suggests otherwise. The parser does not seem to bail early with an error. It remains hung until it gets an \r\n. If \r\n is never sent, the replies are blocked until ReadTimeout, a connection close, or forever whichever is first.

You can repeat my experiment with the following code:

// socket client for golang
// https://golangr.com
package main

import "net"
import "fmt"
import "bufio"
import "time"

func main() {

  // connect to server
  conn, _ := net.Dial("tcp", "127.0.0.1:8080")
  reader := bufio.NewReader(conn)

  go func () {
    for { 
      message, _ := reader.ReadString('\n')
      if len(message) > 0 {
	fmt.Printf("%s reply from server: %s", time.Now(), message)
      }
    }
  } ()
  payload := "GET / HTTP/1.1\r\nhost: localhost\r\n\r\nasdklfjhsdjfh@(#$*$&WSEJVHBSDFasldkfjlskdjflkjsdflkjsdkfjkhasdkfjhjkhasdfkjhsdjfhksjdhfkjshdfkjh"
  fmt.Fprintf(conn, payload)
  fmt.Printf("%s client sent: %s", time.Now(), payload)
  time.Sleep(10 * time.Second) 

  payload = "\r\n"
  fmt.Fprintf(conn, payload)
  fmt.Printf("%s client sent: %s", time.Now(), payload)
  time.Sleep(10 * time.Second) 

}

Running this code gives the following:

2021-06-09 07:56:33.678629 -0700 PDT m=+0.000604236 client sent: GET / HTTP/1.1
host: localhost

asdklfjhsdjfh@(#$*$&WSEJVHBSDFasldkfjlskdjflkjsdflkjsdkfjkhasdkfjhjkhasdfkjhsdjfhksjdhfkjshdfkjh2021-06-09 07:56:43.681598 -0700 PDT m=+10.003462566 client sent: 
2021-06-09 07:56:43.68187 -0700 PDT m=+10.003734519 reply from server: HTTP/1.1 200 OK
2021-06-09 07:56:43.681894 -0700 PDT m=+10.003758150 reply from server: Server: fasthttp
2021-06-09 07:56:43.681907 -0700 PDT m=+10.003771280 reply from server: Date: Wed, 09 Jun 2021 14:56:33 GMT
2021-06-09 07:56:43.681914 -0700 PDT m=+10.003777911 reply from server: Content-Type: text/plain; charset=utf8
2021-06-09 07:56:43.681953 -0700 PDT m=+10.003816820 reply from server: Content-Length: 456
2021-06-09 07:56:43.681972 -0700 PDT m=+10.003836164 reply from server: X-My-Header: my-header-value
2021-06-09 07:56:43.681979 -0700 PDT m=+10.003843043 reply from server: Set-Cookie: cookie-name=cookie-value
2021-06-09 07:56:43.681985 -0700 PDT m=+10.003849132 reply from server: 
2021-06-09 07:56:43.681992 -0700 PDT m=+10.003855620 reply from server: Hello, world!
2021-06-09 07:56:43.682023 -0700 PDT m=+10.003887671 reply from server: 
2021-06-09 07:56:43.682043 -0700 PDT m=+10.003906896 reply from server: Request method is "GET"
2021-06-09 07:56:43.682051 -0700 PDT m=+10.003914924 reply from server: RequestURI is "/"
2021-06-09 07:56:43.682058 -0700 PDT m=+10.003921666 reply from server: Requested path is "/"
2021-06-09 07:56:43.682064 -0700 PDT m=+10.003927830 reply from server: Host is "localhost"
2021-06-09 07:56:43.682071 -0700 PDT m=+10.003934649 reply from server: Query string is ""
2021-06-09 07:56:43.682077 -0700 PDT m=+10.003940805 reply from server: User-Agent is ""
2021-06-09 07:56:43.682084 -0700 PDT m=+10.003947876 reply from server: Connection has been established at 2021-06-09 07:56:33.67863 -0700 PDT m=+1241.087259450
2021-06-09 07:56:43.68209 -0700 PDT m=+10.003954064 reply from server: Request has been started at 2021-06-09 07:56:33.678643 -0700 PDT m=+1241.087272529
2021-06-09 07:56:43.682098 -0700 PDT m=+10.003961908 reply from server: Serial request number for the current connection is 1
2021-06-09 07:56:43.682103 -0700 PDT m=+10.003967294 reply from server: Your ip is "127.0.0.1"
2021-06-09 07:56:43.682109 -0700 PDT m=+10.003973101 reply from server: 
2021-06-09 07:56:43.682117 -0700 PDT m=+10.003981172 reply from server: Raw request is:
2021-06-09 07:56:43.682123 -0700 PDT m=+10.003987124 reply from server: ---CUT---
2021-06-09 07:56:43.682131 -0700 PDT m=+10.003994694 reply from server: GET / HTTP/1.1
2021-06-09 07:56:43.682136 -0700 PDT m=+10.003999825 reply from server: Host: localhost
2021-06-09 07:56:43.682142 -0700 PDT m=+10.004005982 reply from server: 
2021-06-09 07:56:43.68215 -0700 PDT m=+10.004013652 reply from server: 
2021-06-09 07:56:43.682155 -0700 PDT m=+10.004019411 reply from server: ---CUT---HTTP/1.1 400 Bad Request
2021-06-09 07:56:43.682162 -0700 PDT m=+10.004026107 reply from server: Server: fasthttp
2021-06-09 07:56:43.682168 -0700 PDT m=+10.004031902 reply from server: Date: Wed, 09 Jun 2021 14:56:43 GMT
2021-06-09 07:56:43.682176 -0700 PDT m=+10.004040100 reply from server: Content-Type: text/plain; charset=utf-8
2021-06-09 07:56:43.682184 -0700 PDT m=+10.004047777 reply from server: Content-Length: 26
2021-06-09 07:56:43.682206 -0700 PDT m=+10.004070314 reply from server: Connection: close
2021-06-09 07:56:43.682218 -0700 PDT m=+10.004082010 reply from server: 
2021-06-09 07:56:43.682229 -0700 PDT m=+10.004093185 reply from server: Error when parsing request

@wcs1only
Copy link
Author

wcs1only commented Jun 9, 2021

I think this is largely due to the fact that the parser seems to accept any string as an HTTP verb, and any string as a URL. In fact, the only way that I could get it to even generate a 400 at all is if I didn't include a space before \r\n. It would seem that just about any string that contains at least one space delimiter is accepted as a valid (if partial) HTTP request. There is nothing wrong with being more permissive than the HTTP spec here, but I think doing so means that the resulting behavior of responses never actually being sent needs to be addressed. I'm working on a timer delayed write buffer flush patch that I think should address the issue without compromising the benchmark performance that this behavior was created to fix.

21-06-09 12:25:54.972115 -0700 PDT m=+10.006704000 reply from server: ---CUT---HTTP/1.1 200 OK                                                                                                                      
2021-06-09 12:25:54.972119 -0700 PDT m=+10.006707640 reply from server: Server: fasthttp                                                                                                                              
2021-06-09 12:25:54.972123 -0700 PDT m=+10.006711286 reply from server: Date: Wed, 09 Jun 2021 19:25:54 GMT                                                                                                           
2021-06-09 12:25:54.972126 -0700 PDT m=+10.006715165 reply from server: Content-Type: text/plain; charset=utf8                                                                                                        
2021-06-09 12:25:54.972131 -0700 PDT m=+10.006719830 reply from server: Content-Length: 583                                                                                                                           
2021-06-09 12:25:54.972191 -0700 PDT m=+10.006779746 reply from server: X-My-Header: my-header-value                                                                                                                  
2021-06-09 12:25:54.972197 -0700 PDT m=+10.006785366 reply from server: Set-Cookie: cookie-name=cookie-value                                                                                                          
2021-06-09 12:25:54.972203 -0700 PDT m=+10.006791872 reply from server: Connection: close                                                                                                                             
2021-06-09 12:25:54.972207 -0700 PDT m=+10.006795826 reply from server:                                                                                                                                               
2021-06-09 12:25:54.972211 -0700 PDT m=+10.006799371 reply from server: Hello, world!                                                                                                                                 
2021-06-09 12:25:54.972214 -0700 PDT m=+10.006802974 reply from server:                                                                                                                                               
2021-06-09 12:25:54.972218 -0700 PDT m=+10.006806558 reply from server: Request method is "asdklfjhsdjfh@(#$*$&WSEJVHBSDFasldkfj"                                                                                     
2021-06-09 12:25:54.972222 -0700 PDT m=+10.006810566 reply from server: RequestURI is "lskdjflkjsdflkjsdkfjkhasdk!@#$%!^(MISSING)^&**())!@(*&!*@&^$}{\":><?"
2021-06-09 12:25:54.972226 -0700 PDT m=+10.006814497 reply from server: Requested path is "/lskdjflkjsdflkjsdkfjkhasdk!@" 

Basically the fix would look something like this. Disclaimer: I don't think this patch is safe as written, but it illustrates the fix I am thinking:

diff --git a/server.go b/server.go
index 82eeef7..6d3b68a 100644
--- a/server.go
+++ b/server.go
@@ -2280,6 +2280,14 @@ func (s *Server) serveConn(c net.Conn) (err error) {
 				if err != nil {
 					break
 				}
+			} else if br.Buffered() != 0 {
+				flushTimer := time.NewTimer(10 * time.Millisecond)
+				go func () {
+					<-flushTimer.C
+					if bw != nil {
+						bw.Flush()
+					}
+				}()
 			}
 			if connectionClose {
 				break

@erikdubbelboer
Copy link
Collaborator

You're right it waits for the \n here:

fasthttp/header.go

Lines 2290 to 2292 in 6233fbc

nNext := bytes.IndexByte(b, nChar)
if nNext < 0 {
return nil, nil, errNeedMore

In theory we could read a couple of bytes and see if they are ASCII as the method is the first token and should be all ASCII.

I get what you want with EAGAIN and I think we could do this.
What you want is here:

fasthttp/header.go

Lines 1503 to 1507 in 6233fbc

if err != errNeedMore {
h.resetSkipNormalize()
return err
}
n = r.Buffered() + 1

When we get an errNeedMore here we don't want to keep reading in a loop, instead we want to break out to here:
if err = ctx.Request.Header.Read(br); err == nil {

Where we can flush the send buffer if needed and then retry the Header.Read.
But this will require some bigger changes I think. I don't have time for this right now but I will have in the near future.

erikdubbelboer added a commit that referenced this issue Jun 18, 2021
Don't wait for the next request as this can take some time, instead
flush the outstanding responses already.

Fixes #1043
@erikdubbelboer
Copy link
Collaborator

@wcs1only this is what I had in mind: #1050 Would you be willing to test this branch on your end to see if it completely fixes your problem?

@wcs1only
Copy link
Author

wcs1only commented Jun 25, 2021

@erikdubbelboer Awesome, this does seem to fix the issue and deliver the desired behavior for most cases. There is still an outstanding case when the buffer contains <4 bytes, it seems to get blocked here:

b, err = br.Peek(4)

Code to reproduce:

// socket client for golang
// https://golangr.com
package main

import "net"
import "fmt"
import "bufio"
import "time"

func main() {

  // connect to server
  conn, _ := net.Dial("tcp", "127.0.0.1:8080")
  reader := bufio.NewReader(conn)

  go func () {
    for { 
      message, _ := reader.ReadString('\n')
      if len(message) > 0 {
	fmt.Printf("%s reply from server: %s", time.Now(), message)
      }
    }
  } ()
  payload := "GET / HTTP/1.1\r\nhost: localhost\r\n\r\na "
  fmt.Fprintf(conn, payload)
  fmt.Printf("%s client sent: %s", time.Now(), payload)
  time.Sleep(100 * time.Second) 

  payload = "fooo\r\n\r\n"
  fmt.Fprintf(conn, payload)
  fmt.Printf("%s client sent: %s", time.Now(), payload)
  time.Sleep(10 * time.Second) 


}

erikdubbelboer added a commit that referenced this issue Jun 25, 2021
Don't wait for the next request as this can take some time, instead
flush the outstanding responses already.

Fixes #1043
@erikdubbelboer
Copy link
Collaborator

You're right, I have fixed it in my pull. I'm going to merge it on Monday and do a new release on Wednesday.

erikdubbelboer added a commit that referenced this issue Jun 28, 2021
* Flush buffered responses if we have to wait for the next request

Don't wait for the next request as this can take some time, instead
flush the outstanding responses already.

Fixes #1043

* Only peek 1 byte

Make sure old clients that send bogus \r\n still work.
See: golang/go@bf5e19f
@wcs1only wcs1only mentioned this issue Jul 14, 2021
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants