Intermittent Error: write EPIPE when running stripe client in AWS Lambda #1040

hisham · 2020-10-13T17:53:42Z

We're using the stripe node client 8.71.0 on an AWS Lambda running node 12.x. A stripe customers.list call is called first thing when the lambda executes. 33% of the time - we get this error on that call. It consistently happens so does not seem to be transient.

I did read #650, and setting maxNetworkRetries in stripe to 2 seems to resolve the issue. However it seems that just masks the issue.

Is this a stripe issue or AWS Lambda issue? Probably lambda, I submitted a request with AWS. But putting this here in case others run into it.

2020-10-13T12:02:58.032Z c184006d-fe96-490a-9bfe-696b8271769a ERROR StripeConnectionError: An error occurred with our connection to Stripe.
at /var/task/node_modules/stripe/lib/StripeResource.js:234:9
at ClientRequest. (/var/task/node_modules/stripe/lib/StripeResource.js:489:67)
at ClientRequest.emit (events.js:315:20)
at ClientRequest.EventEmitter.emit (domain.js:483:12)
at TLSSocket.socketErrorListener (_http_client.js:426:9)
at TLSSocket.emit (events.js:315:20)
at TLSSocket.EventEmitter.emit (domain.js:483:12)
at emitErrorNT (internal/streams/destroy.js:92:8)
at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)
at processTicksAndRejections (internal/process/task_queues.js:84:21) {
type: 'StripeConnectionError',
raw: {
message: 'An error occurred with our connection to Stripe.',
detail: Error: write EPIPE
at WriteWrap.onWriteComplete [as oncomplete] (internal/stream_base_commons.js:92:16)
at writevGeneric (internal/stream_base_commons.js:132:26)
at TLSSocket.Socket._writeGeneric (net.js:784:11)
at TLSSocket.Socket._writev (net.js:793:8)
at doWrite (_stream_writable.js:401:12)
at clearBuffer (_stream_writable.js:519:5)
at TLSSocket.Writable.uncork (_stream_writable.js:338:7)
at ClientRequest.end (_http_outgoing.js:774:17)
at ClientRequest. (/var/task/node_modules/stripe/lib/StripeResource.js:506:15)
at Object.onceWrapper (events.js:422:26) {
errno: 'EPIPE',
code: 'EPIPE',
syscall: 'write'
}
},
rawType: undefined,
code: undefined,
doc_url: undefined,
param: undefined,
detail: Error: write EPIPE
at WriteWrap.onWriteComplete [as oncomplete] (internal/stream_base_commons.js:92:16)
at writevGeneric (internal/stream_base_commons.js:132:26)
at TLSSocket.Socket._writeGeneric (net.js:784:11)
at TLSSocket.Socket._writev (net.js:793:8)
at doWrite (_stream_writable.js:401:12)
at clearBuffer (_stream_writable.js:519:5)
at TLSSocket.Writable.uncork (_stream_writable.js:338:7)
at ClientRequest.end (_http_outgoing.js:774:17)
at ClientRequest. (/var/task/node_modules/stripe/lib/StripeResource.js:506:15)
at Object.onceWrapper (events.js:422:26) {
errno: 'EPIPE',
code: 'EPIPE',
syscall: 'write'
},
headers: undefined,
requestId: undefined,
statusCode: undefined,
charge: undefined,
decline_code: undefined,
payment_intent: undefined,
payment_method: undefined,
setup_intent: undefined,
source: undefined
}

paulasjes-stripe · 2020-10-14T01:57:46Z

We've seen this before with AWS Lambda and believe it's an issue/configuration setting on their end. Using maxNetworkRetries seems to do the trick in most cases, but as you correctly stated it's more masking the problem than solving it.

When you hear back from AWS would you mind updating this issue with your findings?

hisham · 2020-10-14T02:49:29Z

Yea I have aws premium subscription should have a response soon.

I did find similar issues that people reported here with other libs:

So my latest theory is it's something related to keep-alive and sockets expiring, but at this point I added the retry and waiting for AWS to respond back to me.

hisham · 2020-10-14T17:43:48Z

Hi @paulasjes-stripe - here's the response we got from AWS:

Starting with the error, "EPIPE" error [0] is generally caused when data is piped into closed streams [1]. In the case of the NodeJS Lambda function, the error might be caused when the NodeJS event loop didn't clean-up closed TCP connections from the HTTP connection pool and then the NodeJS runtime attempted to use the closed TCP connection.

To understand the error better, below is what happens behind the scenes:

AWS Lambda function runs in an isolated container and usually each Invoke starts a new Lambda function execution in a new container.

However, if delay between two requests is very small, then the container used by the previous Invoke might be reused to cater to the later request as well. This is known as container reuse [2].

While finishing execution, Lambda does not consider the state of active processes in background other than handler function. Thus, when the execution is finished, the active processes turn into frozen state.

When the next request is processed by the container, the previously frozen asynchronous processes are started again.

If any of the frozen processes has dependency on the piping/streaming, then that process fails to continue execution as it does not find the pipeline/connection/stream it used in previous request.

To avoid these errors the following is suggested:

Revisit the function code and ensure that the processes (dependent on connection/stream) are finished before lambda completes execution.

Use the retry which will create new connection/stream for new request.

I hope the above information gives an idea on EPIPE errors and why adding retries may help in resolving the EPIPE errors.

However, If there are any further queries/concerns please let me know and I will be happy to assist.

References:
[0] https://nodejs.org/api/errors.html
[1] EPIPE error - nodejs/node#947
[2]https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/

So I'll just use Stripe's retry logic for now, as I don't seem to have control over stripes background processes. Is it the keep alive connection that is causing this issue? Not sure.

Our lambda is very simple, it basically just returns the results from this line:

await this.stripeClient.customers.list({ email })

It's a 2048mb lambda running nodejs 12. It is called via a GraphQL function transfomer (https://docs.amplify.aws/cli/function), but I don't think those details matter much.

Interestingly, I have other lambdas that also call the above rest API, but have other network calls and involved logic, and I've never ran into the EPIPE issue with them before.

paulasjes-stripe · 2020-10-14T23:43:11Z

Thanks @hisham! We're going to look into this to see if there's anything that can be done from our end, but it looks like maxNetworkRetries are a suitable workaround for now.

hisham · 2020-10-16T02:24:36Z

Great. Yes maxNetworkRetries does the job. AWS seems to agree with me that calling destroy method on the httpagent before the lambda exists will probably also resolve this issue:

It is mentioned in AWS Lambda Best Practices [1][2] to use a keep-alive directive to maintain persistent connections. Quoting from Documentation
Lambda purges idle connections over time. Attempting to reuse an idle connection when invoking a function will result in a connection error. To maintain your persistent connection, use the keep-alive directive associated with your runtime.

However, In certain situations depending upon the time difference between 2 lambda invocations there might be chances of getting an idle connection present there and causing error.

Therefore, It sounds right to use agent.destroy() before exiting lambda to destroy all connections. But It needs to be made sure that the code to close/destroy all connections is executed before exiting lambda. Then, This would ensure that the socket connections are not hanging in there open.

As a workaround, Retries as you mentioned and have found to be working fine.

I hope this information helps. However, If there are any further queries please let me know.

[1] https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html
[2] https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/node-reusing-connections.html

huntedman · 2020-11-08T11:56:21Z

@hisham Do you happen to know if I can wrap my stripe method calls inside trycatch, if I want to use maxNetworkRetries?
I'm also using aws lambdas, and I'm worried that it will prematurely exit in that case...

hisham · 2020-11-08T17:27:24Z

@huntedman we are using maxNetworkRetries and are not wrapping calls around try catch. Stripe seems to handle this stuff internally.

suz-stripe · 2020-12-18T17:54:39Z

Hi @hisham sorry for the radio silence about this recently but I'm checking in with a quick update. The response from AWS was very helpful to us (thank you!) and we're actively investigating this issue to provide a better fix than our suggested workaround. When we know more we'll definitely update you here again via this open issue. Thank you for your patience!

richardm-stripe · 2020-12-19T02:23:55Z

I've spent some time experimenting with AWS lambda, and have a better understanding of these errors. They are happening due to the interaction between

how Lambda freezes/unfreezes processes
and how stripe-node (by default) uses a single http Agent with keep-alive enabled

In case you're not familiar, keep-alive is a way for http clients like stripe-node to be more efficient when your application is making multiple requests to Stripe. Rather than making a new connection for each request, which has a performance cost, it keeps the connection to the server open after a request is finished, so that it can be reused on the next request.
In order for keep-alive to work, the open connection must ping the server every so often to let the server know that it is still active. If it doesn't, the server will assume the connection isn't active anymore and close the connection to make room for others.

The problem arises when Lambda freezes your Node process. While the process is frozen, the TCP connections can't ping the server to remain active, and the server closes them. When Lambda unfreezes your process, Node isn't aware that the connections have been closed, and it attempts to re-use them. As soon as it does, it gets EPIPE or ECONNRESET.

One option for eliminating these errors would be to disable keep-alive when you initialize stripe-node.

const https = require('https')
const stripe = require('stripe')('sk_live_xyz', {httpAgent: new https.Agent({keepAlive: false})})

This does mean sacrificing the benefits of keep-alive, but I expect that's an acceptable trade-off especially for low-traffic lambdas.

Another possibility would be initializing a new Stripe client with its own keep-alive-enabled agent inside the Lambda handler. This is roughly equivalent to Amazon's suggestion of calling .destroy on the http agent before exiting, but this isn't ideal either because it only allows you to re-use connections within each individual Lambda invocation, and not from one Lambda invocation to the next.

From my perspective, handling these errors by retrying is likely the proper approach, and shouldn't necessarily be viewed as a workaround, or masking an underlying issue, because it is expected/unavoidable that these broken connections will come to exist, and there doesn't seem to be an obvious way of asking Node "how long has it been since the last keep-alive probe on this connection" besides writing to the connection and triggering the error.

At the same time, I think we should look into the possibility of making stripe-node handle errors like this by default/more transparently, so that users don't have to configure the retries themselves. That seems to be what Amazon started doing for errors like this for their own SDKs about a month ago (thank you @hisham for linking to that issue, by the way).

Anyway I hope this clarifies things and we'll keep you posted.

theoBLT · 2021-02-09T13:04:49Z

Thank you for opening this thread! I had the same issue on a very low traffic site (side project). I used Stripe's Node library inside Netlify functions, and got 502 errors with error message write EPIPE in the Netlify function logs. .

I moved forward with the fix you recommended @richardm-stripe, but the syntax didn't work. The below worked though:

const stripe = require('stripe')('secret_key_xyz', {
  httpAgent: new https.Agent({keepAlive: false})
});

richardm-stripe · 2021-02-09T17:33:14Z

Thanks @theoBLT, I've corrected the syntax in the original comment.

Requests that fail with closed connection errors (ECONNRESET, EPIPE) are automatically retried. - `ECONNRESET` (Connection reset by peer): A connection was forcibly closed by a peer.closed by a peer. This normally results from a loss of the connection on the remote socket due to a timeout or reboot. Commonly encountered via the http and net modules. - `EPIPE` (Broken pipe): A write on a pipe, socket, or FIFO for which there is no process to read the data. Commonly encountered at the net and http layers, indicative that the remote side of the stream being written to has been closed. Fixes: stripe#1040

* feat(http-client): Retry requests that failed with closed connection Requests that fail with closed connection errors (ECONNRESET, EPIPE) are automatically retried. - `ECONNRESET` (Connection reset by peer): A connection was forcibly closed by a peer.closed by a peer. This normally results from a loss of the connection on the remote socket due to a timeout or reboot. Commonly encountered via the http and net modules. - `EPIPE` (Broken pipe): A write on a pipe, socket, or FIFO for which there is no process to read the data. Commonly encountered at the net and http layers, indicative that the remote side of the stream being written to has been closed. Fixes: #1040

richardm-stripe · 2022-05-09T23:48:50Z

Oof, #1336 claimed to fix this, so it auto-closed, but I disagree that it's entirely fixed until retries are enabled by default.

* API Updates (#1413) * Bump version to 8.221.0 * API Updates (#1414) * Bump version to 8.222.0 * API Updates (#1415) * feat(http-client): retry closed connection errors (#1336) * feat(http-client): Retry requests that failed with closed connection Requests that fail with closed connection errors (ECONNRESET, EPIPE) are automatically retried. - `ECONNRESET` (Connection reset by peer): A connection was forcibly closed by a peer.closed by a peer. This normally results from a loss of the connection on the remote socket due to a timeout or reboot. Commonly encountered via the http and net modules. - `EPIPE` (Broken pipe): A write on a pipe, socket, or FIFO for which there is no process to read the data. Commonly encountered at the net and http layers, indicative that the remote side of the stream being written to has been closed. Fixes: #1040 * Remove deprecated orders-related events (#1417) * Bump version to 9.0.0 * API Updates (#1420) * Codegen for openapi 7789931 * Bump version to 9.1.0 * API Updates (#1422) * Codegen for openapi 056745c Co-authored-by: Richard Marmorstein <richardm@stripe.com> Co-authored-by: Dominic Charley-Roy <dcr@stripe.com> * Bump version to 9.2.0 * Codegen for openapi v146 (#1430) * Bump version to 9.3.0 * Codegen for openapi v147 (#1431) * Bump version to 9.4.0 * docs: Update HttpClient documentation to remove experimental status. (#1432) * Codegen for openapi v149 (#1434) * Bump version to 9.5.0 * API Updates (#1439) * Bump version to 9.6.0 * Update README.md (#1440) * Codegen for openapi v152 (#1441) * Add test for cash balance methods. (#1438) * Bump version to 9.7.0 Co-authored-by: Dominic Charley-Roy <78050200+dcr-stripe@users.noreply.github.com> Co-authored-by: Dominic Charley-Roy <dcr@stripe.com> Co-authored-by: Richard Marmorstein <52928443+richardm-stripe@users.noreply.github.com> Co-authored-by: Bruno Pinto <brunoferreirapinto@gmail.com> Co-authored-by: Richard Marmorstein <richardm@stripe.com> Co-authored-by: Kamil Pajdzik <99290280+kamil-stripe@users.noreply.github.com>

FeliceGeracitano · 2022-07-24T08:58:41Z

I also got this issue in v8, I upgraded to v9 and all looks good now

automatically retry it is place now for CONNECTION_CLOSED_ERROR_CODES --> 47776ef

anniel-stripe · 2023-08-16T21:52:53Z

Hello! maxNetworkRetries has been set to 1 by default with the release of stripe-node v13 today (enabled by this change). I'll be closing this issue, as the default behavior in v13 should prevent this error.

paulasjes-stripe self-assigned this Oct 14, 2020

paulasjes-stripe mentioned this issue Jan 22, 2021

Getting Stripe EPIPE issue when called from AWS Lambda #1111

Closed

theoBLT added a commit to theoBLT/indonesian that referenced this issue Feb 9, 2021

fixed 502 error in production due to stripe/stripe-node#1040

7ad8b06

bpinto mentioned this issue Jan 21, 2022

feat(http-client): retry closed connection errors #1336

Merged

3 tasks

richardm-stripe closed this as completed in #1336 May 9, 2022

richardm-stripe reopened this May 9, 2022

richardm-stripe added the future label Jan 12, 2023

richardm-stripe mentioned this issue Jun 1, 2023

(next major) one network retry by default #1803

Merged

anniel-stripe closed this as completed Aug 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittent Error: write EPIPE when running stripe client in AWS Lambda #1040

Intermittent Error: write EPIPE when running stripe client in AWS Lambda #1040

hisham commented Oct 13, 2020

paulasjes-stripe commented Oct 14, 2020 •

edited

hisham commented Oct 14, 2020

hisham commented Oct 14, 2020

paulasjes-stripe commented Oct 14, 2020

hisham commented Oct 16, 2020

huntedman commented Nov 8, 2020

hisham commented Nov 8, 2020

suz-stripe commented Dec 18, 2020

richardm-stripe commented Dec 19, 2020 •

edited

theoBLT commented Feb 9, 2021 •

edited

richardm-stripe commented Feb 9, 2021

richardm-stripe commented May 9, 2022

FeliceGeracitano commented Jul 24, 2022 •

edited

anniel-stripe commented Aug 16, 2023

Intermittent Error: write EPIPE when running stripe client in AWS Lambda #1040

Intermittent Error: write EPIPE when running stripe client in AWS Lambda #1040

Comments

hisham commented Oct 13, 2020

paulasjes-stripe commented Oct 14, 2020 • edited

hisham commented Oct 14, 2020

hisham commented Oct 14, 2020

paulasjes-stripe commented Oct 14, 2020

hisham commented Oct 16, 2020

huntedman commented Nov 8, 2020

hisham commented Nov 8, 2020

suz-stripe commented Dec 18, 2020

richardm-stripe commented Dec 19, 2020 • edited

theoBLT commented Feb 9, 2021 • edited

richardm-stripe commented Feb 9, 2021

richardm-stripe commented May 9, 2022

FeliceGeracitano commented Jul 24, 2022 • edited

anniel-stripe commented Aug 16, 2023

paulasjes-stripe commented Oct 14, 2020 •

edited

richardm-stripe commented Dec 19, 2020 •

edited

theoBLT commented Feb 9, 2021 •

edited

FeliceGeracitano commented Jul 24, 2022 •

edited