Skip to content
This repository has been archived by the owner on Feb 13, 2024. It is now read-only.

Timer is scheduled to flush even if queue is empty #121

Closed
chris-pardy opened this issue Aug 28, 2017 · 16 comments · Fixed by #172
Closed

Timer is scheduled to flush even if queue is empty #121

chris-pardy opened this issue Aug 28, 2017 · 16 comments · Fixed by #172

Comments

@chris-pardy
Copy link

At the end of the enqueue method a timer is started to flush the queue. However if the queue has already reached the flushAt length this is not necessary.

Even if this behavior is expected we'd like a mechanism to turn off the timer based flushing since we're running on lambda and want to more aggressively flush.

@f2prateek
Copy link
Contributor

Tracking internally https://segment.atlassian.net/browse/LIB-119.

A workaround for now seems to use a really high value for your timer? I'm not sure what the downside of starting the timer is - it seems harmless in this case.

@chris-pardy
Copy link
Author

@f2prateek our primary issue with starting the timer is that the lambda runtime waits until the event loop is empty before sending the response to a request. Since the timer is left in the event loop this timer was adding several seconds to our response time.

@hey-aw
Copy link

hey-aw commented Apr 15, 2018

Having the same issue with Lambda. Flush timer causes node to hang and not respond to the next request. The observed result is that every other Lambda request times out unless you set the lambda timeout really high (11 s).

@hey-aw
Copy link

hey-aw commented Apr 18, 2018

I've been able to get analytics-node to play nicely with Lambda by passing a very small number for the time value of the (undocumented) flushInterval option:

const analytics = new Analytics(ANALYTICS_KEY, { flushAfter: 1, flushInterval: .000000000001 })

The events are coming through, and I haven't gotten a timeout from Lambda yet. My Lambda timeout is set to 7s.

@penkeysuresh
Copy link

Along with the above setting, do note that context.callbackWaitsForEmptyEventLoop parameter should be set to true (default value). If it's not set to true, Lambda will go into frozen state, and it may result in missing events if AWS Lambda chooses not to use the frozen process the next time lambda gets invoked. Do refer to the documentation for more details

@f2prateek
Copy link
Contributor

The flushInterval is very much documented and part of the public API. Documented both here https://github.com/segmentio/analytics-node/blob/master/index.js#L25 and https://segment.com/docs/sources/server/node/#configuration.

@chris-pardy
Copy link
Author

@f2prateek Regardless I think using the flush interval and setting a timer is wrong. The timer will just no-op (because the flush has already been invoked) while adding something to the event loop that will prevent node from exiting. I think what is needed is a a pre-emptive return after line 213 of index.js (similar to the pre-emptive return on line 209) I can open a PR to resolve that later today if you're ok with that direction.

Something like this

 if (!this.flushed) {
   this.flushed = true
   this.flush()
   return
 }

 if (this.queue.length >= this.flushAt) {
    this.flush()
+  return 
 }

  if (this.flushInterval && !this.timer) {
    this.timer = setTimeout(this.flush.bind(this), this.flushInterval)
 }

@f2prateek
Copy link
Contributor

I think that should be fine, I would love to have a few folks using lambda confirm that this approach works though.

@chris-pardy
Copy link
Author

@f2prateek opened PR #172 for this issue. Can't seem to add reviewers.

@chris-pardy
Copy link
Author

@AWzone @penkeysuresh could you guys take a look at the PR, and maybe help with testing it.

@penkeysuresh
Copy link

penkeysuresh commented May 29, 2018

@chris-pardy sorry for the delay, we went ahead with segment http api for our use case. Using an SDK which is maintaining a queue inside Lambda didn't make sense for us.

@alex-stabile
Copy link

I've run into this issue as well when using the SDK in a Lambda environment – we work around it by explicitly calling flush() when done enqueueing , since per the source ( https://github.com/segmentio/analytics-node/blob/master/index.js#L235-L238 ) it seems that the flush method clears any existing interval timer.

I think it would make sense to have some way to disable the flushInterval, maybe by setting to null in the config.

@simonprickett
Copy link

I wasted a lot of time wondering what was going on with Lambda and the Node API today before stumbling on this issue. Eventually I stopped using the Node API and made a HTTP call directly instead.

@uccmen
Copy link

uccmen commented Aug 6, 2019

followed the footsteps of @penkeysuresh & @simonprickett after spending a lot of hours debugging the intermittent case of event delivery failure to segment using the node api.

@himanshu-onscale
Copy link

I'd like to echo some of the comments made here. I attempted to follow the methods laid out here by setting the flushInterval and flushAt parameters and manually calling flush() but event tracking was still inconsistent compared to the standard HTTP API.

@leonkyr
Copy link

leonkyr commented Apr 2, 2020

Calling flush manually is not optimal and reliability is important here..

@nd4p90x nd4p90x added this to To do in analytics-node via automation Jul 26, 2021
@nd4p90x nd4p90x moved this from To do to In progress in analytics-node Jul 26, 2021
analytics-node automation moved this from In progress to Done Aug 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

Successfully merging a pull request may close this issue.

9 participants