Skip to content
This repository has been archived by the owner on Feb 13, 2024. It is now read-only.

Support for serverless environments? #245

Closed
Blatty opened this issue Jul 13, 2020 · 19 comments · Fixed by #281
Closed

Support for serverless environments? #245

Blatty opened this issue Jul 13, 2020 · 19 comments · Fixed by #281

Comments

@Blatty
Copy link

Blatty commented Jul 13, 2020

Hello,

First of all, thank you to the segment team for providing this library.

I'm currently trying to implement analytics-node in my serverless app (GraphQL + nowsh, now known as vercel.

I tried using a shorter flushInterval like mentioned in this comment: #121 (comment) but it looks like context.callbackWaitsForEmptyEventLoop needs to be set to true, however Vercel, under the hood, use context.callbackWaitsForEmptyEventLoop = false by default.

Is there any recommended guide to use this library in a serverless environment / is it supported?

@sbassin
Copy link

sbassin commented Nov 4, 2020

I'm having similar issues in a Lambda even with context.callbackWaitsForEmptyEventLoop set to true. Did you ever find a solution to this? Did you end up integrating with the Segment HTTP services?

@mccraveiro
Copy link

I don't think that's the best solution but you could call .flush() at the end of your function handler.

@Blatty
Copy link
Author

Blatty commented Nov 30, 2020

.flush() won't work at is will just process the queue synchronously.

@sbassin We forked it to use the same api, removed the queue system and made async calls with undici instead of axios.

@sbassin
Copy link

sbassin commented Dec 10, 2020

I ended up posting to the Segment trace HTTP endpoint.

@ItsWendell
Copy link

ItsWendell commented Feb 5, 2021

I'm glad I ran into this, @Blatty do you have your fork available? I'm also starting to implement it in lamdba / Vercel. Support for serverless environments would be great for this Segment SDK!

Edit: After hours of debugging and trying to figure out how to built my own client efficiently, I figured out a way to fire and forget and save 30-40% of execution time in my lambda's while still successfully sending events.

Couple of things that I've learned while trying to implement segment within a serverless environment:

  • It seems that there is no need to actually call an .identify() method, you can also just sent the userId / anonymousId to the API and with the traits in the context. See docs for context here: https://segment.com/docs/connections/spec/common/#context
  • The API endpoint is quite 'slow', it takes 400ms to 1200ms (from europe) to successfully sent and retrieve and answer, doing an fire and forget, seems to be about 30-40% faster.

I created a small library (TypeScript) here that I use now in my lambda functions on Vercel (I might clean it up and package it later), this implements the identify call but on default only sets it in cache to be used in a later .track() call.

https://gist.github.com/ItsWendell/45ebbb7d2ecc7e35f0a87b2f0cf62476

Example usage:

      await analytics?.identify(user?.id, {
        id: user?.id,
        email: user?.email,
        avatar: user?.avatar || undefined,
        createdAt: user?.createdAt,
        firstName: user?.firstName,
        lastName: user?.lastName,
        fullName: user?.fullName,
        roles: user?.roles?.map((item) => item.role),
        recruiterId: user?.recruiterProfile?.id,
        companies: user?.companies?.map((item) => item.id),
        passports: user?.passports?.map((item) => item.provider),
      });

      await analytics.track("Signed Up", {
        provider: "local",
      });
      ```

@davidparys
Copy link

I'm having a similar issue with NextJs and vercel. It doesn't seem to flush it no matter if I call analytics.flush() or setting a flushAt: 1  as mentionned in the Segment Node Docs.

@pbassut
Copy link
Contributor

pbassut commented Apr 19, 2021

Did anyone trace the root cause of the issue? Does lambdas finish and kill all threads(unusual) before the queue is actually flushed?

@tclass
Copy link

tclass commented Apr 26, 2021

have the same problem, even when I call flush(), as it's non-blocking it can happen that the function ends before flush() finished, can't we have a blocking flush() function. It might not be useful in client facing apps, but on the service-side it's totally fine

@sayertindall
Copy link

Any status on this from the Segment team? this is pretty crucial for us and would love to hear a timeline on when this package will be available for serverless enviroments..

@segment, any updates??

@pbassut
Copy link
Contributor

pbassut commented May 22, 2021

Hey everyone. We're working on this as of now should have a solution ASAP.

I'm wondering if just returning .post here https://github.com/segmentio/analytics-node/blob/master/index.js#L278 if that wouldn't allow us to await(thus making it sync/blocking).

some inputs here would be nice. Maybe I'm not in the loop here

@milo-
Copy link

milo- commented Jun 21, 2021

@pbassut are you able to release a new version on npm with your flush promise fix? I've been testing in our environment (Google cloud functions) and it seems to be working well 👍

@pbassut
Copy link
Contributor

pbassut commented Jun 21, 2021

Yeah, sure.
That's good cause I've been testing on AWS Lambda and it seems to be working well too.
But I was looking to have confirmation from someone else also using Lambda.

But we'll make a release anyway.

@jkarsrud
Copy link

jkarsrud commented Jun 21, 2021

But I was looking to have confirmation from someone else using lambda.

@pbassut If you are still looking at confirmation, I just tested this on my end and the fix seems to be working just fine 👍

@nd4p90x nd4p90x added this to To do in analytics-node via automation Jun 29, 2021
@nd4p90x nd4p90x linked a pull request Jun 29, 2021 that will close this issue
@nd4p90x nd4p90x moved this from To do to Needs Review in analytics-node Jun 29, 2021
@nd4p90x nd4p90x moved this from Needs Review to Done in analytics-node Jun 29, 2021
@nd4p90x nd4p90x closed this as completed Jun 29, 2021
@silvio-e
Copy link

Do I understand right that this work now in serverless environments like Vercel? I can't find anything in the documentation and also #281 has no further description.

@mikeblanton
Copy link

@pbassut can you confirm what specifically needs to be done to take advantage of this? We're working with one of your SA's and he pointed us to this fix to help with some visibility problems we're having in Lambda. We're triggering events off of Lambda's that process DynamoDB streams. I've got our flushAt set to 1. I'm also explicitly calling via an await. I can see in Lumigo that the call to segment is going out. I can see in my logs that the flush call completes, but I don't see confirmation in the logs (or in lumigo) that the calls actually finish.

For example, I've got these 2 helper functions:

module.exports.flush = async (logger) => {
  await analytics.flush(function (err, batch) {
    if (logger) {
      if (err) {
        logger.error(err);
        return;
      }
      logger.debug('Segment cache flushed', {batch});
    }
  });
}

module.exports.track = async (payload, logger) => {
  analytics.track(payload, function (err, batch) {
    if (logger) {
      if (err) {
        logger.error(err, payload);
        return;
      }
      logger.debug('Batch flushed', {batch, payload});
    }
  });
  if (logger) {
    logger.debug('Segment track call sent', {payload});
  }
}

In my logs, I see...

  • Segment track call sent
  • Segment cache flushed

But I never see Batch flushed.

@andreaminieri
Copy link

@mikeblanton I'm having exactly the same issue in a Vercel serverless function: did you find a solution for this?

@pbassut
Copy link
Contributor

pbassut commented Oct 17, 2021

@mikeblanton Sorry I missed this. Can you create an issue for that so we can track more closely?
I'm assuming you did upgrade to the latest library version

@mikeblanton
Copy link

@andreaminieri see the comment in #305 for my workaround. Not sure if it will work for you or not.

@andreaminieri
Copy link

@mikeblanton thanks for your reply. Apparently we had the same idea to fix this, I did something something very similar to your workaround, I wrapped up in a new promise the track call instead of flush, as I described in #303 .

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.