[dont merge to main yet] Don't flush after step completes #977

Gonzalo-Avalos-Ribas · 2023-10-05T21:00:35Z

The motivation for this PR is to avoid flushing every time we complete a step. There should only be two times we flush to disk/upload:

When we reach the graphObjectBufferThresholdInBytes of data in memory
All steps are done.

Also, I don't think is optimal to divide by step the data we update, that would generate way more unnecessary uploads.

Tried it on dev, on instance jupiterone-integration-dev: We went from 432 uploads in a single job, to 3.

packages/integration-sdk-runtime/src/execution/dependencyGraph.ts

Gonzalo-Avalos-Ribas · 2023-10-06T12:24:43Z

packages/integration-sdk-runtime/src/execution/dependencyGraph.ts

+          name: IntegrationErrorEventName.UnexpectedError,
+          description: 'Upload to persister failed',
+        });
+        //How can we fail gracefully here?


Today if an upload fails - we fail that step, but its really an incorrect decision, since the upload usually has information regarding other steps as well. Should we just fail the job if an upload fails? Note this is not the only place we would be uploading, but we can also change the behavior there.

zemberdotnet · 2023-10-06T13:38:47Z

...integration-sdk-runtime/src/storage/FileSystemGraphObjectStore/FileSystemGraphObjectStore.ts

+    await this.lockOperation(async () => {
+      const entitiesByStep = this.localGraphObjectStore.collectEntitiesByStep();
+      let entitiesToUpload: Entity[] = [];
+      for (const [stepId, entities] of entitiesByStep) {


Why did we move this from pMap to a plain function?

There might be no reason to divide the uploads we do. There might be a point of doing it for writes to disk.
Let's say a flushed batch of 6MB of entities gets into the function, do we want to generate 100 uploads if the entities come from 100 different steps?

That makes sense. In that case, we may want to alter the way this whole flushing process is done in the future. We could just remove the concept of steps in this. That can be future work though.

…d fails

…ps-2 Int 9336 dont flush after steps 2

zemberdotnet · 2023-12-21T15:57:27Z

Looks good to me. Let's make an alpha version.

Gonzalo Avalos Ribas added 2 commits October 5, 2023 17:58

Dont flush after steps

0147cf6

format

7f5ce50

Gonzalo-Avalos-Ribas requested a review from a team as a code owner October 5, 2023 21:00

Removed .only from test

08d9dc3

Gonzalo-Avalos-Ribas commented Oct 5, 2023

View reviewed changes

packages/integration-sdk-runtime/src/execution/dependencyGraph.ts Outdated Show resolved Hide resolved

Remove logger.info

bdd38f7

Gonzalo-Avalos-Ribas marked this pull request as draft October 5, 2023 22:02

Gonzalo-Avalos-Ribas changed the title ~~Don't flush after steps~~ Don't flush after step completes Oct 5, 2023

Gonzalo-Avalos-Ribas commented Oct 5, 2023

View reviewed changes

packages/integration-sdk-runtime/src/execution/dependencyGraph.ts Outdated Show resolved Hide resolved

Corrected tests - Added comments - format

0267c15

Gonzalo-Avalos-Ribas commented Oct 6, 2023

View reviewed changes

zemberdotnet reviewed Oct 6, 2023

View reviewed changes

Gonzalo-Avalos-Ribas marked this pull request as ready for review October 9, 2023 18:28

Gonzalo Avalos Ribas and others added 7 commits October 9, 2023 15:31

Added the failure of the last steps in case upload fails

4f9eaaf

Added logic to fail all steps involved in an upload in case the uploa…

753c85c

…d fails

add graphObjectStore optional function

530ae9c

Added a bit more efficiency

4cb0e15

Merge pull request #979 from JupiterOne/INT-9336-dont-flush-after-ste…

52e7821

…ps-2 Int 9336 dont flush after steps 2

Minor changes

c85dbad

Merge main

a09a47c

zemberdotnet approved these changes Dec 21, 2023

View reviewed changes

Gonzalo-Avalos-Ribas changed the title ~~Don't flush after step completes~~ [dont merge to main yet] Don't flush after step completes Dec 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dont merge to main yet] Don't flush after step completes #977

[dont merge to main yet] Don't flush after step completes #977

Gonzalo-Avalos-Ribas commented Oct 5, 2023 •

edited

Gonzalo-Avalos-Ribas Oct 6, 2023

zemberdotnet Oct 6, 2023

Gonzalo-Avalos-Ribas Oct 6, 2023 •

edited

zemberdotnet Oct 6, 2023

zemberdotnet commented Dec 21, 2023

[dont merge to main yet] Don't flush after step completes #977

Are you sure you want to change the base?

[dont merge to main yet] Don't flush after step completes #977

Conversation

Gonzalo-Avalos-Ribas commented Oct 5, 2023 • edited

Gonzalo-Avalos-Ribas Oct 6, 2023

Choose a reason for hiding this comment

zemberdotnet Oct 6, 2023

Choose a reason for hiding this comment

Gonzalo-Avalos-Ribas Oct 6, 2023 • edited

Choose a reason for hiding this comment

zemberdotnet Oct 6, 2023

Choose a reason for hiding this comment

zemberdotnet commented Dec 21, 2023

Gonzalo-Avalos-Ribas commented Oct 5, 2023 •

edited

Gonzalo-Avalos-Ribas Oct 6, 2023 •

edited