Simplify archery integration and disable JS temporarily #2449

tustvold · 2022-08-15T09:58:45Z

Which issue does this PR close?

Rationale for this change

The current integration test has the following flow

Github CI orchestrates a python script
Which orchestrates docker compose
Which orchestrates docker
Which runs scripts in the arrow repo
Which runs different python scripts

This is not only immensely confusing, but makes reproducing issues very difficult and complicates opting out of certain integration tests, e.g. because they are broken #2448

What changes are included in this PR?

Are there any user-facing changes?

tustvold · 2022-08-15T10:05:21Z

.github/workflows/integration.yml

+      - name: Build Rust
+        run: ci/scripts/rust_build.sh . /build
+      - name: Build C++
+        run: ci/scripts/cpp_build.sh . /build
+      - name: Build C#
+        run: ci/scripts/csharp_build.sh . /build
+      - name: Build Go
+        run: ci/scripts/go_build.sh .
+      - name: Build Java
+        run: ci/scripts/java_build.sh . /build
+      - name: Build JS
+        run: ci/scripts/js_build.sh . /build
+      - name: Install archery
+        run: pip install -e dev/archery


Sourced from https://github.com/apache/arrow/blob/master/docker-compose.yml#L1605

tustvold · 2022-08-15T10:05:41Z

.github/workflows/integration.yml

+        run: ci/scripts/js_build.sh . /build
+      - name: Install archery
+        run: pip install -e dev/archery
+      - name: Run integration tests


Sourced from https://github.com/apache/arrow/blob/master/ci/scripts/integration_arrow.sh

tustvold · 2022-08-15T17:46:36Z

See #2453 for an example of the failing JS build, which this disables

alamb

I am +0 on this change (I don't oppose it but I am not a fan either). I would like some other opinions

My hesitation with moving away from archery.py (which is complicated, as you note in the description) is that archery is used for testing the other implementations against each other, so by making this change we are effectively committing to maintaining the equivalent integration scripts (e.g. adding new images and/or keeping up to date with new dependencies or tests).

Maybe this isn't a big deal, but I would prefer not to have to maintain it

cc @viirya

tustvold · 2022-08-15T18:54:33Z

My hesitation with moving away from archery.py

This PR doesn't move away from using archery.py, archery is still used to orchestrate the tests - see here. All it moves away from is the docker-compose CI plumbing within the arrow repo proper, which isn't necessary and is more part of the arrow CI than archery itself...

viirya · 2022-08-15T19:57:56Z

.github/workflows/integration.yml

+        run: conda run --no-capture-output pip install -e dev/archery
+      - name: Run integration tests
+        run: |
+          conda run --no-capture-output archery integration \


Do we need conda run for each commend? I think the docker image enables the conda environment by default?

Sadly github actions overrides the entrypoint, among other things, and so this is the only way to make it work..

viirya · 2022-08-15T20:04:13Z

.github/workflows/integration.yml

+            --gold-dirs=testing/data/arrow-ipc-stream/integration/1.0.0-bigendian \
+            --gold-dirs=testing/data/arrow-ipc-stream/integration/1.0.0-littleendian \
+            --gold-dirs=testing/data/arrow-ipc-stream/integration/2.0.0-compression \
+            --gold-dirs=testing/data/arrow-ipc-stream/integration/4.0.0-shareddict


I think the change here is to move executing commends from docker-compose (conda-integration) to separate CI steps. The advantage is easier to control each step for each implementations (e.g. disabling JS).

Although we don't move from archery, there is still a slight disadvantage that we maintain the execution here and if any change is made in arrow repo, we might miss it at first.

There is some overhead but not as much as moving from archery, I think.

Correct, this should be relatively straightforward to maintain, and we can always revert back should issues arise.

Aside from giving more control, the CI reports will be more helpful as you can see the stage that failed, and it is significantly easier imo to understand what is going on and how to reproduce it locally.

Somewhat related I don't run docker on my machine, instead using podman, and so cannot run the archery docker commands

Aside from giving more control, the CI reports will be more helpful as you can see the stage that failed, and it is significantly easier imo to understand what is going on and how to reproduce it locally.

Oh, yea, that's right, forgot this point.

viirya · 2022-08-15T20:19:24Z

This looks good to me. Alternatively, if we don't like this approach, we may need to update docker-compose command at arrow repo to disable JS as it doesn't take any environment variables for that.

tustvold · 2022-08-15T20:37:10Z

I think the JS issue has possibly been fixed upstream, but regardless the capability to easily selectively disable broken integration tests still feels valuable, not to mention the other advantages to reproducibility and intelligibility of this PR

I'll leave this PR open for now in case anyone else wishes to weigh in, otherwise I'll merge it tomorrow morning

ursabot · 2022-08-16T08:11:44Z

Benchmark runs are scheduled for baseline = a144f69 and contender = 3b59adc. 3b59adc is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

alamb · 2022-08-17T15:23:46Z

I wonder when we will re-enable the javacript integration test?

tustvold force-pushed the simplify-integration-test-harness branch from e3654b4 to d03dd5b Compare August 15, 2022 10:01

tustvold commented Aug 15, 2022

View reviewed changes

tustvold force-pushed the simplify-integration-test-harness branch 5 times, most recently from 496926e to 37db501 Compare August 15, 2022 12:36

tustvold marked this pull request as ready for review August 15, 2022 12:43

tustvold marked this pull request as draft August 15, 2022 12:43

tustvold force-pushed the simplify-integration-test-harness branch 17 times, most recently from 39d2263 to 1dfa9a1 Compare August 15, 2022 16:24

Simplify archery integration and disable JS temporarily (apache#2448)

f80c3e0

tustvold force-pushed the simplify-integration-test-harness branch from 1dfa9a1 to f80c3e0 Compare August 15, 2022 16:55

tustvold changed the title ~~Simplify integration test harness~~ Simplify archery integration and disable JS temporarily Aug 15, 2022

tustvold marked this pull request as ready for review August 15, 2022 16:55

tustvold mentioned this pull request Aug 15, 2022

Simplify integration test harness with js #2453

Closed

alamb reviewed Aug 15, 2022

View reviewed changes

viirya reviewed Aug 15, 2022

View reviewed changes

viirya approved these changes Aug 15, 2022

View reviewed changes

tustvold merged commit 3b59adc into apache:master Aug 16, 2022

tustvold added the development-process Related to development process of arrow-rs label Aug 18, 2022

alamb mentioned this pull request Sep 16, 2022

SchemaResult in IPC deviates from other implementations #2445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify archery integration and disable JS temporarily #2449

Simplify archery integration and disable JS temporarily #2449

tustvold commented Aug 15, 2022 •

edited

tustvold Aug 15, 2022 •

edited

tustvold Aug 15, 2022

tustvold commented Aug 15, 2022

alamb left a comment

tustvold commented Aug 15, 2022 •

edited

viirya Aug 15, 2022

tustvold Aug 15, 2022

viirya Aug 15, 2022

tustvold Aug 15, 2022 •

edited

viirya Aug 15, 2022

viirya commented Aug 15, 2022

tustvold commented Aug 15, 2022 •

edited

ursabot commented Aug 16, 2022

alamb commented Aug 17, 2022

Simplify archery integration and disable JS temporarily #2449

Simplify archery integration and disable JS temporarily #2449

Conversation

tustvold commented Aug 15, 2022 • edited

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

tustvold Aug 15, 2022 • edited

Choose a reason for hiding this comment

tustvold Aug 15, 2022

Choose a reason for hiding this comment

tustvold commented Aug 15, 2022

alamb left a comment

Choose a reason for hiding this comment

tustvold commented Aug 15, 2022 • edited

viirya Aug 15, 2022

Choose a reason for hiding this comment

tustvold Aug 15, 2022

Choose a reason for hiding this comment

viirya Aug 15, 2022

Choose a reason for hiding this comment

tustvold Aug 15, 2022 • edited

Choose a reason for hiding this comment

viirya Aug 15, 2022

Choose a reason for hiding this comment

viirya commented Aug 15, 2022

tustvold commented Aug 15, 2022 • edited

ursabot commented Aug 16, 2022

alamb commented Aug 17, 2022

tustvold commented Aug 15, 2022 •

edited

tustvold Aug 15, 2022 •

edited

tustvold commented Aug 15, 2022 •

edited

tustvold Aug 15, 2022 •

edited

tustvold commented Aug 15, 2022 •

edited