Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify archery integration and disable JS temporarily #2449

Merged
merged 1 commit into from
Aug 16, 2022

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Aug 15, 2022

Which issue does this PR close?

Closes #2448

Rationale for this change

The current integration test has the following flow

  • Github CI orchestrates a python script
  • Which orchestrates docker compose
  • Which orchestrates docker
  • Which runs scripts in the arrow repo
  • Which runs different python scripts

This is not only immensely confusing, but makes reproducing issues very difficult and complicates opting out of certain integration tests, e.g. because they are broken #2448

What changes are included in this PR?

Are there any user-facing changes?

@tustvold tustvold force-pushed the simplify-integration-test-harness branch from e3654b4 to d03dd5b Compare August 15, 2022 10:01
Comment on lines 56 to 83
- name: Build Rust
run: ci/scripts/rust_build.sh . /build
- name: Build C++
run: ci/scripts/cpp_build.sh . /build
- name: Build C#
run: ci/scripts/csharp_build.sh . /build
- name: Build Go
run: ci/scripts/go_build.sh .
- name: Build Java
run: ci/scripts/java_build.sh . /build
- name: Build JS
run: ci/scripts/js_build.sh . /build
- name: Install archery
run: pip install -e dev/archery
Copy link
Contributor Author

@tustvold tustvold Aug 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run: ci/scripts/js_build.sh . /build
- name: Install archery
run: pip install -e dev/archery
- name: Run integration tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tustvold tustvold force-pushed the simplify-integration-test-harness branch 5 times, most recently from 496926e to 37db501 Compare August 15, 2022 12:36
@tustvold tustvold marked this pull request as ready for review August 15, 2022 12:43
@tustvold tustvold marked this pull request as draft August 15, 2022 12:43
@tustvold tustvold force-pushed the simplify-integration-test-harness branch 17 times, most recently from 39d2263 to 1dfa9a1 Compare August 15, 2022 16:24
@tustvold tustvold force-pushed the simplify-integration-test-harness branch from 1dfa9a1 to f80c3e0 Compare August 15, 2022 16:55
@tustvold tustvold changed the title Simplify integration test harness Simplify archery integration and disable JS temporarily Aug 15, 2022
@tustvold tustvold marked this pull request as ready for review August 15, 2022 16:55
@tustvold
Copy link
Contributor Author

See #2453 for an example of the failing JS build, which this disables

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +0 on this change (I don't oppose it but I am not a fan either). I would like some other opinions

My hesitation with moving away from archery.py (which is complicated, as you note in the description) is that archery is used for testing the other implementations against each other, so by making this change we are effectively committing to maintaining the equivalent integration scripts (e.g. adding new images and/or keeping up to date with new dependencies or tests).

Maybe this isn't a big deal, but I would prefer not to have to maintain it

cc @viirya

@tustvold
Copy link
Contributor Author

tustvold commented Aug 15, 2022

My hesitation with moving away from archery.py

This PR doesn't move away from using archery.py, archery is still used to orchestrate the tests - see here. All it moves away from is the docker-compose CI plumbing within the arrow repo proper, which isn't necessary and is more part of the arrow CI than archery itself...

run: conda run --no-capture-output pip install -e dev/archery
- name: Run integration tests
run: |
conda run --no-capture-output archery integration \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need conda run for each commend? I think the docker image enables the conda environment by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly github actions overrides the entrypoint, among other things, and so this is the only way to make it work..

--gold-dirs=testing/data/arrow-ipc-stream/integration/1.0.0-bigendian \
--gold-dirs=testing/data/arrow-ipc-stream/integration/1.0.0-littleendian \
--gold-dirs=testing/data/arrow-ipc-stream/integration/2.0.0-compression \
--gold-dirs=testing/data/arrow-ipc-stream/integration/4.0.0-shareddict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the change here is to move executing commends from docker-compose (conda-integration) to separate CI steps. The advantage is easier to control each step for each implementations (e.g. disabling JS).

Although we don't move from archery, there is still a slight disadvantage that we maintain the execution here and if any change is made in arrow repo, we might miss it at first.

There is some overhead but not as much as moving from archery, I think.

Copy link
Contributor Author

@tustvold tustvold Aug 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this should be relatively straightforward to maintain, and we can always revert back should issues arise.

Aside from giving more control, the CI reports will be more helpful as you can see the stage that failed, and it is significantly easier imo to understand what is going on and how to reproduce it locally.

Somewhat related I don't run docker on my machine, instead using podman, and so cannot run the archery docker commands

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from giving more control, the CI reports will be more helpful as you can see the stage that failed, and it is significantly easier imo to understand what is going on and how to reproduce it locally.

Oh, yea, that's right, forgot this point.

@viirya
Copy link
Member

viirya commented Aug 15, 2022

This looks good to me. Alternatively, if we don't like this approach, we may need to update docker-compose command at arrow repo to disable JS as it doesn't take any environment variables for that.

@tustvold
Copy link
Contributor Author

tustvold commented Aug 15, 2022

I think the JS issue has possibly been fixed upstream, but regardless the capability to easily selectively disable broken integration tests still feels valuable, not to mention the other advantages to reproducibility and intelligibility of this PR

I'll leave this PR open for now in case anyone else wishes to weigh in, otherwise I'll merge it tomorrow morning

@tustvold tustvold merged commit 3b59adc into apache:master Aug 16, 2022
@ursabot
Copy link

ursabot commented Aug 16, 2022

Benchmark runs are scheduled for baseline = a144f69 and contender = 3b59adc. 3b59adc is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@alamb
Copy link
Contributor

alamb commented Aug 17, 2022

I wonder when we will re-enable the javacript integration test?

@tustvold tustvold added the development-process Related to development process of arrow-rs label Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development-process Related to development process of arrow-rs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integration Test Failures
4 participants