Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEDE-7220] arrow upgrade - check we can use memory unsafe instead of memory netty #50

Open
wants to merge 27 commits into
base: branch-siren-15.0.x
Choose a base branch
from

Conversation

prateeknima77
Copy link

No description provided.

GeorgeAp and others added 20 commits February 23, 2024 14:41
* [FEDE-5623] Upgrade Siren Arrow from 7.0.0 to 9.0.0 (#35)

* [FEDE-5150] Upgrade arrow version from 4.0.0 to 7.0.0 (#26)

* [FEDE-4869] Upgrade Arrow version to 4.0.0 (#16)

* Make BaseValueVector#MAX_ALLOCATION_SIZE configurable

This closes apache#65

Some of the tests are based on the assumption that the JVM can allocate at least
2GB of memory, which is not a common occurence (JVM usually defaults at 512MB).
Current Travis CI VM only have 3GB of memory total, which would have make challenging
to run some of the tests on them

Add a system property to change BaseValueVector.MAX_ALLOCATION_SIZE to allow to use
a much smaller value during tests.

* prefix arrow's version with siren

* use our version of netty

* updated readme about siren's changes

* fixed dependency issue with our own artifactory

* use our version of netty

* shade the arrow memory jar

* improved doc

* fix readme

* document the changes done to the arrow fork

* ARROW-5856: [Python] [Packaging] Fix use of C++ / Cython API from wheels

Author: Antoine Pitrou <antoine@python.org>

Closes apache#4884 from pitrou/ARROW-5856-cython-so-version and squashes the following commits:

a411d7a <Antoine Pitrou> Avoid C++ ABI issues with DictionaryMemo
eaede5b <Antoine Pitrou> Revert ARROW-5082 (" Stop exporting copies of shared libraries in wheel")
4594f78 <Antoine Pitrou> ARROW-5856:  Try to fix Cython API from wheels

* use our version of netty

* set drill's default value

* use our version of netty

* bumped to 0.8.0

* update to 0.14.1

* comment unneeded modules

* update release procedure with unneeded modules commented out

* bump version to siren-0.14.1-1 and update readme

* do not allocate a new offset buffer if the slice starts at 0 since the relative offset pointer would be unchanged

* fix slice bounds

* improved readme

* Upgraded netty dependendcy to siren-4.1.27-3

* [FEDE-3917] netty direct memory counter deprecation with bump to siren-0.14.1-6-SNAPSHOT (#10)

* Release siren-0.14.1-5

* Bump to version siren-0.14.1-6-SNAPSHOT

* Use Siren Netty fork version siren-4.1.27-4 to release siren-0.14.1-6

* Bump version to release siren-0.14.1-5 using Netty Siren siren-4.1.27-4

* Set the version to siren-0.14.1-6-SNAPSHOT

Co-authored-by: Martin Anseaume <martin.anseaume@siren.io>

* Fix rebase

* Fix pom

* Fix rebase - Bump version to siren-4.0.0-1-SNAPSHOT

* Clean-up

* Comment out memory-netty module and remove use siren netty

* Update siren netty version

* Fix rebase

* Comment netty

* Fix TestValueVector

* Remove unused imports

* Remove unused import

* Clean up

* Clean up - fix checkstyle

* Clean up - fix checkstyle 2

* WIP

* WIP - add siren netty plugin

* WIP - ignore failing test

* Fix rebase as per peer review

* Fix rebase python

* Fix ZeroVector unused imports

* Fix checkstyle

* Fix rebase

* Fix rebase

* Fix checkstyle imports violation

* Fix checkstyle and update netty version

* Use Arrow's ArrowBuf instead of our own

* Fix imports

* Fix checkstyle

* Tentative fix: workaround for the shading issue but creat new problems

* fix missing import

* Fix import

* Fix imports

* Revert ignored unit tests

* Revert automatic format changes

* Revert format to original format

* Update vector pom as per peer review

* Revert ignored TestArrowBufHasher.testHasherNegative

* Clean up

* Update as per peer review

* Undo unnecessary change in tasks.yml

* Undo unnecessary changes

* Revert changes

* Revert changes

* Revert changes as per peer review

* Update as per peer review

* Add back class path dependency exclusion

* Remove whitespace

* Update memory access with default value

* Remove class path exclusion

* Update readme

* Exclude memory-core from the shaded netty in memory-netty package

* Add information for checking that Siren version of Netty is used

* [FEDE-5144] Fix the static initialization of MemoryUtil (#17)

* Remove unecessary setAccessible call

* Catch all errors thrown by the setAccessible call

New version of java can throw InaccessibleObjectException which Arrow
didn't handle

* Update netty version to the stable version siren-4.1.48-1

Co-authored-by: Laurent Goujon <laurent@dremio.com>
Co-authored-by: Stéphane Campinas <stephane.campinas@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Martin Anseaume <martin.anseaume@siren.io>
Co-authored-by: Johnny Hujol <itudoben@users.noreply.github.com>
Co-authored-by: ggdupont <ger.dupont@gmail.com>

* Bumped version number to siren-4.0.0-1

* Fix rebase

* Remove scope from arrow-memory-netty to make netty available for entire project else our shading will fail.

Signed-off-by: George Apaaboah <george.apaaboah@gmail.com>

* Fix readme - remove duplicate section on how to contribute

* Update readme as per review

* Use stable netty version siren-4.1.68-1 after the release of netty

Signed-off-by: George Apaaboah <george.apaaboah@gmail.com>

Co-authored-by: Laurent Goujon <laurent@dremio.com>
Co-authored-by: Stéphane Campinas <stephane.campinas@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Martin Anseaume <martin.anseaume@siren.io>
Co-authored-by: Johnny Hujol <itudoben@users.noreply.github.com>
Co-authored-by: ggdupont <ger.dupont@gmail.com>

* Release siren-7.0.0-1

Signed-off-by: George Apaaboah <george.apaaboah@gmail.com>

* Fix rebase

* Update services configuration (#32)

* Update services configuration

This is needed for the module declaration in federate

* Create a uber jar with netty and arrow memory dependencies

This avoids conflicts with java modules. Arrows provides different
modules (memory-netty, memory-core) that export the same packages
(org.apache.arrow.memory), which is forbidden with java modules.

The federate code can then just dependend on the shaded jar from vector.

* use reduced pom in jar

Co-authored-by: Issac <issac.garcia@siren.io>

* Add developer tip link to readme

* Remove test scope for arrow-memory-netty

* Change version from siren-9.0.0-1-SNAPSHOT to siren-9.0.0-2-SNAPSHOT since 2 contain the module changes

* Update as per review

* Update comment as per review

* Use stable netty version siren-4.1.78-1

Signed-off-by: George Apaaboah <george.apaaboah@gmail.com>
Co-authored-by: Laurent Goujon <laurent@dremio.com>
Co-authored-by: Stéphane Campinas <stephane.campinas@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Martin Anseaume <martin.anseaume@siren.io>
Co-authored-by: Johnny Hujol <itudoben@users.noreply.github.com>
Co-authored-by: ggdupont <ger.dupont@gmail.com>
Co-authored-by: Issac <issac.garcia@siren.io>

* Fix cherry-pick

* Temporarily comment out the jdk version

* chore: revert commented out jdk version in pom

* update: use stable version of netty after release of siren-4.1.82-1

---------

Signed-off-by: George Apaaboah <george.apaaboah@gmail.com>
Co-authored-by: Laurent Goujon <laurent@dremio.com>
Co-authored-by: Stéphane Campinas <stephane.campinas@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Martin Anseaume <martin.anseaume@siren.io>
Co-authored-by: Johnny Hujol <itudoben@users.noreply.github.com>
Co-authored-by: ggdupont <ger.dupont@gmail.com>
Co-authored-by: Issac <issac.garcia@siren.io>
Copy link

github-actions bot commented Apr 3, 2024

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@prateeknima77 prateeknima77 marked this pull request as ready for review April 18, 2024 08:15
java/vector/pom.xml Outdated Show resolved Hide resolved
Copy link

@GeorgeAp GeorgeAp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please remove extra spaces or added them back where it applies so that the fork only applies what we want to add?

java/vector/pom.xml Outdated Show resolved Hide resolved
java/vector/pom.xml Outdated Show resolved Hide resolved
java/vector/pom.xml Outdated Show resolved Hide resolved
@GeorgeAp GeorgeAp requested a review from scampi April 23, 2024 08:59
@prateeknima77
Copy link
Author

Can you please remove extra spaces or added them back where it applies so that the fork only applies what we want to add?

Yes, the spacing has been adjusted in 1c369e8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants