Stop using Java8 because we no longer support spark < 3.0 #5234

harupy · 2022-01-07T10:45:51Z

Signed-off-by: harupy 17039389+harupy@users.noreply.github.com

What changes are proposed in this pull request?

Stop using java8 because we no longer support spark < 3.0.

How is this patch tested?

Existing checks

Does this PR change the documentation?

No. You can skip the rest of this section.
Yes. Make sure the changed pages / sections render correctly by following the steps below.

Check the status of the ci/circleci: build_doc check. If it's successful, proceed to the
next step, otherwise fix it.
Click Details on the right to open the job page of CircleCI.
Click the Artifacts tab.
Click docs/build/html/index.html.
Find the changed pages / sections and make sure they render correctly.

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy · 2022-01-07T10:49:51Z

mlflow/spark.py

-def _format_exception(ex):
-    return "".join(traceback.format_exception(type(ex), ex, ex.__traceback__))


Removed this unused function to run cross version tests for spark.

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy · 2022-01-07T12:19:46Z

mlflow/mleap.py

+    # This import statement adds `serializeToBundle` and `deserializeFromBundle` to `Transformer`:
+    # https://github.com/combust/mleap/blob/37f6f61634798118e2c2eb820ceeccf9d234b810/python/mleap/pyspark/spark_support.py#L32-L33


Change for running cross version tests for mleap.

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy · 2022-01-07T12:35:39Z

.github/workflows/cross-version-tests.yml

+      - name: Get Java version
+        id: get-java-version
+        run: |
+          if [ "${{ matrix.package }}" = "mleap" ]


mleap still uses spark 2.4.

The latest mleap version already support spark 3.x ?

@WeichenXu123

https://github.com/combust/mleap#requirements says:

MLeap is built against Scala 2.11 and Java 8. Because we depend heavily on Typesafe config for MLeap, we only support Java 8 at the moment.

Until MLeap supports >3.1.x the solution in this PR will be needed for the netty serialization issue, right?

harupy · 2022-01-07T12:36:04Z

tests/mleap/test_mleap_model_export.py

+
+
+@pytest.mark.large
+def test_spark_module_model_save_with_mleap_and_unsupported_transformer_raises_exception(


Moved this test here from tests/spark/test_spark_model_export.py since it's related to mleap.

harupy · 2022-01-07T12:37:22Z

tests/spark/test_spark_model_export.py

+    if Version(pyspark.__version__) < Version("3.1"):
+        # A workaround for this issue:
+        # https://stackoverflow.com/questions/62109276/errorjava-lang-unsupportedoperationexception-for-pyspark-pandas-udf-documenta
+        spark_home = (
+            os.environ.get("SPARK_HOME")
+            if "SPARK_HOME" in os.environ
+            else os.path.dirname(pyspark.__file__)
+        )
+        conf_dir = os.path.join(spark_home, "conf")
+        os.makedirs(conf_dir, exist_ok=True)
+        with open(os.path.join(conf_dir, "spark-defaults.conf"), "w") as f:
+            conf = """
+spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
+spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
+"""
+            f.write(conf)


A workaround for an issue with spark < 3.1, java11, and pandas_udf:
https://stackoverflow.com/questions/62109276/errorjava-lang-unsupportedoperationexception-for-pyspark-pandas-udf-documenta

Error logs: https://github.com/mlflow/mlflow/runs/4738073376?check_suite_focus=true#step:11:164

+1 writing to spark-defaults config file. Good solution for this.

BenWilson2

LGTM. Nice config solution.

stop using java8

6f1b8f9

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

github-actions bot added the rn/none List under Small Changes in Changelogs. label Jan 7, 2022

remove unused _format_exception

a9bce56

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy added the enable-dev-tests Enables cross-version tests for dev versions label Jan 7, 2022

harupy commented Jan 7, 2022

View reviewed changes

harupy added 4 commits January 7, 2022 19:58

use java8 for mleap

6ecdfeb

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

comment

5ce1e8b

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

move mleap test

107de7f

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

try workaround

52ad3d2

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy requested a review from WeichenXu123 January 7, 2022 12:00

makedirs

e90cdf5

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy commented Jan 7, 2022

View reviewed changes

run setup-java

927e97e

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy commented Jan 7, 2022

View reviewed changes

BenWilson2 approved these changes Jan 7, 2022

View reviewed changes

harupy merged commit d5b0b55 into mlflow:master Jan 7, 2022

harupy deleted the stop-using-java8 branch January 7, 2022 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop using Java8 because we no longer support spark < 3.0 #5234

Stop using Java8 because we no longer support spark < 3.0 #5234

harupy commented Jan 7, 2022

harupy Jan 7, 2022

harupy Jan 7, 2022

harupy Jan 7, 2022

WeichenXu123 Jan 7, 2022

harupy Jan 7, 2022

BenWilson2 Jan 7, 2022

harupy Jan 7, 2022

harupy Jan 7, 2022 •

edited

harupy Jan 7, 2022

harupy Jan 7, 2022

BenWilson2 Jan 7, 2022

BenWilson2 left a comment

		def _format_exception(ex):
		return "".join(traceback.format_exception(type(ex), ex, ex.__traceback__))

		# This import statement adds `serializeToBundle` and `deserializeFromBundle` to `Transformer`:
		# https://github.com/combust/mleap/blob/37f6f61634798118e2c2eb820ceeccf9d234b810/python/mleap/pyspark/spark_support.py#L32-L33



		@pytest.mark.large
		def test_spark_module_model_save_with_mleap_and_unsupported_transformer_raises_exception(

Stop using Java8 because we no longer support spark < 3.0 #5234

Stop using Java8 because we no longer support spark < 3.0 #5234

Conversation

harupy commented Jan 7, 2022

What changes are proposed in this pull request?

How is this patch tested?

Does this PR change the documentation?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harupy Jan 7, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenWilson2 left a comment

Choose a reason for hiding this comment

harupy Jan 7, 2022 •

edited