Skip to content

Releases: apache/orc

v2.0.1

15 May 04:31
23044d7
Compare
Choose a tag to compare

Milestone

Branch

Improvements (tools)

  • ORC-1644: Add merge tool to merge multiple ORC files into a single ORC file
  • ORC-1647: Tips for supporting ORC in the convert command
  • ORC-1667: Add check tool to check the index of the specified column

Bug Fix

  • ORC-1646: Close the reader when reading the schema with the convert command
  • ORC-1654: [C++] Count up EvaluatedRowGroupCount correctly
  • ORC-1684: [C++] Find tzdb without TZDIR when in conda-environments
  • ORC-1688: [C++] Do not access TZDB if there is no timestamp type
  • ORC-1696: Fix ClassCastException when reading avro decimal type in bechmark

Task

  • ORC-1649:[C++][Conan] Add 2.0.0 to conan recipe and update release guide
  • ORC-1669: [C++] Deprecate HDFS support
  • ORC-1686: [C++] Avoid using std::filesystem

Test

  • ORC-1648: Add test to convert ORC in the convert command
  • ORC-1663: [C++] Enable TestTimezone.testMissingTZDB on Windows
  • ORC-1672: Remove test packages o.a.o.tools.check
  • ORC-1673: Remove test packages o.a.o.tools.[count|merge|sizes]
  • ORC-1676: Use Hive 4.0.0 in benchmark
  • ORC-1681: Remove redundant import statement in tests to fix checkstyle failures
  • ORC-1699: Fix SparkBenchmark in Parquet format according to SPARK-40918
  • ORC-1704: Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark
  • ORC-1707: Fix sun.util.calendar IllegalAccessException when SparkBenchmark runs on JDK17
  • ORC-1708: Support data/compress options in Hive benchmark

Build and Dependency Changes

Documentation

  • ORC-1668: Add merge command to Java tools documentation

v1.8.7

14 Apr 12:47
8d3c982
Compare
Choose a tag to compare

Milestone

Changelog

Bug

ORC-1528: Fix readBytes potential overflow in RecordReaderUtils.ChunkReader#create
ORC-1602: [C++] limit compression block size

Test

ORC-1556: Add Rocky Linux 9 Docker Test
ORC-1557: Add GitHub Action CI for Docker Test
ORC-1560: Remove Java11 and clang variants from docker/os-list.txt in branch-1.8
ORC-1562: Bump guava to 33.0.0-jre
ORC-1578: Fix SparkBenchmark on sales data according to SPARK-40918
ORC-1621: Switch to oraclelinux9 from rocky9

Documentation

ORC-1536: Remove hive-storage-api link from maven-javadoc-plugin
ORC-1563: Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs

v1.9.3

21 Mar 05:18
Compare
Choose a tag to compare

Milestone

Changelog

BugFix

  • ORC-634 Fix the json output for double NaN and infinite
  • ORC-1553 Reading information from Row group, where there are 0 records of SArg column
  • ORC-1563 Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs
  • ORC-1578 Fix SparkBenchmark according to SPARK-40918
  • ORC-1586 Fix IllegalAccessError when SparkBenchmark runs on JDK17
  • ORC-1602 [C++] limit compression block size
  • ORC-1607 Fix testDoubleNaNAndInfinite to use TestFileDump.checkOutput
  • ORC-1609 Fix the compilation problem of TestJsonFileDump in branch 1.9

Test

  • ORC-1556 Add Rocky Linux 9 Docker Test
  • ORC-1557 Add GitHub Action CI for Docker Test
  • ORC-1559 Remove Java11 and clang variants from docker/os-list.txt from branch-1.9

Task

  • ORC-1532 Upgrade opencsv to 5.9
  • ORC-1536 Remove hive-storage-api link from maven-javadoc-plugin
  • ORC-1576 Upgrade spark.jackson.version to 2.15.2 in bench module
  • ORC-1591 Lower log level from INFO to DEBUG in *ReaderImpl/WriterImpl/PhysicalFsWriter
  • ORC-1592 Suppress KeyProvider missing log
  • ORC-1616 Upgrade aircompressor to 0.26
  • ORC-1618 Disable building tests for snappy

Documentation:

  • ORC-1535 Remove generated Java docs from source tree

v2.0.0

08 Mar 21:20
46eb6ff
Compare
Choose a tag to compare

Milestone

Branch

This is a new major release which we cannot provide a changelog.

Summary of notable changes

ORC-1547: Spin-off ORC Format
ORC-1572: Use Apache ORC Format 1.0.0
ORC-1507: Support Java 21
ORC-1512: Drop Java 8/11 and make Java 17 by default
ORC-1577: Use ZSTD as the default compression
ORC-1430: Use Hadoop 3.3.5 shaded clients
ORC-1456: Update Hadoop to 3.3.6
ORC-1251: Use Hadoop Vectored IO
ORC-1463: Support brotli codec
ORC-1100: Support vcpkg
ORC-1620: Add Apple Silicon Test Coverage

New Feature

ORC-998: Refactor compression output buffer within OutStream for better portability
ORC-1088: Suport ZSTD_JNI and columnn compress to set compression level
ORC-1100: Support vcpkg
ORC-1251: Use Hadoop Vectored IO
ORC-1387: [C++] Support schema evolution from decimal to numeric/decimal
ORC-1440: Check for protobuf config based module
ORC-1463: Support brotli codec
ORC-1507: Use Zulu JDK distribution and switch from 21-ea to 21
ORC-1512: Drop Java 8/11 and make Java 17 by default
ORC-1531: Create orc-format module and repo
ORC-1545: Use orc-format 1.0.0-SNAPSHOT
ORC-1546: Use orc-format 1.0.0-alpha
ORC-1547: Spin-off ORC Format
ORC-1551: Use orc-format 1.0.0-beta
ORC-1572: Use Apache ORC Format 1.0.0
ORC-1585: [C++] Add orc-format_ep as a dependency of orc

Improvement

ORC-1459: Mark DataBuffer::size() and DataBuffer::capacity() as const
ORC-1460: specification: Clarify how dictionary entries are sorted
ORC-1461: Mark Int128::getHighBits() and Int128::getLowBits() as const
ORC-1472: Replace deprecated method in TestMurmur3.java
ORC-1479: Enhance example usage message to use Uber jar
ORC-1481: [C++] Better error message when TZDB is unavailable
ORC-1504: Add lower bound check in get API for DynamicIntArray
ORC-1506: Replacing deprecated valueOf() with recommended forNumber()
ORC-1509: Auto grant contributor role to first-time contributors
ORC-1520: Remove JDK 8 settings from pom
ORC-1567: Add the -ignoreExtension configuration to the sizes and count commands of orc-tools
ORC-1570: Add supportVectoredIO API to HadoopShimsCurrent and use it
ORC-1571: Supports displaying raw data size in the meta command of orc-tools
ORC-1577: Use ZSTD as the default compression
ORC-1580: Change default DataBuffer constructor to use reserve instead of resize
ORC-1595: Add a short-cut to skip tiny inputs for ZstdCodec.compress
ORC-1596: Remove redundant Zstd.isError JNI usage
ORC-1597: Set bloom filter fpp to 1%
ORC-1600: Reduce getStaticMemoryManager sync block in OrcFile
ORC-1601: Reduce get HadoopShims sync block in HadoopShimsFactory
ORC-1610: Reduce the number of hash computation in CuckooSetBytes
ORC-1613: Zstd decompression supports direct buffer
ORC-1631: Supports summary output in sizes command
ORC-1637: [C++] Port conan recipe from upstream conan center
ORC-1638: Avoid System.exit(0) in count command
ORC-1639: [C++] Reduce unnecessary compiler flags in CMake
ORC-1641: Remove sourceFileExcludes from maven-javadoc-plugin
ORC-1642: Avoid System.exit(0) in scan command
ORC-1593: Set orc.compression.zstd.level to 3 by default

Bug Fix

ORC-634: Fix the json output for double NaN and infinite
ORC-1455: [C++] Fix build failure on non-x86 with unused macro in CpuInfoUtil.cc
ORC-1473: Zero-copy zeroCopyReadRanges and releaseBuffer bugs
ORC-1476: Maven build fail with unsupported platform: protoc-3.17.3-osx-aarch_64.exe
ORC-1480: [C++] Build failed when the BUILD_CPP_ENABLE_METRICS is ON
ORC-1500: [C++] The partition field does not support English special characters
ORC-1528: When using the orc.min.disk.seek.size configuration to read extremely large ORC files, a java.nio.BufferOverflowException may occur.
ORC-1553: Reading information from Row group, where there are 0 records of SArg column
ORC-1563: Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs
ORC-1568: Use readDiskRanges if orc.use.zerocopy is enabled
ORC-1575: Use ASF Archive URL instead Download URL
ORC-1578: Fix SparkBenchmark according to SPARK-40918
ORC-1588: Fix incorrect Decimal assert in LeafFilterFactory
ORC-1602: [C++] limit compression block size

Task

ORC-1422: Setting version to 2.0.0-SNAPSHOT
ORC-1434: Remove org.apache.hadoop from dependabot.yml
ORC-1484: Use JIRA_ACCESS_TOKEN in merge_orc_pr.py
ORC-1485: Enable checkstyle checks for test classes
ORC-1486: Fix checkstyle violations for tests in orc-core module
ORC-1492: Fix checkstyle violations for tests in mapreduce, tools, bench modules
ORC-1496: Use iterator to suggest backporting branches
ORC-1515: Skip publishing orc-example module
ORC-1516: Fix minor typo in comments in IOUtils
ORC-1518: Remove findbugs folders
ORC-1529: Fix minor typos in pom.xml
ORC-1530: Rename variables in RecordReaderUtils.ChunkReader#create
ORC-1535: Remove generated Java docs from source tree
ORC-1536: Remove hive-storage-api link from maven-javadoc-plugin
ORC-1540: Remove MacOS 11 from GitHub Action CI
ORC-1542: Use Pattern Matching for instanceof (JEP-394)
ORC-1549: Update libhdfspp.tar.gz by adding #include <cstdint>
ORC-1569: Remove HadoopShimsPre2_3, HadoopShimsPre2_6, HadoopShimsPre2_7 classes
ORC-1579: Add ASF Generative Tooling Guidance to PR template
ORC-1591: Lower log level from INFO ...

Read more

v1.9.2

10 Nov 10:19
5c0ad30
Compare
Choose a tag to compare

Milestone

Changelog

Bug

ORC-1475: [C++] Fix the failure of UT when char is unsigned
ORC-1480: [C++] Fix build break w/ BUILD_CPP_ENABLE_METRICS=ON
ORC-1482: Adaptation to read ORC files created by CUDF
ORC-1489: Assign a writer id to CUDF
ORC-1525: Fix bad read in RleDecoderV2::readByte

Test

ORC-1431: Use parquet to 1.13.1 in bench module
ORC-1454: Update Spark to 3.4.1
ORC-1487: Enable checkstyle on src/test with checkstyle-suppressions.xml
ORC-1498: Add Debian 12 Docker test
ORC-1502: Upgrade Maven to 3.9.4
ORC-1505: Upgrade Spark to 3.5.0
ORC-1511: Bump Avro to 1.11.3 in bench module
ORC-1513: Upgrade snappy-java to 1.1.10.4 in bench module
ORC-1517: Bump snappy-java to 1.1.10.5 in bench module

Task

ORC-1497: Bump maven-enforcer-plugin to 3.4.0
ORC-1499: Add MacOS 13 and 14 to building.md
ORC-1507: Use Zulu JDK distribution and switch from 21-ea to 21
ORC-1518: Remove findbugs folders

Documentation

ORC-1503: Updated README.md with Maven version 3.9.4

v1.8.6

10 Nov 10:11
d9fcd78
Compare
Choose a tag to compare

Milestone

Changelog

Bug

ORC-1525: Fix bad read in RleDecoderV2::readByte

Test

ORC-1432: Add MacOS 13 GitHub Action Job

Documentation

ORC-1499: Add MacOS 13 and 14 to building.md

v1.7.10

10 Nov 10:07
492c850
Compare
Choose a tag to compare

Milestone

Changelog

Bug

ORC-1304: [C++] Fix seeking over empty PRESENT stream
ORC-1413: Fix for ORC row level filter issue with ACID table

Task:

ORC-1482: Adaptation to read ORC files created by CUDF
ORC-1489: Assign a writer id to CUDF

v1.8.5

05 Sep 16:11
Compare
Choose a tag to compare

Milestone

Changelog

Bug

ORC-1315: [C++] Byte to integer conversions fail on platforms with unsigned char type
ORC-1482: RecordReaderImpl.evaluatePredicateProto assumes floating point stats are always present

Task:

ORC-1489 Assign a writer id to CUDF

v1.9.1

16 Aug 08:40
Compare
Choose a tag to compare

Milestone

Changelog

Bug

  • ORC-1455 Fix build failure on non-x86 with unused macro in CpuInfoUtil.cc
  • ORC-1457 Fix ambiguous overload of Type::createRowBatch
  • ORC-1462 Bump aircompressor to 0.25 to fix JDK-8081450

Test

v1.9.0

28 Jun 21:35
9d43439
Compare
Choose a tag to compare

Milestone

Changelog

New Feature and Notable Changes

  • ORC-961 Expose metrics of the reader
  • ORC-1167 Support orc.row.batch.size configuration
  • ORC-1252 Expose io metrics for write operation
  • ORC-1301 Enforce C++17
  • ORC-1310 allowlist Support for plugin filter
  • ORC-1356 Use Intel AVX-512 instructions to accelerate the Rle-bit-packing decode
  • ORC-1385 Support schema evolution from numeric to numeric
  • ORC-1386 Support schema evolution from primitive to string group/decimal/timestamp

Improvement

  • ORC-827 Utilize Array copyOf
  • ORC-1170 Optimize the RowReader::seekToRow function
  • ORC-1232 Disable metrics collector by default
  • ORC-1278 Update Readme.md cmake to 3.12
  • ORC-1279 Update cmake version
  • ORC-1286 Replace DataBuffer with BlockBuffer in the BufferedOutputStream
  • ORC-1298 Support dedicated ColumnVectorBatch of numeric types
  • ORC-1302 Upgrade Github workflow to build on Windows
  • ORC-1306 Fixed indented code style for Java modules
  • ORC-1307 Add coding style enforcement
  • ORC-1314 Remove macros defined before C++11
  • ORC-1347 Use make_unique and make_shared when creating unique_ptr and shared_ptr
  • ORC-1348 TimezoneImpl constructor should pass std::vector<> & instead of std::vector<>
  • ORC-1349 Remove useless bufStream definition
  • ORC-1352 Remove ORC_[NOEXCEPT|NULLPTR|OVERRIDE|UNIQUE_PTR] macro usages
  • ORC-1355 Writer::addUserMetadata change parameter to reference
  • ORC-1373 Add log when DynamicByteArray length overflow
  • ORC-1401 Allow writing an intermediate footer
  • ORC-1421 Use PyArrow 12.0.0 in document

Bug

  • ORC-1225 Bump maven-assembly-plugin to 3.4.2
  • ORC-1266 DecimalColumnVector resets the isRepeating flag in the nextVector method
  • ORC-1273 Bump opencsv to 5.7.0
  • ORC-1297 Bump opencsv to 5.7.1
  • ORC-1304 throw ParseError when using SearchArgument with nested struct
  • ORC-1315 Byte to integer conversions fail on platforms with unsigned char type
  • ORC-1320 Fix build break of C++ code on docker images
  • ORC-1363 Upgrade zookeeper to 3.8.1
  • ORC-1368 Bump commons-csv to 1.10.0
  • ORC-1398 Bump aircompressor to 0.24
  • ORC-1399 Fix boolean type with useTightNumericVector enabled
  • ORC-1433 Fix comment in the Vector.hh
  • ORC-1447 Fix a bug in CpuInfoUtil.cc to support ARM platform
  • ORC-1449 Add -Wno-unused-macros for Clang 14.0
  • ORC-1450 Stop enforcing override keyword
  • ORC-1453 Fix fall-through warning cases

Task

  • ORC-1164 Setting version to 1.9.0-SNAPSHOT
  • ORC-1218 Bump apache pom to 27
  • ORC-1219 Remove redundant toString
  • ORC-1237 Remove a wrong image link to article-footer.png
  • ORC-1239 Upgrade maven-shade-plugin to 3.3.0
  • ORC-1256 Publish test-jar to maven central
  • ORC-1259 Bump slf4j to 2.0.0
  • ORC-1269 Remove FindBugs
  • ORC-1270 Move opencsv dependency to the tools module.
  • ORC-1274 Add a checkstyle rule to ban starting LAND and LOR
  • ORC-1275 Bump maven-jar-plugin to 3.3.0
  • ORC-1276 Bump slf4j to 2.0.1
  • ORC-1277 Bump maven-shade-plugin to 3.4.0
  • ORC-1284 Add permissions to GitHub Action labeler
  • ORC-1296 Bump reproducible-build-maven-plugin to 0.16
  • ORC-1311 Bump maven-shade-plugin to 3.4.1
  • ORC-1316 Bump slf4j.version to 2.0.4
  • ORC-1334 Bump slf4j.version to 2.0.6
  • ORC-1335 Bump netty-all to 4.1.86.Final
  • ORC-1351 Update PR Labeler definition
  • ORC-1358 Use spotless to format pom files
  • ORC-1371 Remove unsupported SLF4J bindings from classpath
  • ORC-1372 Bump zstd to v1.5.4
  • ORC-1375 Cancel old running ci tasks when a pr has a new commit
  • ORC-1377 Enforce override keyword
  • ORC-1383 Upgrade aircompressor to 0.22
  • ORC-1395 Enforce license check
  • ORC-1396 Bump slf4j to 2.0.7
  • ORC-1410 Bump zstd to v1.5.5
  • ORC-1411 Remove Ubuntu18.04 from docker-based tests
  • ORC-1419 Bump protobuf-java to 3.22.3
  • ORC-1428 Setup GitHub Action CI on branch-1.9
  • ORC-1443 Enforce Java version
  • ORC-1444 Enforce JDK Bytecode version
  • ORC-1446 Publish snapshot from branch-1.9

Test

  • ORC-1231 Update supported OS list in building.md
  • ORC-1233 Bump junit to 5.9.0
  • ORC-1234 Upgrade objenesis to 3.2 in Spark benchmark
  • ORC-1235 Bump avro to 1.11.1
  • ORC-1240 Update site README to use apache/orc-dev
  • ORC-1241 Use apache/orc-dev DockerHub repository in Docker tests
  • ORC-1250 Bump mockito to 4.7.0
  • ORC-1254 Add spotbugs check
  • ORC-1258 Bump byte-buddy to 1.12.14
  • ORC-1262 Bump maven-checkstyle-plugin to 3.2.0
  • ORC-1265 Upgrade spotbugs to 4.7.2
  • ORC-1267 Bump mockito to 4.8.0
  • ORC-1271 Bump spotbugs-maven-plugin to 4.7.2.0
  • ORC-1272 Bump byte-buddy to 1.12.16
  • ORC-1300 Update Spark to 3.3.1 and its dependencies
  • ORC-1303 Upgrade GoogleTest to 1.12.1
  • ORC-1318 Upgrade mockito.version to 4.9.0
  • ORC-1319 Upgrade byte-buddy to 1.12.19
  • ORC-1321 Bump checkstyle to 10.5.0
  • ORC-1322 Upgrade centos7 docker image to use gcc9
  • ORC-1324 Use Java 19 instead of 18 in GHA
  • ORC-1333 Bump mockito to 4.10.0
  • ORC-1341 Bump mockito to 4.11.0
  • ORC-1353 Bump byte-buddy to 1.12.21
  • ORC-1359 Bump byte-buddy to 1.12.22
  • ORC-1366 Bump checkstyle to 10.7.0
  • ORC-1367 Bump maven-enforcer-plugin to 3.2.1
  • ORC-1369 Bump byte-buddy to 1.12.23
  • ORC-1370 Bump snappy-java to 1.1.9.1
  • ORC-1374 Update Spark to 3.3.2
  • ORC-1378 Add slf4j impl to avoid warning message in example module
  • ORC-1379 Upgrade spotbugs to 4.7.3.2
  • ORC-1380 Upgrade checkstyle to 10.8.0
  • ORC-1394 Bump maven-assembly-plugin to 3.5.0
  • ORC-1397 Bump checkstyle to 10.9.2
  • ORC-1405 Bump spotbugs-maven-plugin to 4.7.3.4
  • ORC-1406 Bump maven-enforcer-plugin to 3.3.0
  • ORC-1408 Add testVectorBatchHasNull test case and comment
  • ORC-1415 Add Java 20 to GitHub Action CI
  • ORC-1417 Bump checkstyle to 10.10.0
  • ORC-1418 Bump junit to 5.9.3
  • ORC-1426 Use Java 21-ea instead of 20 in GitHub Action
  • ORC-1435 Bump maven-checkstyle-plugin to 3.3.0
  • ORC-1436 Bump snappy-java to 1.1.10.0
  • ORC-1452 Use the latest OS versions in variant tests