Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update version to 23.0.0 and update CHANGELOG, add label_issue.py script #2734

Merged
merged 18 commits into from Sep 16, 2022
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Expand Up @@ -20,7 +20,7 @@ __blobstorage__

# .bak files
*.bak

*.bak2
# OS-specific .gitignores

# Mac .gitignore
Expand Down
115 changes: 114 additions & 1 deletion CHANGELOG-old.md

Large diffs are not rendered by default.

171 changes: 74 additions & 97 deletions CHANGELOG.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions arrow-flight/Cargo.toml
Expand Up @@ -18,7 +18,7 @@
[package]
name = "arrow-flight"
description = "Apache Arrow Flight"
version = "22.0.0"
version = "23.0.0"
edition = "2021"
rust-version = "1.62"
authors = ["Apache Arrow <dev@arrow.apache.org>"]
Expand All @@ -27,7 +27,7 @@ repository = "https://github.com/apache/arrow-rs"
license = "Apache-2.0"

[dependencies]
arrow = { path = "../arrow", version = "22.0.0", default-features = false, features = ["ipc"] }
arrow = { path = "../arrow", version = "23.0.0", default-features = false, features = ["ipc"] }
base64 = { version = "0.13", default-features = false }
tonic = { version = "0.8", default-features = false, features = ["transport", "codegen", "prost"] }
bytes = { version = "1", default-features = false }
Expand Down
2 changes: 1 addition & 1 deletion arrow-flight/README.md
Expand Up @@ -27,7 +27,7 @@ Add this to your Cargo.toml:

```toml
[dependencies]
arrow-flight = "22.0.0"
arrow-flight = "23.0.0"
```

Apache Arrow Flight is a gRPC based protocol for exchanging Arrow data between processes. See the blog post [Introducing Apache Arrow Flight: A Framework for Fast Data Transport](https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/) for more information.
Expand Down
4 changes: 2 additions & 2 deletions arrow-pyarrow-integration-testing/Cargo.toml
Expand Up @@ -18,7 +18,7 @@
[package]
name = "arrow-pyarrow-integration-testing"
description = ""
version = "22.0.0"
version = "23.0.0"
homepage = "https://github.com/apache/arrow-rs"
repository = "https://github.com/apache/arrow-rs"
authors = ["Apache Arrow <dev@arrow.apache.org>"]
Expand All @@ -32,7 +32,7 @@ name = "arrow_pyarrow_integration_testing"
crate-type = ["cdylib"]

[dependencies]
arrow = { path = "../arrow", version = "22.0.0", features = ["pyarrow"] }
arrow = { path = "../arrow", version = "23.0.0", features = ["pyarrow"] }
pyo3 = { version = "0.17", features = ["extension-module"] }

[package.metadata.maturin]
Expand Down
2 changes: 1 addition & 1 deletion arrow/Cargo.toml
Expand Up @@ -17,7 +17,7 @@

[package]
name = "arrow"
version = "22.0.0"
version = "23.0.0"
description = "Rust implementation of Apache Arrow"
homepage = "https://github.com/apache/arrow-rs"
repository = "https://github.com/apache/arrow-rs"
Expand Down
4 changes: 2 additions & 2 deletions arrow/README.md
Expand Up @@ -35,7 +35,7 @@ This crate is tested with the latest stable version of Rust. We do not currently

The arrow crate follows the [SemVer standard](https://doc.rust-lang.org/cargo/reference/semver.html) defined by Cargo and works well within the Rust crate ecosystem.

However, for historical reasons, this crate uses versions with major numbers greater than `0.x` (e.g. `22.0.0`), unlike many other crates in the Rust ecosystem which spend extended time releasing versions `0.x` to signal planned ongoing API changes. Minor arrow releases contain only compatible changes, while major releases may contain breaking API changes.
However, for historical reasons, this crate uses versions with major numbers greater than `0.x` (e.g. `23.0.0`), unlike many other crates in the Rust ecosystem which spend extended time releasing versions `0.x` to signal planned ongoing API changes. Minor arrow releases contain only compatible changes, while major releases may contain breaking API changes.

## Feature Flags

Expand All @@ -61,7 +61,7 @@ The [Apache Arrow Status](https://arrow.apache.org/docs/status.html) page lists

## Safety

Arrow seeks to uphold the Rust Soundness Pledge as articulated eloquently [here](https://raphlinus.github.io/rust/22.0.01/18/soundness-pledge.html). Specifically:
Arrow seeks to uphold the Rust Soundness Pledge as articulated eloquently [here](https://raphlinus.github.io/rust/23.0.01/18/soundness-pledge.html). Specifically:

> The intent of this crate is to be free of soundness bugs. The developers will do their best to avoid them, and welcome help in analyzing and fixing them

Expand Down
2 changes: 1 addition & 1 deletion dev/release/README.md
Expand Up @@ -78,7 +78,7 @@ CHANGELOG_GITHUB_TOKEN=<TOKEN> ./dev/release/update_change_log.sh
git commit -a -m 'Create changelog'

# update versions
sed -i '' -e 's/14.0.0/22.0.0/g' `find . -name 'Cargo.toml' -or -name '*.md' | grep -v CHANGELOG.md`
sed -i '' -e 's/14.0.0/23.0.0/g' `find . -name 'Cargo.toml' -or -name '*.md' | grep -v CHANGELOG.md`
git commit -a -m 'Update version'
```

Expand Down
151 changes: 151 additions & 0 deletions dev/release/label_issues.py
@@ -0,0 +1,151 @@
##############################################################################
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @iajoiner -- here is my hacky python script to update github labels

It does appear to work... But is definitely a bit of a hack

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
##############################################################################

# Python script to add labels to github issues from the PRs that closed them
#
# Required setup:
# $ pip install PyGithub
#
# ARROW_GITHUB_API_TOKEN needs to be set to your github token
from github import Github
import os
import re



# get all cross referenced issues from the named issue
# (aka linked PRs)
# issue = arrow_repo.get_issue(issue_number)
def get_cross_referenced_issues(issue):
all_issues = set()
for timeline_item in issue.get_timeline():
if timeline_item.event == 'cross-referenced' and timeline_item.source.type == 'issue':
all_issues.add(timeline_item.source.issue)

# convert to list
return [i for i in all_issues]


# labels not to transfer
BLACKLIST_LABELS = {'development-process', 'api-change'}

# Adds labels to the specified issue with the labels from linked pull requests
def relabel_issue(arrow_repo, issue_number):
#print(issue_number, 'fetching issue')
issue = arrow_repo.get_issue(issue_number)
print('considering issue', issue.html_url)
linked_issues = get_cross_referenced_issues(issue)
#print(' ', 'cross referenced issues:', linked_issues)

# Figure out what labels need to be added, if any
existing_labels = set()
for label in issue.labels:
existing_labels.add(label.name)

# find all labels to add
for linked_issue in linked_issues:
if linked_issue.pull_request is None:
print(' ', 'not pull request, skipping', linked_issue.html_url)
continue

if linked_issue.repository.name != 'arrow-rs':
print(' ', 'not in arrow-rs, skipping', linked_issue.html_url)
continue

print(' ', 'finding labels for linked pr', linked_issue.html_url)
linked_labels = set()
for label in linked_issue.labels:
linked_labels.add(label.name)
#print(' ', 'existing labels:', existing_labels)

labels_to_add = linked_labels.difference(existing_labels)

# remove any blacklist labels, if any
for l in BLACKLIST_LABELS:
labels_to_add.discard(l)

if len(labels_to_add) > 0:
print(' ', 'adding labels: ', labels_to_add, 'to', issue.number)
for label in labels_to_add:
issue.add_to_labels(label)
print(' ', 'added', label)
existing_labels.add(label)

# leave a note about what updated these labels
issue.create_comment('`label_issue.py` automatically added labels {} from #{}'.format(labels_to_add, linked_issue.number))


# what section headings in the CHANGELOG.md file contain closed issues that may need relabeling
ISSUE_SECTION_NAMES = ['Closed issues:', 'Fixed bugs:', 'Implemented enhancements:']

# find all possible issues / bugs by scraping CHANGELOG.md
#
# TODO: Find all tickets merged since this tag
# The compare api can find all commits since that tag
# I could not find a good way in the github API to find the PRs connected to a commit
#since_tag = '22.0.0'

def find_issues_from_changelog():
script_dir = os.path.dirname(os.path.realpath(__file__))
path = os.path.join(script_dir, '..', '..', 'CHANGELOG.md')

issues = set()

# Flag that
in_issue_section = False

with open(path, 'r') as f:
for line in f:
#print('line: ', line)
line = line.strip()
if line.startswith('**'):
section_name = line.replace('**', '')
if section_name in ISSUE_SECTION_NAMES:
#print(' ', 'is issue section', section_name)
in_issue_section = True
else:
#print(' ', 'is not issue section', section_name)
in_issue_section = False

if in_issue_section:
match = re.search('#([\d]+)', line)
if match is not None:
#print(' ', 'reference', match.group(1))
issues.add(match.group(1))

# Convert to list of number
return sorted([int(i) for i in issues])


if __name__ == '__main__':
print('Attempting to label github issues from their corresponding PRs')

issues = find_issues_from_changelog()
print('Issues found in CHANGELOG: ', issues)

github_token = os.environ.get("ARROW_GITHUB_API_TOKEN")

print('logging into GITHUB...')
github = Github(github_token)

print('getting github repo...')
arrow_repo = github.get_repo('apache/arrow-rs')

for issue in issues:
relabel_issue(arrow_repo, issue)
35 changes: 32 additions & 3 deletions dev/release/update_change_log.sh
Expand Up @@ -29,16 +29,45 @@

set -e

SINCE_TAG="21.0.0"
FUTURE_RELEASE="22.0.0"
SINCE_TAG="22.0.0"
FUTURE_RELEASE="23.0.0"

SOURCE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SOURCE_TOP_DIR="$(cd "${SOURCE_DIR}/../../" && pwd)"

OUTPUT_PATH="${SOURCE_TOP_DIR}/CHANGELOG.md"
OLD_OUTPUT_PATH="${SOURCE_TOP_DIR}/CHANGELOG-old.md"

# remove license header so github-changelog-generator has a clean base to append
sed -i.bak '1,18d' "${OUTPUT_PATH}"
sed -i.bak '1,21d' "${OUTPUT_PATH}"
sed -i.bak '1,21d' "${OLD_OUTPUT_PATH}"
# remove the github-changelog-generator footer from the old CHANGELOG.md
LINE_COUNT=$(wc -l <"${OUTPUT_PATH}")
sed -i.bak2 "$(( $LINE_COUNT-4+1 )),$ d" "${OUTPUT_PATH}"

# Copy the previous CHANGELOG.md to CHANGELOG-old.md
echo '<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Historical Changelog
' | cat - "${OUTPUT_PATH}" "${OLD_OUTPUT_PATH}" > "${OLD_OUTPUT_PATH}".tmp
mv "${OLD_OUTPUT_PATH}".tmp "${OLD_OUTPUT_PATH}"

# use exclude-tags-regex to filter out tags used for object_store
# crates and only only look at tags that DO NOT begin with `object_store_`
Expand Down
2 changes: 1 addition & 1 deletion integration-testing/Cargo.toml
Expand Up @@ -18,7 +18,7 @@
[package]
name = "arrow-integration-testing"
description = "Binaries used in the Arrow integration tests"
version = "22.0.0"
version = "23.0.0"
homepage = "https://github.com/apache/arrow-rs"
repository = "https://github.com/apache/arrow-rs"
authors = ["Apache Arrow <dev@arrow.apache.org>"]
Expand Down
6 changes: 3 additions & 3 deletions parquet/Cargo.toml
Expand Up @@ -17,7 +17,7 @@

[package]
name = "parquet"
version = "22.0.0"
version = "23.0.0"
license = "Apache-2.0"
description = "Apache Parquet implementation in Rust"
homepage = "https://github.com/apache/arrow-rs"
Expand All @@ -41,7 +41,7 @@ zstd = { version = "0.11.1", optional = true, default-features = false }
chrono = { version = "0.4", default-features = false, features = ["alloc"] }
num = { version = "0.4", default-features = false }
num-bigint = { version = "0.4", default-features = false }
arrow = { path = "../arrow", version = "22.0.0", optional = true, default-features = false, features = ["ipc"] }
arrow = { path = "../arrow", version = "23.0.0", optional = true, default-features = false, features = ["ipc"] }
base64 = { version = "0.13", default-features = false, features = ["std"], optional = true }
clap = { version = "3", default-features = false, features = ["std", "derive", "env"], optional = true }
serde_json = { version = "1.0", default-features = false, features = ["std"], optional = true }
Expand All @@ -61,7 +61,7 @@ flate2 = { version = "1.0", default-features = false, features = ["rust_backend"
lz4 = { version = "1.23", default-features = false }
zstd = { version = "0.11", default-features = false }
serde_json = { version = "1.0", features = ["std"], default-features = false }
arrow = { path = "../arrow", version = "22.0.0", default-features = false, features = ["ipc", "test_utils", "prettyprint", "json"] }
arrow = { path = "../arrow", version = "23.0.0", default-features = false, features = ["ipc", "test_utils", "prettyprint", "json"] }

[package.metadata.docs.rs]
all-features = true
Expand Down
4 changes: 2 additions & 2 deletions parquet_derive/Cargo.toml
Expand Up @@ -17,7 +17,7 @@

[package]
name = "parquet_derive"
version = "22.0.0"
version = "23.0.0"
license = "Apache-2.0"
description = "Derive macros for the Rust implementation of Apache Parquet"
homepage = "https://github.com/apache/arrow-rs"
Expand All @@ -35,4 +35,4 @@ proc-macro = true
proc-macro2 = { version = "1.0", default-features = false }
quote = { version = "1.0", default-features = false }
syn = { version = "1.0", default-features = false }
parquet = { path = "../parquet", version = "22.0.0" }
parquet = { path = "../parquet", version = "23.0.0" }
4 changes: 2 additions & 2 deletions parquet_derive/README.md
Expand Up @@ -32,8 +32,8 @@ Add this to your Cargo.toml:

```toml
[dependencies]
parquet = "22.0.0"
parquet_derive = "22.0.0"
parquet = "23.0.0"
parquet_derive = "23.0.0"
```

and this to your crate root:
Expand Down
6 changes: 3 additions & 3 deletions parquet_derive_test/Cargo.toml
Expand Up @@ -17,7 +17,7 @@

[package]
name = "parquet_derive_test"
version = "22.0.0"
version = "23.0.0"
license = "Apache-2.0"
description = "Integration test package for parquet-derive"
homepage = "https://github.com/apache/arrow-rs"
Expand All @@ -29,6 +29,6 @@ publish = false
rust-version = "1.62"

[dependencies]
parquet = { path = "../parquet", version = "22.0.0", default-features = false }
parquet_derive = { path = "../parquet_derive", version = "22.0.0", default-features = false }
parquet = { path = "../parquet", version = "23.0.0", default-features = false }
parquet_derive = { path = "../parquet_derive", version = "23.0.0", default-features = false }
chrono = { version="0.4.19", default-features = false, features = [ "clock" ] }