Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: python notebooks template repo #1195

Merged
merged 30 commits into from Oct 19, 2021

Conversation

loferris
Copy link
Contributor

Adding a template repo for Python notebooks, updating common.py

@loferris loferris requested a review from a team August 25, 2021 23:14
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Aug 25, 2021
@loferris loferris changed the title Python notebooks templates feat: python notebooks template repo Aug 25, 2021
import sys

MINIMUM_MAJOR_VERSION = 3
MINIMUM_MINOR_VERSION = 3
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be 5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is current with AI platform (updated 12 days ago), but will do!

sys.version_info.major < MINIMUM_MAJOR_VERSION
and sys.version_info.minor < MINIMUM_MINOR_VERSION
):
print("Error: Python version less than 3.5")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe dynamically generate the string to reflect the constants?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I understand that this might be old code that was inherited from several people.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fixing it in next commit!

@@ -0,0 +1,40 @@
steps:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make sure you have the most up-to-date version of this folder? I see this file is using an older version.

Compare with: https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/.cloud-build/notebook-execution-test-cloudbuild.yaml

nbconvert>=6.0
papermill
numpy
pandas
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update file

@ivanmkc
Copy link

ivanmkc commented Aug 25, 2021

Question, I see you are including synthtool/gcp/templates/python_notebooks/.cloud-build/test_folders.txt, which is an empty file.

Does that mean this will overwrite any changes in the downstream repos?
Perhaps better to leave this file out.

Regarding non-code files, such as MD, READMES, .gitignore:
I prefer to leave out all of these files and scope this synthtool integration to only the .cloud-build folder.

I imagine different teams can pick and choose what to integrate from synthtool. They might want their own README's but just want the notebook testing pipeline in .cloud-build.

@loferris
Copy link
Contributor Author

shall we include just .github and .cloud-build in this PR? @ivanmkc


if (
sys.version_info.major < MINIMUM_MAJOR_VERSION
and sys.version_info.minor < MINIMUM_MINOR_VERSION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need to be or? Wouldn't 2.7 result in the else statement?

Alternatively, you could flip the if/else and check that major and minor are >= expected value. (Probably worth checking for == 3 for the major since who knows what Python 4 will be if that ever happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in latest commit!

@@ -0,0 +1,8 @@
steps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not 100% necessary, but YAML supports comments, so we really should have a license header here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

@@ -0,0 +1,20 @@
# Google Cloud [PRODUCT] Notebooks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to omit this file for now. Or perhaps template it (see other files with .j2 extension)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed with recent commit!

Copy link

@ivanmkc ivanmkc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except that I flagged some files to remove.

@@ -0,0 +1,31 @@
---
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to notebook testing so perhaps remove it?

@@ -0,0 +1,20 @@
---
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to notebook testing so perhaps remove it?

@@ -0,0 +1,12 @@
Fixes #<issue_number_goes_here>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to notebook testing so perhaps remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm! removed in latest commit

else:
manager.delete(resource)

print("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this print statement need to be here?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was probably to add a newline for formatting purposes. Feel free to remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing, but depending we can add back formatting if it makes things confusing.

print("")


is_dry_run = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this setting of the dry run override any time it's set to true?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard-coded, but could be changed to accept an arg. There was no need at the time for an arg, so it was left hard-coded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivanmkc I've rewritten the cleanup file and updated the source files for the repo in the last two commits if you want to take a look!

for notebook in notebooks:
print(f"Running notebook: {notebook}")

# TODO: Handle cases where multiple notebooks have the same name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue open tracking this todo somewhere?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at the moment, could track in the synthtool issue tracker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, removing and let's track via issues.

stderr_file=sys.stderr,
)

except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little skittish about catching ALL exceptions, but at least we're raising them so I think I feel better about that. Do other @googleapis/python-samples-owners have opinions here?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an exception isn't caught, then Cloud Build will fail on that step, basically failing all the notebooks (even ones that would have otherwise passed.)

Also, the cleanup and debug steps at the end of the Cloud Build pipeline will never run, thus the user wouldn't be able to debug the error.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this whole file seems out of date, it should be matching this file: https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/.cloud-build/ExecuteNotebook.py

@@ -0,0 +1 @@
google-cloud-aiplatform
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most other requirements files, we prefer to pin to specific versions and have them updated by renovate bot. (You can add this non-standard filename to the renovate.json). Is there a reason we're not doing that here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, I think we just weren't aware of the other standards for synthtool. I'll look into changing this in the PR.

args:
- -c
- 'python3 -m pip install -U -r .cloud-build/cleanup/cleanup-requirements.txt && python3 .cloud-build/cleanup/cleanup.py'
timeout: 86400s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we have such a high timeout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving as is!

# limitations under the License.
steps:
# Show the gcloud info and check if gcloud exists
- name: ${_PYTHON_IMAGE}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the cloud build variables needed documented somewhere for our internal reference? I'm not sure if we do this well in general or not - @dinagraves opinions welcome here 😁

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These would generally be in trigger definition, which doesn't have a lot of visibility outside of the project itself -- unless you're using terraform. However, I would suggest adding defaults in the cloudbuild.yaml - these can be overwritten by anyone when they run it, but it'll provide more context and a good default.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, adding defaults would work fine @loferris
Also, +1 on documentation.

FYI $_BASE_BRANCH is injected by Cloud Build (see https://cloud.google.com/build/docs/configuring-builds/substitute-variable-values#using_default_substitutions)

$_FORCE_BASE_BRANCH is used because sometimes that variable is not populated (may be a bug), when you rerun a build.

args:
- -c
- 'echo "Download executed notebooks with this command: \"mkdir -p artifacts && gsutil rsync -r gs://${_GCS_ARTIFACTS_BUCKET}/test-artifacts/PR_${_PR_NUMBER}/BUILD_${BUILD_ID} artifacts/\"" && if [ "$(ls -A /workspace/${BUILD_ID}/failure | wc -l)" -ne 0 ]; then exit 1; else exit 0; fi'
timeout: 86400s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flagging another very high timeout

Copy link

@ivanmkc ivanmkc Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 24 hours, VertexAI notebooks can take many hours to run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving as is!

@@ -0,0 +1,8 @@
ipython>=7.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to above comment, can we pin to exact versions and use renovate to keep up with it? or is that less of an issue with a cloud build script compared to with samples maintenance?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's possible, although I've never used renovate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leahecole I think my main question here is if this template will be incorporated into repos downstream, wouldn't that mean that the template itself should contain its own renovate bot so its usable downstream? And if I do that, will that interfere with the renovate set up for this repo? I can add the requirements to this repo's requirements.txt at project root but I'm not sure how that would be passed to all the downstream repos using this template.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Will it be incorporated into samples libraries with owlbot, which already have renovatebot enabled? If so, we would want to modify this template. I believe that adding a new renovate.json template would make our 🤖 friend and the folks who maintain it extremely confused.

Or, are these notebook templates going to be used ad-hoc in random repos? If it's going to be used in random repos under the GoogleCloudPlatform and googleapis orgs, we do have renovatebot enabled at the org level and the way to bring it to life would be to create a new renovate.json template (like that other one I linked).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than python-docs-samples I'm not sure any of the other repos already have owlbot enabled, though that would be our next step! However, both of the main repos currently are under GCP not googleapis.

Would the suggestion be to add a renovate.json to the root of this template, make sure the right file extensions are flagged, and then leave blank (?) requirements files?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They all probably have it, but similar to renovate, it can magically be turned on with the addition of its config file (in owlbots case, owlbot.py)

Given that it sounds like the main repos are in the GCP org, I think that we actually might be covered - renovate will match any requirements.txt file now that I think about it, so as long as it's in a repo that has a renovate.json we are good!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, so I've gone ahead and made sure all dependencies have specific versions and made sure the file names are easy for renovatebot to parse. see latest two commits!

Copy link

@ivanmkc ivanmkc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking on execute_notebooks.

It should match: https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/.cloud-build/ExecuteNotebook.py

Can you double-check that all the other files match https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/.cloud-build/**, unless there are intentional changes?

@@ -0,0 +1,160 @@
#!/usr/bin/env python
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do - I think it's just out of date because the PR has been in flight for a minute.

Copy link

@ivanmkc ivanmkc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entrypoint: /bin/sh
args:
- -c
- 'python3 -m pip freeze && python3 .cloud-build/ExecuteChangedNotebooks.py --test_paths_file "${_TEST_PATHS_FILE}" --base_branch "${_FORCED_BASE_BRANCH}" --output_folder ${BUILD_ID} --variable_project_id ${PROJECT_ID} --variable_region ${_GCP_REGION}'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecuteChangedNotebooks.py -> execute_changed_notebooks.py

Copy link

@ivanmkc ivanmkc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to add it in this repo, but can we add CI tests to test that all of this works?

# limitations under the License.
steps:
# Show the gcloud info and check if gcloud exists
- name: ${_PYTHON_IMAGE}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These would generally be in trigger definition, which doesn't have a lot of visibility outside of the project itself -- unless you're using terraform. However, I would suggest adding defaults in the cloudbuild.yaml - these can be overwritten by anyone when they run it, but it'll provide more context and a good default.

entrypoint: /bin/sh
args:
- -c
- 'gcloud config list'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to work? I guess I don't know what your python image is. Usually I would use the gcloud SDK image for this. If you're using something custom, I would add some comments about it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving as is for now

args:
- -c
- 'gcloud config list'
# Check the Python version

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all of these need to be separate steps? Since you're using the shell entrypoint, you could combine some of them, unless you want them to be distinct failures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivanmkc what was your thinking on combining these?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are kept separate for readability.

entrypoint: /bin/sh
args:
- -c
- 'python3 -m pip freeze && python3 .cloud-build/ExecuteChangedNotebooks.py --test_paths_file "${_TEST_PATHS_FILE}" --base_branch "${_FORCED_BASE_BRANCH}" --output_folder ${BUILD_ID} --variable_project_id ${PROJECT_ID} --variable_region ${_GCP_REGION}'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend splitting this up on separate lines since it's quite long. See example here: https://github.com/dinagraves/cloud-build-empathy-solution/blob/main/cloudbuild.yaml#L36

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed in last commit - I followed that pattern for the cited example as I didn't see more multi-line commands.

- 'python3 -m pip freeze && python3 .cloud-build/ExecuteChangedNotebooks.py --test_paths_file "${_TEST_PATHS_FILE}" --base_branch "${_FORCED_BASE_BRANCH}" --output_folder ${BUILD_ID} --variable_project_id ${PROJECT_ID} --variable_region ${_GCP_REGION}'
env:
- 'IS_TESTING=1'
# Manually copy artifacts to GCS

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed in latest commit!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dinah - I had to revert this change as it wasn't working for our particular setup!

@loferris
Copy link
Contributor Author

Not sure how to add it in this repo, but can we add CI tests to test that all of this works?

Agreed, let's add it as an issue for a followup PR and I'll investigate with Synthtool folks.

@loferris
Copy link
Contributor Author

@dinagraves the current commit reflects some adjustments based on your comments and @ivanmkc's input! would love any additional feedback before I ping folks on approvals!

Copy link
Contributor

@chingor13 chingor13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used by multiple repositories? If not, we can just manage the files in the one python-notebooks repository rather than indirectly managing the files here.

@loferris
Copy link
Contributor Author

@chingor13 It's being used in multiple repositories and hopefully many more.

@loferris loferris dismissed ivanmkc’s stale review October 18, 2021 16:09

I made the changes requested.

Copy link
Contributor

@bcoe bcoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very much in support of codifying how notebooks are created.

A few non blocking questions:

  • it looks like unlike other templates this will use Cloud Build for testing, rather than kokoro?
  • where will the environment variables be set, such as project name, for running the notebook tests.
  • what project will the notebook tests run against?

@loferris
Copy link
Contributor Author

  1. Yes, we're using Cloud Build rather than Kokoro at the moment and are in active convo about whether that may become a variant standard for Python sample testing.
  2. I think right now we're leaning into hardcoding defaults, per @dinagraves suggestion of cloud-build.yaml best practices. The current PR is sort of splitting the difference, and the environmental defaults at the moment are set in Cloud Build I believe.
  3. Our current expectation is that participating repos would decide where they wanted their cloud-build testing pipeline to live in their projects, and we'll be creating documentation for that side of things. (At the moment the current users of this template are all directly involved in the notebooks working group.)

I'm going to make one more commit to fix a little bug I found on my testing branch before merging!

@loferris loferris merged commit 02cd0ee into googleapis:master Oct 19, 2021
@loferris loferris deleted the python-notebooks-templates branch October 19, 2021 23:04
@leahecole
Copy link
Contributor

I just came here to catch up on the comments and am thrilled this is merged!! Great work @loferris !!! 😁

@leahecole
Copy link
Contributor

(and great work many many reviewers!! 😀 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants