Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage] Smoke tests for Storage mounting #1445

Merged
merged 36 commits into from Dec 15, 2022
Merged

Conversation

romilbhardwaj
Copy link
Collaborator

Our current smoke tests for Storage only test the Storage class. This PR adds smoke tests for the mounting functionality to make sure our mounting backends (goofys/s3fs/gcsfuse) are also functioning correctly.

Tested:

  • pytest tests/test_smoke.py::test_storage_mounts tests/test_smoke.py::test_file_mounts

f'sky launch -y -c {name} examples/using_file_mounts.yaml',
f'sky logs {name} 1 --status', # Ensure the job succeeded.
],
test_commands,
f'sky down -y {name}',
timeout=20 * 60, # 20 mins
)
run_one_test(test)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to clear the folders/files creased in storage_setup_commands?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added cleanup commands in d7afd8d, but that causes transient test failures because a cleanup at the end of a test might delete the files in use by another test. I tried to use pytest_sessionfinish hooks but that does not work with the parallelization we use in pytest-xdist - each process runs just one test and terminates, so the cleanup code is called n times instead of just once (see issue).

To avoid adding more complexity to our testing code, I decided to exclude cleanup for now (since these are tiny files anyway). lmk if you think we should still implement something

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! It seems pretty difficult to implement cleanup. I have another idea if you have time; instead of creating a bunch of files and directories, can we somehow move these commands and use import tempfile to generate temporary files and directories on local in that case?

yaml_str = pathlib.Path(
'tests/test_yamls/test_storage_mounting.yaml').read_text()
storage_name = f'sky-test-{int(time.time())}'
yaml_str = yaml_str.replace('random_bucket_name', storage_name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design choice to replace random_bucket_name with another name stands out a bit. Running the yaml file standalone will def not work as random_bucket_name is most likely taken, so it probably better to make it a jinja file. Why can't we make test_storage_mounting.yaml a jinja file instead and fill in the placeholder variable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done for test_storage_mounting! I didn't do jinja for test_spot_storage because it uses a YAML in the examples folder, and it might be confusing for users to see jinja templates in the examples.

tests/test_smoke.py Show resolved Hide resolved
# Mounting private buckets in MOUNT mode
/mount_private_mount:
name: random_bucket_name
source: ~/tmp-workdir
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, since the same source is specified twice to the same bucket, there is redundant uploading done? Should we allow for users to do this? (assert unique?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would we uploaded twice... we should do some optimizations for this in storage.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sg, ig lets keep it separate from this PR then

@romilbhardwaj
Copy link
Collaborator Author

romilbhardwaj commented Nov 30, 2022

Ready for another look @michaelzhiluo!

Tested

  • pytest -n 16 tests/test_smoke.py::test_storage_mounts tests/test_smoke.py::test_file_moun ts tests/test_smoke.py::test_spot_storage

Copy link
Collaborator

@michaelzhiluo michaelzhiluo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me! I have one big refactor suggestion (create temp files and dirs instead for storage_setup_commands) and it can be implemented, that's be nice^

f'sky launch -y -c {name} examples/using_file_mounts.yaml',
f'sky logs {name} 1 --status', # Ensure the job succeeded.
],
test_commands,
f'sky down -y {name}',
timeout=20 * 60, # 20 mins
)
run_one_test(test)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! It seems pretty difficult to implement cleanup. I have another idea if you have time; instead of creating a bunch of files and directories, can we somehow move these commands and use import tempfile to generate temporary files and directories on local in that case?

# Mounting private buckets in MOUNT mode
/mount_private_mount:
name: random_bucket_name
source: ~/tmp-workdir
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sg, ig lets keep it separate from this PR then

@romilbhardwaj
Copy link
Collaborator Author

I have another idea if you have time; instead of creating a bunch of files and directories, can we somehow move these commands and use import tempfile to generate temporary files and directories on local in that case?

That's a great point @michaelzhiluo - I looked into it and unfortunately tempfile doesn't do symlinks which is required in the tests :(

If everything else looks good, should we merge this?

Copy link
Collaborator

@michaelzhiluo michaelzhiluo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, if symlinks aren't compatible with tempfile library, then I think its good to go^ Thanks for adding these tests!

@romilbhardwaj romilbhardwaj merged commit a0760df into master Dec 15, 2022
@romilbhardwaj romilbhardwaj deleted the storage_mountsmoketest branch December 15, 2022 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants