-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metaflow crashes on AWS Batch if folder called metaflow
is present in the working directory
#1672
Comments
Hey @shchur, I wasn't able to reproduce this...attaching a screenshot of a successful run with Could you perhaps:
|
Thank you for a quick response! I tried using
I suspect that the problem lies in the AWS configuration, but I'm not sure how to get to its root cause. I used the CloudFormation template without any modifications, and stack creation finished with Are there any additional log statements that I could add to the Metaflow code to get a more informative error message / understand which exact operation is failing? |
I found the source of the problem: my working directory included a folder called
It might be helpful to check for presence of this folder before submitting the job and raise an informative error message to the user. As for debugging jobs on AWS Batch, I was able to find the detailed log with error message in the My problem is solved, but I'm keeping this issue open in case you want to add an informative error message for the edge case. Thank you for building and maintaining such an amazing framework! |
metaflow
is present in the working directory
@shchur I wasn't able to reproduce this, can you please share your directory structure? Mine looks the following: and I submit the flow with |
@madhur-ob The problem was that I built my Docker image used by Metaflow in the same directory. In other words, if the |
I have created the stack using the MetaFlow CloudFormation template.
I can run flows locally, but flows fail if I add the
@batch
decorator.For example, this flow
Code
fails with the following error message
The S3 folder for the failing step (
HellowFlow/1704839212206922/hello/2
) contains the following files:0.attempt.json
0.runtime
0.runtime_stderr.log
0.runtime_stdout.log
I would really appreciate any pointers on how to debug this issue.
The text was updated successfully, but these errors were encountered: