You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importos# save some filesos.makedirs("my_data", exist_ok=True)
withopen("my_data/file.txt", "w") asfile:
file.write("Test")
importnumpyasnpfromlitdataimportoptimizedefprocess(filename):
withopen(filename, "r"):
pass# do some processingreturnnp.array([1, 2, 3])
if__name__=="__main__":
optimize(
fn=process,
inputs=["my_data/file.txt"],
output_dir="my_optimized_dataset",
chunk_bytes="64MB"
)
raises the following error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 626, in _handle_data_chunk_recipe
item_data_or_generator = self.data_recipe.prepare_item(current_item)
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/functions.py", line 148, in _prepare_item
return self._fn(item_metadata)
File "<ipython-input-3-f77cf781dbef>", line 6, in process
with open(filename, "r"):
IsADirectoryError: [Errno 21] Is a directory: '/tmp/data'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 423, in run
self._loop()
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 472, in _loop
self._handle_data_chunk_recipe(index)
File "/usr/local/lib/python3.10/dist-packages/litdata/processing/data_processor.py", line 638, in _handle_data_chunk_recipe
raise RuntimeError(f"Failed processing {self.items[index]}") from e
RuntimeError: Failed processing /tmp/data
Expected behavior
This works locally and in Studios, so we would also expect it to work in Google Colab.
Environment
PyTorch Version (e.g., 1.0): N/A
OS (e.g., Linux): Linux
How you installed PyTorch (conda, pip, source): N/A
Build command you used (if compiling from source): N/A
馃悰 Bug
In Google Colab, the cache dir resolution leads to a directory error when using the
optimize
function.To Reproduce
Minimal repro in Colab
Code sample
raises the following error:
Expected behavior
This works locally and in Studios, so we would also expect it to work in Google Colab.
Environment
conda
,pip
, source): N/AAdditional context
The issue was raised originally in LitGPT:
Lightning-AI/litgpt#1402
The text was updated successfully, but these errors were encountered: