Docs

huggingface · Sep 8, 2022 · 3aa43e7 · 3aa43e7 · github-actions · Sep 8, 2022
1 parent 0e241dd
commit 3aa43e7
Show file tree

Hide file tree

Showing 2 changed files with 17 additions and 0 deletions.
diff --git a/docs/source/loading.mdx b/docs/source/loading.mdx
@@ -220,6 +220,22 @@ Load a list of Python dictionaries with [`~Dataset.from_list`]:
 >>> dataset = Dataset.from_list(my_list)
 ```
 
+### Python generator
+
+Create a dataset from a Python generator with [`~Dataset.from_generator`]
+
+```py
+>>> from datasets import Dataset
+>>> def my_gen():
+...     yield {"a": 1}
+...     yield {"a": 2}
+...     yield {"a": 3}
+...
+>>> dataset = Dataset.from_generator(my_dict)
+```
+
+This approach supports loading data larger than available memory.
+
 ### Pandas DataFrame
 
 Load Pandas DataFrames with [`~Dataset.from_pandas`]:

diff --git a/docs/source/package_reference/main_classes.mdx b/docs/source/package_reference/main_classes.mdx
@@ -16,6 +16,7 @@ The base class [`Dataset`] implements a Dataset backed by an Apache Arrow table.
     - from_buffer
     - from_pandas
     - from_dict
+    - from_generator
     - data
     - cache_files
     - num_columns