Run scripts/nlp_language_modeling/prepare_packed_ft_dataset.py without GPUs #8935

alex-ht · 2024-04-16T03:35:42Z

alex-ht
Apr 16, 2024

How to run scripts/nlp_language_modeling/prepare_packed_ft_dataset.py without GPUs?
The cost of GPU nodes is very expensive, especially when the models are large.

Additionally, do you have any plans to improve this?
https://github.com/NVIDIA/NeMo/blob/main/scripts/nlp_language_modeling/prepare_packed_ft_dataset.py#L55-L56

currenlty, we require a full nemo model file for simplicity and readability of code, but in theory only a tokenizer file is needed.
This part can be improved in a future iteration of the script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run scripts/nlp_language_modeling/prepare_packed_ft_dataset.py without GPUs #8935

{{title}}

Replies: 0 comments

Select a reply

Run scripts/nlp_language_modeling/prepare_packed_ft_dataset.py without GPUs #8935

alex-ht Apr 16, 2024

Replies: 0 comments

alex-ht
Apr 16, 2024