You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I use Lora to fine-tune the llama3-instruction model, and after I use the fine-tuned model for generation, the results duplicate and don't end. Example as follows:
tune run generate --config custom_generation_lora_config.yaml
INFO:torchtune.utils.logging:Running InferenceRecipe with resolved config:
meta_model_2.pt
model_type: LLAMA3
output_dir: ./Meta-Llama-3-8B-Instruct-ft
device: cuda
dtype: bf16
max_new_tokens: 300
model: component: torchtune.models.llama3.llama3_8b
prompt: 'Please summarize the sentence by retaining all concrete objects and their
concrete information in this sentence,while deleting all abstract information such
as atmosphere, scene and summary.
The image captures a vibrant scene from a bustling street in India. The street is
teeming with life and activity, with various modes of transportation adding to the
dynamic atmosphere. In the immediate foreground, a man dressed in a purple shirt
is seen riding a black motorcycle. He is not alone; a passenger, clad in a black
shirt, accompanies him on the journey. Their presence in the forefront of the image
suggests they are moving at a brisk pace, navigating their way through the busy
street. Just behind the motorcycle, a man in a white shirt is pedaling a blue bicycle
rickshaw. The rickshaw, a common sight on Indian streets, adds a touch of local
flavor to the scene. Despite being slightly obscured by the motorcycle, the rickshaw
driver''s determined expression is indicative of his effort to keep up with the
fast-paced traffic. Further back, the street is filled with an array of other vehicles
and pedestrians, each contributing to the overall hustle and bustle. The gray buildings
lining the street are adorned with various signs and advertisements, reflecting
the commercial nature of the area. Overall, this image paints a vivid picture of
daily life on an Indian street, characterized by its lively atmosphere, diverse
modes of transportation, and vibrant urban landscape.'
quantizer: null
seed: 1234
temperature: 0.6
tokenizer: component: torchtune.models.llama3.llama3_tokenizer
path: Meta-Llama-3-8B-Instruct/original/tokenizer.model
top_k: 30
DEBUG:torchtune.utils.logging:Setting manual seed to local seed 1234. Local seed is seed + rank = 1234 + 0
INFO:torchtune.utils.logging:Model is initialized with precision torch.bfloat16.
INFO:torchtune.utils.logging:Please summarize the sentence by retaining all concrete objects and their concrete information in this sentence,while deleting all abstract information such as atmosphere, scene and summary.
The image captures a vibrant scene from a bustling street in India. The street is teeming with life and activity, with various modes of transportation adding to the dynamic atmosphere. In the immediate foreground, a man dressed in a purple shirt is seen riding a black motorcycle. He is not alone; a passenger, clad in a black shirt, accompanies him on the journey. Their presence in the forefront of the image suggests they are moving at a brisk pace, navigating their way through the busy street. Just behind the motorcycle, a man in a white shirt is pedaling a blue bicycle rickshaw. The rickshaw, a common sight on Indian streets, adds a touch of local flavor to the scene. Despite being slightly obscured by the motorcycle, the rickshaw driver's determined expression is indicative of his effort to keep up with the fast-paced traffic. Further back, the street is filled with an array of other vehicles and pedestrians, each contributing to the overall hustle and bustle. The gray buildings lining the street are adorned with various signs and advertisements, reflecting the commercial nature of the area. Overall, this image paints a vivid picture of daily life on an Indian street, characterized by its lively atmosphere, diverse modes of transportation, and vibrant urban landscape.
Summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is filled with other vehicles and pedestrians, and the buildings are adorned with signs and advertisements. This image captures a busy scene in India. Retained objects: man, purple shirt, black motorcycle, passenger, black shirt, white shirt, blue bicycle rickshaw, buildings, signs, advertisements. Abstract information deleted: atmosphere, scene, summary. Concrete information retained: concrete objects and their concrete information. Concrete information summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is filled with other vehicles and pedestrians, and the buildings are adorned with signs and advertisements. This image captures a busy scene in India. Retained objects: man, purple shirt, black motorcycle, passenger, black shirt, white shirt, blue bicycle rickshaw, buildings, signs, advertisements. Abstract information deleted: atmosphere, scene, summary. Concrete information retained: concrete objects and their concrete information. Concrete information summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is
INFO:torchtune.utils.logging:Time for inference: 10.56 sec total, 28.40 tokens/sec
INFO:torchtune.utils.logging:Bandwidth achieved: 519.09 GB/s
INFO:torchtune.utils.logging:Memory used: 18.52 GB
How can I solve this problem? Is there something wrong with eos?
The text was updated successfully, but these errors were encountered:
Thanks for filing the issue! I did a quick check of our generation recipe to see if there were any immediate potential issues around stopping when an EOS is issued. Things were updated a bit after #871 that started to support stopping after some non EOS tokens are issued, but it looks like we do respect EOS tokens as expected. Based on this, another initial thought is that the finetuned model just isn't generating an EOS token for some reason.
cc @ebsmothers who might have more context on this.
Hi @gulizhoutao thanks for creating the issue. Can you share the command you used to fine-tune the model along with a paste of your fine-tune and generate configs? Then I can try to reproduce the behavior you're seeing to figure out the cause.
I use Lora to fine-tune the llama3-instruction model, and after I use the fine-tuned model for generation, the results duplicate and don't end. Example as follows:
tune run generate --config custom_generation_lora_config.yaml
INFO:torchtune.utils.logging:Running InferenceRecipe with resolved config:
checkpointer:
component: torchtune.utils.FullModelMetaCheckpointer
checkpoint_dir: ./Meta-Llama-3-8B-Instruct-ft
checkpoint_files:
model_type: LLAMA3
output_dir: ./Meta-Llama-3-8B-Instruct-ft
device: cuda
dtype: bf16
max_new_tokens: 300
model:
component: torchtune.models.llama3.llama3_8b
prompt: 'Please summarize the sentence by retaining all concrete objects and their
concrete information in this sentence,while deleting all abstract information such
as atmosphere, scene and summary.
The image captures a vibrant scene from a bustling street in India. The street is
teeming with life and activity, with various modes of transportation adding to the
dynamic atmosphere. In the immediate foreground, a man dressed in a purple shirt
is seen riding a black motorcycle. He is not alone; a passenger, clad in a black
shirt, accompanies him on the journey. Their presence in the forefront of the image
suggests they are moving at a brisk pace, navigating their way through the busy
street. Just behind the motorcycle, a man in a white shirt is pedaling a blue bicycle
rickshaw. The rickshaw, a common sight on Indian streets, adds a touch of local
flavor to the scene. Despite being slightly obscured by the motorcycle, the rickshaw
driver''s determined expression is indicative of his effort to keep up with the
fast-paced traffic. Further back, the street is filled with an array of other vehicles
and pedestrians, each contributing to the overall hustle and bustle. The gray buildings
lining the street are adorned with various signs and advertisements, reflecting
the commercial nature of the area. Overall, this image paints a vivid picture of
daily life on an Indian street, characterized by its lively atmosphere, diverse
modes of transportation, and vibrant urban landscape.'
quantizer: null
seed: 1234
temperature: 0.6
tokenizer:
component: torchtune.models.llama3.llama3_tokenizer
path: Meta-Llama-3-8B-Instruct/original/tokenizer.model
top_k: 30
DEBUG:torchtune.utils.logging:Setting manual seed to local seed 1234. Local seed is seed + rank = 1234 + 0
INFO:torchtune.utils.logging:Model is initialized with precision torch.bfloat16.
INFO:torchtune.utils.logging:Please summarize the sentence by retaining all concrete objects and their concrete information in this sentence,while deleting all abstract information such as atmosphere, scene and summary.
The image captures a vibrant scene from a bustling street in India. The street is teeming with life and activity, with various modes of transportation adding to the dynamic atmosphere. In the immediate foreground, a man dressed in a purple shirt is seen riding a black motorcycle. He is not alone; a passenger, clad in a black shirt, accompanies him on the journey. Their presence in the forefront of the image suggests they are moving at a brisk pace, navigating their way through the busy street. Just behind the motorcycle, a man in a white shirt is pedaling a blue bicycle rickshaw. The rickshaw, a common sight on Indian streets, adds a touch of local flavor to the scene. Despite being slightly obscured by the motorcycle, the rickshaw driver's determined expression is indicative of his effort to keep up with the fast-paced traffic. Further back, the street is filled with an array of other vehicles and pedestrians, each contributing to the overall hustle and bustle. The gray buildings lining the street are adorned with various signs and advertisements, reflecting the commercial nature of the area. Overall, this image paints a vivid picture of daily life on an Indian street, characterized by its lively atmosphere, diverse modes of transportation, and vibrant urban landscape.
Summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is filled with other vehicles and pedestrians, and the buildings are adorned with signs and advertisements. This image captures a busy scene in India. Retained objects: man, purple shirt, black motorcycle, passenger, black shirt, white shirt, blue bicycle rickshaw, buildings, signs, advertisements. Abstract information deleted: atmosphere, scene, summary. Concrete information retained: concrete objects and their concrete information. Concrete information summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is filled with other vehicles and pedestrians, and the buildings are adorned with signs and advertisements. This image captures a busy scene in India. Retained objects: man, purple shirt, black motorcycle, passenger, black shirt, white shirt, blue bicycle rickshaw, buildings, signs, advertisements. Abstract information deleted: atmosphere, scene, summary. Concrete information retained: concrete objects and their concrete information. Concrete information summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is
INFO:torchtune.utils.logging:Time for inference: 10.56 sec total, 28.40 tokens/sec
INFO:torchtune.utils.logging:Bandwidth achieved: 519.09 GB/s
INFO:torchtune.utils.logging:Memory used: 18.52 GB
How can I solve this problem? Is there something wrong with eos?
The text was updated successfully, but these errors were encountered: