Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate results in the result generate by the model fine-tuned by lora. #939

Open
gulizhoutao opened this issue May 5, 2024 · 2 comments

Comments

@gulizhoutao
Copy link

I use Lora to fine-tune the llama3-instruction model, and after I use the fine-tuned model for generation, the results duplicate and don't end. Example as follows:

tune run generate --config custom_generation_lora_config.yaml
INFO:torchtune.utils.logging:Running InferenceRecipe with resolved config:

checkpointer:
component: torchtune.utils.FullModelMetaCheckpointer
checkpoint_dir: ./Meta-Llama-3-8B-Instruct-ft
checkpoint_files:

  • meta_model_2.pt
    model_type: LLAMA3
    output_dir: ./Meta-Llama-3-8B-Instruct-ft
    device: cuda
    dtype: bf16
    max_new_tokens: 300
    model:
    component: torchtune.models.llama3.llama3_8b
    prompt: 'Please summarize the sentence by retaining all concrete objects and their
    concrete information in this sentence,while deleting all abstract information such
    as atmosphere, scene and summary.

The image captures a vibrant scene from a bustling street in India. The street is
teeming with life and activity, with various modes of transportation adding to the
dynamic atmosphere. In the immediate foreground, a man dressed in a purple shirt
is seen riding a black motorcycle. He is not alone; a passenger, clad in a black
shirt, accompanies him on the journey. Their presence in the forefront of the image
suggests they are moving at a brisk pace, navigating their way through the busy
street. Just behind the motorcycle, a man in a white shirt is pedaling a blue bicycle
rickshaw. The rickshaw, a common sight on Indian streets, adds a touch of local
flavor to the scene. Despite being slightly obscured by the motorcycle, the rickshaw
driver''s determined expression is indicative of his effort to keep up with the
fast-paced traffic. Further back, the street is filled with an array of other vehicles
and pedestrians, each contributing to the overall hustle and bustle. The gray buildings
lining the street are adorned with various signs and advertisements, reflecting
the commercial nature of the area. Overall, this image paints a vivid picture of
daily life on an Indian street, characterized by its lively atmosphere, diverse
modes of transportation, and vibrant urban landscape.'
quantizer: null
seed: 1234
temperature: 0.6
tokenizer:
component: torchtune.models.llama3.llama3_tokenizer
path: Meta-Llama-3-8B-Instruct/original/tokenizer.model
top_k: 30

DEBUG:torchtune.utils.logging:Setting manual seed to local seed 1234. Local seed is seed + rank = 1234 + 0
INFO:torchtune.utils.logging:Model is initialized with precision torch.bfloat16.
INFO:torchtune.utils.logging:Please summarize the sentence by retaining all concrete objects and their concrete information in this sentence,while deleting all abstract information such as atmosphere, scene and summary.

The image captures a vibrant scene from a bustling street in India. The street is teeming with life and activity, with various modes of transportation adding to the dynamic atmosphere. In the immediate foreground, a man dressed in a purple shirt is seen riding a black motorcycle. He is not alone; a passenger, clad in a black shirt, accompanies him on the journey. Their presence in the forefront of the image suggests they are moving at a brisk pace, navigating their way through the busy street. Just behind the motorcycle, a man in a white shirt is pedaling a blue bicycle rickshaw. The rickshaw, a common sight on Indian streets, adds a touch of local flavor to the scene. Despite being slightly obscured by the motorcycle, the rickshaw driver's determined expression is indicative of his effort to keep up with the fast-paced traffic. Further back, the street is filled with an array of other vehicles and pedestrians, each contributing to the overall hustle and bustle. The gray buildings lining the street are adorned with various signs and advertisements, reflecting the commercial nature of the area. Overall, this image paints a vivid picture of daily life on an Indian street, characterized by its lively atmosphere, diverse modes of transportation, and vibrant urban landscape.

Summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is filled with other vehicles and pedestrians, and the buildings are adorned with signs and advertisements. This image captures a busy scene in India. Retained objects: man, purple shirt, black motorcycle, passenger, black shirt, white shirt, blue bicycle rickshaw, buildings, signs, advertisements. Abstract information deleted: atmosphere, scene, summary. Concrete information retained: concrete objects and their concrete information. Concrete information summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is filled with other vehicles and pedestrians, and the buildings are adorned with signs and advertisements. This image captures a busy scene in India. Retained objects: man, purple shirt, black motorcycle, passenger, black shirt, white shirt, blue bicycle rickshaw, buildings, signs, advertisements. Abstract information deleted: atmosphere, scene, summary. Concrete information retained: concrete objects and their concrete information. Concrete information summary: A man in a purple shirt is riding a black motorcycle with a passenger in a black shirt. Behind them, a man in a white shirt is pedaling a blue bicycle rickshaw. The street is
INFO:torchtune.utils.logging:Time for inference: 10.56 sec total, 28.40 tokens/sec
INFO:torchtune.utils.logging:Bandwidth achieved: 519.09 GB/s
INFO:torchtune.utils.logging:Memory used: 18.52 GB

How can I solve this problem? Is there something wrong with eos?

@rohan-varma
Copy link
Member

Thanks for filing the issue! I did a quick check of our generation recipe to see if there were any immediate potential issues around stopping when an EOS is issued. Things were updated a bit after #871 that started to support stopping after some non EOS tokens are issued, but it looks like we do respect EOS tokens as expected. Based on this, another initial thought is that the finetuned model just isn't generating an EOS token for some reason.

cc @ebsmothers who might have more context on this.

@ebsmothers
Copy link
Contributor

Hi @gulizhoutao thanks for creating the issue. Can you share the command you used to fine-tune the model along with a paste of your fine-tune and generate configs? Then I can try to reproduce the behavior you're seeing to figure out the cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants