-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For finetuning, add an alternative to LlamaFactory #134
base: main
Are you sure you want to change the base?
Conversation
https://github.com/zhangfaen/finetune-Qwen2-VL has 50+ stars and many repost in twitter. |
Thank you deeksha! Hope this PR will be merged to main branch soon. |
Cound you have a plan to add continue pretraining support ? |
I think Finetune itself is kind of continue pretraining. |
yes, you are right. I want to add some new task in stage2 (multi_task pretraining) and then SFT on stage 3. whether your PR can more easy for our project. The biggest obstacle is that we can not obtain the model output from stage 1 from Qwen2-VL team which just release stage3 model(Qwen2-VL-7B-instruct). |
Below is from Qwen2-vl tech report:
It seems the stage2 is just to optimize all parameters. For your purpose, I think it should be OK that you use https://github.com/zhangfaen/finetune-Qwen2-VL and prepare your data. In case you want to do stage3, you can just add a few lines in my code script, for example:
|
thanks,,good idea. |
Why Qwen-VL release Qwen-VL + Qwen-VL-Char, but Qwen2-VL only release Qwen2-VL-Instruct , do not release Qwen2-VL which maybe belong to pretrain model. |
I agree with @CarlHuangNuc. Can you guys please provide your pretraining code? @deekshaaneja |
For finetuning, add an alternative to LlamaFactory