You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I would like to inquire about an issue related to replicating the study "Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning". The paper mentions that both the text and video encoders utilize CLIP, yet in the code you provided, the base.yml configuration file specifies the text encoder as BERT and does not disclose what is used for the video encoder. Could you provide the configuration file for CLIP that you used?
The text was updated successfully, but these errors were encountered:
Hello, I would like to inquire about an issue related to replicating the study "Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning". The paper mentions that both the text and video encoders utilize CLIP, yet in the code you provided, the base.yml configuration file specifies the text encoder as BERT and does not disclose what is used for the video encoder. Could you provide the configuration file for CLIP that you used?
Thank you very much for following our related work, and we apologize for the failure of the reproduction work due to some company regulations.
I must clarify the following points:
Firstly, for a fair comparison with other methods (such as minimizing the impact of code framework/runtime environment on the results), the results in our paper were implemented based on the ts2net code framework(https://github.com/yuqi657/ts2_net);
Secondly, due to some company regulations, the code in our paper needs to be open-source based on the current Antmmf framework;
Based on the above reasons, the existing scripts are currently unable to fully reproduce the results in our paper. To reproduce the corresponding results, it is necessary to migrate them to the code framework of ts2net.
Hello, I would like to inquire about an issue related to replicating the study "Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning". The paper mentions that both the text and video encoders utilize CLIP, yet in the code you provided, the base.yml configuration file specifies the text encoder as BERT and does not disclose what is used for the video encoder. Could you provide the configuration file for CLIP that you used?
The text was updated successfully, but these errors were encountered: