
ニュース記事・画像からキャプション生成,Transform and Tell

Transform and Tell: Entity-Aware News Image Captioning paper https://arxiv.org/abs/2012.00364 Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao github https://github.com/alasd…


Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks paper https://arxiv.org/abs/2004.06165 github https://github.com/microsoft/Oscar データセット COCO etc. project まとめ どんなもの? 言語embedding,画像の物体検出特徴に…


Learning Transferable Visual Models From Natural Language Supervision paper https://cdn.openai.com/papers/Learning_Transferable_Visual_Models_From_Natural_Language.pdf github https://github.com/openai/CLIP データセット WebImageText(WIT) pr…