COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval | IEEE Conference Publication | IEEE Xplore

More Web Proxy on the site http://driver.im/