
Rank Multimodal (RankMM) The RankMM model effectively combines the search paradigms of a text query, page context, and images to aid image and video retrieval. RankMM models are Visual Language (VL) models which take page context into account to improve image and video retrieval performance in a web-scale search engine.
Read More