The rapid advancements in generative Artificial Intelligence (AI) have ignited a revolutionary wave across information retrieval and recommender systems. The MMGenSR workshop serves as a premier interdisciplinary platform to explore how generative models, particularly Large Language Models (LLMs) and Large Multimodal Models (LMMs), are transforming both multimodal search and recommendation paradigms.
We aim to bring together researchers and practitioners to discuss innovative architectures, methodologies, and evaluation strategies for generative document retrieval, generative image retrieval, grounded answer generation, generative recommendation, and other multimodal tasks.
The workshop will foster discussions on improving algorithms, generating personalized content, evolving user-system interactions, enhancing trustworthiness, and refining evaluation methodologies for these cutting-edge systems. This timely workshop seeks to identify promising future research directions, address key challenges, and catalyze collaborations towards the development of next-generation intelligent systems.
| Activity type | Time | Title |
|---|---|---|
| Welcome & Opening remarks | 09:00am-09:10am | |
| Keynote 1 | 09:10am-09:40am | Smarter Retrieval for Smarter Generation--When and How to Retrieve for Retrieval-Augmented Generation |
| Oral presentations (4 papers, 15 minutes each) | 09:40am-10:40am | 09:40-09:55: Unifying Inductive, Cross-Domain, and Multimodal Learning for Robust and Generalizable Recommendation 09:55-10:10: Rank-Aware Indigo-DPO: Scalable Preference Optimization for Industrial Talent Search Ranking 10:10-10:25: Turning Adversaries into Allies: Reversing Typographic Attacks for Multimodal Product Retrieval 10:25-10:40. Let Multimodal Embedders Learn When to Augment Query via Adaptive Query Augmentation |
| Coffee break | 10:40am-11:00am | |
| Keynote 2 | 11:00am-11:30am | Large-scale Generative and Multimodal Recommendation Systems: An Overview |
| Lightning spotlight presentations (6 papers, 8 minutes each) | 11:30am-12:20pm | 11:30-11:38: V-Agent: An Interactive Video Search System Using Vision-Language Models 11:38-11:46: EcomCLIP: Leveraging multimodal models for generating semantic embeddings 11:46-11:54: XX-Qwen-OmniEmbed: A Unified Multimodal Retrieval Model for Text, Image, Audio, and Video 11:54-12:02: TG-S&P: Time-Series Data Generation for Improved Fashion Demand Forecasting 12:02-12:10: Enhancing Medical Cross-Modal Hashing Retrieval using Dropout-Voting Mixture-of-Experts Fusion 12:10-12:18: SARCH: Multimodal Search for Archaeological Archives |
Abstract
This keynote surveys the rapidly evolving landscape of generative and multimodal recommendation systems in large-scale industrial applications. I will first highlight recent advances that incorporate action tokens into generative models, exemplified by HSTU, PinRec, and GenRank. I will then discuss approaches that integrate LLM/VLM knowledge into recommenders, including representation alignment, distance transfer, and semantic ID learning. Across both directions, I will outline the key challenges in real-world deployment—data sparsity, feature heterogeneity, and latency constraints—and summarize emerging solutions and open problems. The goal of this talk is to shed light on the future evolution of generative and multimodal recommendation systems and their role in shaping the next generation of intelligent, foundation-level recommender architectures.
Bio
Dr. Junwei Pan is a Research Scientist in the Tencent Ads Science team. Before joining Tencent, he was a Principal Research Engineer at Yahoo Labs and Yahoo Research where he worked on news personalization, search relevance, and demand-side platform. His research interests lie in computational advertising, recommender systems, and LLM for recommendation. He has published over 30 papers in top-tier conferences like SIGKDD, WWW, ICML, ICLR, NeurIPS, SIGIR, AAAI, CIKM, etc.
Abstract
Retrieval-Augmented Generation (RAG) has emerged as a core paradigm for integrating external knowledge into large language models (LLMs), helping to mitigate hallucinations and compensate for outdated or missing information. However, retrieval introduces additional computational overhead due to longer input contexts and does not always improve generation quality—particularly when the retrieved content is irrelevant or low quality. In this talk, I will discuss when retrieval should be triggered—only when the LLM lacks sufficient internal knowledge—and how retrieval can be optimized to better complement generation. I will present recent advances on enhancing LLMs’ perception of their own knowledge boundaries, leveraging LLM-based retrievers to improve retrieval quality, and adopting utility-aware retrieval strategies that prioritize information most beneficial for downstream generation. Together, these directions aim to reduce unnecessary retrieval overhead and provide more effective supporting evidence for reliable and efficient knowledge-augmented generation.
Bio
Keping Bi is an Associate Professor at the Institute of Computing Technology, Chinese Academy of Sciences. She received her Ph.D. from the Center for Intelligent Information Retrieval at the University of Massachusetts Amherst. Her research focuses on retrieval-augmented AI, universal text and multimodal representation learning, and large language model alignment guided by the Honest, Helpful, and Harmless (3H) principle. She currently serves as General Co-Chair of SIGIR-AP 2025 and as an Editor of SIGIR Forum, and has previously served as Registration Chair of SIGIR-AP 2023, Tutorial Chair of NLPCC 2025, and Program Committee or Senior Program Committee member for numerous leading IR and NLP conferences. In addition to her academic experience, she also held full-time industry positions at Baidu (China) and Microsoft (U.S.).
The main objective of this workshop is to encourage pioneering research at the intersection of generative models with multimodal search and recommendation. The overarching theme is the leveraging of generative AI to enhance and revolutionize information access and personalized content delivery in these multimodal scenarios. This workshop aims to attract a diverse audience, including academic researchers and industry experts working on or interested in generative models, multimodal information retrieval, and recommender systems. It offers a unique forum for these stakeholders to share innovative ideas, methods, and accomplishments, encouraging interdisciplinary collaboration and the exploration of novel applications. Specifically, we invite contributions addressing three key areas: (1) Generative Retrieval and Recommendation utilizing Large Multimodal Models (LMMs) and Multimodal Large Language Models (MLLMs), (2) Advanced Content Generation methodologies within Generative Search and Recommendation systems, and (3) Domain-specific Applications, Benchmarks, and Deployment strategies.
Topics of interest include, but are not limited to:
Submission Guidelines: Authors are invited to submit original, full-length research papers that are not previously published, accepted to be published, or being considered for publication in any other forum. Manuscripts should be submitted to the CIKM 2025 EasyChair site in PDF format, using the 2-column sigconf format from the ACM proceedings template. Submissions can be of varying length from 4 to 9 pages, plus unlimited pages for references. The authors may decide on the appropriate length of the paper as no distinction is made between long and short papers. The review process will be double-blind, and submissions that are not properly anonymized will be desk-rejected without review. Each submission will be allocated to a minimum of three program committee members without COI for review. After collecting reviewers’ comments, a meeting will be scheduled by the organizers to discuss reviews and make the final decision. At least one author of each accepted paper must register and present the work on-site in Seoul, Korea, as scheduled in the official CIKM 2025 conference program.
In addition, we also welcome and invite submissions of outstanding papers that have been recently accepted or published in top-tier conferences or journals, to foster broader discussion and engagement on the latest research advances. (Please contact yi.bin@hotmail.com).
Submission site: https://easychair.org/conferences?conf=mmgensr2025