学際大規模情報基盤共同利用・共同研究拠点

採択課題 【詳細】

jh230040 大規模拡散モデルを用いたテキスト生成
課題代表者 Li Zihui(東京大学情報基盤センター・データ科学研究部門 )
Li Zihui (The University of Tokyo, Information Technology Center)
概要

This research project explores the significance of integrating diffusion models into natural language processing (NLP), building upon their success in computer vision. Specifically, the project addresses the open research question of incorporating diffusion methods into existing auto-aggressive models in NLP. The study also emphasizes the growing trend of large pretrained models and their computational challenges. The objective is to optimize the integration of diffusion models with pretrained models, enhancing inference speed and benefiting both academia and industry. The research focuses on text generation tasks, such as summarization and machine translation, and seeks to overcome the challenge of integrating diffusion methods into auto-aggressive sequence-to-sequence models in NLP. By leveraging pretrained models and optimizing their utilization through diffusion models, the project aims to improve efficiency without starting from scratch, addressing resource limitations, and advancing the field of NLP.

報告書等 研究紹介ポスター / 最終報告書
関連Webページ
無断転載禁止