学際大規模情報基盤共同利用・共同研究拠点

採択課題 【詳細】

jh230040 大規模拡散モデルを用いたテキスト生成
課題代表者 Li Zihui(東京大学情報基盤センター・データ科学研究部門 )
Li Zihui (The University of Tokyo, Information Technology Center)
概要

This research project investigates the integration of diffusion models into natural language processing (NLP), building on their success in computer vision. We explore incorporating diffusion methods into existing auto-regressive models and compare text generation with Large Language Models (LLMs). Our findings show that diffusion models are not superior to Transformer-based models. We assess the proficiency of LLMs in generating survey articles for NLP, focusing on 99 topics. Automated benchmarks indicate that GPT-4 outperforms GPT-3.5, PaLM2, and LLaMa2 by 2% to 20%. While GPT-created surveys are more contemporary and accessible, GPT-4 occasionally misses details or includes factual errors. We also found systematic bias in GPT-based evaluations compared to human evaluations.

報告書等 研究紹介ポスター 最終報告書
関連Webページ
無断転載禁止