採択課題 【詳細】
EX24402 | 下流タスクでの汎化を目的とした大規模言語モデル学習における曲率正則化 |
---|---|
課題代表者 | 長沼 大樹(モントリオール大学 モントリオール学習アルゴリズム研究所) |
概要 |
The sharpness-aware minimization (SAM) procedure recently gained increasing attention due to its favorable generalization ability to unseen data. SAM aims to find flatter (local) minima, utilizing a minimax objective. An immediate challenge in the application of SAM is the adjustment of two pivotal step sizes, which significantly influence its effectiveness. We introduce a novel, straightforward approach for adjusting step sizes that adapts to the smoothness of the objective function, thereby reducing the necessity for manual tuning. This method, termed Smoothness-Adaptive SAM (SA-SAM), not only simplifies the optimization process but also promotes the method's inherent tendency to converge towards flatter minima, enhancing performance in specific models. |
報告書等 | 研究紹介ポスター / 最終報告書 |
関連Webページ |