Research
♠: Both authors contributed equally.
|
|
LOTUS: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He♠ ,
Haodong Li♠ ,
Wei Yin,
Yixun Liang,
Leheng Li,
Kaiqiang Zhou,
Hongbo Zhang,
Bingbing Liu,
Kaiqiang Zhou,
Ying-Cong Chen
arXiv 2024
arXiv
/
Project Page
/
Github
/
Demo (Depth)
/
Demo (Normal)
Lotus is a diffusion-based visual foundation model with a simple yet effective adaptation protocol,
aiming to fully leverage the pre-trained diffusion's powerful visual priors for dense prediction.
With minimal training data, Lotus achieves SoTA performance in two key geometry perception tasks, i.e., zero-shot monocular depth and normal estimation.
|
|