Research
✱: Both authors contributed equally.
|
|
LOTUS: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He✱ ,
Haodong Li✱ ,
Wei Yin,
Yixun Liang,
Leheng Li,
Kaiqiang Zhou,
Hongbo Zhang,
Bingbing Liu,
Ying-Cong Chen
ICLR 2025
arXiv
/
Project Page
/
Github
/
Demo (Depth)
/
Demo (Normal)
/
ComfyUI
Lotus is a diffusion-based visual foundation model with a simple yet effective adaptation protocol,
aiming to fully leverage the pre-trained diffusion's powerful visual priors for dense prediction.
With minimal training data, Lotus achieves SoTA performance in two key geometry perception tasks, i.e., zero-shot monocular depth and normal estimation.
|
|