Research
✱: Both authors contributed equally.
|
|
StereoDiff: Stereo-Diffusion Synergy for Video Depth Estimation
Haodong Li,
Chen Wang,
Jiahui Lei,
Zhiyang Dou,
Kostas Daniilidis,
Jiatao Gu,
Lingjie Liu
arXiv 2024
arXiv
/
Project Page
/
Github (Soon)
Video depth estimation is not merely an extension of image depth estimation. The consistency for dynamic and static regions in videos are fundamentally different.
To tackle these challenges, StereoDiff synergizes stereo matching with video depth diffusion models, achieving superior video depth estimation performance.
|
|
LOTUS: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He✱ ,
Haodong Li✱ ,
Wei Yin,
Yixun Liang,
Kaiqiang Zhou,
Hongbo Zhang,
Bingbing Liu,
Ying-Cong Chen
arXiv 2024
arXiv
/
Project Page
/
Github
/
Demo (Depth)
/
Demo (Normal)
Lotus is a diffusion-based visual foundation model with a simple yet effective adaptation protocol,
aiming to fully leverage the pre-trained diffusion's powerful visual priors for dense prediction.
With minimal training data, Lotus achieves SoTA performance in two key geometry perception tasks, i.e., zero-shot monocular depth and normal estimation.
|
|