Abstract: Salient object detection (SOD) has been in the spotlight recently, yet has been studied less for high-resolution (HR) images. Unfortunately, HR images and their pixel-level annotations are ...
Vision-Language Pre-training (VLP) has recently attracted rapidly growing attention from both the computer vision and NLP communities, especially due to the emergence of multimodal foundation models ...