Abstract: This paper aims at providing an effective multi-modal images invariant feature extraction and matching algorithm for the application of multi-source data analysis. Focusing on the ...
As large language models (LLMs) evolve into multimodal systems that can handle text, images, voice and code, they’re also becoming powerful orchestrators of external tools and connectors. With this ...
Abstract: Multi-modal emotion recognition plays a crucial role in human-computer interaction. Nowadays, many studies have developed fusion algorithms for this purpose. However, two challenges are ...