The codebase is benchmark code for audio-visual sound event localization and detection (SELD) in STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results