Loading…
Saved in:
Publication Year: | 2025 |
---|---|
Subject Terms: |
Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Artificial Intelligence, Computer Science - Com
|
Description: |
Multimodal large language models (MLLMs) have advanced perception across text, vision, and audio, yet they often struggle with structured cr
|
Database: | arXiv |