Code and data
<p dir="ltr">State-of-the-art (SOTA) Automatic Speech Recognition (ASR) systems primarily rely on acoustic information while disregarding additional multi-modal context. However, visual information are essential in disambiguation and adaptation. </p><p dir="ltr">...
Saved in:
| Main Author: | Supriti Sinhamahapatra (22271917) (author) |
|---|---|
| Other Authors: | Jan Niehues (22272010) (author) |
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
ssRSA for EMEG data.
by: Cai Wingfield (554068)
Published: (2017) -
Relating brain data dRDMs to phone model dRDMs and converting to feature fits.
by: Cai Wingfield (554068)
Published: (2017) -
Maps of fit for each feature.
by: Cai Wingfield (554068)
Published: (2017) -
Similarities between model RDMs and phonetic features.
by: Cai Wingfield (554068)
Published: (2017) -
Second-order similarity structure of phone models.
by: Cai Wingfield (554068)
Published: (2017)