CASRA+

The research proposed here is for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which is the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coeffi...

Full description

Saved in:
Bibliographic Details
Main Author: Haraty, Ramzi A. (author)
Other Authors: El Ariss, Omar (author)
Format: article
Published: 2007
Online Access:http://hdl.handle.net/10725/2320
http://dx.doi.org/10.3844/ajassp.2007.23.32
http://ku7rj9xt8c.scholar.serialssolutions.com/?sid=google&auinit=RA&aulast=Haraty&atitle=CASRA%2B:+A+colloquial+arabic+speech+recognition+application&id=doi:10.3844/ajassp.2007.23.32&title=American+journal+of+applied+sciences&volume=4&issue=1&date=2007&spage=23&issn=1546-9239
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The research proposed here is for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which is the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coefficients (MFCC). The extracted features are then compared with the system's stored model; in this case the stored model chosen is a phoneme-based model. This reference model differs from the direct word template matching, where speech features that are extracted from the input are directly compared to the word templates. Each word template in the direct matching model is stored as a vector of feature parameters. Thus, when the vocabulary size of the ASR system becomes large, the memory size for the word template will become humongous. In contrast, the model used here is phoneme-like template matching. Word templates are stored as phoneme-like template parameters. Thus, the memory size for the word templates will not grow as fast as that of the direct matching model.