Abstract. Numeral recognition is one among the most vital problems in pattern recognition. Its numerous applications like reading postal zip code, passport number, employee code, bank cheque processing and video gaming etc. To the best of our knowledge, little work has been done in Marathi language as compared with those for other Indian and non-Indian languages. This paper has discussed a novel technique for recognition of isolated Marathi numerals. It discusses a Marathi database and isolated numeral recognition system using Mel-Frequency Cepstral Coefficient (MFCC) and Distance Time Warping (DTW) as attributes. The precision of the pre-recorded samples is more than that of the real-time testing samples. We have also seen that the accuracy of the speaker dependent samples is higher than that of the speaker independent samples. Another method called HMM that statistically models the words is also presented. Experimentally, it is proved that recognition accuracy is higher for HMM compared with DTW, but the training procedure in DTW is very simple and fast, as compared to the Hidden Markov Model (HMM). The time needed for recognition of numerals using HMM is more as compared to DTW, as it has to go through the many states, iterations and many more mathematical modeling, so DTW is preferred for the real-time applications.
Keywords: Hidden Markov Model (HMM), Mel-Frequency Cepstral Coefficient (MFCC), Distance Time Warping (DTW).
Speech recognition systems are utilized in different fields in our daily life. Due to the rapid advancement in this field all over the world we can see many systems and devices with voice input 1. Speech Synthesis and Speech Recognition combinely form a speech interface. A speech synthesizer converts text into speech, so it can read out the textual contents from the screen. Speech recognizer had the ability to find the spoken words and transform it into text. We require such software’s to be available for Indian languages.
Speech recognition is the ability to listen spoken words and recognize different sounds present in it, and identify them as words of some known language. Speech recognition in computer domain involves many steps with issues attached with them. The steps needed to make computers perform speech recognition are: Voice recording, word boundary detection, feature extraction, and recognition by using knowledge models.
2. Problem Definition
The primary goal of the paper is to build a speech recognition device for Marathi language, which is an isolated word speech recognition devices that uses Mel-Frequency Cepstral Coefficient (MFCC) for Feature Extraction and Distance Time Warping (DTW) for Feature Matching or to compare the test patterns.
3. Marathi Numeral Recognition using MFCC and DTW Features
The popularly used cepstrum based techniques to check the pattern to find their similarity are the MFCC and DTW. The MATLAB is utilized for the implementation of MFCC and DTW attributes.