How to Convert Different Language Audio to Text using Python

13 Jul 2020

How to convert different language audio to text using Python

Speech Recognition is an important feature in several applications, such as home automation, artificial intelligence, etc. This article provides an introduction to converting an audio file to text using the Speech Recognition library of Python.

How does speech recognition work?

First, internally the input physical audio will convert into electric signals. The electric signals convert into digital data with an analog-to-digital converter. Then, the digitized model can be used to transcribe the audio into text.

Installing the Python Speech Recognition Module

sudo pip3 install SpeechRecognition

This is the simplest way to install the SpeechRecognition Module.
Audio files that support speech recognition are wav, AIFF, AIFF-C, and FLAC. I have used the ‘wav’ file in this example.

Steps to convert audio file to text

Step 1: Import speech_recognition as speechRecognition. #import library

Step 2: speechRecognition.Recognizer() # Initializing recognizer class in order to recognize the speech. We are using google speech recognition.

Step 3: recogniser.recognize_google(audio_text) # Converting audio transcripts into text.

Step 4: Converting specific language audio to text.

Code Snippet
Audio file to text conversion

def startConvertion(path = 'sample.wav',lang = 'en-IN'):
    with sr.AudioFile(path) as source:
        print('Fetching File')
        audio_text = r.listen(source)
        # recoginize_() method will throw a request error if the API is unreachable, hence using exception handling
        try:
        
            # using google speech recognition
            print('Converting audio transcripts into text ...')
            text = r.recognize_google(audio_text)
            print(text)
    
        except:
            print('Sorry.. run again...')

How about converting different audio languages?

For example, if we want to read a Hindi language audio file, then we need to add a language option in recogonize_google. The remaining code remains the same.

recognize_google(audio_text, language = lang)

Please refer this link for language code:
https://cloud.google.com/speech-to-text/docs/languages

Example:

  recognize_google(audio_text, language = "hi-IN")

Full Code:

Output
English audio

Hindi audio

Kannada audio

This blog demonstrates how to convert different language audio files using the Google speech recognition API. Google speech recognition API is an easy method to convert speech into text, but for it to operate, it requires an internet connection.

If you have anything to add, please feel free to leave a comment.

Thanks for reading.