October 01, 2025

Coding Your Own Virtual Assistant with Python and Speech Recognition

virtualassistant

pythonai

speechrecognition

voiceassistant

aiautomation

Ethan Kim

@ethan-kim

Share what you learn in this blog to prepare for your interview, create your forever-free profile now, and explore how to monetize your valuable knowledge.

Coding Your Own Virtual Assistant with Python and Speech Recognition

Have you ever wished for a personal assistant to create reminders, send emails, or give you the weather? You can code a virtual assistant with Python and speech recognition! Imagine using your voice to control your computer. Cool, right? This tutorial will teach you how to code a small virtual assistant using Python, speech recognition, and a few libraries. Ready to code your own assistant? Let's begin!

What You Need

Before coding, let's quickly review everything you will need for this project. Visit python.org to get the latest version of Python. Next, we will use the simple Python speech_recognition library. Install the pyttsx3 library for text-to-speech to have your assistant speak. Keep reading; I will walk you through installation!

Let's install these libraries using pip:

pip install SpeechRecognition pyttsx3

After setup, we may code our assistance!

Setting Up Speech Recognition

Our virtual assistant relies on voice commands. The speech_recognition library lets your software transform microphone audio into text. Building your assistant's capabilities starts here.

To set up speech recognition, do the following:

import speech_recognition as sr

recognizer = sr.Recognizer()

# Use the default microphone as the source for audio
with sr.Microphone() as source:
    print("Say something...")
    audio = recognizer.listen(source)

try:
    # Recognize speech using Google Speech Recognition
    print("You said: " + recognizer.recognize_google(audio))
except sr.UnknownValueError:
    print("Sorry, I did not understand that.")
except sr.RequestError:
    print("Could not request results; check your internet connection.")

What's happening here?

The recognizer.listen() function detects audio from the microphone.
Google's speech recognition API converts audio to text.
The assistant will notify the user if it cannot understand the audio.

This simple setup listens to your speech and prints it. Now, make it respond!

Adding Text-to-Speech

For full interaction, your assistant should respond to voice commands. Here comes pyttsx3. This text-to-speech library lets your software speak. Let's integrate it with our existing code.

Pyttsx3 library import and engine initialisation:

import pyttsx3

# Initialize the pyttsx3 engine
engine = pyttsx3.init()

# Set the speech rate (optional)
engine.setProperty('rate', 150)  # Speed of speech

# Set the volume (optional)
engine.setProperty('volume', 1)  # Volume level (0.0 to 1.0)

# Function to make the assistant speak
def speak(text):
    engine.say(text)
    engine.runAndWait()

To have the assistant speak back to the user, you've to change the speech recognition code:

import speech_recognition as sr
import pyttsx3

recognizer = sr.Recognizer()
engine = pyttsx3.init()

# Function to make the assistant speak
def speak(text):
    engine.say(text)
    engine.runAndWait()

# Using the microphone to capture audio
with sr.Microphone() as source:
    print("Say something...")
    audio = recognizer.listen(source)

try:
    command = recognizer.recognize_google(audio)
    print("You said: " + command)
    speak("You said: " + command)  # Assistant repeats what you said
except sr.UnknownValueError:
    speak("Sorry, I did not understand that.")
except sr.RequestError:
    speak("Could not request results; check your internet connection.")

Now, when you talk into the microphone, the assistant will print and repeat what it heard!

Adding Simple Commands

Now that your assistant can talk and listen, give it some easy commands to make it smart. Our assistant will learn to tell time, set reminders, and launch websites based on voice commands. How to add functionality:

import datetime
import webbrowser

def respond_to_command(command):
    if 'time' in command:
        now = datetime.datetime.now()
        time = now.strftime("%H:%M:%S")
        speak(f"The time is {time}")
   
    elif 'open YouTube' in command:
       webbrowser.open("https://www.youtube.com")
        speak("Opening YouTube")
   
    elif 'bye' in command:
        speak("Goodbye!")
        exit()

# Now, letÃ¢â‚¬â„¢s combine everything
with sr.Microphone() as source:
    print("Say something...")
    audio = recognizer.listen(source)

try:
    command = recognizer.recognize_google(audio).lower()
    print("You said: " + command)
    speak("You said: " + command)
    respond_to_command(command)
except sr.UnknownValueError:
    speak("Sorry, I did not understand that.")
except sr.RequestError:
    speak("Could not request results; check your internet connection.")

This code includes respond_to_command(), which listens for user voice keywords. If you ask "What is the time?" the assistant will give you the time. Saying "Open YouTube" opens YouTube in the browser.

Conclusion

Python with speech recognition create a simple but useful virtual assistant! By using speech-to-text, text-to-speech, and basic command recognition, you may construct an assistant that follows your commands. Add more advanced commands, connect it to APIs, or use machine learning for smarter results. There are a lot of options! Keep trying, and you will have a powerful assistant for any task.

673 views

Please Login to create a Question

Posts

Questions

Blogs