Education logo

Creating a Speaking AI Assistant with Voice Control in Python

A Step-by-Step Guide

By Rio Vijey Published 3 years ago 9 min read

To create a speaking AI assistant with voice control in Python, you can use a combination of libraries such as SpeechRecognition, pyttsx3, and pyaudio. Here is an example code to get started:

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  - - -

import speech_recognition as sr

import pyttsx3

# initialize recognizer class

r = sr.Recognizer()

# initialize text-to-speech engine

engine = pyttsx3.init()

# voice control function

def voice_control():

 with sr.Microphone() as source:

 print("Say something!")

 audio = r.listen(source)

try:

 text = r.recognize_google(audio)

 print("You said: {}".format(text))

 engine.say(text)

 engine.runAndWait()

 except sr.UnknownValueError:

 print("Could not understand audio")

 except sr.RequestError as e:

 print("Error with the recognition service; {0}".format(e))

# run voice control function

voice_control()

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

In this code, SpeechRecognition is used to listen to and recognize the speech from the microphone, and pyttsx3 is used to convert the recognized text into speech. The code initializes the text-to-speech engine, then defines a voice_control function that listens to the microphone input and converts it to text using the recognize_google method. If the audio is recognized successfully, the code uses the text-to-speech engine to say the recognized text. If the audio is not recognized, the code prints an error message.

To make this AI assistant more functional, you can add various commands and corresponding responses to it. Here is an example:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

import speech_recognition as sr

import pyttsx3

# initialize recognizer class

r = sr.Recognizer()

# initialize text-to-speech engine

engine = pyttsx3.init()

# voice control function

def voice_control():

 with sr.Microphone() as source:

 print("Say something!")

 audio = r.listen(source)

try:

 text = r.recognize_google(audio)

 print("You said: {}".format(text))

 if "hello" in text:

 engine.say("Hello! How can I help you today?")

 elif "time" in text:

 engine.say("The current time is: " + time.strftime("%H:%M:%S"))

 else:

 engine.say("I'm sorry, I didn't understand what you said.")

 engine.runAndWait()

 except sr.UnknownValueError:

 print("Could not understand audio")

 except sr.RequestError as e:

 print("Error with the recognition service; {0}".format(e))

# run voice control function

voice_control()

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

In this code, we added an if-else statement to respond to certain commands such as "hello" and "time". If the recognized text contains the word "hello", the AI assistant will respond with "Hello! How can I help you today?". If the recognized text contains the word "time", the AI assistant will say the current time. If the recognized text doesn't match any of the commands, the AI assistant will respond with "I'm sorry, I didn't understand what you said.".

You can also add more commands and responses to this AI assistant, as well as incorporate other functionality, to make it more useful and sophisticated.

To make the AI assistant even more advanced, you can use additional libraries to incorporate different functionalities into it. For example, you can use the wikipedia library to retrieve information from Wikipedia, the weather library to get weather information, and the calendar library to get calendar information.

Here is an example code that incorporates these functionalities:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

import speech_recognition as sr

import pyttsx3

import wikipedia

import weather

import time

import calendar

# initialize recognizer class

r = sr.Recognizer()

# initialize text-to-speech engine

engine = pyttsx3.init()

# voice control function

def voice_control():

 with sr.Microphone() as source:

 print("Say something!")

 audio = r.listen(source)

try:

 text = r.recognize_google(audio)

 print("You said: {}".format(text))

 if "hello" in text:

 engine.say("Hello! How can I help you today?")

 elif "time" in text:

 engine.say("The current time is: " + time.strftime("%H:%M:%S"))

 elif "date" in text:

 engine.say("Today is " + time.strftime("%B %d, %Y"))

 elif "weather" in text:

 weather_info = weather.get_weather()

 engine.say(weather_info)

 elif "wikipedia" in text:

 try:

 text = text.replace("wikipedia", "")

 results = wikipedia.summary(text, sentences=2)

 engine.say("According to Wikipedia, " + results)

 except Exception as e:

 engine.say("I'm sorry, I couldn't find any information on that topic.")

 else:

 engine.say("I'm sorry, I didn't understand what you said.")

 engine.runAndWait()

 except sr.UnknownValueError:

 print("Could not understand audio")

 except sr.RequestError as e:

 print("Error with the recognition service; {0}".format(e))

# run voice control function

voice_control()

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

In this code, we added functionalities such as getting the date, weather information, and information from Wikipedia. If the recognized text contains the word "date", the AI assistant will say the current date. If the recognized text contains the word "weather", the AI assistant will say the current weather information. If the recognized text contains the word "wikipedia", the AI assistant will retrieve a summary of the Wikipedia article corresponding to the text following "wikipedia".

This AI assistant can be further improved by adding more functionalities and making the code more robust and efficient. However, this code provides a good starting point for creating a speaking AI assistant with voice control.

Another important aspect to consider when creating an AI assistant is error handling. In the previous code, we added some basic error handling for the speech recognition and Wikipedia API, but you may want to add more error handling to handle other types of errors that might occur. For example, you can add error handling for the weather API in case the API returns an error or there is a problem with the network connection.

Here's an example of how you can add additional error handling:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

import speech_recognition as sr

import pyttsx3

import wikipedia

import weather

import time

import calendar

# initialize recognizer class

r = sr.Recognizer()

# initialize text-to-speech engine

engine = pyttsx3.init()

# voice control function

def voice_control():

 with sr.Microphone() as source:

 print("Say something!")

 audio = r.listen(source)

try:

 text = r.recognize_google(audio)

 print("You said: {}".format(text))

 if "hello" in text:

 engine.say("Hello! How can I help you today?")

 elif "time" in text:

 engine.say("The current time is: " + time.strftime("%H:%M:%S"))

 elif "date" in text:

 engine.say("Today is " + time.strftime("%B %d, %Y"))

 elif "weather" in text:

 try:

 weather_info = weather.get_weather()

 engine.say(weather_info)

 except Exception as e:

 engine.say("I'm sorry, there was a problem getting weather information.")

 elif "wikipedia" in text:

 try:

 text = text.replace("wikipedia", "")

 results = wikipedia.summary(text, sentences=2)

 engine.say("According to Wikipedia, " + results)

 except Exception as e:

 engine.say("I'm sorry, I couldn't find any information on that topic.")

 else:

 engine.say("I'm sorry, I didn't understand what you said.")

 engine.runAndWait()

 except sr.UnknownValueError:

 engine.say("I'm sorry, I didn't understand what you said.")

 engine.runAndWait()

 except sr.RequestError as e:

 engine.say("I'm sorry, there was a problem with the recognition service.")

 engine.runAndWait()

# run voice control function

voice_control()

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

In this code, we added a try-except block to handle errors from the weather API. If there is an error, the AI assistant will respond with "I'm sorry, there was a problem getting weather information.". Additionally, we added error handling for the speech recognition API, in case the API returns an error or there is a problem with the network connection. In this case, the AI assistant will respond with "I'm sorry, there was a problem with the recognition service."

This error handling helps to make the AI assistant more user-friendly and resilient to potential errors. You can add more error handling as necessary to make your AI assistant as robust and reliable as possible.

Another aspect you might want to consider is to expand the functionality of your AI assistant. For example, you can add more voice commands and integrate with other APIs to provide more information and services. Some popular APIs that you can use include the following:

News API: Provides access to news articles from a variety of sources. You can integrate this API to provide the latest news headlines or specific news articles based on the user's request.

Google Maps API: Provides access to Google Maps data, including maps, directions, and location information. You can use this API to provide directions, nearby locations, and more.

OpenWeatherMap API: Provides weather information for cities around the world. You can use this API to provide the current weather, forecast, and other weather-related information.

Wolfram Alpha API: Provides access to a vast collection of knowledge and information on a wide range of topics. You can use this API to provide answers to questions and provide information on a variety of topics.

By integrating with these APIs, you can significantly expand the functionality of your AI assistant and provide more information and services to the users.

It's also a good idea to add user authentication and privacy protection when integrating with these APIs. For example, you can use OAuth or other authentication methods to ensure that user data is protected and secure. Additionally, you should also ensure that you follow the terms of use and privacy policies of the APIs you integrate with.

Finally, you can also consider making your AI assistant more natural and user-friendly by using natural language processing (NLP) techniques. NLP can help your AI assistant to understand the meaning and context of the user's requests and respond more naturally and effectively.

Another important aspect to consider is making your AI assistant more accessible and user-friendly. Here are a few tips for improving the user experience:

Provide clear and concise responses: Make sure the responses from your AI assistant are clear, concise, and easy to understand. Avoid using technical jargon or overly complex language.

Add context and personalization: You can use the user's name or other information to personalize the responses and make the AI assistant feel more human. You can also provide context for the responses, such as the source of the information or the time frame for the request.

Improve speech recognition accuracy: The accuracy of the speech recognition can significantly impact the user experience. You can improve accuracy by using high-quality microphones, reducing background noise, and fine-tuning the recognizer settings.

Allow for multiple languages: If your AI assistant will be used by users from different countries or regions, consider adding support for multiple languages. This can be done by integrating with speech recognition and text-to-speech engines that support multiple languages.

Make the AI assistant easily accessible: Consider adding shortcuts or hotkeys to make the AI assistant easily accessible. You can also consider adding a graphical user interface (GUI) to make it easier for users to interact with the AI assistant.

By following these tips, you can make your AI assistant more accessible and user-friendly, which can lead to increased adoption and usage.

In addition, here are some best practices to keep in mind while developing your AI assistant:

Test thoroughly: It's important to thoroughly test your AI assistant to ensure that it works as intended. Test the assistant in different environments, with different types of input, and with different user scenarios.

Make it easy to maintain: Make sure your code is well organized, commented, and easy to understand. This will make it easier for others to understand and maintain the code, and it will also make it easier for you to make updates and improvements in the future.

Document the code: Make sure to document the code, including descriptions of the algorithms, data structures, and APIs used. This documentation can be helpful for others who want to understand the code, and it can also serve as a reference for you when you need to make updates or improvements in the future.

Keep it secure: Security is a key concern when developing AI assistants, especially if they will be used by a large number of users. Make sure to use encryption and secure communication protocols to protect user data and ensure that the AI assistant is not vulnerable to hacking or other security threats.

Follow ethical guidelines: AI assistants can have a significant impact on people's lives, so it's important to ensure that they are developed ethically. Follow ethical guidelines and best practices for AI development, and make sure that your AI assistant does not violate any privacy rights or ethical principles.

By following these best practices, you can ensure that your AI assistant is well-designed, reliable, and secure. Additionally, by following ethical guidelines, you can help ensure that AI technology is used for the benefit of society and not to cause harm.

To make your speaking AI assistant into a computer software, you'll need to package it in a way that makes it easy for users to install and use on their computers. Here are some steps you can follow to package your AI assistant as a computer software:

Choose a packaging format: There are several formats for packaging software, including installer packages (e.g. .exe or .dmg), compressed archives (e.g. .zip), and virtual environments (e.g. Docker). Choose the format that best fits your needs and the type of software you are creating.

Create a user-friendly installation process: Make the installation process as simple and straightforward as possible. Provide clear instructions and guidance for users, and include any necessary dependencies or prerequisites.

Test the software: Thoroughly test the software on a variety of different computer configurations and operating systems to ensure that it works as expected.

Include documentation: Provide clear and concise documentation for the software, including instructions for installation, usage, and troubleshooting.

Obtain necessary licenses: If your AI assistant uses any third-party libraries, tools, or APIs, make sure to obtain the necessary licenses and include them with the software.

Choose a distribution channel: Decide on the best way to distribute the software, such as through a website, an app store, or a download service.

By following these steps, you can package your speaking AI assistant as a computer software that is easy to install, use, and maintain. Additionally, by providing clear and concise documentation and choosing a suitable distribution channel, you can make your software more accessible and user-friendly.

courses

About the Creator

Rio Vijey

As a lifelong lover of storytelling, I am thrilled to begin my journey as a writer. I have a vivid imagination and a passion for exploring the depths of the human experience through fiction.

Reader insights

Be the first to share your insights about this piece.

How does it work?

Add your insights

Comments

There are no comments for this story

Be the first to respond and start the conversation.

Sign in to comment

    Find us on social media

    Miscellaneous links

    • Explore
    • Contact
    • Privacy Policy
    • Terms of Use
    • Support

    © 2026 Creatd, Inc. All Rights Reserved.