ESP32 and a Gravity Voice Control Sensor

  1. Why Offline Language Recognition?
  2. Wiring the Gravity Voice Recognition Sensor
  3. Upgrading the Speaker
  4. Installing The Gravity Library
  5. First Example Voice Recognition Program
  6. List of Starter Voice Commands
  7. Example Voice Controlled Servos
  8. Still need help?

Why Offline Language Recognition?

Language recognition in 2024 has come a long way from products you could buy back in the 90’s. The first dictation software I bought ran on a Windows 95 PC and would get at minimum 20% of all words wrong. I remember spending hours uselessly trying to calibrate it. The software would always write “I can speak spinach” because “Spanish” is apparently a vegetable.

I trashed the software and didn’t try voice recognition again for 15 years.

Then, in 2010 you could buy these voice recognition modules that were small and low power enough for miniature robots, and they could do about 10 one-word commands. They’d work 70% of the time in a quiet room, if you spoke very clearly, but much less reliably in noisy rooms with people talking. Hey, good enough! Interestingly, I would find myself angrily raising my voice whenever my robot didn’t listen to me – something that happened often. It was just a machine, right?

Programming Electronics Academy members, check out the Internet of Things (IoT) Course to start programming your own IoT devices.

Not a member yet?  Sign up here.

Fast forward to 2024, with AI voice recognition being rather impressive (albeit still not perfect). Now you have many options for robot voice control. You could put a cellphone on your robot, link that to your ESP32, and it could run ChatGPT to perform rather complex tasks. However, that would require your robot to have an available wifi/cell connection, and software-wise it’s definitely NOT for beginners!

So for this tutorial, we’re going to use a small low power voice recognition module, one that’s offline and very beginner friendly. The ones available today are definitely way better than what I used back in 2010, with a vocabulary of many dozens of words – and can even do sounds like dog barks, cat meows, whistling, clicks, etc.

Think of all the things you can make with that capability!

Getting Started on Hardware

For this project, I chose to use the…

Gravity: Offline Language Learning Voice Recognition Sensor (Amazon)

Gravity: Offline Language Learning Voice Recognition Sensor (website)

It has a built-in microphone, and a speaker to verbally confirm commands back at you. I find it accurate about 90% of the time in a quiet room, and 70% of the time in a noisy room. The speaker is hard to hear in a noisy room, but it has the ability to upgrade to a louder external speaker (keep reading). I found it extremely user friendly, and got it working on my robot in no time.

Note: If you plan to have it operate in a room with moderate noise, definitely add an external speaker. I could barely hear it at all.

I encourage you to look at their tutorial for the Arduino, but it won’t be necessary for our following ESP32 projects.

Image7

Wiring the Gravity Voice Recognition Sensor

Wiring is straightforward. You got 4 wires going to your ESP32 – 3.3V and ground, plus two wires for I2C. Flip the switch to SPK1. Finished!

Note: I2C is pronounced ‘eye squared C.’ It’s actually I²C, but few people can find that little ² on their keyboard. I’m still wondering how I typed it, myself.

Note: The datasheet doesn’t have schematics or even list power requirements for the sensor, so I experimentally determined the power requirement using a power supply. It MUST be given 3.3V as it doesn’t have a built-in voltage regulator. It uses up to 325mA of current. As your ESP32 already uses 175mA (or more with WiFi), this means you cannot share the regulated 3.3V directly from your ESP32 for the Gravity sensor. The built in regulator on your ESP32 board likely cannot handle the 500mA total required. My solution was to use this step-down voltage regulator to power the Gravity sensor via the same 4.8V 2000mAh NiMH battery that powers my ESP32.

Wiring chart + diagram as follows:

Image8
GNDGoes to (-) negative of your battery
SCL , C/RGoes to SCL pin of ESP32.
SDA , D/TGoes to SDA pin of ESP32.
VCCGoes to 3.3V voltage regulator output from ESP32 board.

Below is how to wire the ESP32. Be sure to give it appropriate power as per the manual for the ESP32 you choose to use.

Image1

Upgrading the Speaker

As I mentioned earlier, the speaker was difficult to hear in a noisy room. There are two ways to upgrade your speaker: a big speaker + amplifier (outside the scope of this tutorial), or a small/cheap passive speaker. I went with the latter route. Volume subjectively improved about +30%, while sound quality is far better. Definitely worth getting an external speaker.


I got this set of speakers from Amazon. An unrelated DFRobot tutorial suggests 4Ω 3W speakers. I tried both 8Ω (below left image speaker) and 4Ω (below right image speaker), and both subjectively performed the same. Use a bit of double-sided sticky tape or velcro tape to attach them to your project.

Image12

I then directly plugged it into the SPK2 port on the device. Flip the switch to SPK2.

Image8
Image3

Installing The Gravity Library

Now that you got your hardware wired up, let’s get onto the software side of things.

Load up your Arduino IDE, and go to:

Tools => Manage Libraries

and type in DFRobot_DF2301Q into search.

Download and install the library as below:

Image2

Note: check out the DFRobot Gravity Voice Recognition Library on Github for more info.

First Example Voice Recognition Program

Let’s try out the first built in example program that comes with the library. This will confirm your wiring and let you try it out, and then later we’ll do a more advanced example.

Go to:

File => Examples => DFRobot_DF2301Q => i2c

Image5

Their example program doesn’t really do anything so we’re going to make it better by controlling an LED.

At the very top of setup(), you’ll need to declare the I2C and LED pins. Add in these two lines:

Wire.begin(26, 27);

  pinMode(2,OUTPUT);

Change the pin numbers to match your setup. Next, look for this piece of code:

 

uint8_t CMDID = 0;

  CMDID = DF2301Q.getCMDID();

  if(0 != CMDID) {

    Serial.print("CMDID = ");

    Serial.println(CMDID);

  }

  delay(3000);

…and replace it with this:

uint8_t CMDID = DF2301Q.getCMDID();

  switch (CMDID) {

    case 103: //If the command is “Turn on the light”

      Serial.println("received'Turn on the light',command flag'103'");

          //change 13 to be your LED pin

digitalWrite(2, LOW);

      break;

    case 104: //If the command is “Turn off the light”

      Serial.println("received'Turn off the light',command flag'104'");

          //change 13 to be your LED pin

digitalWrite(2, HIGH);

      break;

    default:

      if (CMDID != 0) {

        Serial.print("nCMDID = ");  //Printing command ID

        Serial.print(CMDID);

      }

    }

In the above code, set the LED pin to the correct pin on your ESP32 setup.

Then, compile and run it on your ESP32.

Now, say the phrase, “Hello robot. Turn on the light…. Turn off the light”. These are pre-programmed into the Gravity device.

Below is a screenshot of what you should see in Serial in your Arduino IDE when running the demo.

Image6

Debugging Help

If you get this below message:

[E][Wire.cpp:499] requestFrom(): i2cWriteReadNonStop returned Error 263

Power cycle your ESP32 and the Gravity device, and it should go away. Also, check your wiring and the battery voltage. The error indicates a communication issue with the Gravity device.

Depending on your system design, you may temporarily get this error during startup while the power is still settling. It should go away on its own within a few seconds.

List of Starter Voice Commands

After running the same program from above, let’s configure your Gravity Voice Recognition device to recognize your unique voice commands.

Below is my quick reference list of the most important commands you’ll need to get started. Power up your unit and try out a few commands. After giving a verbal command, listen to what she says before continuing. It should all be self-explanatory.

See the DFROBOT tutorial for a full list of command words. 140+ commands available!

commanddescription
Hello RobotDefault system wake word to get started.
Learning wake wordChange the wake word.
Learning command wordTeaches it a new command.
Re-learnReplaces a command with another.
…Exit learningSay this to exit learning.
I want to deleteEnters the delete function.
…Delete wake wordErase the learned awakening words from the system.
…Delete command wordRemoves a previously acquired command phrase.
….Delete allDeletes all commands and phrases from memory.
…Exit deletingExits the delete function

Check out the demo video below for adding commands.

Example Voice Controlled Servos

Now that you have all the individual parts working, let’s make it do something useful. This project will control just two servos using the external servo controller from our previous tutorial (link here).

If you have never used servos before, definitely check out that tutorial before continuing. If you wish to do something other than move servos, modify the servos.h file as you like (keep reading).

I am using pins #5 and #9 for the two servos (see servos.h below). Below is the wiring of the servo controller (from the servo tutorial) for reference.

Image13

Add in new voice commands

Go back to the previous step above (starter voice commands), delete all your previous commands, and add in these 6 new commands in this exact order:

Left Claw Open

Left Claw Close

Right Claw Open

Right Claw Closed

Claw Attack

System Off

The CMDID (command ID) will end up being 5 through 10. This is necessary to match your command to the ESP32 software. You can also change the CMDIDs in voice_gravity.h. Continue on to the next section for more info.

Software for your ESP32

Let’s program your ESP32.

Create a new project in your Arduino IDE called voice_recog_demo.ino. Then create two header files, voice_gravity.h and servos.h.

Image4

Next, copy/paste the following code into each file.

voice_recog_demo.ino

/*
 * Code written by John Palmisano
 * July 2024
 * various online libraries additionally credited in header files
*/

#include <string>

//states to put the robot in to

uint8_t STATE=0;

#define left_claw_open    1

#define left_claw_close   2

#define right_claw_open   3

#define right_claw_close  4

#define claw_attack       5

#define system_OFF        6

#include "servos.h"//servo (action) code

#include "voice_gravity.h"//voice recognition code

void setup() {

  //small delay for proper power up

  delay(150);

  Serial.begin(115200);

  Serial.print("nStarting up...");

  Wire.begin(26, 27);

  servos_setup();

  delay(10);

  voice_setup();

  delay(10);

  servo_goto_init_pos();

  delay(500);

  Serial.print("nStartup Complete");

}

void loop() {

  voice_loop();//get voice command

  //make sure this can run at least every 20ms or so, or servos will get sluggish

  servo_loop();//do something

}

voice_gravity.h

/*

https://wiki.dfrobot.com/SKU_SEN0539-EN_Gravity_Voice_Recognition_Module_I2C_UART

*/

#include "DFRobot_DF2301Q.h"

DFRobot_DF2301Q_I2C asr;

void voice_setup(void)

  {

  // Init the sensor

  while (!(asr.begin())) {

    Serial.println("Communication with device failed, please check connection");

    delay(1000);

  }

  Serial.print("nvoice recognition initiated");

  /**

   * @brief Set voice volume

   * @param voc - Volume value(1~7)

   */

  asr.setVolume(7);

  /**

     @brief Set mute mode

     @param mode - Mute mode; set value 1: mute, 0: unmute

  */

  asr.setMuteMode(0);

  /**

     @brief Set wake-up duration

     @param wakeTime - Wake-up duration (0-255)

  */

  asr.setWakeTime(20);

  /**

     @brief Get wake-up duration

     @return The currently-set wake-up period

  */

  uint8_t wakeTime = 0;

  wakeTime = asr.getWakeTime();

  Serial.print("nwakeTime = ");

  Serial.println(wakeTime);

  // asr.playByCMDID(1);   // Wake-up command

  /**

     @brief Play the corresponding reply audio according to the ID

     @param CMDID - command word ID

  */

  //asr.playByCMDID(23);  // Command word ID

  }

void voice_loop(void)

  {

  uint8_t CMDID = asr.getCMDID();

  switch (CMDID) {

    //note:

    //case 0 to 4 is hard coded, can't be changed

    //change case # to match your CMDID

    case 5:

      STATE=right_claw_open;

      Serial.print(CMDID);

      Serial.print(" right claw openn");

      break;

    case 6:

      STATE=right_claw_close;

      Serial.print(CMDID);

      Serial.print(" right claw closedn");

      break;

    case 7:

      STATE=left_claw_open;

      Serial.print(CMDID);

      Serial.print(" left claw openn");

      break;

    case 8:

      STATE=left_claw_close;

      Serial.print(CMDID);

      Serial.print(" left claw closedn");

      break;

    case 9:

      STATE=claw_attack;

      Serial.print(CMDID);

      Serial.print(" claw attackn");

      break;

    case 10:

      STATE=system_OFF;

      Serial.print(CMDID);

      Serial.print(" all servos offn");

      break;

    default:

      if (CMDID != 0) {

        Serial.print("nCMDID = ");  //Printing command ID

        Serial.print(CMDID);

        Serial.print("n");

      }

    }

  }

servos.h

// get commands from:

// https://github.com/janelia-arduino/PCA9685/blob/master/src/PCA9685/PCA9685.cpp

// https://github.com/peterpolidoro/PCA9685/blob/master/src/PCA9685.h

#include <PCA9685.h>

PCA9685 pca9685;

const uint16_t servo_time =40;//time between servo angle changes

uint64_t  servo_last_time =0;//last time servo was changed

//servo channel

#define right_hand          9

#define left_hand           5

//current servo angle

uint8_t right_hand_pos    = 90;

uint8_t left_hand_pos     = 90;

//initialize routine

void servos_setup(void)

{

  pca9685.setupSingleDevice(Wire,0x40);

  pca9685.setToServoFrequency();

  servo_last_time=millis();

}

//scale a number from 0 - 180 to 800 - 2000

uint16_t angle_2_ms(uint8_t angle)

  {

  return 2000*angle/180+800;

  }

//turn everything off

void all_servos_off(void)

  {

  pca9685.setChannelServoPulseDuration(right_hand,0);

  pca9685.setChannelServoPulseDuration(left_hand,0);

  }

//command all servos at once

void servo_pulse_all(void)

  {

  pca9685.setChannelServoPulseDuration(right_hand,angle_2_ms(right_hand_pos));

  pca9685.setChannelServoPulseDuration(left_hand,angle_2_ms(left_hand_pos));

  }

//print servo positions to screen

void servo_print_all(void)

  {

  Serial.printf("ns: %u",right_hand_pos);

  Serial.printf("ns: %u",left_hand_pos);

  }

//command servos to go to some initial position

void servo_goto_init_pos(void)

  {

  pca9685.setChannelServoPulseDuration(right_hand,90);

  pca9685.setChannelServoPulseDuration(left_hand,90);

  }

Note that this code doesn’t command a servo directly to a specific

position. It will smoothly transition the servo over time – much better!

See our servo tutorial for more information.

void servo_loop()

  {

  if(STATE==system_OFF)

    all_servos_off();

  else if(millis()-servo_last_time > servo_time) //change angle slowly using a timer

    {

    if(STATE==right_claw_open)

      {

      if(right_hand_pos<130)

        right_hand_pos++;

      }

    if(STATE==right_claw_close)

      {

      if(right_hand_pos>50)

        right_hand_pos--;

      }

    if(STATE==left_claw_open)

      {

      if(left_hand_pos>50)

        left_hand_pos--;

      }

    if(STATE==left_claw_close)

      {

      if(left_hand_pos<130)

        left_hand_pos++;

      }

    if(STATE==claw_attack)

      {

      right_hand_pos++;

      left_hand_pos++;

      if(right_hand_pos>130)

        right_hand_pos=40;

      if(left_hand_pos>130)

        left_hand_pos=40;

      delay(1);

      }

    servo_pulse_all();

    }

  }

Compile and upload it to your ESP32. Power cycle your setup. Then say, “Hello Robot… right claw open…” etc. Try each of the commands. You should see the following output as below:

Image9

Note: As I mentioned at the beginning of this tutorial, the Gravity voice recognition system isn’t perfect. When I tested this code, for multiple times in a row, stating “right claw closed” was misunderstood as “right claw open”. If this becomes a problem for you, choose commands that sound nothing like each other. I also found pronouncing “closed” as “ka…low…sed” worked much better. Perhaps it struggles with consonant clusters?

Voice Controlled Servos Demo

Two wiggling servos didn’t sound like an exciting demo, so I put the above code (without modification) onto an old robot of mine.


Each claw consists of a single servo and its servo horn screwed into some plastic (see below image).

Image11

The below video should be self-explanatory.

Still need help?

If you want to voice control something other than servos and need a tutorial to get you started, write in to us and maybe we’ll expand this tutorial with your suggestion. =)

AppLab Bricks open in background with actual brick

Arduino AppLab Bricks → Marketing Garbage or New Powerful Interface?

Arduino Ventuno single board computer - top side

New Ventuno Q Dual Brain Single Board Computer

AppLab Pip Install

How to Add Python Packages in Arduino AppLab (No pip install needed)

Arduino Power Section Schematic

Kit-on-a-Shield Schematic Review

Just how random is the ESP32 random number generator?

Just how random is the ESP32 random number generator?

3 Comments

  1. Christiane on October 25, 2024 at 5:43 pm

    That is the coolest! Mahalo for sharing!

  2. Harley Davis on March 3, 2025 at 1:08 pm

    Hello,
    i need help please.
    I want to hook up the Gravity offline voice learning module to play saved subghz files from an esp-32 2432s208 (cheap yellow display) that is running Bruce firmware.
    Thank you for the help with coding 😅

Leave a Comment