2021-11-06 07:11

Application of Mobile Robots by Using Speech Recognition in Engineering

Prof. Dr. Eng. PVL Narayana Rao1, Er. Pothireddy Siva Abhilash2

1Professor of Computer Science Engineering, Dept. of Information System, College of Computing amp; Informatics,

Wolkite University, P.O.Box.No.7, SNNPR, Wolkite, Ethiopia, East Africa 2Software Engineer, Staffordshire University, Staffordshire, United Kingdom

Abstract--This Chapter presents an expected technique of speech recognition system and it relates to voice control of electromechanical application, especially voice controlled mobile robots or intelligent wheelchair for handicapped people. Our aim is to interact with the robot using natural and direct communication techniques. The aim of this Chapter is that how the voice can be processed to obtain proper and safe wheelchair movement by high recognition rate. In order to make voice an effective communication tool between human and robots, high speech recognition rate must be achieved. But one hundred percent speech recognition rate under a general environment is almost difficult to achieve. In this Chapter, proposed technique called (Multiregdilet transform) is used for lonely words recognition. Finally use the outputs of neural network (NNT) to control the wheelchair through computer note books and special interface hardware. A successful recognition rate of 98% was achieved.

Keywords-- Artificial Neural Network, Multiridgelet Transform, Multiwavelet Transform, and Interfacing



Since human usually communicates each other by voices, it is very convenient if voice is used to command robots. A wheelchair is an important vehicle for the persons physically handicapped. However, for the injuries who suffer from spasms and paralysis of extremities, the joystick is a useless device as a manipulating tool.


The following 5 voice commands have been identified for various operation of the wheelchair FORWARED, REVERSE, LIFT, RIGHT, and STOP. Chair starts moving in corresponding direction on voicing the command forward in forward direction and stop if the command is stop and so on.

2-1 Data Base of Speech

Every speaker recognition system depends mainly on the data input. The data that used in the system is speech. The speech uttered by using 15 speakers, 8 males and 7 females, 10 of them used for training purpose (5 males, and 5 females) and each speaker utter the same word 5 times.

2-2 Multirighelet Transform

To improve the performance and to overcome the weakness points of the Ridgelet transform, a technique named the Multiridgelet transform proposed. The main idea of the Ridgelet transform is to map a line sampling scheme into a point sampling scheme using the Radon transform, then the Wavelet transform can be used to handle effectively the point sampling scheme in the Radon domain [Minh, et al., 2003].While the main idea of Multiridgelet transform depends on the Ridgelet transform with changing the second part of this transform with Multiwavelet transform to improve the performance and output quality of the Ridgelet transform.


Artificial Neural Networks (ANN) refers to the computing systems whose central theme is borrowed from the analogy of bdquo;biological neural networks‟. Many tasks involving intelligence or pattern recognition are extremely difficult to automate [Ram Kumar, et al., 2005].

3-1 The Model of Neural Network

We used random numbers around zero to initialize weights and biases in the network. The training process requires a set of proper inputs and targets as outputs. During training, the weights and biases of the network are iteratively adjusted to minimize the network performance function.


This Chapter contain two part, part one contains the theoretical work (simulation in computer with aid of matlab 7), and the second one puts interface between computer and connected to wheelchair..

    1. The Preprocessing: In this section, the lonely spoken word is segmented into frames of equal length of (128 samples). Next the result frames of each word is converted into single matrix (2- dimensional), and this matrix must be power of two. So the proposed length for all word is 16348 (one dimensional), and this length is power of two and can divided into matrix have dimension (128times;128 , and this is 2- dimensional and power of two matrix).
    2. Classification: This step begins when getting on 2-D discrete Multiridgelet transform coefficient. The coefficient splitter into two parts, the first part used as a reference data, and the second one used as tested or classified data. The strong method that can be recognized signal simply is neural network that use an algorithm of back propagation training algorithm as a classifier after training the reference data (coefficient) resulting from 2D discrete Multiridgelet transform.
    3. Computation FDMWT for 1-D Signal: By using an over-sampled scheme of preprocessing (repeated row), the discrete multiwavelet transform (DMWT) matrix is doubled in dimension compared with that of the input, which should be a square matrix NxN where N must be power of two. Transformation matrix dimensions equal input signal dimensions after preprocessing.

The wheelchair that used in this work has three connecting rod (one in front and two in rear of wheelchair) that connect the two sides of wheelchair; each rod has joint in middle this will enable the wheelchair to be portable. The wheelchair is 65 cm (25.5 inches)


Prof. Dr. Eng. PVL Narayana Rao, Er. Pothireddy Siva Abhilash

PVL Narayana Rao是沃尔凯特大学计算机与信息学院信息系统系计算机科学工程教授,邮政信箱是7号,沃尔凯特民族地区,埃塞俄比亚,东非;Pothireddy Siva Abhilash是英国斯塔福德郡斯塔福德郡大学软件工程师。



1 研究目的及意义的介绍


2 语音控制系统的设计


2-1 语音数据库


2-2 多重脊波变换


3 神经网络


3-1 神经网络模型


4 预期系统的一般过程


4-1 语音信号预处理


4-2 分类


4-3 一维信号计算快速小波变换


5 实验工作




5-3 轮子




电动机是把电能转换成机械能的一种设备。它是利用通电线圈产生旋转磁场并作用于转子形成磁电动力旋转扭矩。电机可以说是移动机器平台最重要的部分之一。动力不足的电动机效率低下,浪费了车载电池已经有限的电力供应,而尺寸过小的电动机在关键时刻可能会出现扭矩不足。 还必须考虑电机的最佳转速和可用的速度范围。来自电机轴的输出转速太高将导致机器人以快速,无法控制运行的速度。输出太低,机器将无法达到合适的速度来满足用户的需求。电动机的旋转输出也在性能中起作用,因为如果扭矩不足,则在某些情况下可能不会发生运动。所以,需要仔细考虑来确定应用于这个平台的电动机。

6 向原始轮椅添加硬件组件

添加到原来的轮椅上的改装除了以前设计的操纵杆(根据人的残疾特别是对于患有痉挛和四肢瘫痪的残疾人来修改轮椅功能)使其物理设计更加实用。 它是各种物理硬件和计算软件的组合,它们将轮椅的子系统混合在一个单元中工作。 在硬件组件方面,添加到轮椅的主要组件是接口电路,麦克风(耳机麦克风)和笔记本电脑(主机)。

6-1 耳机

当使用自动语音识别功能(ASR)时,使用高质量的麦克风是非常有必要的。在大多数情况下,桌面麦克风并不能完成这项工作,因为桌面麦克风会接收更多的环境噪声,这给自动语音识别带来了困难。手持麦克风也不是最好的选择,因为它们总是不够灵活,虽然它们确实降低了环境噪声,但在需要经常更换扬声器或不经常与识别器通话(不带耳机的情况下)的应用中,手持麦克风是最有效的。目前为止最好的选择是耳机麦克风。它可以将周围的噪音降到最低,同时可以让你一直把麦克风放嘴边。无论带不带耳机的头戴式麦克风都是可以的,在这次设计中,我们使用耳机类型是Fancong FC-340。

6-2 继电器驱动接口电路




7 实验结果仿真

7-2 实验结果


7-2-1 直线路径



7-2-2 曲线路径



8 工作总结



  1. Cook S., 2002, ' Speech Recognition How To ', Revision v2.0 April 19, 2002.
  2. Hosseini E., Amini J., Saradjian M.R., 1996, 'Back Propagation Neural Network for Classification of IRS-1D Satellite lite Images', Tehran University, 1996.
  3. Hrnčaacute;r M., 2007, ' Voice Command Control For Mobile Robots', Department of Control and Information Systems Faculty of Electrical Engineering, University
原文和译文剩余内容已隐藏,您需要先支付 30元 才能查看原文和译文全部内容!立即支付