Our system architecture consists of three layers: the first written in C++, runs on MS Windows and is called VOXAR Brain. This part of the robot system processes all the visual and depth data that comes from the Kinect Camera and its depth sensor. Based on this information the VOXAR Brain sends data to the second level of the system, packets that contain data for the robot movements and its activities. In the Fig. 2 we detail the architecture.
In the second layer of the robot system we have a machine with Linux running ROS (Robot Operating System). This part of our architecture has voice recognition software, which is the most important data source (along with gestures) to change the robot activities. It also sends and receives data packets from the VOXAR Brain to monitor the robot status, helping the first layer of the system choose exactly what is necessary to 4 process. Via the vision technology processing, ROS will receive data packets with commands to control the movements. Finally, ROS needs to pass the movements to the actuators, and it is here that our third layer enters the system. The electromechanical system is based on various controllers, including Arduino UNO and Arbotix that communicate via USB serial with ROS. The various controllers each receive a packet, and from there they compute the information that is necessary to control the actuator drives. Note that we have bidirectional communication between the VOXAR Brain and ROS, and between ROS and the hardware controllers.