Our system architecture consists of three layers: the first written in C++, runs on MS Windows and is called VOXAR Brain. This part of the robot system processes all the visual and depth data that comes from the Kinect Camera and its depth sensor. Based on this information the VOXAR Brain sends data to the second level of the system, packets that contain data for the robot movements and its activities. In the Fig. 2 we detail the architecture.
In the second layer of the robot system we have a machine with Linux running ROS (Robot Operating System). This part of our architecture has voice recognition software, which is the most important data source (along with gestures) to change the robot activities. It also sends and receives data packets from the VOXAR Brain to monitor the robot status, helping the first layer of the system choose exactly what is necessary to 4 process. Via the vision technology processing, ROS will receive data packets with commands to control the movements. Finally, ROS needs to pass the movements to the actuators, and it is here that our third layer enters the system. The electromechanical system is based on various controllers, including Arduino UNO and Arbotix that communicate via USB serial with ROS. The various controllers each receive a packet, and from there they compute the information that is necessary to control the actuator drives. Note that we have bidirectional communication between the VOXAR Brain and ROS, and between ROS and the hardware controllers.
Our robot I-Zak, is composed of the following hardware parts: a base platform with two driverless VEX 2.75" Omni Directional Wheel and two driving 13cm Colson wheels; an aluminum and acrylic body made of reused materials where there is a laptop running the Robot Operating System (ROS); and a head system, composed by an Align G800 gimbal where is a Microsoft Surface tablet mimicking its face, a Microsoft Kinect for its vision, a RODE VideoMic GO directional microphone for voice recognition and a small speaker for human interaction.
On the bottom of the robot we have an Arduino board which controls two Mabuchi DC motors and robot’s neck. Two PCBs have been developed. The first has terminals which are connected batteries and also makes the association of these also has the oscillators responsible for the heartbeat, the circuit emergency button, the engine drivers and some secondary circuits. The other has the encoder circuits with infrared barrier sensors which operate at from a coded disc made into a 3D printer and coupled to the wheels of the robot. The Arduino receives data from ROS application through arduinoserial lib implemented. The same notebook with ROS is also responsible for the robot audio output delivered by one small speaker and audio input which comes from the RODE VideoMic GO direction microphone and several DSP Technics implemented in the middleware layer on PC. On the top of the robot there is an Align G800 gimbal with three degrees of freedom (3DOF), which emulates head movements and is controlled by the Arduino that sends angular orientations to the servos.
There is a MS Surface running the VOXAR brain, responsible for all the vision system that processes visual data from MS Kinect camera and depth sensor. The MS Surface is connected to the notebook by an ethernet cable in order to send commands based on visual processed data and receive some feedback about the robot's position and its motion via an encoder on each powered wheel for steering control. For gripping and manipulation, we use a WidowX Robot Arm Mark II, manufactured by Trossen Robotics. It has 5 degrees of freedom (DOF) and horizontal reach of 41 cm at full stretch, holding up to 400 g for a vertical grip orientation and 29 cm reach. It has a boxed ABS frame, two MX-64 Dynamixel servos for shoulder movement and rotation, two MX-28 Dynamixel servos for elbow and wrist movement, and two AX-12 Dynamixel servos for wrist rotation and parallel gripping, opening up to 32 mm, all controlled by an ArbotiX-M Robocontroller.