Audio

Source:

Report:

Introduction

The Cyborg audio module gathers functionality previously provided by modules cyborg_music and cyborg_text_to_speech, while decoupling output and Cyborg states. The module handles execution of playback and text to speech onto the Cyborg speakers, makes itself available for other modules through ROS topics, and provides executional feedback. A context diagram for the cyborg audio module is presented in Figure 7.1 (AB = Areg Babayan's masters thesis 2019)

Design

The audio node is designed as a single ROS node. As per the stated specifications, playback and text to speech are separate instances, each with their own channels for commands and feedback. A conceptual class diagram is shown in Figure 7.2 (AB). Both playback and text to speech handles preemption requests through messages, and both reply with a feedback message once execution is finished or preempted. A sequence diagram for the playback module is seen in Figure 7.3 (AB).

Implementation

The implementation is done in Python, as a single ROS node called cyborg_audio. Playback and text to speech are implemented as separate Python classes, instantiated in the cyborg_audio ROS node. Individual ROS topics with message type std_msgs.msg.String are used for commands and feedback. A class diagram for the audio module is presented in 7.4 (section in AB). ROS topics were chosen as they require less lines of code than actions.

Playback

The playback module is reimplemented using ROS topics instead of the actionlib protocol. The main body of the module is implemented as a threaded function using the threading library, python library vlc is used to play the audio files. The module is made up of two functions:

playback - Main function of the module. Signaled by callback_playback when a message is received, loads the requested file into vlc and executes. Checks for preemption requests when active, and publishes a result message on the feedback topic upon completion or preemption.

callback_playback - Callback function for the ROS topic subscriber, stores the message content and signals playback through a shared variable.

The following commands are available:

<filename> - Replace with your filename, the name is written without the filetype extension.
PreemptPlayback - Preempts playback.

The following feedback is provided:

playback finished - Published when playback is finished.
playback preempted - Published when playback has been preempted.
playback timeout - Published if playback has timed out.

The requested audio file must be located in the homedir folder, only mp3 files are supported.

Text to Speech

The text to speech module is reimplemented with a new text to speech engine pyttsx3, in addition preemption and feedback of execution has been added. The module is made up of three functions:

text_to_speech - This is the main function of the module, it starts the text to speech engine and connects to the on_end_tts function.
on_end_tts - Executed upon completion, publishes feedback on the feedback ROS topic.
callbcak_text_to_speech - Callback function for the ROS topic subscriber, stores the message content and starts or preempts execution. Publishes feedback on the feedback ROS topic upon preemption.