From f04945fef03db6af8b7dbeefe23b4265b0672a8b Mon Sep 17 00:00:00 2001 From: Tursiae Date: Thu, 6 Feb 2025 01:24:08 +1100 Subject: [PATCH 1/3] Briefly document the TTS module. --- src/tangara/tts/README.md | 47 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 src/tangara/tts/README.md diff --git a/src/tangara/tts/README.md b/src/tangara/tts/README.md new file mode 100644 index 00000000..27c6801b --- /dev/null +++ b/src/tangara/tts/README.md @@ -0,0 +1,47 @@ +# Text-to-speech on Tangara + +The `tangara/tts/` module implements an audio accessibility layer for the +UI, providing the ability to play back text-to-speech recordings for each +UI element focused when using Tangara. + +The code is structured in three pieces: + +- `events.hpp`, providing the on-selection-changed and on-TTS-enabled events + for the UI bindings. +- `player.cpp`, which supports TTS playback via low-memory audio decoders + (currently, only WAV files), and +- `provider.cpp`, which is responsible for finding the TTS sample on the SD + card for the focused UI element. + +## End-user Configuration + +Text-to-speech can be enabled under the Display settings on Tangara, by +enabling the _"Spoken Interface"_ setting. Please note that this will not be +a user-visible change unless TTS phrases are loaded onto the SD card under +`/.tangara-tts/`. + +## Supported Codecs + +Currently, the TTS library only supports a WAV decoder. Natively, the player +expects 48 kHz audio, mono or stereo, and will (if required) resample the +audio to 48kHz for playback. + +## Creating and enabling TTS Samples + +TTS samples should be stored on your SD card, under `/.tangara-tts/`. The +`provider` expects that the TTS samples are stored in this directory as WAV +files, with a `.wav` extension, named as the hexadecimal version of the +[KOMIHASH](https://github.com/avaneev/komihash)ed TTS string. + +For example, `Settings` hashes to `1e3e816187453bf8`. If you recorded a +short sample as a 48kHz (mono or stereo) WAV file, and stored it on the SD +card as `/.tangara-tts/1e3e816187453bf8.wav`, it would be played back when the +settings icon is highlighted. + +## Finding the KOMIHASH of UI strings + +If you connect to your Tangara via the serial console, the `provider` module +logs a `WARN`ing each time it cannot find a TTS sample. You can enable these +log messages on the console by using the command `loglevel warn`, and then +manipulating the click wheel to move through the UI to discover other missing +TTS samples. From 2d8fdbf67f5623ec47a578f31059323ab8bb7d8f Mon Sep 17 00:00:00 2001 From: Tursiae Date: Thu, 6 Feb 2025 00:13:40 +1100 Subject: [PATCH 2/3] Make the TTS playback work by assuming the file is a .wav. The extension is needed to trigger format detection in the tag reader. --- src/tangara/tts/provider.cpp | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/tangara/tts/provider.cpp b/src/tangara/tts/provider.cpp index d19500e0..eedfe959 100644 --- a/src/tangara/tts/provider.cpp +++ b/src/tangara/tts/provider.cpp @@ -28,7 +28,11 @@ static const char* kTtsPath = "/.tangara-tts/"; static auto textToFile(const std::string& text) -> std::optional { uint64_t hash = komihash(text.data(), text.size(), 0); std::stringstream stream; - stream << kTtsPath << std::hex << hash; + // Assume the TTS sample is a .wav file; since we only support one low-RAM + // overhead codec, we can presume the suffix. The suffix is needed, else we + // fail to open the stream when it fails to autodetect the format when looking + // up tags. + stream << kTtsPath << std::hex << hash << ".wav"; return stream.str(); } From ffc62ee5e0b5d7f81c07aeb430a1b2566466f717 Mon Sep 17 00:00:00 2001 From: Tursiae Date: Thu, 6 Feb 2025 17:00:32 +1100 Subject: [PATCH 3/3] Update the docs to eliminate the mention of the Spoken Interface setting until re-added. --- src/tangara/tts/README.md | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/src/tangara/tts/README.md b/src/tangara/tts/README.md index 27c6801b..63d587da 100644 --- a/src/tangara/tts/README.md +++ b/src/tangara/tts/README.md @@ -15,10 +15,12 @@ The code is structured in three pieces: ## End-user Configuration -Text-to-speech can be enabled under the Display settings on Tangara, by -enabling the _"Spoken Interface"_ setting. Please note that this will not be -a user-visible change unless TTS phrases are loaded onto the SD card under -`/.tangara-tts/`. +Text-to-speech will automatically be enabled if you have loaded TTS phrases +onto the SD card, under `/.tangara-tts/`. These samples must be formatted +and named as per the instructions below. + +To disable TTS, rename or delete the `/.tangara-tts/` directory on your SD +card. ## Supported Codecs @@ -35,13 +37,13 @@ files, with a `.wav` extension, named as the hexadecimal version of the For example, `Settings` hashes to `1e3e816187453bf8`. If you recorded a short sample as a 48kHz (mono or stereo) WAV file, and stored it on the SD -card as `/.tangara-tts/1e3e816187453bf8.wav`, it would be played back when the -settings icon is highlighted. +card as `/.tangara-tts/1e3e816187453bf8.wav`, it would be played back when +the settings icon is highlighted. ## Finding the KOMIHASH of UI strings -If you connect to your Tangara via the serial console, the `provider` module -logs a `WARN`ing each time it cannot find a TTS sample. You can enable these -log messages on the console by using the command `loglevel warn`, and then -manipulating the click wheel to move through the UI to discover other missing -TTS samples. +If you connect to your Tangara via the serial console, the TTS provider +logs a `WARN`ing each time it cannot find a TTS sample. You can enable +these log messages on the console by using the command `loglevel warn`, +and then manipulating the click wheel to move through the UI to discover +other missing TTS samples.