09-28-2024, 05:13 AM | #1 |
Connoisseur
Posts: 74
Karma: 6698
Join Date: Sep 2022
Location: South Africa
Device: kindle pw10
|
Read aloud problem
Hi,
I have just tried to use the read aloud feature. Using the flite and speech dispatcher engines I get a voice (but terrible). Using the Piper engine the only thing that happens is a word or phrase is highlighted but no sound. The same word remains highlighted but nothing else happens. Tried a different voice. Still nothing. Any ideas? Thanks Phil |
09-28-2024, 05:15 AM | #2 | |
Resident Curmudgeon
Posts: 76,259
Karma: 136006010
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
Advert | |
|
09-28-2024, 06:22 AM | #3 |
Connoisseur
Posts: 74
Karma: 6698
Join Date: Sep 2022
Location: South Africa
Device: kindle pw10
|
|
09-28-2024, 06:35 AM | #4 |
Resident Curmudgeon
Posts: 76,259
Karma: 136006010
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
09-28-2024, 07:11 AM | #5 |
Connoisseur
Posts: 74
Karma: 6698
Join Date: Sep 2022
Location: South Africa
Device: kindle pw10
|
No
Display: x11 server: X.Org v: 1.21.1.4 driver: X: loaded: modesetting unloaded: fbdev,vesa gpu: i915 display-ID: :0 screens: 1 |
Advert | |
|
09-28-2024, 07:12 AM | #6 |
Resident Curmudgeon
Posts: 76,259
Karma: 136006010
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
09-28-2024, 07:14 AM | #7 |
Connoisseur
Posts: 74
Karma: 6698
Join Date: Sep 2022
Location: South Africa
Device: kindle pw10
|
|
09-28-2024, 10:21 PM | #8 |
creator of calibre
Posts: 44,468
Karma: 24044628
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
If you are not getting sound then its because the Qt Multimedia module is unable to connect to the audio device. Check that you have working pipewire/pulseaudio/alsa and try changing the default audio device in the read aloud configuration.
|
09-29-2024, 12:39 AM | #9 | |
Connoisseur
Posts: 74
Karma: 6698
Join Date: Sep 2022
Location: South Africa
Device: kindle pw10
|
Quote:
Do have pulseaudio working in other apps. Will fiddle some more. Thanks for the suggestion. Phil |
|
10-18-2024, 02:03 PM | #10 | |
Member
Posts: 10
Karma: 10
Join Date: Feb 2023
Device: none
|
Quote:
1. If I start read aloud, it highlights a sentence but does nothing - no sound and does not advance. I see a piper process max out 12/24 cores and allocate up to 1.5GB of memory. 2. However if I open the read aloud settings from the toolbar and click "cancel", it will read aloud the current sentence, then highlight the next, but not proceed further. 3. It will read a sentence each time I open and cancel the read aloud settings. If I click "ok" on the read aloud settings, it does not read the sentence. 4. I get no relevant messages / errors in the terminal or syslog Hope that helps! |
|
10-18-2024, 02:06 PM | #11 |
creator of calibre
Posts: 44,468
Karma: 24044628
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
piper maxing out your CPUs is normal it proceeds to synthesize audio for the entire current chapter regardless of how slow the actual speaking goes.
Run the viewer as calibre-debug -w /path/to/ebook and you will ge tplenty of messgaes in the terminal. |
10-18-2024, 02:51 PM | #12 |
Member
Posts: 10
Karma: 10
Join Date: Feb 2023
Device: none
|
It looks like an audio clip is only getting sent to the device when the audio state is toggled.
These are the logs at the moment I cancel the settings dialog after 10s without audio that makes one sentence to be read out loud: Code:
[10.45] Audio state: State.IdleState [10.45] Utterance 1 audio output finished [10.47] Audio sent to output: maxlen=16384 len(ans)=16384 [10.47] Audio state: State.ActiveState Code:
[1.22] Utterance 3 synthesis started [1.22] Synthesized data read: 36864 bytes [1.22] [piper-debug] Phonemizing text: “I hold at your neck the gom jabbar,” she said. [1.22] [piper-debug] Converting 50 phoneme(s) to ids: aɪ hˈoʊld æt jʊɹ nˈɛk ðə ɡˈɑːm dʒˈæbɑːɹ, ʃiː sˈɛd. [1.22] [piper-debug] Converted 50 phoneme(s) to 103 phoneme id(s): xxx [1.22] [piper-debug] Synthesizing audio for 103 phoneme id(s) [1.50] [piper-debug] Synthesized 2.2639455782312927 second(s) of audio in 0.280647179 second(s) [1.50] Synthesized data read: 65536 bytes [1.50] [piper-info] Waiting for audio to finish playing... [1.50] [piper-info] Real-time factor: 0.13556893212154525 (infer=0.9396494800000001 sec, audio=6.931156462585034 sec) [1.50] Utterance 3 got 102400 bytes of audio data from piper [1.50] Utterance 4 synthesis started [1.50] Synthesized data read: 34304 bytes [1.50] [piper-debug] Phonemizing text: “The gom jabbar, the highhanded enemy. [1.50] [piper-debug] Converting 40 phoneme(s) to ids: ðə ɡˈɑːm dʒˈæbɑːɹ, ðə hˈaɪhændᵻd ˈɛnəmi. [1.50] [piper-debug] Converted 40 phoneme(s) to 83 phoneme id(s): xxx [1.50] [piper-debug] Synthesizing audio for 83 phoneme id(s) [1.79] [piper-debug] Synthesized 2.345215419501134 second(s) of audio in 0.28609584 second(s) [1.79] Synthesized data read: 65536 bytes [1.79] [piper-info] Waiting for audio to finish playing... [1.79] [piper-info] Real-time factor: 0.13213628513180542 (infer=1.2257453200000001 sec, audio=9.276371882086167 sec) [1.79] Utterance 4 got 99840 bytes of audio data from piper [1.79] Utterance 5 synthesis started |
10-18-2024, 11:39 PM | #13 |
creator of calibre
Posts: 44,468
Karma: 24044628
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Yes as I said its an issue with your audio device. For whatever reason its not reading the audio data that is available to it. Sadly audio on Linux is such an absolute cluster fuck that it could be anything, I haven't the first clue where you would go to debug it. The calibre piper code will emit the readyRead() signal when synthesized data is available. It is then upto the audio device to read that data, which it isnt on your system.
|
10-19-2024, 11:46 AM | #14 |
Member
Posts: 10
Karma: 10
Join Date: Feb 2023
Device: none
|
I've taken a look at the code and found a fix by removing the line that sets a large buffer on the QAudioSink in piper.py:
Code:
self._audio_sink.setBufferSize(2 * 1024 * 1024) - the default buffer size on Ubuntu 22.04 is just 3794 bytes rather than 2097152 bytes - piper tts now plays fine - no stutter or glitches (sounds great btw, looking forward to using it!) It feels like the Nagle algorithm type issue i.e. QT/linux is waiting for enough data to be written into the buffer before processing it to avoid under runs but calibre only writes one utterance at a time and waits for it to be processed to sync the highlight. A single utterance is tiny relative to a 2MB buffer size so processing stalls. Might count as a QT bug? I guess might need to elevate buffer size to a setting given different platforms seem quite sensitive to it. Last edited by noodler; 10-19-2024 at 11:51 AM. |
10-19-2024, 11:00 PM | #15 |
creator of calibre
Posts: 44,468
Karma: 24044628
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Definitely a bug in either Qt or the underlying audio driver. It's pretty ridiculous for a audio driver to refuse to output audio until its buffer is full. What if the user is playing short, intermittent audio sounds? I will note it works fine on all the Linux, Windows and macOS systems I have. If you can reproduce it easily I suggest writing a small PySide based script to reproduce it (just create an audio device with a large buffer and write some random raw audio data into it) and open a bug report at Qt with your reproducer script.
And in the next release there will be a tool to generate Piper based audio overlays which means you can pre-create the TTS audio files and play then directly using the calibre viewer to workaround the Qt bug. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Read-Aloud suggestion | ElectricOutcast | Viewer | 0 | 05-12-2023 10:05 PM |
Read Aloud in Japanese | magphil | Calibre | 1 | 02-07-2023 11:27 AM |
Center read text when auto scrolling and Read Aloud | Revolving Pixel | Viewer | 1 | 07-17-2021 09:04 PM |
KF8 and read-aloud | marcelo2605 | Kindle Fire | 5 | 04-12-2012 01:34 PM |
Will Kindle DX read aloud PDFs? | JoeC | Amazon Kindle | 15 | 05-07-2009 10:47 AM |