Piper and MacOS problem

MDMullins · 10-22-2024, 12:51 AM

I would love to use the new voices, but on an M2 MacBook, the extreme delay between phrases is currently making it unworkable. Is there a setting which could fix this? I just updated to Calibre 7.20.0, so I don't think anything needs to be updated.

kovidgoyal · 10-22-2024, 12:58 AM

There is no delay on my m2 macbook, presumably for some reason the audio synthesis is very slow on your machine. The next calibre release has a tool that allows you to pre-synthesize and embed the audio in the book, you can use that as a workaround. Also in the preferences of read aloud make sure the pause between sentences setting is at zero, maybe you accidentally increased that.

MDMullins · 10-22-2024, 01:20 AM

OK. Just to be sure: the delay is extreme. It will take 30 seconds just to speak the first word, and then delay for 15 seconds, and speak another phrase.

Do you think a new MacBook Pro would be able to handle this task? This one is extremely important to me and is something I would use for hours a day, so I'm anxious to getting it working.

kovidgoyal · 10-22-2024, 01:24 AM

I checked and my macbook is an Air M1 from 2020 and it works fine on that. So it should definitely work on any reasonably up to date machine.

noodler · 10-22-2024, 05:32 PM

I just tried my M1 Air and I see the same as MDMullins. I see a 20+s gap between each utterance.

Piper has generated the data but QT isn't reading it. It looks related to my problem on linux because it plays without any delay if I remove the line that sets the large buffer.

With the large buffer, it takes QT 20s after it has been signalled with readyRead to actually read the data:

Code:

[0.10] ### atEnd: True
[0.10] ### readData: Audio sent to output: maxlen=2097152 len(ans)=0
[0.10] ### atEnd: True
[0.13] ### Audio state: State.ActiveState
[0.16] ### Audio state: State.IdleState
[1.48] ### start_utterance readyRead emitted for 1
[23.65] ### readData: Audio sent to output: maxlen=2097152 len(ans)=128512
[23.65] ### atEnd: True
[23.65] ### readData: Audio sent to output: maxlen=1968640 len(ans)=0
[23.65] ### atEnd: True
[23.68] ### Audio state: State.ActiveState
[26.63] ### Audio state: State.IdleState
[26.63] ### start_utterance readyRead emitted for 2
[47.20] ### readData: Audio sent to output: maxlen=1968640 len(ans)=54272
[47.20] ### atEnd: True
[47.20] ### readData: Audio sent to output: maxlen=1914368 len(ans)=0
[47.20] ### atEnd: True
[47.23] ### Audio state: State.ActiveState
[48.49] ### Audio state: State.IdleState

And here is the log with the line setting the large buffer removed. It's a lot stranger than I expected with a lot of totally spurious calls to readData. Although it successfully reads the data and plays it immediately it might be unrelated to the signal given how frequently it's trying to read data without being asked:

Code:

0.10] ### atEnd: True
[0.10] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.10] ### atEnd: True
[0.13] ### Audio state: State.ActiveState
[0.17] ### Audio state: State.IdleState
[0.20] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.20] ### atEnd: True
[0.29] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.29] ### atEnd: True
[0.38] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.38] ### atEnd: True
[0.47] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.47] ### atEnd: True
[0.56] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.56] ### atEnd: True
[0.66] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.66] ### atEnd: True
[0.75] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.75] ### atEnd: True
[0.84] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.84] ### atEnd: True
[0.93] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[0.93] ### atEnd: True
[1.03] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[1.03] ### atEnd: True
[1.12] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[1.12] ### atEnd: True
[1.21] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[1.21] ### atEnd: True
[1.30] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[1.30] ### atEnd: True
[1.39] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[1.39] ### atEnd: True
[1.48] ### readData: Audio sent to output: maxlen=16384 len(ans)=0
[1.48] ### atEnd: True
[1.50] ### start_utterance readyRead emitted for 1
[1.58] ### readData: Audio sent to output: maxlen=16384 len(ans)=16384
[1.58] ### atEnd: False
[1.58] ### Audio state: State.ActiveState
[1.67] ### atEnd: False
[1.76] ### atEnd: False
[1.85] ### readData: Audio data sent to output: maxlen=0
[1.85] ### atEnd: False
[1.85] ### readData: Audio sent to output: maxlen=16384 len(ans)=16384
[1.85] ### atEnd: False
[1.95] ### atEnd: False
[2.04] ### atEnd: False
[2.04] ### atEnd: False
[2.13] ### atEnd: False
[2.22] ### readData: Audio data sent to output: maxlen=0
[2.22] ### atEnd: False
[2.22] ### readData: Audio sent to output: maxlen=16384 len(ans)=16384
[2.22] ### atEnd: False
[2.31] ### atEnd: False
[2.41] ### atEnd: False
[2.41] ### atEnd: False
[2.50] ### atEnd: False
[2.59] ### readData: Audio data sent to output: maxlen=0
[2.59] ### atEnd: False
[2.59] ### readData: Audio sent to output: maxlen=16384 len(ans)=16384
[2.59] ### atEnd: False
[2.68] ### atEnd: False
[2.77] ### atEnd: False
[2.77] ### atEnd: False
[2.87] ### atEnd: False
[2.96] ### readData: Audio data sent to output: maxlen=0
[2.96] ### atEnd: False
[2.96] ### readData: Audio sent to output: maxlen=16384 len(ans)=16384
[2.96] ### atEnd: False
[3.05] ### atEnd: False
[3.14] ### atEnd: False
[3.14] ### atEnd: False
[3.23] ### atEnd: False
[3.33] ###

kovidgoyal · 10-23-2024, 12:12 AM

Interesting, I think I should maybe just bypass Qt since it seems to be pretty buggy and just use ffmpeg directly in the Piper backend. But that's going to be a larger project. In the meantime I guess I can restrict the buffer size change to windows only, where its needed.

MDMullins · Today, 02:34 AM

Fixed now with the latest update. Thank you.

10-22-2024, 12:51 AM	#1
MDMullins Junior Member Posts: 6 Karma: 10 Join Date: Jun 2023 Device: Kindle	Piper and MacOS problem I would love to use the new voices, but on an M2 MacBook, the extreme delay between phrases is currently making it unworkable. Is there a setting which could fix this? I just updated to Calibre 7.20.0, so I don't think anything needs to be updated.

10-22-2024, 12:58 AM	#2
kovidgoyal creator of calibre Posts: 44,560 Karma: 24495948 Join Date: Oct 2006 Location: Mumbai, India Device: Various	There is no delay on my m2 macbook, presumably for some reason the audio synthesis is very slow on your machine. The next calibre release has a tool that allows you to pre-synthesize and embed the audio in the book, you can use that as a workaround. Also in the preferences of read aloud make sure the pause between sentences setting is at zero, maybe you accidentally increased that. Last edited by kovidgoyal; 10-22-2024 at 01:22 AM.

Today, 02:34 AM	#7
MDMullins Junior Member Posts: 6 Karma: 10 Join Date: Jun 2023 Device: Kindle	Fixed Fixed now with the latest update. Thank you.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Problem running plugins on MacOS	BBouy	Sigil	22	04-24-2023 01:49 PM
Problem converting on MacOS Monterrey	dunhill	Conversion	2	11-01-2021 04:20 PM
Calibre and Oasis 3 - communication problem on MacOS Catalina	WitoldN	Calibre	2	02-08-2020 11:27 AM
Minor problem with 0.6 + MacOS/Growl + German umlauts	Rafardeon	Calibre	8	03-31-2010 01:54 PM
PAYING THE PIPER (thriller)	Simon Wood	Writers' Corner	1	02-10-2010 01:30 AM

10-22-2024, 01:20 AM	#3
MDMullins Junior Member Posts: 6 Karma: 10 Join Date: Jun 2023 Device: Kindle	OK. Just to be sure: the delay is extreme. It will take 30 seconds just to speak the first word, and then delay for 15 seconds, and speak another phrase. Do you think a new MacBook Pro would be able to handle this task? This one is extremely important to me and is something I would use for hours a day, so I'm anxious to getting it working.

10-22-2024, 01:24 AM	#4
kovidgoyal creator of calibre Posts: 44,560 Karma: 24495948 Join Date: Oct 2006 Location: Mumbai, India Device: Various	I checked and my macbook is an Air M1 from 2020 and it works fine on that. So it should definitely work on any reasonably up to date machine.

10-23-2024, 12:12 AM	#6
kovidgoyal creator of calibre Posts: 44,560 Karma: 24495948 Join Date: Oct 2006 Location: Mumbai, India Device: Various	Interesting, I think I should maybe just bypass Qt since it seems to be pretty buggy and just use ffmpeg directly in the Piper backend. But that's going to be a larger project. In the meantime I guess I can restrict the buffer size change to windows only, where its needed.