<oembed><type>rich</type><version>1.0</version><title>Subema wrote</title><author_name>Subema (npub1tg…vj8mh)</author_name><author_url>https://yabu.me/npub1tgnp5cf3r9rdjvn2c4geney59tsdzyggddwzr2y9wpfugt4agjqqlvj8mh</author_url><provider_name>njump</provider_name><provider_url>https://yabu.me</provider_url><html>So it&#39;s audio? Would be surprised they wouldn&#39;t have better tech at launch then. Voice generation is pretty far in that, as far as same style/balanced output goes. I mean it&#39;s accessible for end user willing to pay pennies. The bleeding edge models with scriptable intonation/speed/volume/idontevenknow are circulating in the papers and huggingface demos. So, if I were to bet, voice quality wouldn&#39;t be the reason for fail </html></oembed>