{"id":15298,"date":"2024-09-20T13:04:45","date_gmt":"2024-09-20T11:04:45","guid":{"rendered":"https:\/\/www.beseit.net\/?p=15298"},"modified":"2024-09-22T19:07:32","modified_gmt":"2024-09-22T17:07:32","slug":"tts-actualment-qualsevol-llibre-en-format-digital-es-un-audio-book","status":"publish","type":"post","link":"https:\/\/www.beseit.net\/?p=15298","title":{"rendered":"TTS. Actualment qualsevol llibre en format digital \u00e9s un audio book."},"content":{"rendered":"\n<p>Les sigles <strong>TTS<\/strong> signifiquen <strong>Text-to-Speech<\/strong> (text a veu). \u00c9s una tecnologia que permet convertir text escrit en veu parlada. Aquesta tecnologia \u00e9s molt \u00fatil per a persones amb discapacitats visuals o dificultats de lectura, i tamb\u00e9 pot millorar l\u2019efici\u00e8ncia permetent que els usuaris facin altres tasques mentre escolten. Tamb\u00e9 \u00e9s possible llegir a m\u00e9s velocitat, i si llegeixes mentre s&#8217;escolta pots augmentar la teva comprensi\u00f3. <\/p>\n\n\n\n<p>La intel\u00b7lig\u00e8ncia artificial ha millorat significativament la tecnologia de Text-to-Speech (TTS). Algunes de les millores m\u00e9s destacades inclouen:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Veus m\u00e9s naturals<\/strong>: Els models de TTS actuals poden generar \u00e0udio que sona molt m\u00e9s hum\u00e0 gr\u00e0cies a l\u2019\u00fas d\u2019algoritmes de Deep Learning. <a href=\"https:\/\/blog.unrealspeech.com\/exploring-the-2023-speech-engine-evolution-in-tts-and-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">Aix\u00f2 permet una pros\u00f2dia i cad\u00e8ncia m\u00e9s naturals<\/a><a href=\"https:\/\/blog.unrealspeech.com\/exploring-the-2023-speech-engine-evolution-in-tts-and-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\"><sup>1<\/sup><\/a>.<\/li>\n\n\n\n<li><a href=\"https:\/\/blog.unrealspeech.com\/exploring-the-2023-speech-engine-evolution-in-tts-and-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>TTS neuronal<\/strong>: Aquesta tecnologia utilitza xarxes neuronals per crear veus m\u00e9s realistes i expressives<\/a><a href=\"https:\/\/www.naturalreaders.com\/online\/\" target=\"_blank\" rel=\"noreferrer noopener\"><sup>2<\/sup><\/a>.<\/li>\n\n\n\n<li><a href=\"https:\/\/blog.unrealspeech.com\/exploring-the-2023-speech-engine-evolution-in-tts-and-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Traducci\u00f3 i transcripci\u00f3<\/strong>: Els sistemes de TTS moderns poden transcriure amb precisi\u00f3 i traduir entre idiomes, millorant la comunicaci\u00f3 global<\/a><a href=\"https:\/\/blog.unrealspeech.com\/exploring-the-2023-speech-engine-evolution-in-tts-and-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\"><sup>1<\/sup><\/a>.<\/li>\n\n\n\n<li><a href=\"https:\/\/openai.com\/index\/chatgpt-can-now-see-hear-and-speak\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Interacci\u00f3 m\u00e9s humana<\/strong>: Les millores en TTS permeten interaccions m\u00e9s intu\u00eftives i semblants a les humanes, fent que els assistents virtuals siguin m\u00e9s efectius i accessibles<\/a><a href=\"https:\/\/openai.com\/index\/chatgpt-can-now-see-hear-and-speak\/\" target=\"_blank\" rel=\"noreferrer noopener\"><sup>3<\/sup><\/a>.<\/li>\n<\/ol>\n\n\n\n<p>Aquestes millores han ampliat les aplicacions de TTS en \u00e0rees com el servei al client, la creaci\u00f3 de contingut digital i la inclusi\u00f3 de persones amb discapacitats visuals o dificultats de lectura.<\/p>\n\n\n\n<p>Personalment, he notat que aplicacions que ja estava fent servir com el gestor de llibres <strong>Calibre<\/strong> ha augmentat de manera molt considerable les seves veus amb una qualitat notable. Han aparegut idiomes de forma natural que no fa gaires dies costava d&#8217;incloure, estic parlant del nostre estimat <strong><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f10606\" class=\"has-inline-color\">catal\u00e0<\/mark><\/strong>. De moment ens hem de conformar amb d&#8217;una qualitat mitjana. Per\u00f2 el canvi \u00e9s abismal.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"564\" height=\"811\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-5.png\" alt=\"\" class=\"wp-image-15301\" srcset=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-5.png 564w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-5-209x300.png 209w\" sizes=\"auto, (max-width: 564px) 100vw, 564px\" \/><\/figure>\n\n\n\n<p>Sembla que has trobat informaci\u00f3 sobre l\u2019\u00fas de veus TTS en catal\u00e0, com la veu \u201cupc_ona(Spain)\u201d de qualitat mitjana. \u00c9s probable que la Universitat Polit\u00e8cnica de Catalunya (UPC) hagi treballat en el desenvolupament d\u2019aquestes veus, ja que \u201cupc\u201d podria referir-se a aquesta instituci\u00f3.<\/p>\n\n\n\n<p>El nombre de veus disponibles en un idioma en particular per a sistemes de Text-to-Speech (TTS) pot dependre de diversos factors:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><a href=\"https:\/\/dl.acm.org\/doi\/fullHtml\/10.1145\/3313831.3376789\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Demanda del mercat<\/strong>: Els idiomes amb m\u00e9s parlants solen tenir m\u00e9s veus disponibles perqu\u00e8 hi ha una demanda m\u00e9s gran per a aquestes veus<\/a><a href=\"https:\/\/dl.acm.org\/doi\/fullHtml\/10.1145\/3313831.3376789\" target=\"_blank\" rel=\"noreferrer noopener\"><sup>1<\/sup><\/a>.<\/li>\n\n\n\n<li><strong>Recursos ling\u00fc\u00edstics<\/strong>: La disponibilitat de dades de veu de qualitat per a entrenar models TTS \u00e9s crucial. <a href=\"https:\/\/link.springer.com\/article\/10.1007\/s11042-022-13943-4\" target=\"_blank\" rel=\"noreferrer noopener\">Els idiomes amb m\u00e9s recursos ling\u00fc\u00edstics solen tenir m\u00e9s veus disponibles<\/a><a href=\"https:\/\/link.springer.com\/article\/10.1007\/s11042-022-13943-4\" target=\"_blank\" rel=\"noreferrer noopener\"><sup>2<\/sup><\/a>.<\/li>\n\n\n\n<li><a href=\"https:\/\/arxiv.org\/pdf\/2106.15561\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Inversi\u00f3 en recerca i desenvolupament<\/strong>: Les institucions acad\u00e8miques i empreses que inverteixen en la recerca i desenvolupament de tecnologies TTS poden crear m\u00e9s veus per a un idioma en particular<\/a><a href=\"https:\/\/arxiv.org\/pdf\/2106.15561\" target=\"_blank\" rel=\"noreferrer noopener\"><sup>3<\/sup><\/a>.<\/li>\n\n\n\n<li><a href=\"https:\/\/link.springer.com\/article\/10.1007\/s11042-022-13943-4\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Complexitat ling\u00fc\u00edstica<\/strong>: Alguns idiomes poden ser m\u00e9s dif\u00edcils de modelar degut a la seva complexitat fon\u00e8tica i gramatical, el que pot limitar el nombre de veus disponibles<\/a><a href=\"https:\/\/link.springer.com\/article\/10.1007\/s11042-022-13943-4\" target=\"_blank\" rel=\"noreferrer noopener\"><sup>2<\/sup><\/a>.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">2 Edge incorpora un magn\u00edfic plugin EpubReader. <\/h2>\n\n\n\n<p>Com anirem veient aix\u00f2 \u00e9s ple d&#8217;avantatges.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"577\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-6-1024x577.png\" alt=\"\" class=\"wp-image-15306\" srcset=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-6-1024x577.png 1024w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-6-300x169.png 300w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-6-768x433.png 768w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-6-1536x865.png 1536w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-6-500x282.png 500w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-6.png 1916w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Portant el ratol\u00ed a la part baixa de la pantalla surten les opcions per canviar la configuraci\u00f3:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"840\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-7-1024x840.png\" alt=\"\" class=\"wp-image-15309\" srcset=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-7-1024x840.png 1024w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-7-300x246.png 300w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-7-768x630.png 768w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-7-366x300.png 366w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-7.png 1076w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Mida del text 18: color 249-232-180<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"706\" height=\"867\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-8.png\" alt=\"\" class=\"wp-image-15311\" srcset=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-8.png 706w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-8-244x300.png 244w\" sizes=\"auto, (max-width: 706px) 100vw, 706px\" \/><\/figure>\n\n\n\n<p>Veus preferides:<\/p>\n\n\n\n<p>Catal\u00e0:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"270\" height=\"199\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-10.png\" alt=\"\" class=\"wp-image-15317\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"464\" height=\"482\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-12.png\" alt=\"\" class=\"wp-image-15324\" srcset=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-12.png 464w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-12-289x300.png 289w\" sizes=\"auto, (max-width: 464px) 100vw, 464px\" \/><\/figure>\n\n\n\n<p>English UK: <\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"267\" height=\"202\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-9.png\" alt=\"\" class=\"wp-image-15314\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"451\" height=\"459\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-11.png\" alt=\"\" class=\"wp-image-15322\" srcset=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-11.png 451w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-11-295x300.png 295w\" sizes=\"auto, (max-width: 451px) 100vw, 451px\" \/><\/figure>\n\n\n\n<p>Castella:<\/p>\n\n\n\n<p>fdfdf<\/p>\n\n\n\n<p>Aquestes mateixes veus s\u00f3n utilitzades pel plugin &#8216;Read Well&#8217; de Edge. Molt \u00fatil per a p\u00e0gines web.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"647\" src=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-13-1024x647.png\" alt=\"\" class=\"wp-image-15327\" srcset=\"https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-13-1024x647.png 1024w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-13-300x189.png 300w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-13-768x485.png 768w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-13-475x300.png 475w, https:\/\/www.beseit.net\/wp-content\/uploads\/2024\/09\/image-13.png 1297w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Les sigles TTS signifiquen Text-to-Speech (text a veu). \u00c9s una tecnologia que permet convertir text escrit en veu parlada. Aquesta tecnologia \u00e9s molt \u00fatil per a persones amb discapacitats visuals o dificultats de lectura, i tamb\u00e9 pot millorar l\u2019efici\u00e8ncia permetent &hellip; <a href=\"https:\/\/www.beseit.net\/?p=15298\">Continua llegint <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":8179,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-15298","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bloc-de-notes"],"_links":{"self":[{"href":"https:\/\/www.beseit.net\/index.php?rest_route=\/wp\/v2\/posts\/15298","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.beseit.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.beseit.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.beseit.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.beseit.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=15298"}],"version-history":[{"count":13,"href":"https:\/\/www.beseit.net\/index.php?rest_route=\/wp\/v2\/posts\/15298\/revisions"}],"predecessor-version":[{"id":15329,"href":"https:\/\/www.beseit.net\/index.php?rest_route=\/wp\/v2\/posts\/15298\/revisions\/15329"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.beseit.net\/index.php?rest_route=\/wp\/v2\/media\/8179"}],"wp:attachment":[{"href":"https:\/\/www.beseit.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=15298"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.beseit.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=15298"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.beseit.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=15298"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}