Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 04-11-2019, 06:48 AM   #1
tage fredheim
Member
tage fredheim began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Apr 2019
Device: edge
Cool xml:lang

Hello, I am converting an EPUB-file to docx using calibre. In the text there are several description lists, i.e. glossaries, With the English and the translated Norwegian text:

<dl><dt>mind: </dt><dd lang="no" xml:lang="no">sinn</dd>

My question is: Will this information (xml:lang) be transferred to the docx-file, so that the synthetic Speech will change accordingly?

/Tage
tage fredheim is offline   Reply With Quote
Old 04-11-2019, 10:47 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,138
Karma: 22670164
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Try it and see. You can convert to docx and convert back to see if the lang information was preserved.
kovidgoyal is offline   Reply With Quote
Advert
Old 04-25-2019, 04:46 AM   #3
tage fredheim
Member
tage fredheim began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Apr 2019
Device: edge
Hi, I convert the epub to docx and then back to epub to check whether the Language metadata is preserved.

My original epub file has the following Language metadata:

<dc:language xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dtb="http://www.daisy.org/z3986/2005/dtbook/" xmlns:d="http://www.daisy.org/ns/pipeline/data" id="language_1">en</dc:language>

Then I convert the epub to docx and then back to epub to check whether the Language metadata is preserved, but now it is changed from "en" to "nb":

<dc:language>nb</dc:language>

This is an English Learning book so it is essential that the spoken Language is indeed English in the docx.

In the text glossary the Language is preserved e.g.

<p>Page 221:</p><dl><dt>completely: </dt><dd lang="no" xml:lang="no">fullstendig</dd><dt>out of fashion: </dt><dd lang="no" xml:lang="no">ikke på moten</dd>

Konvertert tilbake til epub fra docx:

<p class="block_6"><span class="text_">Page 221:</span></p>
<ul class="list_">
<li class="block_9"><span class="text_">completely: </span><span lang="no" class="text_">fullstendig</span></li>

So it seems that the main language "en" is lost converting to docx.
tage fredheim is offline   Reply With Quote
Old 04-25-2019, 04:53 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,138
Karma: 22670164
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Set the language in the book metadata and it will be sued during conversion.
kovidgoyal is offline   Reply With Quote
Old 04-25-2019, 05:47 AM   #5
tage fredheim
Member
tage fredheim began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Apr 2019
Device: edge
OK, how can that be achieved programmatically? Currently my programmatic interface is:

try:
self.utils.report.info("Konverterer fra XHTML til DOCX...")
process = self.utils.filesystem.run(["/usr/bin/ebook-convert",
html_file,
os.path.join(temp_docxdir, epub.identifier() + ".docx"),
"--no-chapters-in-toc",
"--toc-threshold=0",
"--docx-page-size=a4",
# "--linearize-tables",
"--extra-css=/home/statped/Dokumenter/produksjonssystem/produksjonssystem/extra.css",
"--embed-font-family=Verdana", # microsoft fonts must be installed (sudo apt-get install ttf-mscorefonts-installer)
"--docx-page-margin-top=42",
"--docx-page-margin-bottom=42",
"--docx-page-margin-left=70",
"--docx-page-margin-right=56",
"--base-font-size=13",
"--font-size-mapping=13,13,13,13,13,13,13,13"])
tage fredheim is offline   Reply With Quote
Advert
Old 04-25-2019, 06:45 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,138
Karma: 22670164
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
--language
kovidgoyal is offline   Reply With Quote
Reply

Tags
xml:lang


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
xml:lang oddities jcsalomon ePub 1 06-06-2016 05:28 PM
xml:lang empty (pdf to epub) fxp33 Conversion 3 05-07-2015 11:40 PM
After merging all the .xml files, how do you divide it back into .xml files? automa Sigil 10 08-13-2013 07:43 AM
Russian lang. cavaughan Calibre 2 08-06-2009 09:26 PM
Why xml?? real_yoni Sony Reader Dev Corner 1 01-20-2009 11:45 AM


All times are GMT -4. The time now is 06:42 PM.


MobileRead.com is a privately owned, operated and funded community.