07-31-2018, 04:43 AM | #1 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jul 2018
Device: edge
|
DOCX Identation - Ebook-Convert
According to the documentation for ebook-convert the option
--remove-paragraph-spacing-indent-size=10 should add an identation of 10 em However this does not work. Neither do the --remove-paragraph-spacing option. The same applies to --insert-blank-line and --insert-blank-line-size My sourcefile contain p and div tags. And each paragraph should be idented or divided by a blank line. I have also tried to insert two   ( ) at the start of each paragraph via xslt, but these spaces are not shown (no identation appears) in the docx. However when I replace the   with ** these characters are displayed in the docx file. Any suggestions how to solve these problems? |
07-31-2018, 05:16 AM | #2 |
creator of calibre
Posts: 44,542
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
Advert | |
|
07-31-2018, 10:01 AM | #3 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jul 2018
Device: edge
|
Files
Hi, thanks for a quick response. Here are the files I am working with including a calibre log file. We usually run the conversion from a python script, but here I have used the graphical interface (app). The result looks identical. I.e. no identation.
Regards, Tage Fredheim |
07-31-2018, 10:04 AM | #4 |
creator of calibre
Posts: 44,542
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I need the source file, which as per your first post is supposed to be an HTML file?
|
08-01-2018, 04:16 AM | #5 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jul 2018
Device: edge
|
Yes, an xhtml. I could not upload the file with that extension, so I copied it to a docx-file which I called sourcefile.docx. As you can see I have indented some parapgraphs with two blank spaces (in fact  ) i.e. non breakable spaces which should not be collapsed.
<section epub:type="chapter" id="level2_1"> <p>**Sosiologi og sosialantropologi er to av en rekke ulike samfunnsfag. Andre samfunnsfag er for eksempel statsvitenskap, samfunnsøkonomi, psykologi, samfunnsgeografi, pedagogikk og historie. Alle disse fagene handler om den menneskeskapte verden og konsentrerer seg på ulike måter om menneskelig aktivitet. Men likevel er de så forskjellige at de har ulike navn.</p> <p>**Noen enkle skiller mellom fagene finnes: Mens psykologene er opptatt av menneskenes atferd og tankeprosesser, studerer sosiologene og sosialantropologene først og fremst de sosiale og kulturelle sammenhengene vi inngår i. Pedagogene er opptatt av spørsmål knyttet til læring, skole og utdanning, mens statsvitere er opptatt av staten, offentlig aktivitet og hvordan det internasjonale systemet fungerer politisk og økonomisk.</p> <p>**Sosiologi og sosialantropologi er to fag som har mye til felles, både når det gjelder temaer som studeres, og teorier og begreper som brukes. I begge fagene stiller vi for eksempel slike spørsmål:</p> <div epub:type="pagebreak" class="page-normal" id="page-11" title="11">--- 11 til 298</div><div> I also need to ident various lists and other text so blank spaces and tabs must be kept. The docx files are used for blind pupils using a braille reading list, so special formatting and identation is mandatory. |
Advert | |
|
08-01-2018, 04:29 AM | #6 |
creator of calibre
Posts: 44,542
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Dont use blank spaces to indent, use the text-indent css property and you will be fine. Use the following in extra css in the conversion settings
Code:
p { text-indent: 2em !important } |
08-01-2018, 04:30 AM | #7 |
creator of calibre
Posts: 44,542
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
And if you need to upload files that mobileread does not support, you can zip them up and attach the zip file.
|
08-01-2018, 05:43 AM | #8 |
Junior Member
Posts: 5
Karma: 10
Join Date: Jul 2018
Device: edge
|
Ok, as mentioned we run the conversion from a python program: Where is "extra css" located?
try: self.utils.report.info("Konverterer fra XHTML til DOCX...") process = self.utils.filesystem.run(["/usr/bin/ebook-convert", html_file, os.path.join(temp_docxdir, epub.identifier() + ".docx"), "--no-chapters-in-toc", "--toc-threshold=0", "--docx-page-size=a4", "--linearize-tables", "--embed-font-family=Verdana", # microsoft fonts must be installed (sudo apt-get install ttf-mscorefonts-installer) "--docx-page-margin-top=42", "--docx-page-margin-bottom=42", "--docx-page-margin-left=70", "--docx-page-margin-right=56", "--base-font-size=13"]) |
08-01-2018, 06:33 AM | #9 |
creator of calibre
Posts: 44,542
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
--extra-css
|
Tags |
 , docx output, ebook conversion, identation |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Some docx files convert to blank epubs | wftl | Conversion | 1 | 03-19-2018 09:23 PM |
Failed to Convert Gutenberg MOBI into DOCX | CrossReach | Conversion | 3 | 08-31-2016 07:58 PM |
ebook-convert (docx->html) inserting too many page breaks | xanguera | Conversion | 3 | 07-31-2015 09:05 PM |
Word docx won't convert | psilber | Conversion | 2 | 08-09-2014 08:13 AM |
Calibre refuses to convert my docx file | DaveMcA | Conversion | 3 | 11-04-2013 03:26 AM |