11-05-2015, 01:21 PM | #1 |
Junior Member
Posts: 5
Karma: 10
Join Date: Nov 2015
Device: none
|
Docx to ePub conversion keeps failing
I am trying to convert a MS Word file to ePub, and it keeps failing. The log is below.
The text is Tibetan language, UTF-8 encoded. I just converted a different Tibetan language text this evening, and it worked without any errors, and I have done it numerous times in the past, too. I used the same settings for this, but it fails almost immediately (17 seconds). Tried with the Word file saved as a Word 2015 file as well as 2013 file. I am using the newest version of Calibre 64 bit on WIndows 10 Home 64 bit. I uninstalled Calibre, deleted the fCalibre folder in %User%/AppData/Roaming, and then reinstalled, and nothing. Tried running as Administrator, also nothing. The Word file is larger than other I have converted (3.9mb) but don't think too large to cause a problem. (?) I spent numerous hours this evening readying this file for ePub conversion, stripping away a lot of the formatting and so forth. And now it isn't working! I cannot attach the file here because it is a .docx file, not a .doc, and the file uploader says it is an invalid file type for upload. I can convert and upload a .doc version, if requested. It is written in unicode Tibetan fonts, two of them specifically, but iIf you don't have them it should be legible using MS Himalaya or whatever Apple's default Tibetan font is. (In my ePub I have chosen to embed all the fonts used. These are the exact same fonts I used in my previous conversions which did work). Any thoughts or ideas? Thanks! The error log: Code:
calibre, version 2.42.0 (win32, isfrozen: True) Conversion Error: Failed: Convert book 1 of 1 (འདུལ་བ་མདོ་རྩ་བའི་མཆན་འགྲེལ།) Convert book 1 of 1 (འདུལ་བ་མདོ་རྩ་བའི་མཆན་འགྲེལ།) Resolved conversion options calibre version: 2.42.0 {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0.0, 'book_producer': None, 'change_justification': u'original', 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., '\\s*((chapter|book|section|part)\\s+)|((prolog|prologue|epilogue)(\\s+|$))', 'i')) or @class = 'chapter']", 'chapter_mark': u'both', 'comments': None, 'cover': u'C:\\Users\\Gyalten\\AppData\\Local\\Temp\\calibre_c4ydw1\\wiykiq.jpeg', 'debug_pipeline': None, 'dehyphenate': True, 'delete_blank_paragraphs': True, 'disable_font_rescaling': False, 'docx_no_cover': False, 'docx_no_pagebreaks_between_notes': True, 'dont_split_on_page_breaks': False, 'duplicate_links_in_toc': False, 'embed_all_fonts': True, 'embed_font_family': u'Monlam Uni OuChan2', 'enable_heuristics': False, 'epub_flatten': False, 'epub_inline_toc': False, 'epub_toc_at_end': False, 'expand_css': False, 'extra_css': None, 'extract_to': None, 'filter_css': u'', 'fix_indents': True, 'flow_size': 260, 'font_size_mapping': None, 'format_scene_breaks': True, 'html_unwrap_factor': 0.4, 'input_encoding': u'utf-8', 'input_profile': <calibre.customize.profiles.InputProfile object at 0x000000000256ADD8>, 'insert_blank_line': False, 'insert_blank_line_size': 0.5, 'insert_metadata': False, 'isbn': None, 'italicize_common_cases': True, 'keep_ligatures': False, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0.0, 'linearize_tables': True, 'margin_bottom': 4.0, 'margin_left': 4.0, 'margin_right': 4.0, 'margin_top': 4.0, 'markup_chapter_headings': True, 'max_toc_links': 50, 'minimum_line_height': 120.0, 'no_chapters_in_toc': False, 'no_default_epub_cover': False, 'no_inline_navbars': False, 'no_svg_cover': False, 'output_profile': <calibre.customize.profiles.GenericEink object at 0x0000000002579198>, 'page_breaks_before': u'/', 'prefer_metadata_cover': False, 'preserve_cover_aspect_ratio': True, 'pretty_print': True, 'pubdate': None, 'publisher': None, 'rating': None, 'read_metadata_from_opf': u'C:\\Users\\Gyalten\\AppData\\Local\\Temp\\calibre_c4ydw1\\skigqi.opf', 'remove_fake_margins': True, 'remove_first_image': False, 'remove_paragraph_spacing': True, 'remove_paragraph_spacing_indent_size': 1.5, 'renumber_headings': True, 'replace_scene_breaks': u'', 'search_replace': '[]', 'series': None, 'series_index': None, 'smarten_punctuation': False, 'sr1_replace': None, 'sr1_search': None, 'sr2_replace': None, 'sr2_search': None, 'sr3_replace': None, 'sr3_search': None, 'start_reading_at': None, 'subset_embedded_fonts': True, 'tags': None, 'timestamp': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'toc_title': None, 'unsmarten_punctuation': False, 'unwrap_lines': True, 'use_auto_toc': True, 'verbose': 2} InputFormatPlugin: DOCX Input running on C:\Users\Gyalten\AppData\Local\Temp\calibre_c4ydw1\83jqv2.docx Python function terminated unexpectedly Error in xpath expression (Error Code: 1) Traceback (most recent call last): File "site.py", line 132, in main File "site.py", line 109, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 193, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_override File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 1051, in run File "site-packages\calibre\customize\conversion.py", line 241, in __call__ File "site-packages\calibre\ebooks\conversion\plugins\docx_input.py", line 31, in convert File "site-packages\calibre\ebooks\docx\to_html.py", line 97, in __call__ File "site-packages\calibre\ebooks\docx\fields.py", line 104, in __call__ File "xpath.pxi", line 456, in lxml.etree.XPath.__call__ (src\lxml\lxml.etree.c:147594) File "xpath.pxi", line 238, in lxml.etree._XPathEvaluatorBase._handle_result (src\lxml\lxml.etree.c:144977) File "xpath.pxi", line 224, in lxml.etree._XPathEvaluatorBase._raise_eval_error (src\lxml\lxml.etree.c:144832) lxml.etree.XPathEvalError: Error in xpath expression Last edited by BetterRed; 11-05-2015 at 03:33 PM. |
11-05-2015, 01:44 PM | #2 |
creator of calibre
Posts: 44,509
Karma: 24495778
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Simply zip up the docx file and attach that. If the file is copyrighted then use the calibre bug tracker, instead of attaching it here.
|
Advert | |
|
11-06-2015, 08:45 AM | #3 |
Junior Member
Posts: 5
Karma: 10
Join Date: Nov 2015
Device: none
|
File...
Thanks, here is the file zipped...
I am going to assume that part of the problem has to do with the fact that the file has a whole mess of styles layered one upon another.... I saved the file as a HTML file and tried to convert that, and it got to 67% (instead of 17%) but then also failed. So I spent the day entirely rebuilding the file from scratch, scrapping all of the overlapping styles and doing my best to clean it up. The resulting file is basically 1mb smaller. I am trying to convert it now, we will see how it goes, I will update... But if you can look at this first file and see if there is something else amiss, something other than the formatting and style nightmare, please let me know... Thanks again! [Attachment not approved until confirmation given that it only contains public domain text and images (or the OP owns the copyright).] Last edited by pdurrant; 11-06-2015 at 10:43 AM. |
11-06-2015, 09:33 AM | #4 | |
Junior Member
Posts: 5
Karma: 10
Join Date: Nov 2015
Device: none
|
Update
I tried again, as mentioned in my previous post, with a new version of the file that was much cleaner.
Again it failed, but the error message was much more useful this time: Quote:
The problem is that Tibetan is a complex font that does not have actual spaces. Using MS Word you can set the justification to "Thai Justification" to fix this, or you can insert zero-width breaks at every dot (every syllable is separated by a dot). I had run a macro to do this, but maybe Calibre did not pick up the zero width spaces? In any case, I can probably just add paragraph breaks at random points throughout the text, and that might be all that is needed... |
|
Tags |
tibetan font |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Epub to Docx conversion? | Hattie | Conversion | 14 | 09-13-2014 05:12 AM |
conversion from docx to epub seems to break my paragraphs | xanguera | Conversion | 2 | 07-24-2014 01:28 AM |
Conversion of Endnotes .docx to .epub | profjones | Conversion | 1 | 11-01-2013 09:05 AM |
Docx to Epub conversion error with 1.5 | dapjukebox | Calibre | 6 | 10-03-2013 09:18 AM |
Horizontal lines in DOCX to EPUB conversion. | StevieP | Conversion | 13 | 07-05-2013 04:14 AM |