09-23-2024, 09:04 AM | #1 |
Junior Member
Posts: 4
Karma: 10
Join Date: Sep 2024
Device: none
|
docx cross-reference hyperlinks not preserved
I have an 80 MB docx (made with Word for Mac 16.66.1) with footnotes and endnotes. When I convert to epub, cross-references (internal hyperlinks made with Insert: Cross-reference) in the body are preserved (live, working) and in footnotes and endnotes, external links are also preserved, but internal ones, aka, cross-references, are not. They come out as plain text.
Since the book has hundreds of cross-references, I need their hyperlinks preserved, as they are when I produce a PDF for electronic distribution from within Word. I've attached files that demonstrate the problem: an excerpt from the original file named "xref test.docx" as well as the input and output files from Calibre, "xref test - author.docx" and "xref test - author.epub" and the conversion log, "xref test conversion log.txt" Note: In the original, I added a superscripted "i" in front of footnote reference marks so that they will be distinguishable from endnotes in the e-book. Thank you! |
09-24-2024, 06:36 AM | #2 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I'm afraid the docx engine doesnt support use of PAGEREF fields which is what your cross references in the footnotes are. Use actual hyperlinks to elements in the document and they will work. As far as I recall PAGEREF fields are not supported in body text either.
|
09-24-2024, 09:34 AM | #3 |
Junior Member
Posts: 4
Karma: 10
Join Date: Sep 2024
Device: none
|
Thanks. Indeed, only some of the cross-references in the body of the EPUB are working, ones that are not PAGEREFs. However, there are fields in the footnotes that are not PAGEREFS and that appear to be the same kind of fields are in the body, yet they are not preserved either.
For example, in the body, “About the Book” is a cross-reference (I've underlined it to indicate that). When I toggle (view) code, it changes to "{ REF _Ref171804793 \h \* MERGEFORMAT }". This one works in the EPUB. Also in the body, "see page 92" is a cross-reference. The code is "{ PAGEREF pizza_party \h }". It is not preserved. However, in the footnotes, "See 'Is it really necessary to cook mushrooms?,' page 869" is actually two cross-references. I can't right-click to toggle codes in a footnote area, so I copy and pasted it into the body. Then when I view the code, it is "See '{ REF _Ref7362940 \h },' page { PAGEREF _Ref7362940 \h }". In the EPUB, neither cross-reference is preserved (they are no longer hyperlinked). |
09-24-2024, 11:34 AM | #4 |
creator of calibre
Posts: 44,535
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Yeah I haven't bothered adding support for fields in footnotes. It's not a common use case. Hyperlinks in footnotes do work however. And IIRC you can create hyperlinks to document elements in Word.
|
09-24-2024, 11:48 AM | #5 |
Junior Member
Posts: 4
Karma: 10
Join Date: Sep 2024
Device: none
|
ok, thank you for getting back to me. This app was otherwise working great. It's too much trouble to convert to hyperlinks right now but maybe in the future.
|
09-25-2024, 09:30 PM | #6 |
JCL Punch-Card Collector
Posts: 58
Karma: 10
Join Date: Jun 2014
Location: Antarctica
Device: Aggressively Device Independent
|
Plantalia, have you tried saving the .docx in .rtf and importing the rtf? If the document doesn't contain text boxes and some other "advanced Word" features, that might be more successful.
RTF from Word documents has been better in the past for me (last time I did this was with a Calibre 6 series release for Windows) with academic journal articles, and it's always more reliable when crossing the Great Religious Divide (Mac to/from Windows; it's a character-coding thingy that affects even Roman-standard text). One caution: If there's any complex math or triangle-type page formatting (two columns of main text but footnotes and endnotes in a single column), RTF will munge it. |
09-26-2024, 12:49 AM | #7 |
Junior Member
Posts: 4
Karma: 10
Join Date: Sep 2024
Device: none
|
Thank you Jaws. I tried RTF but it loses images, and I have over 1000 of them. It also lost footnotes and endnotes, and I have thousands of those. I'm just going to export to PDF instead.
|
09-27-2024, 05:21 AM | #8 |
JCL Punch-Card Collector
Posts: 58
Karma: 10
Join Date: Jun 2014
Location: Antarctica
Device: Aggressively Device Independent
|
Plantalia, it sounds like you're running into image-format problems — it's another area that RTF can be a bit glitchy, although I've not seen the "losing footnotes/endnotes" issue (believe me, with some of the stuff I've worked with I would have noticed that — I've been the technical editor for several academic journals in both STEM and non-STEM fields). RTF doesn't handle vector image formats, or rather Word's interface with RTF files glitches on vector image formats (LibreOffice does better but has its own, actually worse, problems with crossreferences). Oddly enough, Word's use of RTFs also has trouble with some BMPs (which tend to blow up and cross memory-sector boundaries for no apparent reason)... and with embedded HTML, which may be the real source of your problems.*
A PDF may turn out to be your best end result, but don't use any internal PDF converter or an Adobe product. (For whatever reason, Adobe's own products tend to glitch or fully choke on pages that have both multiple images and either multiple footnotes or a continued footnote.) Instead, use one of the many, many "print to PDF" utilities out there; I use Bullzip for Windows, but there are several others worth considering, and several for both the Mac and Linux-native systems. * Short version: In Microsoft's infinite wisdom, and to maintain backward compatibility with its own prior decisions back to the 90s, HTML-coded hyperlinks in Office files are not fully W3C compliant (for example, internal link names often begin with numerals, which is a no-no); they'll continue to work inside the program, but results when sent to another program are unpredictable. Which is not at all to say that W3C is perfect, just that epub readers tend to be snooty about it. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Cross-reference/index/quote/metadata | Evil Overlord | ePub | 9 | 05-16-2020 02:35 AM |
Possible to link/cross-reference book entries in a bidirectional relationship? | shayaknyc | Library Management | 2 | 07-01-2019 07:56 PM |
Cross-references in docx to mobi/epub | resolfe | Conversion | 4 | 04-29-2019 11:02 PM |
Cross reference problem in .mobi | franknun | Conversion | 0 | 01-22-2014 01:50 AM |
How to save to Epub from InDesign CS6 and preserve the Cross-Reference links | frikkif | ePub | 6 | 09-19-2012 10:44 AM |