04-11-2013, 11:35 PM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Apr 2013
Device: none
|
XSLT vs InDesign - best conversion process?
Hello,
Let me begin by apologizing if this sounds like a naive question, but I wanted to ask whether there are any inherent advantages of using a transform to export XML to EPUB, rather than using InDesign. It's not a simple DocBook-EPUB because the source XML is in a specific kind of XML (based on the NLM [National Library of Medicine] tag library), and thus we would need someone to develop the XSLT specifically for us. I suggested pouring the source XML into InDesign (since it's already tagged and ready to go) and exporting it that way. It seems that it would be cheaper, easier, faster, and wouldn't require anyone to take evening classes in xProc. Am I missing something? (I VERY well could be) Thanks so much for any help provided. |
04-12-2013, 04:00 AM | #2 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Just keep in mind that an ePUB exported from InDesign typically needs some touching up. The export is not always correct.
|
04-12-2013, 07:28 AM | #3 | |
Wizard
Posts: 2,304
Karma: 12587727
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
I would say the best bet is working directly from the XML -> (Transformation) -> XHTML. From there, you could insert the XHTML into an EPUB using Sigil, and do any other needed tweaks from there (hopefully the Transformation can be created to minimize/eliminate the work to be done). As long as these XML files all use the same classes, then you can create one master CSS file which can then be used in all books. As long as all these documents are tagged in a very consistent way, this should be the best bet. Would it be possible to give samples of any XML files wanting to be converted? |
|
04-12-2013, 09:38 PM | #4 |
Junior Member
Posts: 3
Karma: 10
Join Date: Apr 2013
Device: none
|
Thank you very much, you are wonderful people. I can't provide actual examples of the XML, given the proprietary nature of the content; however, I don't think any are necessary. You have both answered my question. I suppose I'm just less than comfortable working with XSLT at the moment, and didn't want to have to outsource the actual creation of the transformation, but the advice you've provided is in keeping with the existing workflow, and I have no reason to suspect I know better.
In the meantime, I will invest more time in getting comfortable with XSLT, and try and avoid interfering until I know what I'm talking about. Thanks again. |
04-13-2013, 07:22 PM | #5 |
Curmudgeon
Posts: 629
Karma: 1623086
Join Date: Jan 2012
Device: iPad, iPhone, Nook Simple Touch
|
If you are familiar with pretty much any programming language, you might find it easier to use a tree-based (DOM) XML parser, then walk the DOM tree and manipulate it in one or more passes, and write the result out to disk. It's pretty much just like manipulating HTML elements with JavaScript code, if you've ever done that.
If you don't know any programming languages, then XSLT is probably the better of those two choices. Then again, PHP, Perl, Python, and Ruby are all relatively easy to learn (with PHP being probably the most straightforward in terms of having a fairly lightweight and consistent syntax), so maybe it's time to pick up a new hobby. |
04-16-2013, 01:51 PM | #6 | |
Junior Member
Posts: 3
Karma: 10
Join Date: Apr 2013
Device: none
|
Quote:
Thanks so much for your help. |
|
04-16-2013, 06:45 PM | #7 | |
Wizard
Posts: 2,304
Karma: 12587727
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
You would then use this library to go through the XML (perhaps convert it into an array of strings), and from there, step through the array and make any changes as needed. For example, one pass, you might change all <title> into <span class="title">: Original: Code:
<title>TitleofSong</title> <artist>ArtistofSong</artist> Code:
<span class="title">TitleofSong</span> <artist>ArtistofSong</artist> After "Artist" pass: Code:
<p><span class="artist">ArtistofSong</span>: <span class="title">TitleofSong</span></p> Code:
<?xml version="1.0" encoding="utf-8" standalone="no"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title></title> <link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css" /> </head> <body> <p><span class="artist">ArtistofSong</span>: <span class="title">TitleofSong</span></p> </body> </html> Without having a sample of what XML you are working with, I can only give general (not very helpful in my view) overviews. (Perhaps someone else might be more insightful?) |
|
04-18-2013, 04:58 PM | #8 | |
Digital Amanuensis
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
|
Quote:
Plus, there are DocBook -> EPUB (via XSLT) tools around, which you might find useful to start from. For example: http://sourceforge.net/projects/docbook/files/epub3/ http://www.ibm.com/developerworks/xm.../section5.html http://en.wikibooks.org/wiki/XQuery/DocBook_to_ePub |
|
04-18-2013, 05:03 PM | #9 |
Connoisseur
Posts: 53
Karma: 10
Join Date: Aug 2012
Location: Nashville, Tn
Device: ipad, Kindle Fire
|
Code:
<?xml version="1.0" encoding="utf-8" standalone="no"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title></title> <link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css" /> </head> <body> <p><span class="artist">ArtistofSong</span>: <span class="title">TitleofSong</span></p> </body> </html> Code:
standalone="no" Code:
standalone="yes" |
04-18-2013, 05:50 PM | #10 | ||
Wizard
Posts: 2,304
Karma: 12587727
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Quote:
I am very interested in the long-term storage of ebooks, and storing them in a way which will make it (relatively) easy/automated to convert into other formats in the future. I just quickly pulled it out of an EPUB I was working on as an example to clarify the "multiple passes"... I wasn't trying to make the code as minimal as possible. |
||
04-19-2013, 11:06 AM | #11 |
Connoisseur
Posts: 53
Karma: 10
Join Date: Aug 2012
Location: Nashville, Tn
Device: ipad, Kindle Fire
|
No thats cool. I just see it as a common practice and I think people dont understand what they are doing and I just happened to see you did it and I wanted to know the logic behind it. All good!!
|
04-19-2013, 11:03 PM | #12 | |
Curmudgeon
Posts: 629
Karma: 1623086
Join Date: Jan 2012
Device: iPad, iPhone, Nook Simple Touch
|
Quote:
For example, in PHP, you can do something like this: Code:
<?php require('phpQuery/phpQuery.php'); // JQuery port to PHP $doc = new DOMDocument(); if (!$doc->loadXML(file_get_contents("/path/to/file.xml")) { // handle error exit(1); } /* Useful function */ function changeElementTagName($elt, $newTagName) { $newelt = $Document->createElement(newTagName); // Clone the element's attributes foreach($elt->attributes as $attribute) { $newelt->setAttribute($attribute->name, $attribute->value); } // Clone the element's content foreach($elt->childNodes as $child) { $newelt->appendChild($child->cloneNode(true)); } // Replace the node in the tree $elt->parentNode->replaceChild($newelt, $elt); } foreach (pq("title") as $titleelt) { changeElementTagName($titleelt, "span"); $titleelt->setAttribute("class", "title"); } $outputstring = $doc->saveXML(); print $outputstring; ?> |
|
Tags |
conversion, indesign, journal articles, xslt |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Mobigen KindleGen Conversion Process Documentation | sarafnikit | Kindle Formats | 9 | 03-29-2012 08:09 PM |
Runaway conversion process! | johnb0647 | Calibre | 3 | 02-28-2012 06:37 AM |
Trying to understand conversion process | AlexBell | Conversion | 4 | 06-16-2011 08:46 AM |
Help w/ Conversion Process | dftr | Workshop | 2 | 06-20-2009 09:33 PM |
New Conversion Process | Gideon | Kindle Formats | 2 | 02-20-2009 12:04 AM |