Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 03-20-2010, 02:17 PM   #1
pedz
Nameless Being
 
Load of html debugging

I was able to load and convert a small html file with Calibre but I could not even load this html file: HTML 5

So far, no reader really likes that file. But is there any way to get some type of debug from Calibre so I can find the section it does not like and remove it?

Thank you,
Perry
  Reply With Quote
Old 03-20-2010, 10:45 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,509
Karma: 24495778
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Conversion logs are available by clicking the rotating spinner in the bottom right corner.
kovidgoyal is offline   Reply With Quote
Advert
Old 03-23-2010, 12:17 PM   #3
pedz
Nameless Being
 
Either you misunderstood me or I am misunderstanding you.

The error is not when I am doing the conversion. It is when I am "adding" the book. e.g. I go to Add Book and click the top menu item. Then pick the html source and hit Open. I get a dialog box that says "Adding..." and it is blank except for "Add...". After a few minutes, it stops with an error message that reads: ERROR: Adding failed: The add books process seems to have hung. Try restarting calibre and adding the books in smaller increments, until you find the problem book.

The html I am trying to load is pointed to by my original post. I'm happy to try and help debug this.
  Reply With Quote
Old 03-23-2010, 02:04 PM   #4
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by pedz View Post
The error is not when I am doing the conversion. It is when I am "adding" the book.
I looked at this, and like you, I had trouble adding it. My first thought was that Calibre was hanging when reading metadata from the file, so I turned on the option to get metadata from the filename. That didn't help. I don't know if Calibre still opens the file even if that option is set.

My second thought is that Calibre will zip up the file and all linked images, etc. it needs. I suspect that's where it's hanging - trying to find additional files it needs. Why don't you try to zip it up manually, then add the zip file to Calibre. If that works, try to view it.

If all that works, you could try splitting the html up into pieces to find the piece causing the problem. If there's a log that shows errors found during the add process, I don't know where it's located.
Starson17 is offline   Reply With Quote
Old 03-23-2010, 02:23 PM   #5
pedz
Nameless Being
 
Progress...

I wrote a script to remove all the comments. The comments being defined as:

<!-- xxxxx -->

I can now Add the file. When I try and generate the eBook, I get this error message
  Reply With Quote
Advert
Old 03-23-2010, 02:33 PM   #6
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by pedz View Post
Progress...

I wrote a script to remove all the comments. The comments being defined as:

<!-- xxxxx -->

I can now Add the file. When I try and generate the eBook, I get this error message
Take a look at the EPUB conversion output options. There's a split size parameter there that IIRC was in the 280 range. Your log indicates a split failure that's just a bit larger than that in the 300 range. I have no idea if this is even the same parameter, but try increasing it to 400 and see if that helps. It will only take a moment to test.
Starson17 is offline   Reply With Quote
Old 03-23-2010, 02:42 PM   #7
pedz
Nameless Being
 
Ok thanks.

Can you help me understand the debug output? e.g. Split point: {http://www.w3.org/1999/xhtml}div /*/*[2]/*[4]

I'm thinking may be I can modify the source as an alternative but I'm not understanding where that is pointing me to.

Thanks again.
  Reply With Quote
Old 03-23-2010, 02:58 PM   #8
pedz
Nameless Being
 
Quote:
Originally Posted by Starson17 View Post
Take a look at the EPUB conversion output options. There's a split size parameter there that IIRC was in the 280 range. Your log indicates a split failure that's just a bit larger than that in the 300 range. I have no idea if this is even the same parameter, but try increasing it to 400 and see if that helps. It will only take a moment to test.
Ok. The input file is 4Meg so I moved this up to 5Meg and it got through the processing and I'm able to view it in the viewer on my computer (Mac). I forgot my "nook" at home so I can not see if the nook can open it but from the comments of that parameter, I fear that it will not be able to.

Perhaps Kovid can comment on how best to set this option for huge html files and what exactly is the "splitting" doing.
  Reply With Quote
Old 03-23-2010, 11:58 PM   #9
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,867
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by pedz View Post
Ok. The input file is 4Meg so I moved this up to 5Meg and it got through the processing and I'm able to view it in the viewer on my computer (Mac). I forgot my "nook" at home so I can not see if the nook can open it but from the comments of that parameter, I fear that it will not be able to.
This setting is reader specific. Wallcraft talks about Sony's limit for the PRS-505.

Quote:
Originally Posted by wallcraft View Post
For most authors, Adobe DE's requirement (for the Sony and, eventually, other handheld devices) of < 300 KB per ePub XML file is easily met by using one file per chapter.
Inside each epub (which is just a zip file at heart) are the xml files that make up the book. If any 1 segment is larger then 300k the PRS-505 will choke on the book. Handling the book in chunks makes it easier for low powered devices to process.

Let us know if this works OK on your Nook.
DoctorOhh is offline   Reply With Quote
Old 03-24-2010, 12:18 AM   #10
pedz
Nameless Being
 
I'm still having trouble. You are correct. If I set the value too big, the nook can not open the book.

I've switched to the command line interface so I can repeat my tests. My script current looks like:

Code:
#!/bin/sh

/usr/bin/ebook-convert \
  out.html \
  test.epub \
  -v -v \
  --output-profile nook \
  --max-levels 0 \
  --flow-size 300 \
  --chapter '//*[name()='h2' or name()='h3']' \
  --chapter-mark pagebreak \
  --page-breaks-before '//*[name()='h2' or name()='h3']' \
  --level1-toc '//h:h2' \
  --level2-toc '//h:h3' \
  --level3-toc '//h:h4' \
  --language en \
  --authors "Ian Hickson, David Hyatt, et. al." \
  --pubdate "$( date '+%b %d, %Y' )" \
  --publisher WhatWg.org
I run the above script and then I unzip the test.epub file into its own directory. The files range up to 343153 bytes. 300 * 1024 is only 307200.

The other problem is I am only getting a total of 40 files but if I grep the source for h2 and h3 tags, I hit about 126 of them. So, I'm not understanding how to force things into smaller pieces.

If I remove the flow-size option, the converter dies with a "tree" that is too big.
  Reply With Quote
Old 03-24-2010, 12:30 AM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,509
Karma: 24495778
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You need a flow size limit of 260 A tree too big error means somewhere in your html files is one with a lot of unstructured text that calibre cannot find a decent point to split at.
kovidgoyal is offline   Reply With Quote
Old 03-24-2010, 12:30 AM   #12
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,867
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
I am at a loss. the only time I ran into this I was fortunate to be able to use a different source file.

You might want to review this thread over in the Sigil forum. Sigil is an epub editor and it sounds like they are talking about the same subject.

Good Luck!
DoctorOhh is offline   Reply With Quote
Old 03-24-2010, 02:20 PM   #13
pedz
Nameless Being
 
Quote:
Originally Posted by kovidgoyal View Post
You need a flow size limit of 260 A tree too big error means somewhere in your html files is one with a lot of unstructured text that calibre cannot find a decent point to split at.
Does the debug give me a clue as to where this point in the source is?

I'm trying to make each h2 or h3 start a new file but I can't seem to do that.
  Reply With Quote
Old 03-24-2010, 11:28 PM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,509
Karma: 24495778
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
look at the conversion log, it contains plenty of information about what file is currently being split
kovidgoyal is offline   Reply With Quote
Old 03-27-2010, 01:30 AM   #15
pedz
Nameless Being
 
So, I finally have all my parts less than 260K but the book still will not come up in the nook. When I hit the open button, it flashes a few times and then goes back to the list of books.

Any suggestions of how to debug this at this point?

Thank you
  Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Merging multiple HTML files into one HTML file skoobwoman Workshop 45 07-11-2014 11:46 AM
HTML load times drastically slower motormanjh Calibre 5 10-14-2010 09:44 PM
Calibre Recipe HTML content differs from raw html of index.html. krunk Calibre 4 09-20-2010 10:48 PM
PRS-500 Tools for debugging javascript ...? Clemenseken Sony Reader Dev Corner 6 05-03-2008 03:51 PM
iLiad Debugging and the iLiad scotty1024 iRex Developer's Corner 2 10-23-2006 04:43 PM


All times are GMT -4. The time now is 07:41 PM.


MobileRead.com is a privately owned, operated and funded community.