Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 01-18-2014, 12:44 PM   #76
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
Quote:
Originally Posted by theducks View Post
<br> (and others) is no close needed

To help parse and to avoid confusion, All tags must close. Some can self close like:
Code:
<br />
<hr />
Thanks for the clarification. Ah, "self-closing tag". Nice to learn the correct term.

Yeah, I did know about most html tags needing a closing tag. To my mind self-closing tags mixed with non-self-closing tags in search/replace scenarios are quite confusing.

Last edited by unboggling; 01-18-2014 at 01:06 PM.
unboggling is offline   Reply With Quote
Old 01-18-2014, 01:15 PM   #77
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,454
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by unboggling View Post
Thanks for the clarification. Ah, "self-closing tag". Nice to learn the correct term.

Yeah, I did know about most html tags needing a closing tag. To my mind self-closing tags mixed with non-self-closing tags in search/replace scenarios are quite confusing.

Why does it matter to a S&R if they are mixed in

BTW I have seen a non-pretty version, so a S&R REGEX for a BR
<br\s*/> no class
Calibre assigns a class (and includes the space)

<br class="calibre\d+" />
theducks is offline   Reply With Quote
Advert
Old 01-18-2014, 01:18 PM   #78
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
Quote:
Originally Posted by theducks View Post

Why does it matter to a S&R if they are mixed in

BTW I have seen a non-pretty version, so a S&R REGEX for a BR
<br\s*/> no class
Calibre assigns a class (and includes the space)

<br class="calibre\d+" />
It's that simple? You just replace <br\s*/> with <br class="calibre\d+" />

Did you mean either of those are the Search regex, depending on whether book was previously converted by calibre or not?

What is the Replace regex to end up with
</p> closing prior paragraph and <p> opening next paragraph? Or whatever else is the best way to do it?

(I'm unskilled with regex too )

Last edited by unboggling; 01-20-2014 at 06:03 AM. Reason: undeleted a previously deleted previous strike-out edit. (Because it was quoted below.)
unboggling is offline   Reply With Quote
Old 01-18-2014, 01:51 PM   #79
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,454
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by unboggling View Post
It's that simple? You just replace <br\s*/> with <br class="calibre\d+" />

Did you mean either of those are the Search regex, depending on whether book was previously converted by calibre or not?

What is the Replace regex to end up with
</p> closing prior paragraph and <p> opening next paragraph? Or whatever else is the best way to do it?
Here is one that finds either (and does not count as a capture

Code:
<br(?:\sclass="calibre\d+")*\s*/>
NB took me a while, I forgot to include the space before the optional calibre
theducks is offline   Reply With Quote
Old 01-18-2014, 01:55 PM   #80
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,454
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
UB

Code:
</p> <p class="current">
is the replace You need to customize the opening P class (orange) to match current P usage
theducks is offline   Reply With Quote
Advert
Old 01-18-2014, 02:18 PM   #81
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
Quote:
Originally Posted by theducks View Post
UB

Code:
</p> <p class="current">
is the replace You need to customize the opening P class (orange) to match current P usage
theducks, thank you!!! I finally understand. My confusion about that one of the stumbling blocks to fixing books with tools like Sigil or Edit Book or simple text editor.



(( And it was so easy to fix that in an RTF opened in MS Word, in "Advanced Find & Replace" usually just replace ^l with ^p^t (replace linefeed with paragraph tab), then save as DOCX.

Last edited by unboggling; 02-12-2014 at 12:09 PM. Reason: clarify
unboggling is offline   Reply With Quote
Old 01-20-2014, 04:57 AM   #82
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
Quote:
Originally Posted by LadyKate View Post
...One of the first steps after clearing up the excess spans and font settings in an html file is to check for paragraph markings.

A book that has <br> or <br/> as a means of separating paragraphs of text is not going to allow you to use calibre to indent the first line of paragraphs....
I see what LadyKate meant about excess span and font tags in some old formats.

I've been looking at some raw book formats: unfixed original downloaded files that I kept separately outside of calibre. Specifically these were original format files that I copied into calibre way back when, then had fixed the calibre copy with the method RTF -> Word (advanced find & replace) -> DOCX.

I added the raw unfixed originals into calibre again as separate duplicate records, converted them to EPUB, and looked at them in Edit Book.

So I saw what LadyKate was talking about. For the most part these formats were extravagantly riddled with excess span and font tags. (That was boggling. I didn't even try to fix them in Edit Book, hadn't a clue where to start. There seemed to be more html tags than content text.) So, like LadyKate said, that usage of spans is another common thing, in addition to the break tag instead of paragraph tags thing. In the past habitually fixing things in RTF in Word, the specific nature of the html problems had been invisible to me.

In the conversion of original to EPUB, calibre had added its own classes to that span mishmash as best it could. Which seemed to make the span multitude harder to deal with.

But I'm just starting to learn about this stuff on the html side. And don't really know what I'm doing there yet.

Meanwhile, I was really looking for an old raw file with a lot of break tags so I could play with theduck's search/replace regex in html editor or Edit Book. Didn't find any of those, got distracted by the formats with span problem.

Last edited by unboggling; 01-20-2014 at 12:01 PM. Reason: minor clarification.
unboggling is offline   Reply With Quote
Old 01-25-2014, 11:52 AM   #83
LadyKate
Fanatic
LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.
 
Posts: 515
Karma: 1470724
Join Date: Jul 2013
Location: Quebec CA
Device: android 4 (samsung tablet and asus tablet)
[QUOTE=unboggling;2742113]I'm confused.

I'm not much good with html and usually don't fix books by messing with html tags, but I would've thought I'd want most of those <br> tags replaced with </p><p>

I thought <br> didn't have a closing tag? Or is <br/> an alternate form of <br> ?

First, in XHTML all tags need to be closed. So <br> becomes <br />

Now the search string I use is to find a break <br> followed by lowercase alphabetic. That will indicate a break that is not a paragraph marker but just someone putting the hard return because they want the line to "look pretty" lol.

I use regex often in searching for patterns that indicate the line end is not a paragraph end before I put in the paragraphs,
LadyKate is offline   Reply With Quote
Old 01-25-2014, 12:05 PM   #84
LadyKate
Fanatic
LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.
 
Posts: 515
Karma: 1470724
Join Date: Jul 2013
Location: Quebec CA
Device: android 4 (samsung tablet and asus tablet)
Quote:
Originally Posted by unboggling View Post
I see what LadyKate meant about excess span and font tags in some old formats.

I've been looking at some raw book formats: unfixed original downloaded files that I kept separately outside of calibre. Specifically these were original format files that I copied into calibre way back when, then had fixed the calibre copy with the method RTF -> Word (advanced find & replace) -> DOCX.

I added the raw unfixed originals into calibre again as separate duplicate records, converted them to EPUB, and looked at them in Edit Book.

So I saw what LadyKate was talking about. For the most part these formats were extravagantly riddled with excess span and font tags. (That was boggling. I didn't even try to fix them in Edit Book, hadn't a clue where to start. There seemed to be more html tags than content text.) So, like LadyKate said, that usage of spans is another common thing, in addition to the break tag instead of paragraph tags thing. In the past habitually fixing things in RTF in Word, the specific nature of the html problems had been invisible to me.

In the conversion of original to EPUB, calibre had added its own classes to that span mishmash as best it could. Which seemed to make the span multitude harder to deal with.

But I'm just starting to learn about this stuff on the html side. And don't really know what I'm doing there yet.

Meanwhile, I was really looking for an old raw file with a lot of break tags so I could play with theduck's search/replace regex in html editor or Edit Book. Didn't find any of those, got distracted by the formats with span problem.
The problem is with the way word processors (and rtf editiors are just another form of word processor) work.

I have not seen a word processor since the days of the old dos versions of WordPerfect that shows the codes inserted to change the look and feel of the document created.

Every time you make a change even if you don't complete it, a code is inserted. You change to italic, change your mind, remove the two characters typed, change the color etc. and it leaves more font changes, spans, color changes etc than text.

While you only see the result of all these changes in a WYSIWYG word processor or web page creator, they are only as good as the underlying code.
LadyKate is offline   Reply With Quote
Old 01-25-2014, 12:06 PM   #85
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,454
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
[QUOTE=LadyKate;2747763]
Quote:
Originally Posted by unboggling View Post
I'm confused.

I'm not much good with html and usually don't fix books by messing with html tags, but I would've thought I'd want most of those <br> tags replaced with </p><p>

I thought <br> didn't have a closing tag? Or is <br/> an alternate form of <br> ?

First, in XHTML all tags need to be closed. So <br> becomes <br />

Now the search string I use is to find a break <br> followed by lowercase alphabetic. That will indicate a break that is not a paragraph marker but just someone putting the hard return because they want the line to "look pretty" lol.

I use regex often in searching for patterns that indicate the line end is not a paragraph end before I put in the paragraphs,
<br> is (like <hr> ) assumed closure (there was no </br> and probably made parsing into a headache as html evolved

<br /> is proper
theducks is offline   Reply With Quote
Old 01-25-2014, 02:55 PM   #86
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
This recent discussion highlights a fundamental difference in approaches to fixing formatting problems in ebooks. These approaches seem both skill-driven and assumption-driven.

One assumption is that fixing formatting problems with regular expressions at code level is the best approach. I've noticed this is an assumption common to programmers, web designers, ebook designers, advanced calibre users. Particularly when technical knowledge and skills (regular expressions, HTML/XHTML, CSS) are at the higher end of the learning curve, it is easy to make the assumption. I guess that demographically this is a small (but vocal) minority of the total calibre user population.

A different assumption is that fixing formats above code level in a word processor works well enough. Particularly when technical knowledge and skills (regular expressions, HTML/XHTML, CSS) are at the lower end of the learning curve, it is easy to make this assumption. I made this assumption. I extrapolate that some other calibre users share this assumption, and guess that demographically this is a larger (but quieter) minority of the total calibre user population.

Consider those span-riddled original formats I looked at the other day. I had eliminated annoying formatting problems from copies of them a couple years ago with the method: EPUB -> RTF -> fix in Word or Open Office Writer -> DOCX or ODT -> EPUB. About 3 minutes time each. Two years later, having learned a lot since then, looking at the morass of HTML and XHTML and CSS tags in those span-plagued original formats, it seems that in Edit Book now it would take much longer to fix each format at code level, even if I knew how. Same with fixing them at code level outside calibre in a programmer-oriented editor.

So the first conversion to RTF blew away the ToC links — so what? — that's quickly fixable in calibre ToC Editor after conversion to EPUB, if not fixed already by the conversion-applied XPath expression. So the "fixed" EPUB contains unnecessary tags I didn't see while editing with word processor, and is larger in filesize than if it had been fixed cleanly at code level — so what? — I don't see those unnecessary tags when reading the book, and sufficient cheap storage is available to accommodate larger files.

Assumptions aside, approach and method to fix formatting problems depend on need, constrained by knowledge/skill level. From the point of view of an ebook consumer reading for enjoyment, I would ignore the technical aspect of ebooks, except for the need to fix formatting problems that annoy me, the quickest way possible at my current knowledge/skill level. From the point of view of an ebook designer, maybe I would want the underlying code to be clean.

But I'm not an ebook designer. I'm an ebook consumer, who likes reading books more than fixing books.

Last edited by unboggling; 02-01-2014 at 02:29 AM. Reason: clarify, change to more precise or correct technical terms, fix typos.
unboggling is offline   Reply With Quote
Old 01-25-2014, 09:37 PM   #87
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
@LadyKate and theducks, thank you for the clarifications.

I'd completely forgotten about HTML vs XHTML.


Last edited by unboggling; 01-25-2014 at 09:42 PM.
unboggling is offline   Reply With Quote
Old 01-31-2014, 11:08 PM   #88
LadyKate
Fanatic
LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.
 
Posts: 515
Karma: 1470724
Join Date: Jul 2013
Location: Quebec CA
Device: android 4 (samsung tablet and asus tablet)
Quote:
Originally Posted by unboggling View Post
This recent discussion highlights a fundamental difference in approaches to fixing formatting problems in ebooks. These approaches seem both skill-driven and assumption-driven.

One assumption is that fixing formatting problems with regular expressions at code level is the best approach. I've noticed this is an assumption common to programmers, web designers, ebook designers, advanced calibre users. Particularly when technical knowledge and skills (regular expressions, HTML/XHTML, CSS) are at the higher end of the learning curve, it is easy to make the assumption. I guess that demographically this is a small (but vocal) minority of the total calibre user population.

A different assumption is that fixing formats above code level in a word processor works well enough. Particularly when technical knowledge and skills (regular expressions, HTML/XHTML, CSS) are at the lower end of the learning curve, it is easy to make this assumption. I made this assumption. I extrapolate that some other calibre users share this assumption, and guess that demographically this is a larger (but quieter) minority of the total calibre user population.

Consider those span-riddled original formats I looked at the other day. I had eliminated annoying formatting problems from copies of them a couple years ago with the method: EPUB -> RTF -> fix in Word or Open Office Writer -> DOCX or ODT -> EPUB. About 3 minutes time each. Two years later, having learned a lot since then, looking at the morass of HTML and XHTML and CSS tags in those span-plagued original formats, it seems that in Edit Book now it would take much longer to fix each format at code level, even if I knew how. Same with fixing them at code level outside calibre in a programmer-oriented editor.

So the first conversion to RTF blew away the ToC links — so what? — that's quickly fixable in calibre ToC Editor after conversion to EPUB, if not fixed already by the conversion-applied XPath expression. So the "fixed" EPUB is riddled with unnecessary tags I didn't see while editing with word processor, and is larger in filesize than if it had been fixed cleanly at code level — so what? — I don't see those unnecessary tags when reading the book, and sufficient cheap storage is available to accommodate larger files.

Assumptions aside, approach and method to fix formatting problems depend on need, constrained by knowledge/skill level. From the point of view of an ebook consumer reading for enjoyment, I would ignore the technical aspect of ebooks, except for the need to fix formatting problems that annoy me, the quickest way possible at my current knowledge/skill level. From the point of view of an ebook designer, maybe I would want the underlying code to be clean.

But I'm not an ebook designer. I'm an ebook consumer, who likes reading books more than fixing books.
I find the spans and "hard page breaks" totally irritating.

I usually do a total cleanup for favorite authors lol but seem to have quite a problem doing a conversion from pdf or whatever without touching on correcting them at least a little.

One disappointing thing is that HTML BOOK FIXER while it removes all the spans etc also removes the formatting of <span class="italic"> or bold or whatever lol. Sometimes I wonder how important those italics are versus an unreadable book.
LadyKate is offline   Reply With Quote
Old 02-01-2014, 07:44 AM   #89
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
Quote:
Originally Posted by LadyKate View Post
I find the spans and "hard page breaks" totally irritating.

I usually do a total cleanup for favorite authors lol but seem to have quite a problem doing a conversion from pdf or whatever without touching on correcting them at least a little.

One disappointing thing is that HTML BOOK FIXER while it removes all the spans etc also removes the formatting of <span class="italic"> or bold or whatever lol. Sometimes I wonder how important those italics are versus an unreadable book.
I only clean up formatting problems that will affect readability or annoy me more than mildly. That is the minimum standard of formatting quality for books allowed to remain in my library. That includes books by all authors, even favorite authors. I usually ignore other problems, which reduces clean up time considerably. Like I said, I'd rather read than fix formatting.

I fix hard page breaks if they affect readability, such as a page break after each ToC item, or a page break between "Chapter n" and an associated chapter title. I get rid of the offending page breaks in RTF or DOCX in Word by replacing "^m" with nothing, and let calibre insert page breaks automatically during conversion later. Is fixing page breaks more complicated in HTML/XHTML? They should be handled in CSS, yes?

I have PDFs only if I could find no better format, all nonfiction used infrequently for reference rather than reading start-to-finish. I prefer to be annoyed at their headers/footers than spend time eliminating them and other problems after conversion from PDF, so I don't bother to convert nonfiction PDFs. They are the only exception to my "no more than mildly annoying" rule. I have no fiction PDFs anymore. I gradually replaced them with better formats instead of converting and fixing them, except for a few I fixed that were unavailable in different format. At present PDFs are an infrequent annoyance.

I just checked statistics in my library. Less than 0.5% of the book formats are rated "mildly annoying", and 75% of those are advance reader copies with no specific annoying formatting, rated "mildly annoying" on general principle. "Mildly annoying" is the worst rating currently in the library, excluding placeholders with no format. Every other book format is rated "no annoyance". Less than 0.1% of the book formats are PDF; I rate them on relative annoyance of specific formatting problems, with a little slack due to unavailability of better formats, and ignore my strong annoyance at the mere existence of PDFs in my library. All other formats are EPUB.

What is mildly annoying to me may not annoy someone else. Or what doesn't annoy me may annoy someone else. Or any of various possible formatting quirks or problems may spark (in any of different people) anger, rage, or despair that drives a fix-formatting frenzy or, almost inconceivably, a retreat to paper books. But paper books may have formatting problems too, such as folded mis-cut corners or pages with blurred ink. An alternative is audio books, but what if the narrator conveys inappropriate emotion at inopportune moments, or inadvertently skips or misreads a few important words or sentences, or the volume level fluctuates between barely audible and loud?

Apparently there is no reprieve from either (1) suffering negative emotion in response to perceived/judged problems in formatting, or (2) fixing formatting to reduce the frequency and intensity of formatting-instigated negative emotion.

Last edited by unboggling; 02-02-2014 at 05:27 PM. Reason: clarify; statistics; page breaks; rambling.
unboggling is offline   Reply With Quote
Old 02-07-2014, 10:48 PM   #90
LadyKate
Fanatic
LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.LadyKate ought to be getting tired of karma fortunes by now.
 
Posts: 515
Karma: 1470724
Join Date: Jul 2013
Location: Quebec CA
Device: android 4 (samsung tablet and asus tablet)
Quote:
Originally Posted by copyrite View Post
I bet, theducks, your pretty kitty was not active on FidoNet. Woof!

(I am showing my age.)
Had to comment. I remember setting up a fidonet on my first ibm clone with dual floppies and no hard drive. (and the first hard drive I bought for more than I spent recently on my 4TB usb was only 10meg lol).

Sorry all, just feeling my age here
LadyKate is offline   Reply With Quote
Reply

Tags
calibre workflow, ebook management strategy, ebook management workflow


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PRS-T1 Manage Collections in Calibre (Help!) FatCat0 Sony Reader 19 08-11-2012 01:00 PM
How to find & manage ebooks from various apps? rapidlanguage Library Management 3 01-06-2012 09:13 AM
Development Using Calibre to manage eDGe library mrspaceman enTourage Archive 76 05-12-2011 01:38 PM
Neo How to manage ebooks? ivanm BeBook 11 08-19-2010 12:01 PM
How do you manage your read queue with ebooks? DuncanWatson General Discussions 7 05-14-2010 02:30 PM


All times are GMT -4. The time now is 07:46 AM.


MobileRead.com is a privately owned, operated and funded community.