Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 04-01-2009, 02:13 PM   #16
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by =X= View Post
Yes I would be very interested.

=X=
If you will post a before and after, I'll take a look.
Nate the great is offline   Reply With Quote
Old 04-02-2009, 12:44 AM   #17
brewt
Boo-Frickety-Hoo-Erizer
brewt will become famous soon enoughbrewt will become famous soon enoughbrewt will become famous soon enoughbrewt will become famous soon enoughbrewt will become famous soon enoughbrewt will become famous soon enough
 
brewt's Avatar
 
Posts: 251
Karma: 686
Join Date: Oct 2007
Device: Kobo Glo HD!
Did someone say Word?

(Bahoo-hoohoo-haha-haha).

"Clean" html from Word isn't all that possible. Now, this isn't to say one can't use word to produce "viable" files that can (and do) convert well into ebook formats. But "Clean"? Noo, not in my observations.

Personally, I got over being clean. I am most of the time happy to let Word mangle the styles it wants to embed as "css" into the html file all it wants. There's just waaay to much other usefulness in Word to overcome my fear of evil.

Saving the file as [Web Page, Filtered] goes a long way of extracting the extra junk word generically implants - that's all there and well and fine if you need to reconstruct an actual Word Document with all of Word's formatting tricks intact from the html file. Which isn't the usual goal here - MobiCreator, if you import a real Word Document, converts it to a filtered html file before it converts it to a mobi file.

If you just have to use CSS in Word, remember, it's css 1 only, and there are weirdnesses in css you can't construct using Word Properly (see my thread in the epub forum about using css in Word to make a drop cap work in an epub - it doesn't look like a drop cap in Word, but it works out ok in the epub). And Word is all too anxious to over-impose changes into the embedded overlay of the html file - just TRY to redefine "Normal" in css without forcing it into normal.dot and see what happens in your html file.

Unless you intend to hand-re-edit the htm file after you've made it in Word, what what do you care if it's "clean"? Is it oh-so-much smaller? Is it really worth your time? Wouldn't better metatags be more useful in the long run when the formats change (again)? Or more care toward managing your stylesets and where to use them?

pennypenny.

-bjc

p.s. be sure to check in on the word document properties - you might be surprised that Word could be embedding your work computer name, company name, logon name, things you really maybe don't want in the html meta-info. If, you know, you use work machines to do any of this.
brewt is offline   Reply With Quote
Old 04-02-2009, 03:22 AM   #18
Sweetpea
Grand Sorcerer
Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.Sweetpea ought to be getting tired of karma fortunes by now.
 
Sweetpea's Avatar
 
Posts: 9,707
Karma: 32763414
Join Date: Dec 2008
Location: Krewerd
Device: Pocketbook Inkpad 4 Color; Samsung Galaxy Tab S6
Quote:
Originally Posted by brewt View Post
Unless you intend to hand-re-edit the htm file after you've made it in Word, what what do you care if it's "clean"? Is it oh-so-much smaller? Is it really worth your time? Wouldn't better metatags be more useful in the long run when the formats change (again)? Or more care toward managing your stylesets and where to use them?
Actually, I sometimes half the size of HTML files generated by Word. So, personally, I wouldn't touch Word with a stick if I have to finish with HTML files.
Sweetpea is offline   Reply With Quote
Old 04-02-2009, 11:00 AM   #19
brewt
Boo-Frickety-Hoo-Erizer
brewt will become famous soon enoughbrewt will become famous soon enoughbrewt will become famous soon enoughbrewt will become famous soon enoughbrewt will become famous soon enoughbrewt will become famous soon enough
 
brewt's Avatar
 
Posts: 251
Karma: 686
Join Date: Oct 2007
Device: Kobo Glo HD!
Not to pick on Sweetpea, but let's try something.

In the attached test.zip are html files of Sweetpea's post conjured by copying and pasting into Word, and the resultant mobi files.

I saved them as [Full Web Page], and [Web Page, Filtered] out of Word. Sure enough, the html file for [Full Web Page] is twice as big as the [Web Page, Filtered] file.

Funny thing: When I try to open the files in a browser, in the [Full Web Page] file I can't see the picture. Same thing in the mobi files - that's why the mobi file for [Full Web Page] is smaller.

But, when I look at the html code in the filtered file, it's not too bad - the styles names are longer than [h1] etc., and since the styles straight off the web site are being expressed as modifications of existant styles on the fly, sure we could trim out some file size by defining the styles better. How much time do I have to do that? (zilch)

But give up Word because we hate evil so bad? Not a chance.....in Notepad (or vi, or textpad, ted, whatever) I get to miss out on Selecting by Style, search and replace on invisible characters like hard vs soft carriage returns (do I remember the code? is this western or unicode?), grammar check, spell check, multi-columns, tables, picture embedding by drag & drop, MACROS, automated TOCs, just to scratch the surface.

To make a toc in Notepad, I get to hand code it.
To make a table in Notepad, I get to hand code it.
To embed a picture in Notepad, I get to hand code it.
To change all instances of a style in Notepad, I get to search and replace.
I get to be the spellchecker/grammar check (semi colon rules, anyone?)

I can go on all day. I'd rather have the machine assist me through my own ineptitudes than say "oh, i don't need any help here" and do it the hard way.....being all lazy and all as I am......

-bjc
Attached Files
File Type: zip Test.zip (34.2 KB, 245 views)
brewt is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
clean HTML or PDF before mobi conversion in Calibre mark235 Calibre 9 12-25-2010 09:37 PM
BookDesigner HTML0 to clean HTML conversion utility Pablo Workshop 15 08-24-2010 12:05 PM
Clean and compress HTML before making ebook eping Workshop 4 01-13-2010 07:51 PM
Tool to easily clean and refurbish html-text before conversion Pulp Workshop 3 10-13-2008 10:16 AM
Docvert 2.0 converts MS Word files to clean HTML Alexander Turcic Lounge 0 03-16-2006 04:50 AM


All times are GMT -4. The time now is 07:39 AM.


MobileRead.com is a privately owned, operated and funded community.