Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 03-16-2013, 01:41 PM   #1
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Proposed changes for news download image compression

News websites are increasingly using high-resolution images on their sites and this is causing calibre downloads to become very large. I have experimented with re-dimensioning images to fit inside device screen resolution and increasing jpeg compression to reduce the total download size.

Re-dimensioning and then compressing images by progressively reducing the quality setting until the image size falls below a threshold has a dramatic effect on download size--reductions of 60% are typical. There is generally no perceptible reduction in image quality. The threshold I used is (w*h)/16 bytes where w x h are the image dimensions in pixels.

I have made the modifications required to support re-dimensioning and image compression in news.py and simple.py and these files are attached (based on the most recent calibre release). I'm hoping these changes will be incorporated. The compression will work for all output formats (MOBI, EPUB, etc.)

The following five parameters have been added to BasicNewsRecipe, so they can be overridden as desired in custom recipes. My recommendation is to have compress_news_images default to True so all existing recipes will benfit from image compression without requiring modification.
Code:
    
    '''
    The following parameters control how the recipe attempts to minimize image sizes
    '''
    
    compress_news_images = True
    '''
    Set this to False to ignore all scaling and compression parameters and
    pass images through unmodified. If True and the other compression
    parameters are left at their default values, images will be scaled to fit
    in the screen dimensions set by the output profile and compressed to size at
    most (w * h)/16 where w x h are the scaled image dimensions.
    '''
    
    compress_news_images_auto_size = 16
    '''
    The factor used when auto compressing jpeg images. If set to None,
    auto compression is disabled. Otherwise, the images will be reduced in size to
    (w * h)/compress_news_images_auto_size bytes if possible by reducing
    the quality level, where w x h are the image dimensions in pixels.
    The minimum jpeg quality will be 5/100 so it is possible this constraint
    will not be met.  This parameter can be overridden by the parameter
    compress_news_images_max_size which provides a fixed maximum size for images.
    '''
    
    compress_news_images_max_size = None
    '''
    Set jpeg quality so images do not exceed the size given (in KBytes).
    If set, this parameter overrides auto compression via compress_news_images_auto_size.
    The minimum jpeg quality will be 5/100 so it is possible this constraint
    will not be met.
    '''

    scale_news_images_to_device = True
    '''
    Rescale images to fit in the device screen dimensions set by the output profile.
    Ignored if no output profile is set.
    '''

    scale_news_images = None
    '''
    Maximum dimensions (w,h) to scale images to. If scale_news_images_to_device is True
    this is set to the device screen dimensions set by the output profile unless
    there is no profile set, in which case it is left at whatever value it has been
    assigned (default None).
    '''
The modified code is as follows in the attached files:

simple.py lines 145-150, 347-385 and 432-461
news.py lines 413-459 and 900-913
Attached Files
File Type: zip news_simple_py.zip (26.7 KB, 181 views)

Last edited by nickredding; 03-16-2013 at 01:44 PM.
nickredding is offline   Reply With Quote
Old 03-16-2013, 02:38 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Sounds useful, I dont think I want to turn it on by default, at least for now, as that would be a rather large change and I suspect that whether there is a perceptual loss of quality or not depends on lots of factors, like the screen of the device the image is being viewed on, the person doing the perceiving, the content of the image and so on. However, that may well change in the future.

I'll review and merge the code in a bit.
kovidgoyal is online now   Reply With Quote
Advert
Old 03-16-2013, 04:30 PM   #3
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
OK, sounds good.
nickredding is offline   Reply With Quote
Old 03-17-2013, 02:29 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Merged, removing the check for .jpg in the URL since if imghdr fails to identify the image type, it is coerced to jpeg anyway.
kovidgoyal is online now   Reply With Quote
Old 03-17-2013, 02:29 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Incidentally, how did you arrive at the factor of 16?
kovidgoyal is online now   Reply With Quote
Advert
Old 03-17-2013, 10:56 AM   #6
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by kovidgoyal View Post
Merged, removing the check for .jpg in the URL since if imghdr fails to identify the image type, it is coerced to jpeg anyway.
Not true--in my testing imghdr returned 'None' for some large jpegs. Look at the flow: None goes right past the tests.
Code:
                    if itype not in {'png', 'jpg', 'jpeg'}:
                        itype = 'png' if itype == 'gif' else 'jpg'
                        im = Image()
                        im.load(data)
                        data = im.export(itype)
                    if self.compress_news_images and itype in {'jpg','jpeg'}:
                        try:
                            data = self.rescale_image(data)
                        except:
                            self.log.exception('failed to compress image '+iurl)
                            identify_data(data)
                    else:
                        identify_data(data)
EDIT: I take that back--None should drop into the first, but in my testing it did not (??). Does None automatically fail the 'in' test?

2ND EDIT: OK I checked using the Python interpreter and itype being None will get snagged and coerced to jpeg, so I'll assume I mistook what was happening in the fog of debugging!

Last edited by nickredding; 03-17-2013 at 07:21 PM.
nickredding is offline   Reply With Quote
Old 03-17-2013, 10:57 AM   #7
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by kovidgoyal View Post
Incidentally, how did you arrive at the factor of 16?
It seems to give good results on 7" readers/tablets. I think for iPad-size 8 is a better factor because of the high screen resolution.
nickredding is offline   Reply With Quote
Old 03-18-2013, 01:51 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Can you post an image that imghdr fails with. I'd like to patch it to no longer fail with it if possible.

Hmm might be worth making the factor dynamic based on output profile's screen size.
kovidgoyal is online now   Reply With Quote
Old 03-18-2013, 10:46 AM   #9
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by kovidgoyal View Post
Can you post an image that imghdr fails with. I'd like to patch it to no longer fail with it if possible.
I'll put a trap in my development environment to find one.

Quote:
Hmm might be worth making the factor dynamic based on output profile's screen size.
Good point.
nickredding is offline   Reply With Quote
Old 03-20-2013, 02:30 AM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I've been thinking about this patch a bit, currently if an output profile is specified it both scales to the output profile size and lowers the image quality. Is that really necessary? Wouldn't a better algorithm be

1) Scale to the output profile size
2) Lower the image quality only enough to ensure that the image size is < original_area/factor rather than scaled_area/factor
kovidgoyal is online now   Reply With Quote
Old 03-20-2013, 10:55 AM   #11
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by kovidgoyal View Post
I've been thinking about this patch a bit, currently if an output profile is specified it both scales to the output profile size and lowers the image quality. Is that really necessary? Wouldn't a better algorithm be

1) Scale to the output profile size
2) Lower the image quality only enough to ensure that the image size is < original_area/factor rather than scaled_area/factor
The issue is when rescaling the image, you have to pick a compression factor (because rescaling involves decompressing, rescaling, then compressing). So what I did was rescale to quality 95 and then compress. If we could figure out what the image's original compression setting is we could rescale directly to that level then decide whether to compress further.

Not knowing the original factor, and recognizing that jpeg is lossy, I figured compressing as little as possible while meeting the goal would be the best approach.

Also, I picked a size factor that would yield a combination of reasonable quality and image size relative to image dimensions, and in my testing this is achieved (for 7" readers and tablets) with w*h/16. Therefore it wouldn't make sense to compress an image to original_area/factor if it has been rescaled because we want to acknowledge the rescaled size in setting the threshold. We would only use original_area/factor if we didn't rescale the image.
nickredding is offline   Reply With Quote
Old 03-20-2013, 02:12 PM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
OK, makes sense.
kovidgoyal is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
HTML to EPUB - Cailbre Image Compression Kevsgreat Conversion 3 05-21-2011 08:29 PM
Automatic News Download brewjono Calibre 3 10-09-2010 08:49 AM
Setting Image width on news feed Wiggles Calibre 2 08-13-2010 02:10 AM
Calibre News Epub Image Scaling grib Calibre 3 01-07-2010 06:45 AM
Cover Image Compression. Ichi Calibre 3 01-04-2010 06:38 AM


All times are GMT -4. The time now is 11:01 AM.


MobileRead.com is a privately owned, operated and funded community.