03-16-2013, 01:41 PM | #1 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Proposed changes for news download image compression
News websites are increasingly using high-resolution images on their sites and this is causing calibre downloads to become very large. I have experimented with re-dimensioning images to fit inside device screen resolution and increasing jpeg compression to reduce the total download size.
Re-dimensioning and then compressing images by progressively reducing the quality setting until the image size falls below a threshold has a dramatic effect on download size--reductions of 60% are typical. There is generally no perceptible reduction in image quality. The threshold I used is (w*h)/16 bytes where w x h are the image dimensions in pixels. I have made the modifications required to support re-dimensioning and image compression in news.py and simple.py and these files are attached (based on the most recent calibre release). I'm hoping these changes will be incorporated. The compression will work for all output formats (MOBI, EPUB, etc.) The following five parameters have been added to BasicNewsRecipe, so they can be overridden as desired in custom recipes. My recommendation is to have compress_news_images default to True so all existing recipes will benfit from image compression without requiring modification. Code:
''' The following parameters control how the recipe attempts to minimize image sizes ''' compress_news_images = True ''' Set this to False to ignore all scaling and compression parameters and pass images through unmodified. If True and the other compression parameters are left at their default values, images will be scaled to fit in the screen dimensions set by the output profile and compressed to size at most (w * h)/16 where w x h are the scaled image dimensions. ''' compress_news_images_auto_size = 16 ''' The factor used when auto compressing jpeg images. If set to None, auto compression is disabled. Otherwise, the images will be reduced in size to (w * h)/compress_news_images_auto_size bytes if possible by reducing the quality level, where w x h are the image dimensions in pixels. The minimum jpeg quality will be 5/100 so it is possible this constraint will not be met. This parameter can be overridden by the parameter compress_news_images_max_size which provides a fixed maximum size for images. ''' compress_news_images_max_size = None ''' Set jpeg quality so images do not exceed the size given (in KBytes). If set, this parameter overrides auto compression via compress_news_images_auto_size. The minimum jpeg quality will be 5/100 so it is possible this constraint will not be met. ''' scale_news_images_to_device = True ''' Rescale images to fit in the device screen dimensions set by the output profile. Ignored if no output profile is set. ''' scale_news_images = None ''' Maximum dimensions (w,h) to scale images to. If scale_news_images_to_device is True this is set to the device screen dimensions set by the output profile unless there is no profile set, in which case it is left at whatever value it has been assigned (default None). ''' simple.py lines 145-150, 347-385 and 432-461 news.py lines 413-459 and 900-913 Last edited by nickredding; 03-16-2013 at 01:44 PM. |
03-16-2013, 02:38 PM | #2 |
creator of calibre
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Sounds useful, I dont think I want to turn it on by default, at least for now, as that would be a rather large change and I suspect that whether there is a perceptual loss of quality or not depends on lots of factors, like the screen of the device the image is being viewed on, the person doing the perceiving, the content of the image and so on. However, that may well change in the future.
I'll review and merge the code in a bit. |
Advert | |
|
03-16-2013, 04:30 PM | #3 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
OK, sounds good.
|
03-17-2013, 02:29 AM | #4 |
creator of calibre
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Merged, removing the check for .jpg in the URL since if imghdr fails to identify the image type, it is coerced to jpeg anyway.
|
03-17-2013, 02:29 AM | #5 |
creator of calibre
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Incidentally, how did you arrive at the factor of 16?
|
Advert | |
|
03-17-2013, 10:56 AM | #6 | |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Quote:
Code:
if itype not in {'png', 'jpg', 'jpeg'}: itype = 'png' if itype == 'gif' else 'jpg' im = Image() im.load(data) data = im.export(itype) if self.compress_news_images and itype in {'jpg','jpeg'}: try: data = self.rescale_image(data) except: self.log.exception('failed to compress image '+iurl) identify_data(data) else: identify_data(data) 2ND EDIT: OK I checked using the Python interpreter and itype being None will get snagged and coerced to jpeg, so I'll assume I mistook what was happening in the fog of debugging! Last edited by nickredding; 03-17-2013 at 07:21 PM. |
|
03-17-2013, 10:57 AM | #7 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
|
03-18-2013, 01:51 AM | #8 |
creator of calibre
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Can you post an image that imghdr fails with. I'd like to patch it to no longer fail with it if possible.
Hmm might be worth making the factor dynamic based on output profile's screen size. |
03-18-2013, 10:46 AM | #9 | ||
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Quote:
Quote:
|
||
03-20-2013, 02:30 AM | #10 |
creator of calibre
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I've been thinking about this patch a bit, currently if an output profile is specified it both scales to the output profile size and lowers the image quality. Is that really necessary? Wouldn't a better algorithm be
1) Scale to the output profile size 2) Lower the image quality only enough to ensure that the image size is < original_area/factor rather than scaled_area/factor |
03-20-2013, 10:55 AM | #11 | |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Quote:
Not knowing the original factor, and recognizing that jpeg is lossy, I figured compressing as little as possible while meeting the goal would be the best approach. Also, I picked a size factor that would yield a combination of reasonable quality and image size relative to image dimensions, and in my testing this is achieved (for 7" readers and tablets) with w*h/16. Therefore it wouldn't make sense to compress an image to original_area/factor if it has been rescaled because we want to acknowledge the rescaled size in setting the threshold. We would only use original_area/factor if we didn't rescale the image. |
|
03-20-2013, 02:12 PM | #12 |
creator of calibre
Posts: 43,966
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
OK, makes sense.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
HTML to EPUB - Cailbre Image Compression | Kevsgreat | Conversion | 3 | 05-21-2011 08:29 PM |
Automatic News Download | brewjono | Calibre | 3 | 10-09-2010 08:49 AM |
Setting Image width on news feed | Wiggles | Calibre | 2 | 08-13-2010 02:10 AM |
Calibre News Epub Image Scaling | grib | Calibre | 3 | 01-07-2010 06:45 AM |
Cover Image Compression. | Ichi | Calibre | 3 | 01-04-2010 06:38 AM |