03-21-2013, 08:45 PM | #1 |
Junior Member
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kobo Glo
|
Dutch Weekly Newspaper "De Groene Amsterdammer" Recipe
This is the Recipe I'd built for "De Groene Amsterdammer" (for readers with a subscription only). They publish their new edition each wednesday-evening. I set mine to thursday-morning. Have fun with it!
Code:
#!/usr/bin/env python2 # -*- coding: utf-8 -*- #Based on veezh's original recipe and Kovid Goyal's New York Times recipe and Snaab's NRC-epub recipe __license__ = 'GPL v3' __copyright__ = '2013, RealBase' ''' www.groene.nl ''' import os, zipfile import time from calibre.web.feeds.news import BasicNewsRecipe from calibre.ptempfile import PersistentTemporaryFile from calibre.ebooks.conversion.cli import main class GroeneAmsterdammer(BasicNewsRecipe): title = u'De Groene Amsterdammer' description = u'De ePub-versie van de Groene Amsterdammer' language = 'nl' lang = 'nl-NL' needs_subscription = True __author__ = 'Realbase' conversion_options = { 'no_default_epub_cover' : True } def get_browser(self): br = BasicNewsRecipe.get_browser(self) if self.username is not None and self.password is not None: br.open('https://www.groene.nl/sessie/new') print [form for form in br.forms()][1] br.select_form(nr=1) br['user_session[login]'] = self.username br['user_session[password]'] = self.password br.submit() return br def build_index(self): domain = "http://www.groene.nl" url = domain + "/deze-week.epub" #print url try: br = self.get_browser() f = br.open(url) except: self.report_progress(0,_('Kan niet inloggen om editie te downloaden')) raise ValueError('Groene van deze week nog niet beschikbaar') tmp = PersistentTemporaryFile(suffix='.epub') self.report_progress(0,_('downloading epub')) tmp.write(f.read()) f.close() br.close() tmp.close() # convert self.report_progress(0.2,_('Converting to OEB')) oebdir = self.output_dir + '/INPUT/' main(['ebook-convert', tmp.name, oebdir]) index = os.path.join(oebdir, 'content.opf') self.report_progress(1,_('epub downloaded and extracted')) return index |
03-22-2013, 12:28 AM | #2 |
creator of calibre
Posts: 44,658
Karma: 24966646
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You should not call ebook-convert like that, it wont work if ebook-convert is not in the path. If you want to run a conversion, do it like this
from calibre.ebooks.conversion.cli import main main(['ebook-convert', input_file, output_file]) |
Advert | |
|
03-22-2013, 08:44 AM | #3 |
Junior Member
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kobo Glo
|
The recipe above works in Calibre.
Mine version is just a trial and error of combining some code. But how do you suggest implenting your part of the code, to make it more clean? I tried some different ways, but it doesn't work out. Error all the way. |
03-22-2013, 10:19 AM | #4 |
creator of calibre
Posts: 44,658
Karma: 24966646
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
What error do you get?
|
03-22-2013, 01:34 PM | #5 |
Junior Member
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kobo Glo
|
I just tried to remove the 'convert'-part, since I don't understand why I use it.
(I'm a newbie, thats why ) This is the code i tried to use. Code:
#!/usr/bin/env python2 # -*- coding: utf-8 -*- #Based on veezh's original recipe and Kovid Goyal's New York Times recipe and Snaab's NRC-epub recipe __license__ = 'GPL v3' __copyright__ = '2013, RealBase' ''' www.groene.nl ''' import os, zipfile import time from calibre.web.feeds.news import BasicNewsRecipe from calibre.ptempfile import PersistentTemporaryFile from calibre.ebooks.conversion.cli import main class GroeneAmsterdammer(BasicNewsRecipe): title = u'De Groene Amsterdammer' description = u'De ePub-versie van de Groene Amsterdammer' language = 'nl' lang = 'nl-NL' needs_subscription = True __author__ = 'Realbase' conversion_options = { 'no_default_epub_cover' : True } def get_browser(self): br = BasicNewsRecipe.get_browser(self) if self.username is not None and self.password is not None: br.open('https://www.groene.nl/sessie/new') print [form for form in br.forms()][1] br.select_form(nr=1) br['user_session[login]'] = self.username br['user_session[password]'] = self.password br.submit() return br def build_index(self): domain = "http://www.groene.nl" url = domain + "/deze-week.epub" #print url try: br = self.get_browser() f = br.open(url) except: self.report_progress(0,_('Kan niet inloggen om editie te downloaden')) raise ValueError('Groene van deze week nog niet beschikbaar') tmp = PersistentTemporaryFile(suffix='.epub') self.report_progress(0,_('downloading epub')) tmp.write(f.read()) f.close() br.close() tmp.close() index = os.path.join(self.output_dir, 'metadata.opf') return index Then I get this error Code:
Download nieuws van De Groene Amsterdammer Resolved conversion options calibre version: 0.9.23 {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0, 'book_producer': None, 'change_justification': 'original', 'chapter': None, 'chapter_mark': 'pagebreak', 'comments': None, 'cover': None, 'debug_pipeline': None, 'dehyphenate': True, 'delete_blank_paragraphs': True, 'disable_font_rescaling': False, 'dont_download_recipe': False, 'dont_split_on_page_breaks': True, 'duplicate_links_in_toc': False, 'embed_font_family': None, 'enable_heuristics': False, 'epub_flatten': False, 'extra_css': None, 'extract_to': None, 'filter_css': None, 'fix_indents': True, 'flow_size': 260, 'font_size_mapping': None, 'format_scene_breaks': True, 'html_unwrap_factor': 0.4, 'input_encoding': None, 'input_profile': <calibre.customize.profiles.InputProfile object at 0x108249590>, 'insert_blank_line': False, 'insert_blank_line_size': 0.5, 'insert_metadata': False, 'isbn': None, 'italicize_common_cases': True, 'keep_ligatures': False, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0, 'linearize_tables': False, 'lrf': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'markup_chapter_headings': True, 'max_toc_links': 50, 'minimum_line_height': 120.0, 'no_chapters_in_toc': False, 'no_default_epub_cover': False, 'no_inline_navbars': False, 'no_svg_cover': False, 'output_profile': <calibre.customize.profiles.KoboReaderOutput object at 0x108249d10>, 'page_breaks_before': None, 'prefer_metadata_cover': False, 'preserve_cover_aspect_ratio': False, 'pretty_print': True, 'pubdate': None, 'publisher': None, 'rating': None, 'read_metadata_from_opf': None, 'remove_fake_margins': True, 'remove_first_image': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'renumber_headings': True, 'replace_scene_breaks': '', 'search_replace': None, 'series': None, 'series_index': None, 'smarten_punctuation': False, 'sr1_replace': '', 'sr1_search': '', 'sr2_replace': '', 'sr2_search': '', 'sr3_replace': '', 'sr3_search': '', 'start_reading_at': None, 'subset_embedded_fonts': False, 'tags': None, 'test': False, 'timestamp': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'unsmarten_punctuation': False, 'unwrap_lines': True, 'use_auto_toc': False, 'verbose': 2} Python function terminated unexpectedly: 'NoneType' object has no attribute 'rfind' InputFormatPlugin: Recipe Input running Using custom recipe <POST https://www.groene.nl/sessie application/x-www-form-urlencoded <HiddenControl(authenticity_token=*****=) (readonly)> <TextControl(user_session[login]=)> <PasswordControl(user_session[password]=)> <SubmitControl(commit=Login) (readonly)>> <POST https://www.groene.nl/sessie application/x-www-form-urlencoded <HiddenControl(authenticity_token=*****=) (readonly)> <TextControl(user_session[login]=)> <PasswordControl(user_session[password]=)> <SubmitControl(commit=Login) (readonly)>> Parsing all content... Traceback (most recent call last): File "/Applications/calibre.app/Contents/Resources/Python/lib/python2.7/site.py", line 147, in main return run_entry_point() File "/Applications/calibre.app/Contents/Resources/Python/lib/python2.7/site.py", line 116, in run_entry_point return getattr(pmod, func)() File "site-packages/calibre/utils/ipc/worker.py", line 189, in main File "site-packages/calibre/gui2/convert/gui_conversion.py", line 25, in gui_convert File "site-packages/calibre/ebooks/conversion/plumber.py", line 1018, in run File "site-packages/calibre/ebooks/conversion/plumber.py", line 1183, in create_oebbook File "site-packages/calibre/ebooks/oeb/reader.py", line 67, in __call__ File "site-packages/calibre/ebooks/oeb/base.py", line 458, in __init__ File "lib/python2.7/posixpath.py", line 96, in splitext File "lib/python2.7/genericpath.py", line 91, in _splitext AttributeError: 'NoneType' object has no attribute 'rfind' |
Advert | |
|
03-22-2013, 01:56 PM | #6 |
creator of calibre
Posts: 44,658
Karma: 24966646
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You need to put the convert code back. What you want to do is unzip the epub and return the path to the opf inside the epub from build_index(). The way to do that is either to unzip it, or convert it to oeb.
|
04-19-2013, 05:07 PM | #7 |
Junior Member
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kobo Glo
|
I retried to trie to do something with your feedback.
(I followed the same route as this guy does: http://l_uka.pentax.org.pl/calibre/biweekly.recipe ) Now the code looks like this, and it really works: Code:
#!/usr/bin/env python2 # -*- coding: utf-8 -*- #Based on veezh's original recipe and Kovid Goyal's New York Times recipe and Snaab's NRC-epub recipe __license__ = 'GPL v3' __copyright__ = '2013, RealBase' ''' www.groene.nl ''' import os, zipfile import time from calibre.web.feeds.news import BasicNewsRecipe from calibre.ptempfile import PersistentTemporaryFile from calibre.ebooks.conversion.cli import main class GroeneAmsterdammer(BasicNewsRecipe): title = u'De Groene Amsterdammer' description = u'De ePub-versie van de Groene Amsterdammer' language = 'nl' lang = 'nl-NL' needs_subscription = True __author__ = 'Realbase' conversion_options = { 'no_default_epub_cover' : True } def get_browser(self): br = BasicNewsRecipe.get_browser(self) if self.username is not None and self.password is not None: br.open('https://www.groene.nl/sessie/new') print [form for form in br.forms()][1] br.select_form(nr=1) br['user_session[login]'] = self.username br['user_session[password]'] = self.password br.submit() return br def build_index(self): domain = "http://www.groene.nl" url = domain + "/deze-week.epub" #print url try: br = self.get_browser() f = br.open(url) except: self.report_progress(0,_('Kan niet inloggen om editie te downloaden')) raise ValueError('Groene van deze week nog niet beschikbaar') self.report_progress(0,_('downloading epub')) book_file = PersistentTemporaryFile(suffix='.epub') book_file.write(f.read()) f.close() br.close() book_file.close() # convert self.report_progress(0.2,_('Converting to OEB')) oebdir = self.output_dir + '/INPUT/' main(['ebook-convert', book_file.name, oebdir]) #feed calibre index = os.path.join(oebdir, 'content.opf') return index |
11-16-2015, 07:20 AM | #8 | |
Junior Member
Posts: 1
Karma: 10
Join Date: Nov 2015
Device: Kobo Glo HD
|
Groene Amsterdammer
Hi there,
This is an old post, but hope you can help me out with the recipe? Not working currently with Calibre! Thanks, Niels Quote:
|
|
08-12-2024, 05:38 PM | #9 |
Junior Member
Posts: 8
Karma: 10
Join Date: Aug 2021
Device: Kobo Forma
|
This thread is indeed quite old and the suggested recipe no longer works with the current Calibre. Can someone take a gander at it and update it?
|
08-13-2024, 02:39 PM | #10 |
Fanatic
Posts: 562
Karma: 82944
Join Date: May 2021
Device: kindle
|
https://www.groene.nl/weekblad
i see a list of issues.. but where are the articles? |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Make a recipe for Dutch Magazine "Groene Amsterdammer" | realbase | Recipes | 0 | 03-21-2013 08:05 PM |
Recipe for german newspaper "Berliner Zeitung" | a.peter | Recipes | 1 | 12-13-2011 04:02 PM |
Recipe for Dutch newspaper "Dagblad van het Noorden" | reijndert | Recipes | 2 | 05-18-2011 08:52 AM |
Recipe for Dutch newssite "Hallo Assen" | reijndert | Recipes | 0 | 04-13-2011 03:12 PM |
Calibre recipe for daily Portuguese newspaper "Correio da Manhã" | jmst | Recipes | 2 | 11-01-2010 02:01 PM |