|
|
Thread Tools | Search this Thread |
10-13-2023, 02:00 AM | #1 |
Fanatic
Posts: 540
Karma: 82944
Join Date: May 2021
Device: kindle
|
html fetched by calibre is different from what I see in browser (js disabled)
html content fetched in browser (with js disabled) is different from what is sent to calibre when fetching recipe.
How do I get the same html content in calibre as seen in a js disabled browser? How is their server able to detect & send completely different content to a calibre bot? This is the recipe Spoiler:
an example of html that I get in calibre. Spoiler:
I don't even need to set auto_cleanup = True I also used browser user_agent = 'common_words/based' and still get this simplified html content. how do i set up get_browser to look like a firefox! |
10-13-2023, 06:38 AM | #2 |
creator of calibre
Posts: 44,423
Karma: 24044628
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
There can be a lot of things a site can use from the ssl handshake algorithms, to http request headers. You can visit the site in a browser with developer tools and see exactly what request headers are sent and mimic that in the recipe. That might work.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
? about html entries in Book Browser | Gregg Bell | Sigil | 6 | 05-09-2013 09:28 PM |
Local html in browser? | mm5 | iRiver Story | 2 | 02-16-2012 06:43 PM |
Calibre not emailing fetched news to Kindle | pierda | Calibre | 1 | 12-12-2010 08:53 PM |
Calibre Recipe HTML content differs from raw html of index.html. | krunk | Calibre | 4 | 09-20-2010 09:48 PM |