Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-01-2024, 08:28 AM   #1
wurbl
Junior Member
wurbl began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jan 2024
Device: Kindle
The Times and Sunday Times UK

There are several issues with the recipe for scraping this newspapers, as I think there have been some changes to the way the website works/and is structured.

(1) not including the full article, (2) random bold writing saying 'Sponsored', (3) the related articles section should be removed or reformatted, (4) duplication and wrongly formatted byline and date, (5) separating the byline from the article summary, (6) separating and distinguishing the caption in italics

The main issue is not including the full article, which I think is because they have changed their login page from 'login.thetimes.co.uk' to 'account.thetimes.co.uk'; which makes it harder to scrape. The other issues can probably be solved by updating the recipe to solve the formatting issues, but I am not familiar with this. Has anyone made a fix for any of these probems?
wurbl is offline   Reply With Quote
Old 01-02-2024, 06:04 AM   #2
unkn0wn
Fanatic
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 562
Karma: 82944
Join Date: May 2021
Device: kindle
i can try, pm me your login details.
unkn0wn is offline   Reply With Quote
Old 06-23-2024, 11:57 AM   #3
willdixon
Junior Member
willdixon began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jun 2024
Device: Kindle
I'm still having the same problem, I've PMmed you my login details,
willdixon is offline   Reply With Quote
Old 06-29-2024, 04:31 AM   #4
unkn0wn
Fanatic
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 562
Karma: 82944
Join Date: May 2021
Device: kindle
https://github.com/kovidgoyal/calibr...00a69ec2350442

I'm getting content from archive for now.

logging in is a lot more complex, requires visiting login page, copying a link in response header, visiting that link adding login details in cookies.. so on.. too much testing.. but once we set cookies we can access content without js. Maybe someone else could try.
unkn0wn is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The Times & Sunday Times (UK) recipe articles incomplete Maxijmj Recipes 1 12-12-2023 07:41 AM
The Times & Sunday Times Rich Recipes 1 07-03-2023 05:27 PM
Fixing Sunday Times UK - ToC creation bobbysteel Recipes 1 01-01-2017 08:27 PM
Request: Sunday Times us06154 Recipes 0 07-13-2012 04:44 AM
NY Times Sunday Magazine ? MichaelMSeattle Recipes 2 11-18-2010 03:49 PM


All times are GMT -4. The time now is 02:49 PM.


MobileRead.com is a privately owned, operated and funded community.