08-23-2011, 06:43 PM | #106 |
creator of calibre
Posts: 44,565
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The original error is a UnicodeEncode Error which implies that the string being encoded/decoded most definitely does not have only ascii chars.
Assuming the url template itself does not contain non ascii chars, the code I last posted will result in a URL with only ascii chars. |
08-23-2011, 07:57 PM | #107 |
Calibre Plugins Developer
Posts: 4,688
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
The URL template contains no non-ascii characters. But my reading of what you posted shows there is no encoding being done (such as to latin-1), so while the result does indeed have no non-ascii characters (to allow QUrl.fromEncoding to not blow up), it also doesn't actually give me a usable result
Frigging encodings, hate 'em, grumble. I don't understand why this is all breaking now when it used to work fine (the reason the plugin comes with it's own right-click test menu with 4 sets of data on the configuration screen is because I used it to test exactly this scenario with exactly that title in the past). |
Advert | |
|
08-23-2011, 08:18 PM | #108 |
creator of calibre
Posts: 44,565
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
final_url=template.format(title=quote(titlevar), author=quote(authorvar)).encode('ascii')
|
08-23-2011, 08:44 PM | #109 |
Calibre Plugins Developer
Posts: 4,688
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Nope... lol
It's like playing whack-a-mole. Adding .encode('ascii', 'ignore') will prevent the error from happening when supplied a name like "Nieznany wyjątek", and allow QUrl.fromEncoded to not blow up. However, it destroys the effect of .encode('latin-1') or similar that I need to do to cater for titles like "De l'inconvénient d'être né", which becomes something like: "De l'inconvnient d'tre n". Which means I have to avoid converting to ascii. Which means I can't use QUrl.fromEncoded(). Which means I can't construct a QUrl, to give to open_url(). Argggghhhhhhhhh.... (And by the way your patience is clearly better than mine - my lack of Python knowledge slays me at times). |
08-23-2011, 08:49 PM | #110 |
creator of calibre
Posts: 44,565
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I'm confused. Here is what needs to happen:
You encode the title and author in utf-8 or latin1 and then use quote on the result. This gives you an ascii title and author. You then run format on the template with the quoted title and author. This should give you an ascii url, so why is fromEncoded blowing up? |
Advert | |
|
08-24-2011, 04:19 AM | #111 |
Calibre Plugins Developer
Posts: 4,688
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
It only blows up on certain combinations. This is more of the code I originally posted (slightly adapted to try to make sense as a snippet) - it is unchanged from the zip downloadable from this thread:
Code:
encoding = 'utf-8' or 'latin-1' based on configured website URL vals = mi.all_non_none_fields() fixed_vals = {} for k in vals: fixed_vals[k] = unicode(vals[k]) # convert non-string types fixed_vals[k] = self.convert_to_search_text(fixed_vals[k], encoding) # self.convert_to_search_text() effectively does this: # text = quote_plus(text.encode(encoding, 'ignore')) # Substitute our quoted, encoded values into out tokenised_url (which "might" contain quoted chars) url = template_formatter.safe_format(tokenised_url, fixed_vals, 'STI template error', mi) # Next line blows up if Author was Nieznany wyjątek # url at this point is: http://www.google.com/#sclient=psy&q=%22Nieznany wyjÄ…tek%22+%22Unknown%22 open_url(QUrl.fromEncoded(url)) So in answer to your question - no it is not pure ascii characters by the time it reaches QUrl.fromEncoded, and it cannot be or else the original content is corrupted. Note that if I avoid QUrl completely and replace the last line above with webbrowser.open(url), then it works perfectly fine on EVERY case of inputs. The URL is already exactly how I want it to be passed to the webbrowser object. It is QUrl which is "mangling it". I can't use QUrl(url) because QUrl tries to be too clever and do it's own quote substitutions it spots of characters - I can't give it the "raw" URL, and .fromEncoded() is useless because of it's own latin-1 decoding it is doing internally that blows up for non-ascii. I also note that in some cases even though it does not blow up, it also doesn't give the right result in the web browser. QUrl sucks. So unless you have any bright ideas I am going back to webbrowser.open(). That won't please the Linux users (not that I care personally not being one of them but there are obviously a few out there). The only thing I could do is have an if is_linux statement to invoke the existing open_url() with the understanding that it doesn't work for any non-ascii names. Which is a crappy workaround. Last edited by kiwidude; 08-24-2011 at 05:14 AM. |
08-24-2011, 01:09 PM | #112 |
creator of calibre
Posts: 44,565
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The think I'm confused by is if you are doing quote_plus() on the title and author and there are no non ascii characters in the template, where are the non ascii characters coming from in the URL you pass to QUrl?
quote plus should be replacing all non ascii chars with percent encoded equivalents. |
08-24-2011, 01:31 PM | #113 |
Calibre Plugins Developer
Posts: 4,688
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@Kovid - nowhere I saw in the documentation does it say quote or quote_plus produce an ascii only output. Just that they percent encode "special characters" which I take to mean things like dashes, spaces, quotes etc. It does not appear to percent encode every non-ascii character.
|
08-24-2011, 01:46 PM | #114 |
creator of calibre
Posts: 44,565
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
URLS are supposed to have all non ascii chars encoded as UTF-8 and then percent encoded, see RFC 3986
For example: urllib.quote(u'\xe1'.encode('utf-8')) '%C3%A1' repr(u'\xe1'.encode('utf-8')) '\xc3\xa1' |
08-24-2011, 04:52 PM | #115 |
Calibre Plugins Developer
Posts: 4,688
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Ok, shouldn't have written my last reply from work and waited until I got home. What I wrote was rubbish, quote_plus is percent encoding the values.
Prior to calling the safe_format function, the title/author are sitting there in the "fixed_vals" array with values like this: Nieznany+wyj%C4%85tek However after calling safe_format, the value when substitued into the URL now looks like this: Code:
http://www.google.com/#sclient=psy&q=%22Nieznany wyjÄ…tek%22+%22Unknown%22 So perhaps we are back to pointing our finger at the safe_format function, which after all was the one thing that has changed between the allegedly working and currently not working releases. Where was the old safe_format code located? If I can find it on an old branch on my machine I could try it to see if that is the problem. |
08-24-2011, 05:25 PM | #116 |
Calibre Plugins Developer
Posts: 4,688
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Hmmm... I copied the old SafeFormat class from save_to_disk into my plugin - and everything works as you would expect it to. So it is indeed the SafeFormat class that is the cause of all the new problems for this plugin.
Now I know you made changes for thread safety - but for my plugin purposes can I just run with the "old" SafeFormat function embedded? Or is that going to cause problems with the database etc... Or alternatively can you "fix" the "new" SafeFormat function... Thanks for your help in getting to this point. |
08-24-2011, 05:53 PM | #117 |
creator of calibre
Posts: 44,565
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You'll have to wait for charles to comment on SafeFormat, that's his bailiwick.
|
08-26-2011, 11:48 AM | #118 |
Junior Member
Posts: 1
Karma: 10
Join Date: May 2011
Device: none
|
Problem when installing / updating plugin
calibre, version 0.8.15
ERROR: Installatie plugin: Er is een probleem opgetreden tijdens de installatie van deze plugin. Deze plugin zal nu worden verwijderd. Plaats de foutmelding uit de details hieronder a.u.b. bij het forumonderwerp voor deze plugin en herstart Calibre. Traceback (most recent call last): File "site-packages\calibre\gui2\dialogs\plugin_updater.py", line 731, in _install_clicked File "site-packages\calibre\gui2\preferences\plugins.py", line 385, in check_for_add_to_toolbars File "site-packages\calibre\customize\__init__.py", line 539, in load_actual_plugin AttributeError: 'module' object has no attribute 'SearchTheInternetAction' |
08-27-2011, 02:02 PM | #119 |
Member
Posts: 18
Karma: 34
Join Date: Mar 2011
Device: android
|
calibre, version 0.8.16
ERROR: Install Plugin Failed: Během instalace pluginu se objevil problém. Tento plugin bude nyní odinstalován. Prosím pošlete chybovou zprávu v detailech dole do vlákna fóra pro tento plugin a restartujte calibre. Traceback (most recent call last): File "site-packages/calibre/gui2/dialogs/plugin_updater.py", line 731, in _install_clicked File "site-packages/calibre/gui2/preferences/plugins.py", line 385, in check_for_add_to_toolbars File "site-packages/calibre/customize/__init__.py", line 539, in load_actual_plugin AttributeError: 'module' object has no attribute 'SearchTheInternetAction' |
08-27-2011, 02:42 PM | #120 |
Calibre Plugins Developer
Posts: 4,688
Karma: 2162246
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@Bojoko/vladacr - absolutely no idea what that error means, perhaps Kovid may have a suggestion. With 3,500 downloads and only you two reporting an issue I'm thinking the problem is not particularly widespread. That you are both using international versions based on the error messages above is perhaps relevant.
- What is your operating system? - Does it happen on other plugins from these forums that you try to install? |
Tags |
book details, search the internet |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Any web-to-epub plugin for internet browser? | bthoven | ePub | 7 | 07-10-2011 06:14 AM |
Fictionwise Browser Search Plugin | Zero9 | Deals and Resources (No Self-Promotion or Affiliate Links) | 17 | 07-27-2009 04:15 PM |
Diesel eBooks Browser Search Plugin | Zero9 | Deals and Resources (No Self-Promotion or Affiliate Links) | 10 | 07-27-2009 01:16 PM |
eReader.com Browser Search Plugin | Zero9 | Deals and Resources (No Self-Promotion or Affiliate Links) | 0 | 07-24-2009 10:44 PM |
BooksOnBoard Browser Search Plugin | Zero9 | Deals and Resources (No Self-Promotion or Affiliate Links) | 10 | 07-24-2009 04:27 PM |