09-13-2015, 03:50 PM | #1 |
Sigil Developer
Posts: 7,912
Karma: 5449552
Join Date: Nov 2009
Device: many
|
New Plugin Features in the upcoming Sigil 0.8.900
New Plugin Features in the Upcoming Sigil 0.8.900 Release
There will be new features available in the upcoming release of Sigil 0.8.900 for plugin developers: These include: - launcher version returning 20150909 - spell checking using Sigil's Hunspell for both Python 2.7 and 3.4 plugins - auto correction of text/html media-types to 'application/xhtml+xml' - our own version of BeautifulSoup4 called sigil_bs4 that works on both Python 2.7 and 3.4 - support via sigil_bs4 to our version of Google's gumbo xhtml parser - all users will have access to the embedded Python 3.4 interpreter via the python 3 plugin interface without having to install anything extra. - access to a list of resources that were "selected" in the Book Browser window when the plugin was launched. In this manner you can write plugins that only work on the resource selected by the user before launching the plugin. Python 3.4 plugins will have automatics access to the following pre-installed site-packages: - PIL for image manipulation - regex for better/easier to use regular expressions - sigil_bs4 - our specially modified version of BeautifulSoup 4.4.0 - lxml - an element tree based interface to libxml2 for python users - html5lib - a pure python html5 parser - sigil_gumbo - our html5 parser based on Google's gumbo parser (via sigil_bs4) Python 2.7 plugins will have access to the following "if" a Python 2.7 interpreter exists on the user's system and is selected: - sigil_bs4 - our specially modified version of BeautifulSoup 4.4 - sigil_gumbo our html5 parser based on Google's Gumbo parser (via sigil_bs4) Here is some sample plugin.py code for using the Hunspell Spell Checker in a plugin: Code:
def run(bk): # Example of using hunspell spell checker # get a list of all of the locations that Sigil knows about where Hunspell dictionaries are installed dic_dirs = bk.get_dictionary_dirs(); # check each location to find a pre-installed dictionary in your desired language afffile = None dicfile = None for adir in dic_dirs: afile = os.path.join(adir, "en_US.aff") dfile = os.path.join(adir, "en_US.dic") if os.path.exists(afile) and os.path.exists(dfile): afffile = afile dicfile = dfile break # Load the bk.hspell class with the specified dictionary and use it to spell-check words # and make suggestions if needed if bk.hspell is not None and afffile is not None and dicfile is not None: bk.hspell.loadDictionary(afffile, dicfile) checklist = ["hello", "goodbye", "don't", "junkj", "misteak"] for word in checklist: res = bk.hspell.check(word) if res != 1: print(word, "incorrect", bk.hspell.suggest(word)) else: print(word, "correct") # clean up after yourself so that you can reuse the bk.hspell class with a different dictionary bk.hspell.cleanUp() Here is a sample of how to get information on the list of selected resources in Sigil's BookBrowser when the plugin was launched: Code:
def run(bk): # employ the selected_iter (iterator) to get a list of all files # selected in the BookBrowser by the user before the Plugin was launched for (id_type, id) in bk.selected_iter(): if id_type == "manifest": href = bk.id_to_href(id, ow=None) mime = bk.id_to_mime(id, ow=None) print(id_type, id, href, mime) else: # not a file in the manifest (other file), href to ebook root as its id print(id_type, id) which is a Sigil enhanced version of BeautifulSoup4. Note, the Sigil enhanced version of BS4 includes the ability for a single code base to work with Python 2.7 and Python 3.4, fixes bugs in how namespaces and namespace attributes are handled in the builder/_lxml.py, and extend the interface to include two new methods: serialize_xhtml() and prettyprint_xhtml() which understand a lot more about serializing and pretty printing xhtml without mistakenly adding spaces in preformatted or inline tags, and etc. The serialize_xhtml() will try to serialize your soup tree with the minimum of changes, wheres prettyprint_xhtml() will fully prettyprint your xhtml code. sigil_bs4 includes full BS4 support, but because of our changes we did not want to have collisions with the stock "bs4" python module and so we renamed it. Code:
def run(bk): # examples for using the bs4/gumbo parser to process xhtml import sigil_bs4 import sigil_gumbo_bs4_adapter as gumbo_bs4 samp = """ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml/" xml:lang="en" lang="en-US"> <head><title>testing & entities</title></head> <body> <p class="first second">this is*the*<i><b>copyright</i></b> symbol "©"</p> <p xmlns:xlink="http://www.w3.org/xlink" class="second" xlink:href="http://www.ggogle.com">this used to test atribute namespaces</p> </body> </html> """ soup = gumbo_bs4.parse(samp) for node in soup.find_all(attrs={'class':'second'}): print(node) print(soup.serialize_xhtml()); print(soup.prettyprint_xhtml(indent_level=0, eventual_encoding="utf-8", formatter="minimal", indent_chars=" ")) Please let me know if you have any questions and I will be happy to try and help. Thanks, KevinH Last edited by KevinH; 09-13-2015 at 04:08 PM. |
09-14-2015, 01:38 AM | #2 |
Connoisseur
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
|
|
Advert | |
|
09-14-2015, 08:08 AM | #3 | |
Sigil Developer
Posts: 7,912
Karma: 5449552
Join Date: Nov 2009
Device: many
|
Hi,
Quote:
The plugin is provided with those paths by the launcher. Thanks, Kevin |
|
09-14-2015, 01:25 PM | #4 |
Connoisseur
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
|
Hi,
I saw at your sample code that it loads only the .dic & .aff files. (Maybe i'm not correct, my programming skills are below zero ) It's possible somehow the plugins get those two and the user dictionaries that are selected if the preferenses? Something like a variable that it contains the dictionary and the selected user dictionaries Thanks |
09-14-2015, 02:14 PM | #5 |
Sigil Developer
Posts: 7,912
Karma: 5449552
Join Date: Nov 2009
Device: many
|
Hi gipsy,
No, those user dictionaries are simply word lists with no structure. They are not actually part of Hunspell and no special software is needed to parse them. If you really want user specified words not in any dictionary, you will have to parse the user's default word list yourself. That is easily doable in python. In fact, if you have lots of user specified word lists, you would be better off creating real Hunspell dictionaries (.dic and .aff) files from them so that they are usable by others. See the Hunspell website for details on how to create your own official hunspell dictionaries. Hope this helps, Kevin |
Advert | |
|
09-14-2015, 02:34 PM | #6 |
Addict
Posts: 202
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
I look forward to these new features. It's great to see Sigil continuing to be developed as this is my favourite ePub editor.
I note that the next version of Sigil will use PIL. However, since PIL has been superceded by PILlow, I am using PILlow to develop a new feature for my plugin for tidying ePub files (at https://www.mobileread.com/forums/sho...d.php?t=264378). I was wondering whether it may be possible to include PILlow instead of PIL in the next version of Sigil; if not, will be possible for me to use PILow with my updated plugin if users install PILow separately on a PC? Thanks. BTW: The new feature for my plugin will allow the cover image for an ePub to be resized. |
09-14-2015, 02:48 PM | #7 | |
Sigil Developer
Posts: 7,912
Karma: 5449552
Join Date: Nov 2009
Device: many
|
Hi CalibUser,
On Python 3 PILLOW actually installs as PIL in the site.packages (I think so it can replace the old python 2 only PIL version). So yes, when I said PIL it is actually Pillow that is being included in the next release of Sigil. Note, this is for Python 3 plugins only. As far as I know, the old PIL will not work on Python 3 at all. KevinH Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
epubcheck plugin for Sigil | Doitsu | Plugins | 519 | 07-22-2024 04:59 PM |
kindlegen plugin for Sigil | Doitsu | Plugins | 169 | 02-16-2024 06:48 AM |
Upcoming Sigil Video Tutorials | stepheno | Sigil | 7 | 07-17-2015 10:30 PM |
Writer2ePub, Sigil, and mjBookMaker Features | Ransom | Writer2ePub | 1 | 09-21-2011 09:20 AM |
sony 900 vs. kindle features | parias1126 | Which one should I buy? | 4 | 02-06-2010 12:33 AM |