10-27-2009, 02:01 PM | #1 |
Grand Sorcerer
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
|
Change single quotes to double quotes
Does anyone know any batch methods for converting single quotes, but not apostrophes, to double quotes?
I'm OCR'ing a scanned book, and I want a final version I can read without twitching. I'm considering using Word's find-and-replace to change space-apostrophe into space-dblquote, and apostrophe-space into dblquote-space. Especially since this book seems to favor apostrophe-s even for words that end with s. (Yeats's. Moss's. Democritus's.) |
10-27-2009, 03:06 PM | #2 |
frumious Bandersnatch
Posts: 7,536
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
|
Advert | |
|
10-27-2009, 03:18 PM | #3 |
Grand Sorcerer
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
|
Presumably not all; I suppose there are probably single quotes within the double quotes that are supposed to stay that way. But I might have an easier time replacing them all, and then manually fixing, than just trying to manually change them.
|
10-27-2009, 05:11 PM | #4 | |
Wizard
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
|
Quote:
Or, if you are feeling brave, I could send you the WIP version of Pacify.py... but you'll need Python 3 installed. - Ahi |
|
10-27-2009, 05:17 PM | #5 | |
Grand Sorcerer
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
|
Quote:
It's not heavily formatted, and when I get the basic OCR correction done, I'll send it to you. (40 pages down, 600 to go. Sigh.) |
|
Advert | |
|
10-27-2009, 05:20 PM | #6 |
Wizard
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
|
|
10-27-2009, 07:52 PM | #7 |
Sigil & calibre developer
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Here are some Python regexs I wrote to do exactly what you want... I know Python scares you but learning some type of regex (Python, Perl, sed...) is a very powerful tool especially if you are editing books regularly.
|
10-27-2009, 08:20 PM | #8 |
Reader
Posts: 11,504
Karma: 8720163
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
|
I've just done this, in order to get the quotation marks consistents in a short story collection, pieced from various sources. This is what I did:
1. Search for double quotes (used for internal quotations). Replace with @ or # as a placeholder. 2. Replace single-quote plus space, and space plus single quote with doublequotes and spaces. 3. Replace @ or # with single-quotes. 4. Then check the lot, just in case. (Sometimes authors add punctuation immediately after closing a quotation.) I did all this in Word, then put the edited version into Book Designer. |
10-27-2009, 08:56 PM | #9 | |
Grand Sorcerer
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
|
Quote:
FWIW, my programming skills: I can write very very simple .bat files for dos. VERY simple. With Google to help me figure out the commands I need. And backing up all the files first, in case I accidentally erase them all with the wrong command. |
|
10-27-2009, 08:59 PM | #10 | |
Grand Sorcerer
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
|
Quote:
I'm just annoyed that it's disrupting the spellcheck; FineReader stops on all the words with an apostrophe at the end to ask me if it's misspelled. And FR's search-and-replace is much more limited than Word's; I don't want to do the replacing there. |
|
10-27-2009, 11:08 PM | #11 | |
Wizard
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
|
Quote:
Just kidding. Python's re module should run under Python 2.6.4, on any OS, I would think. (I'm just learning myself, though.) User_none hasn't given you a full script though. I think the intent was just to provide the actual regex expressions which you could use inside another script, or just with a text editor that does RegEx search and replaces in a way compatible with Python's, or a very similar way. I imagine there are such editors for Windows. I don't know what they are, mind you... (Do keep in mind however that you can run Ubuntu from a live CD or in a virtual box if you're so enamored of your precious Windows....) |
|
10-28-2009, 12:11 AM | #12 | |
Grand Sorcerer
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
|
Quote:
Switching to Unix of any sort would mean learning a whole new OS *and* a whole new set of programs, from scratch. It's on the list of things I'd like to do if I ever have six months when I won't mind the frustration of not being able to do what I *know* the computer is supposed to do. (It's slightly above "learn Dvorak keyboard" on my personal list of projects. And "learn perl" is right behind "learn Sindarin" on my list of language skills to acquire.) I've wanted to switch to Linux since, oh, just before Win-NT came out? I'm just not geek enough to manage it on my own, and the geekfriend who introduced me to Unix went off to Colorado to attend grad school. (Any real switch, as opposed to playing with a bootable disc, is probably waiting for a good friend who can visit my house at least twice a week, often on short notice, to remind me how to do complex procedures like "move files from one place to another" and "play this movie so it fills up the screen.") |
|
10-28-2009, 03:13 AM | #13 |
Guru
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
|
I like to use this regular expression:
Code:
Search = ([>_])'(.*?[^a-z_])'([<_]) Replace = $1{opening_quote}$2{closing_quote}$3 It is intended for HTML source with reasonable typography (e.g. proper punctuation before closing quotes) and can't handle some situations (words starting with an apostrophe followed by real opening quotes), but apart from that it performs surprisingly well. |
10-28-2009, 11:21 PM | #14 | |
Guru
Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
|
Quote:
You don't have to remove Vista or inconvenience your family. |
|
10-30-2009, 12:01 AM | #15 |
Guru
Posts: 819
Karma: 171672846
Join Date: Jan 2009
Location: Alberta, Canada
Device: PRS-350, PRS-650, iPhone 6, NVIDIA Shield K1
|
I use "eBook Tidy", which has a "change single quote to double quote" function. This function changes what the program percieves as being quotes and does a pretty good job of leaving apostrophes alone. The only problem with ebook tidy is that it only handles text. Pictures get wiped out of the material as soon as it is imported. Works great for novels and text only books though. I'm pretty sure if you Google ebook tidy it leads you to the download page.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Single quotes to double quotes? | lunixer | General Discussions | 35 | 10-10-2010 06:47 AM |
Dealing with double quotes " in URL | kinurev | Recipes | 6 | 10-03-2010 10:57 AM |
convert straight quotes to curly quotes | alansplace | Calibre | 3 | 09-25-2010 04:51 PM |
Quotes | novels.books | Astak EZReader | 4 | 10-26-2009 10:56 PM |