Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 10-13-2023, 08:51 AM   #1
Noirtier
Member
Noirtier began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Jan 2021
Device: Kobo Forma
En-dash option for smarten punctuation

Hi,

I was just testing out the "Polish books" feature for the first time, and noticed that it replaced double hyphens with em-dash, even if there are spaces to either side of them. The same happens in the Edit book → Tools option for this.

The usual convention I am used to seeing is either having an en-dash with spaces, or much more rarely, an em-dash without any spaces (typically I only see this in more antiquated sources). I.e.:

"...text -- text..." → "...text – text..."
"...text--text..." → "..text—text..."

Maybe there are other conventions/preferences over this, but it would be very much appreciated if there was at least an option to use en-dashes instead of em-dashes.

Noirtier is offline   Reply With Quote
Old 10-13-2023, 12:01 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,566
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
this comes from the smarted punctuation python library, I'm afraid adding configuration UI for this is not wortht he effort at least for me, patches welcome.
kovidgoyal is offline   Reply With Quote
Advert
Old 10-13-2023, 01:44 PM   #3
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 12,379
Karma: 92073397
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
USA vs elsewhere is different for em dash and en dash as an aside. USA uses em dash and no spaces and most other places, esp. UK & Ireland use en dash with a space before and after. This is only asides, the equivalent of (and) or , and aside, etc.

If I'm fussed I fix after by detecting all non-aside em dash and replacing with ¬ then em dash with space en dash space, then ¬ to em dash. Likely there is a smarter way.

The Smarten Punctuation fixes single straight quotes and double quotes and makes em dashs with no spaces.
Quoth is offline   Reply With Quote
Old 10-13-2023, 07:52 PM   #4
capink
Wizard
capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.
 
Posts: 1,139
Karma: 1954142
Join Date: Aug 2015
Device: Kindle
One way to get around this is to:

First in case you want to keep the double hyphens as they are:
  • Use the search/replace to replace the double hyphens to some unique text e.g _double hyphen_
  • Run smarten punctuation.
  • Replace the unique text back to double hyphen.

Or if you want to convert them to anything else
  • Use the search/replace to replace the double hyphens to whatever you want.
  • Run smarten punctuation.
You can automate the process using the Editor Chains plugin, which allows you to invoke the whole sequence from a menu entry or a keyboard shortcut.

Last edited by capink; 10-13-2023 at 07:58 PM.
capink is offline   Reply With Quote
Old 10-14-2023, 12:43 PM   #5
Noirtier
Member
Noirtier began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Jan 2021
Device: Kobo Forma
Thank you all for the responses. Yes, I am using the Find/Replace functionality as a workaround which is not a big deal if it is difficult to change the smarten punctuation feature.
Noirtier is offline   Reply With Quote
Advert
Old 10-15-2023, 10:13 AM   #6
rjwse@aol.com
Addict
rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.
 
rjwse@aol.com's Avatar
 
Posts: 304
Karma: 2228060
Join Date: Dec 2013
Location: LaVernia, Texas
Device: kindle epub readers on android
I use en dash and em dash quite a bit. To make them I employ both the calibre snippets and also .XCompose. I found out to my horror from a prior post reply that Amazon will not recognize the em dash, so I fudge it by 'running together' 3 regular dashes with a span whose spacing has been reduced. I use the triple sometimes to surround page numbers at bottom of PDF pages. You can set up an .XCompose file in Home and use it elsewhere (not just calibre tag editor). While in calibre I prefer using snippets for en and em dashes. I break USA convention (if, indeed, those rules still apply) by separating like this: text space en dash space text. That is, I do not allow the en dash to touch text (as is the convention). I find that it is even worse looking when done as follows: text comma en dash space text. This was done frequently a century ago by many authors. Or, maybe the type setter did this. I use the triple dash like this for mid-chapter centered scene breaks: em dash space scene space em dash. Best regards, Pop
Attached Thumbnails
Click image for larger version

Name:	Screenshot from 2023-10-15 07-58-23.png
Views:	118
Size:	518.5 KB
ID:	204292   Click image for larger version

Name:	Screenshot from 2023-10-15 07-56-20.png
Views:	132
Size:	107.8 KB
ID:	204293   Click image for larger version

Name:	Screenshot from 2023-10-15 07-56-06.png
Views:	116
Size:	247.8 KB
ID:	204294  
rjwse@aol.com is offline   Reply With Quote
Old 10-15-2023, 11:33 AM   #7
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 12,379
Karma: 92073397
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Sure em & en both work on Amazon upload to KDP. Why do you think em dash doesn't (which is odd as Amazon is US)?

On Linux I disable Caps Lock Key, both shift is Caps lock, either cancels and then Compose mapped to Caps Lock Key. The Compose --- is em and Compose --. is en as standard. I do have a custom .Xcompose too, but only needed for Greek, prime, double prime like 6′ 2″ ( Using Compose 0 ' and Compose 0 ", where 0 is Zero.) and a few other unusual things as Alt Gr and Compose cover Spanish, French, German, Polish, Icelandic etc. I've no need to type Hebrew, Arabic, Thai, Hindi, "modern" Korean or Cyrillic, but these can be done by swapping layouts or in .Xcompose. The advantage of .Xcompose is using a transliteration rather than official language layout.

Years ago I made my own layouts using MS Keyboard Editor thing on XP to match the AltGr of Linux. I've no idea why the regular MS UK or US layouts are so limited.

Last edited by Quoth; 10-15-2023 at 11:44 AM.
Quoth is offline   Reply With Quote
Old 10-15-2023, 11:48 AM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 76,495
Karma: 136564766
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Can the smarten punctuation be changed to replace en-dashes with em-dashes and remove any spaces?
JSWolf is offline   Reply With Quote
Old 10-15-2023, 12:28 PM   #9
rjwse@aol.com
Addict
rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.rjwse@aol.com ought to be getting tired of karma fortunes by now.
 
rjwse@aol.com's Avatar
 
Posts: 304
Karma: 2228060
Join Date: Dec 2013
Location: LaVernia, Texas
Device: kindle epub readers on android
Quote:
Sure em & en both work on Amazon upload to KDP. Why do you think em dash doesn't (which is odd as Amazon is US)?
It's the extra long dash that won't show correctly in kindle previewer. I quit (more or less) using it from advice on mobileread. I now fudge the way-long dash with calibre's snippetology which is fun to do. It would be great if there were a way to insert various CSS to create epub outcomes viewable with calibre's reader that emulates what each and every name brand dedicated ebook reader will show without having to actually purchase that particular piece of hardware. I have no desire to buy any nook, fire, ipad, sony, kobo etc. in addition to my laptop which seems to show epub rules exactly under calibre. As of now, putting cutting edge CSS stuff on amazon runs the risk of it looking corrupt on their own reader. They only use a cut-down subset of CSS. As for little-used unicode stuff (which I like to goof with sporadically) I am perplexed it won't work in the amazon kindle previewer. After all, they have done yeoman's work getting Japanese, Hindi, every which kind of alphabet to work. Seems as though they could at least put out a linux version. I thank you all for your input and correspondence. Best regards, Pop
Attached Thumbnails
Click image for larger version

Name:	Screenshot from 2023-10-15 10-05-35.png
Views:	135
Size:	73.3 KB
ID:	204299   Click image for larger version

Name:	Screenshot from 2023-10-15 10-04-27.png
Views:	105
Size:	69.4 KB
ID:	204300   Click image for larger version

Name:	Screenshot from 2023-10-15 10-04-00.png
Views:	109
Size:	85.8 KB
ID:	204301  
rjwse@aol.com is offline   Reply With Quote
Old 10-15-2023, 02:22 PM   #10
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 40,604
Karma: 157444382
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Hmmm... That could be a horizontal bar ( ― ), a 2em dash ( ⸺ ) or a 3em dash ( ⸻ ) all of which I've seen used. All too often, what gets displayed is a notdef character. The actual character will depend on the font, some common examples in the attached graphic. Others may be used depending on what is set for glyph 0 in the font.
Attached Thumbnails
Click image for larger version

Name:	notdef.gif
Views:	94
Size:	3.1 KB
ID:	204302  
DNSB is online now   Reply With Quote
Old 10-16-2023, 05:32 AM   #11
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 12,379
Karma: 92073397
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Quote:
Originally Posted by rjwse@aol.com View Post
It's the extra long dash that won't show correctly in kindle previewer. I quit (more or less) using it from advice on mobileread.
An actual em dash, en dash and hyphen dash covers all uses for ordinary readers of ebooks. All other dashes should only be in PDFs for paper and are very specialist (maybe maths or textbooks). The regular hyphen only for when it's used, not to indicate word breaking on flowing text except on web pages. The minus only for paper print.
Quoth is offline   Reply With Quote
Old 10-16-2023, 05:37 AM   #12
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 12,379
Karma: 92073397
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Quote:
Originally Posted by rjwse@aol.com View Post
As of now, putting cutting edge CSS stuff on amazon runs the risk of it looking corrupt on their own reader. They only use a cut-down subset of CSS. As for little-used unicode stuff (which I like to goof with sporadically) I am perplexed it won't work in the amazon kindle previewer.
An ebook isn't a web page. ALL ebooks are a subset of current webpage HTML and CSS.

Maybe the odd app based on Webkit might render everything a browser on a web page can (and even on web pages there are idiots that use Chrome features). No physical ereader will do bleeding edge CSS, or all of HTML 5 or even all of epub3.
Quoth is offline   Reply With Quote
Old 10-18-2023, 03:03 PM   #13
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,304
Karma: 12587727
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Noirtier View Post
I was just testing out the "Polish books" feature for the first time, and noticed that it replaced double hyphens with em-dash, even if there are spaces to either side of them. [...]

I.e.:

"...text -- text..." → "...text – text..."
"...text--text..." → "..text—text..."

Maybe there are other conventions/preferences over this, but it would be very much appreciated if there was at least an option to use en-dashes instead of em-dashes.
DiapDealers's fantastic plugins can already do this:

Once you install it, you can get multiple options in a dropdown:
  • (em|en)-dash settings
    • Do not educate dashes
    • -- = emdash (no endash support)
    • -- = emdash | --- = endash
    • --- = emdash | -- = endash

The 4th one is exactly what you want to have enabled.

Quote:
Originally Posted by rjwse@aol.com View Post
I use en dash and em dash quite a bit. [...] I found out to my horror from a prior post reply that Amazon will not recognize the em dash, so I fudge it by 'running together' 3 regular dashes with a span whose spacing has been reduced. I use the triple sometimes to surround page numbers at bottom of PDF pages.

[...]

I use the triple dash like this for mid-chapter centered scene breaks: em dash space scene space em dash.
No. The two rare dash characters:
  • U+2E3A = ⸺ = TWO-EM DASH
  • U+2E3B = ⸻ = THREE-EM DASH

should not be used like that.

Instead, it's better to create those lines via CSS. That was one of the first questions I asked MobileRead wayyyy back in:

- - -

Side Note: If you want to know the proper use-cases for two + three em dashes, see my post in:

Long story short:
  • 2-em dash is for "missing text".
  • 3-em dash is for "same exact author" in Bibliographies.

I also strongly recommend against using those 2 dash characters, because of all the missing font + problematic rendering issues. Instead, use the equivalent amount of normal EM DASHes.

- - -

Quote:
Originally Posted by rjwse@aol.com View Post
It's the extra long dash that won't show correctly in kindle previewer. I quit (more or less) using it from advice on mobileread.
Yep, exactly. Barely any fonts have those 2 rarer dash characters, and it will cause more trouble than it solves.

Quote:
Originally Posted by rjwse@aol.com View Post
I break USA convention (if, indeed, those rules still apply) by separating like this: text space en dash space text. That is, I do not allow the en dash to touch text (as is the convention). I find that it is even worse looking when done as follows: text comma en dash space text. This was done frequently a century ago by many authors. Or, maybe the type setter did this. I use the triple dash like this for mid-chapter centered scene breaks: em dash space scene space em dash.
You may also be interested in all the dash discussion back in:

I cover all the use-cases + spaced vs. non-spaced versions (American vs. British, Chicago vs. Other Style Guides), etc., etc.

Last edited by Tex2002ans; 10-18-2023 at 03:06 PM.
Tex2002ans is offline   Reply With Quote
Old 10-27-2023, 07:23 AM   #14
Noirtier
Member
Noirtier began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Jan 2021
Device: Kobo Forma
Quote:
Originally Posted by Tex2002ans View Post
  • (em|en)-dash settings
    • Do not educate dashes
    • -- = emdash (no endash support)
    • -- = emdash | --- = endash
    • --- = emdash | -- = endash

The 4th one is exactly what you want to have enabled.
Thank you very much for the suggestion! I have installed it now. All of those four options appear to be based solely on the number of hyphens to be found, without taking into account the presence of spaces or not. In my experience, the number of hyphens people employ in place of dashes is always two, with the only variability being whether they include spaces on either side or not.
But you are right that the 4th option certainly comes closest, and would at least work in those cases where there are already spaces in place
Noirtier is offline   Reply With Quote
Old 10-27-2023, 08:16 AM   #15
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,304
Karma: 12587727
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Noirtier View Post
Thank you very much for the suggestion! I have installed it now. All of those four options appear to be based solely on the number of hyphens to be found, without taking into account the presence of spaces or not. In my experience, the number of hyphens people employ in place of dashes is always two, with the only variability being whether they include spaces on either side or not.
Then you will want to create 2 Regular Expressions:

Regex #1: SPACE + 2 HYPHENS + SPACE -> EN DASH
  • Find: ( )--( )
  • Replace: \1–\2

Regex #2: LETTER + 2 HYPHENS + LETTER -> EM DASH
  • Find: (\w)--(\w)
  • Replace: \1—\2

That would take your examples:

Code:
...text -- text...
...text--text...
and convert them into:

Code:
...text – text...
...text—text...
- - -

Side Note: In Sigil, I make heavy use of the "Saved Searches" feature in:
  • Tools > Saved Searches

You can even use "Groups" to run batches of search/replaces in 1 button press. See:

This allows you to save your common search/replaces, and easily run them in the future on any books.

You can do similar in Calibre's Editor, except it's in a slightly different location:
  • Search > Saved Searches
Tex2002ans is offline   Reply With Quote
Reply

Tags
dashes, edit, polish, polishing, smarten punctuation


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Smarten Punctuation hiteshp Calibre 4 06-13-2021 03:15 AM
Smarten punctuation crutledge Editor 7 04-26-2014 03:02 AM
Smarten punctuation only? Psymon Conversion 3 10-20-2013 10:28 AM
Simpler Way to Smarten Punctuation Rand Brittain Calibre 3 10-10-2010 09:16 PM
Thanks for new 'Smarten Punctuation' feature jackie_w Calibre 1 09-21-2010 03:53 PM


All times are GMT -4. The time now is 01:08 AM.


MobileRead.com is a privately owned, operated and funded community.