As dashes are wrap points in HTML, dialogues in Spanish ebooks can look terrible.
Example in one line:
Code:
—Bla, Bla, Bla, —John said—. More bla, bla, bla.
Wrong:
Code:
—Bla, Bla, Bla, —John said
—. More bla, bla, bla.
Wrong:
Code:
—Bla, Bla, Bla, —
John said—. More
bla, bla, bla.
Right:
Code:
—Bla, Bla, Bla, —John
said—. More bla, bla, bla.
The next two following searches add a <span> around the partner word with a specified class. (In my example just
<span class="nw">).
Then add the next CSS definition for this class:
Code:
.nw { white-space: nowrap; display: inline-block; text-indent: 0em;}
and you will have prevented the wrong wrapping in Spanish books.
Edit notes. Explanation of the workaround for RMSDK:
First S&R
Search:
Code:
\x20(—|–|—|–)([^ <]+)( |</p>|</div>)
Replace:
Code:
\x20<span class="nw">\1\2</span>\3
Second S&R
Search:
Code:
\x20([^ >]+)(—|–|—|–)(\.|\.\.\.|,|;|:|…|…)?\x20
Replace:
Code:
\x20<span class="nw">\1\2\3</span>\x20
Additional usage notes
Spoiler:
- Yes, you need both S&R and in that order.
- Do not forget about setting up the additional CSS style or it would be useless.
- As you can see they look for dashes and just dashes (in unicode or in named entity flavour). Some horribly formatted books use minus signs that these searches won't catch.
- Case Sensitive or Dot All settings are probably irrelevant but I've got them in OFF.
- Because of the [^ <]+ and [^ >]+ parts of the Searches they are completely safe to use. I mean they won't catch and destroy code like:
Code:
—Bla, Bla, Bla, —<b>John</b> <i>said</i>—. More bla, bla, bla.
They will just ignore it. You will never get something wrong like:
Code:
—Bla, Bla, Bla, <span class="nw">—<b>John</span></b> <i><span class="nw">said</i>—.</span> More bla, bla, bla.
You'll have to manually fix this kind of situations.
- Using them where dashes are used as sentence or word separators is also safe:
Code:
First sentence—Second sentence.
This situation, pretty common in English books, is also ignored.
- As hinted in other thread I've used \x20 for the starting and ending spaces needed in the regexes, in order to make them clearly visible.
- Obviously there's no point in adding a <span> around the very first starting dash and word, and these searches don't do that.
- Strange situation that I remember having found once or twice. If there's some kind of CSS setting directly on <span> tags then it will be also applied to the newly created tags. I remember suffering a
Code:
span {font-size: 1.3em;}
which I had to override with
Code:
.nw {font-size: 1em; white-space: nowrap;}
while not losing where it was being originally applied.