Have you ever written or received a Google Docs file that is full of extra spaces, double punctuation, and similar issues? Microsoft Word makes extra spaces pretty easy to spot with the visible formatting option that shows dots where spaces are. That feature is missing from Google Docs. And Word seems to be much more consistent (though certainly not perfect) at highlighting double punctuation than Google Docs is. In this post, I’ll show you how I handle these issues in Google Docs.
While doing some extensive work in Google Docs for a client a few years ago, I developed a short list of search strings to clean up easily missed issues. I’ll walk through those next. But first, some cautions:
- If you’re an editor using suggest mode in Google Docs, consider turning that off while you go through this cleanup, because you’re likely to make a lot of very minor changes that your client probably doesn’t care about.
- You may also want to go into the version history and name the current version of the document so you can easily restore from it in the unlikely event something goes horribly wrong here.
- You may be tempted to use the Replace All option for some of these commands. I suggest taking the time to visually check each instance, just to be safe. Yes, even when trimming extra spaces, as some authors use those for formatting, and you don’t want to mess that up.
Find and Replace Dialog Box
For these searches, we’re going to use the Find and Replace dialog box. You can open that with Ctrl-H/Cmd-H or by picking it off the Edit menu.
Or, you can get there from the Find dialog (Ctrl-F/Cmd-F) by clicking the three dot button.
Either way, you’ll wind up on the Find and Replace dialog.
Double Punctuation Searches
We’ll look for repeated punctuation first, and for this example, I’ll start with exclamation points just because they’re the easiest to see in a screenshot.
In the Find box, type two exclamation points. In the Replace box, type just one exclamation point. There’s a counter next to the Find box that shows how many double exclamation points are in the document. If there are any, the first will be highlighted, but you may need to scroll the document up or down if it’s hidden behind the dialog box. You can click Replace to correct it or Next to move on to the next one if it looks like this one is fine. Keep going until the counter in the dialog box reaches zero or the only ones that come up are those you decided to keep.
Keep two things in mind as Google Docs presents these occurrences to you:
- Google Docs sometimes skips occurrences.
- If there are cases where there are three or more exclamation points together, Google Docs may only find the first two the first time through the document.
So, just because you’ve hit the bottom of the document, it doesn’t mean you’re done with your search. Pay attention to the counter next to the Find box. It will tell you when there’s no more left. And pay attention to when the search jumps back to the top of the document. if you go from top to bottom in the document and it only turns up cases you already decided were ok, then you’re done.
In addition to double exclamation points, I look for doubles of these characters:
- spaces (this is what I find the most of)
- periods
- commas
- semicolons
- colons
- question marks
- double quote marks
- single quote marks
- left and right parentheses
- left and right brackets
- left and right braces
The idea is the same for each. For example, enter two spaces in the Find box and one space in the Replace With box, then cycle through the occurrences with the Replace and Next buttons.
Space before Punctuation
Next, I look for extra spaces before punctuation. Enter a space followed by a period in the Find box, and enter a period in the Replace With box. Again, the counter by the Find box will tell you if there are any occurrences in the document. You can cycle through them with the Next and Replace buttons. But we have to be careful here: if the author uses any decimal numbers without leading zeroes (like .75), they’ll get picked up by this search and we’ll need to leave them alone (or add a leading zero, depending on the style guide).
Other cases like this I look for are
- space question mark
- space comma
- space semicolon
- space colon
- space right parenthesis
- space right bracket
- space right brace
- space double quote (this will get a lot of false positives at the beginning of quotes, but it will catch issues at the ends of quotes)
- space single quote (likewise, you’ll get false positives here)
Punctuation around Quotation Marks
If you’re working with American English, most punctuation belongs before a closing quotation mark. The exceptions are semicolons and colons. These searches will find places where the punctuation and the quotation mark are reversed. This example looks for periods outside the quotation mark.
- Find:
".
- Replace With:
."
And similarly with commas, exclamation marks, and question marks. Then, repeat these for the single quote mark.
For semicolons and colons, we have to be a bit more careful.
- Find:
:"
If we turn up something at the end of a quote, we need to move the colon outside the quote. But if the issue is at the start of a quote, we probably need to put a space in the colon and the quotation mark. So for colons and semicolons next to quotation marks, I usually just look for the issues with the Next button and fix anything that turns up manually.
Spacing and Capitalization after Periods
Now it gets more complicated. I want to look for cases where a sentence starts with a lowercase letter. In the Find dialogue, click the Match Case and Use Regular Expressions options. Then, in the Find box, enter this: \. [a-z]
When regular expressions are turned on, the period is a wildcard that matches any character. So to look for a period, we escape the wildcard with the backslash. The brackets indicate a range of characters: any character in that range will match. So this expression finds a period followed by a space followed by any lowercase letter. And indeed, it found one instance where the first character of a sentence starts with a lowercase i. I’d manually fix this by capitalizing the word.
That search will give false positives if there are any URLs following a period.
Next, we’ll look for sentences that don’t have a space between them.
Find: \.[a-zA-Z]
This finds any period immediately followed by a lowercase or uppercase letter. We have to be careful with this one, because if the document has URLs in it, they’ll legitimately match this search, and we don’t want to change those.
You can do the same command substituting a comma, exclamation point, colon, or semicolon for the period to find other versions of missing spaces at the end of a sentence or clause.
En Dashes That Should Be Hyphens or Em Dashes
The last one I’ll show involves en dashes. The right thing to do here varies based on the style guide being used. For example, in Chicago style, en dashes are usually used only in number ranges and some special cases of hyphenation. Most hyphenation uses a hyphen. And when used as a “strong comma,” we want an em dash instead of an en dash. In AP style, the en dash is never used, even for number ranges.
This query looks for cases where an en dash is surrounded by letters.
Find: [a-zA-Z]–[a-zA-Z]
That character between the two sets of letters is an en dash. Based on what turns up, I either change the en dash to a hyphen or an em dash, or, rarely, I’ll leave it alone.
Wrapping Up
I run through these cleanup steps on all Google Docs manuscripts I work on. It takes a little bit of time, but it’s worth it. These issues can be easily missed when reading through a document, and it’s better to not risk them slipping through. You can run these checks before or after your main editing pass, but I have found that it’s better to run at least the double space one before hand, as markup can hide double spaces by separating them with deleted text. It’s also a good idea to run the double spaces and double punctuation searches again after all changes have been accepted or rejected, because Google Docs makes it very easy to insert double periods or spaces in suggest mode.
The searches here are simple, and we can get a lot more precise looking for issues with the regular expression searches. I’ll go into more detail on regular expression searches in another post.
Do you have any cleanup tips for Google Docs manuscripts? I’d love to hear them.