Why does Thunderbird add ‘\A0’ and other strange-looking strings in e-mails I send?

I use Linux and have used the Thunderbird e-mail client since 2008. I used to use DavMail to enable Thunderbird to access various company Microsoft Exchange WebMail accounts but, several years ago, DavMail would no longer work with a particular Microsoft Exchange account so I switched to the Thunderbird add-on ExQuilla, for which I pay an annual licence fee. I do not know if the more recent versions of DavMail would work with this particular account but ExQuilla got me out of a hole so I stuck with it. Recently this particular corporation decided to stop using an in-house Microsoft Exchange server and switched to Microsoft 365.

Recently people receiving my e-mails sent using this particular account told me there were strange strings of characters in e-mails of mine that quote other e-mails. The most frequent occurrence is the three-character string ‘\A0’, although other strings are sometimes present too. The following e-mail extract illustrates the effect:

Hi Claudia,

I have had a look at your draft and agree with your assessment. Let’s sit down together and prepare a list of possible remedial measures.

Regards,
Fitzcarraldo

On 07/10/2020 13:02, Claudia wrote:
> Hi,
> \A0
> Could you please have a look at the draft I have attached.
> \A0
> There are several main issues requiring attention. The operation was basically run by one person \2013 (John) during the tests, which led to several issues.
> \A0
> He does not have the time to do everything by himself.\A0 The other staff who had assisted him during earlier tests were not present.

Notice various occurrences of ‘\A0’ and an occurrence of ‘\2013’.

I searched the Web to see if other Thunderbird users had come across this problem, and found several reports of similar problems, although not identical. The most promising page I found was in the Mozilla support forums for Thunderbird: Why do my sent messages magically add “�” at the end of my sentences?. However, none of the various fixes suggested in that thread worked in my case. My Thunderbird installation was configured to use ‘Unicode (UTF-8)’ text encoding for Outgoing Mail and Western (ISO-8859-1) for Incoming Mail ( ‘Edit’ > ‘Preferences’ > ‘General’ > ‘Language & Appearance’ > ‘Advanced…’ > ‘Text Encoding’). I changed the text encoding for incoming mail to ‘Unicode (UTF-8)’ but that made no difference. I ticked ‘When possible, use the default text encoding in replies’ but that also made no difference. Anyway, I left the settings like that and hoped an update to Thunderbird would fix the issue.

I was not sure if the problem started with an upgrade to Thunderbird, or whether the switch to Microsoft 365 was the cause. I suspect Microsoft 365 is the culprit because the problem does not occur when I use other e-mail accounts. Anyway, it is annoying and I have still not found a fix for it. One of the replies in the above-mentioned Thunderbird support thread is not identical to what I’m seeing, but it looks to be essentially the same problem:

Jorg K
2/4/18, 6:17 AM

There is NO bug in Thunderbird. Sadly some US ISP’s like AT&T and Bellsouth have started *corrupting* their customers’ e-mail.

If the customer sends in windows-1252 and includes for example special punctuation characters or a non-break space xA0, the ISP doesn’t correctly interpret the the message as windows-1252 but as UTF-8. In UTF-8, xA0 is not valid and gets replaced by the so-called replacement character, � (0xEF 0xBF 0xBD).

Since the e-mail is still windows-1252 encoded, the recipient’s client displays �.

See:
https://bugzilla.mozilla.org/show_bug.cgi?id=1427636
https://bugzilla.mozilla.org/show_bug.cgi?id=1435536

Affected users should complain heavily to their mail providers. As a workaround, they need to send all messages as UTF-8.

This seems to be a possible explanation of what I am experiencing, but it is impractical for me to check what text encoding all my contacts are using, or get them to switch to UTF-8 if they are not already using it in their e-mails. I noticed that there is actually a space in what look like blank lines in the e-mails I quote, and, if I delete that space, the ‘\A0’ no longer appears on those lines when I view the contents of e-mails in the Sent Items mailbox. I think that the space is, in fact, a non-breaking space (xA0), which is apparently invalid in UTF-8 and gets displayed as ‘\A0’ by Thunderbird (I’m currently using Version 78.4.2).

Trying to find and delete all the non-breaking spaces and other non-UTF-8 characters in a quoted e-mail is impractical. However, I found a somewhat cumbersome work-around to the problem of non-breaking spaces (and, I think, other non UTF-8 characters). When I click on the ‘Reply’ button in Thunderbird and a window pops up for me to compose my reply which includes a quoted e-mail or e-mails, I use Ctrl-C to copy all the contents of the window, then Ctrl-V to paste it back into the window. This seems to get rid of the character strings representing non-UTF-8 characters. It does add some extra blank lines in the quoted e-mail(s) in the window in which I am composing my e-mail, but those extra blank lines are normally not present when viewing the e-mail after it has been sent.

This work-around is not ideal as it relies on me remembering to do it when composing an e-mail in which I am quoting a previous e-mail or e-mails. But at least it gets rid of the multiple additional occurrences of ‘\A0’ (non-breaking space). It’s a pity there is no mechanism in Thunderbird to filter out non-UTF-8 characters such as a non-breaking space when quoting other e-mails. Even if Jorg K in the above-mentioned thread is correct and the cause of the problem does not lie in Thunderbird, I would rather Thunderbird act differently if the user has configured it to send e-mails using UTF-8 text encoding, and filter out non-UTF-8 characters rather than including strings of gobbledygook in the e-mail.