You can reliably locate non-breaking spaces with view.state.doc.textContent.replace(/\u00a0/g, "!!")
.
The issue appears to be that Chrome and Safari mangle content they put on the clipboard by replacing some spaces in the actual copied content with non-breaking spaces. By the time ProseMirror sees the HTML, it can’t reliably see anymore whether a given non-breaking space is intentional or inserted by the browser.
It seems CKEditor has a hack to replace all spans with a single non-breaking space in them, and optionally an Apple-converted-space
class (Safari seems to label these spaces, Chrome unfortunately doesn’t). This patch copies that approach.