Disrupting marks breaks regexp in linting example


I’ve got a problem which can be reformulated/shown with the linting example: https://prosemirror.net/examples/lint/. Here, we have a regexp which tries to look for problematic word in a text node, and if one of the word exists they mark it as a problem.

If we make parts of the problematic word bold, we’ll no longer be able to mark it as problematic as the word is split into multiple nodes now.

Any pointers appreciated with this! One solution I thought of was to create a temporary document without the “disrupting” nodes/marks, and then do the regexp on the resulting nodes, but it doesn’t seem completely trivial to me how to do that either.

Yes, the demo code is intentionally simplistic to avoid being too big. A proper solution would iterate over the nodes in a textblock and build up a string of adjacent text nodes, so that it can search for the words in that string.

Thanks for the reply @marijn!

I feel like it’s very easy to run into a mismatch between the indices of the regexp and the indices of the document, as the example of bolding in the middle of the word do cause index to be pushed forward in the document, while it doesn’t increase in the regexp indices.

Requires thinking a bit about the details to get it correct for all cases I think. I’ll give it a shot! :slight_smile: