Option for "corrupted node" placeholder instead of total doc failure?

We’ve been bitten several times by cases where a single node in a bad state brings down the whole document. For example, we had a text node get inserted somehow in an illegal empty state due to a nbsp. This then caused all renders of the doc to fail. In my preferred world, there would be a way to still render the doc but with a type of “this node is broken” placeholder.

To the best of my knowledge, prosemirror doesn’t have any built in error handling. (This post, eg).

Does anyone have any tips for handling cases like this in production? Is this kind of thing potentially possible with prosemirror? On first blush it seems quite complicated, because it might break selections, offsets, etc.

If someone’s thought about this I’d love to hear from them.

1 Like

How did this happen? The DOMParser shouldn’t be capable of producing invalid documents.

In any case, the approach of this library is to try and keep the document rigidly valid, and I don’t think a messier approach where schema-violating nodes are tolerated sounds very promising.

1 Like

How did this happen? The DOMParser shouldn’t be capable of producing invalid documents.

I’ve yet to figure that out. Probably a side effect of some monstrous way we’re using the library.

We semi-regularly find edge cases where somehow an illegal document has been persisted in a way that breaks prosemirror. A null node attribute that I expected not to be null, eg. Or more commonly, someone has made a backwards incompatible schema change so that a node was valid at write time, and then invalid when loaded three months later.

This is programmer error, obviously, but we haven’t come up with a good way to prevent this. We’ve considered eslint rules or perhaps JSONSchema applied to the doc, but we haven’t implemented anything yet.

Calling Node.check on your document in some strategic locations might help isolate these kinds of things.

1 Like

If it helps, Remirror has a utility function that validates JSON using the schema. Probably only useful is you persist your state as JSON though…

It supports transformers on invalid nodes, the default is to remove them - but you could write your own transformer for other behaviour

This looks interesting! Can you tell me a bit more about how it works?

As I’m trying to reason through it, it seems like the doc node itself would fail a validity check. Does it try and remove the smallest number of nodes or anything like that? (Sorry if this is self-evident in the code–as I’m poking around I’m not seeing anything.)

On perusing of that function, it’s recursive and will not simply return top level node, but an array of invalid child nodes (which may include the top level)

And then another utility fn can optionally remove every invalid node (and if you create your own replacement function, can swap for a ‘invalid node type placeholder’ or something like that.

See the test here: https://github.com/remirror/remirror/blob/68773ba81b9bb9aad08a18b9b3a9f9e735e96123/packages/remirror__core-utils/__tests__/core-utils.spec.ts#L898

Thanks for the pointer. I poked around but failed to draw any conclusions. I was also have a hard time seeing if any of those utilities are usable without using Remirror. It looks like they might be generalizable enough, but they return a lot of Remirror types so I’m not sure.

That is true. From what I saw, you could re-write something similar pretty quickly. Either way, let us know how it goes. That is just a guide and I only link there because it was mentioned above. In premise, this feature is really about custom recursive iteration and checking validity / swapping / replacing the invalid node when first loading into prosemirror state