[Resolved] Rapid updates while using prosemirror-collab causes un-expected document state

dennmat · April 23, 2024, 4:58am

(Resolved before submitting, but submitting anyway for prosperity’s sake, and to hopefully help anyone who stumbles on this)

First thank you to marijn for this outstanding piece of OSS.

To anyone with experience in the collaborative features of ProseMirror:

I currently have it mostly working, though my setup may be slightly out of the norm.

I am using an unmodified version of the prosemirror-collab plugin to manage gathering steps to send to my authority server and then “rebasing”.

My authority server is a python app utilizing a python implementation of prosemirror. I recognize there will be concerns as to this being a potential failure point, however I only use it to apply the steps to transform the doc. And I know it to not be the failure point in this case, as I’ve tested this issue on an app that I know for certain also uses it in the same fashion, and it works there.

So to my current problem.

I have it working fairly well. Everything works as expected, until a connected client sends many requests. Size does not matter, but quantity does.

What happens is another “client” simply “watching” the document and not making any changes of their own will suddenly have “extra” content, actually the authority ends up with the extra as well, client 1 is “in the wrong” but also the only one with the correct state.

Say client 1 submits ‘abcdefghijklmnopqrstuvwxyz’ (at a speed that equates to mashing the keyboard) then client 2 (having done nothing) might see ‘abcdefghijklmnopqrstuvwxyz abcdhij qrst xy’. (and the authority server)

I believe it has to do with client 1 sending more “steps” before it has “corrected” its state from a previous request. Therefore rebasing on its own changes frequently.

I’ve read the ProseMirror Guide many times over and believe I’ve implemented the authority server to spec. It will reject version mismatches and the client must catch up then re-submit. Largely handled by the prosemirror-collab module.

After much googling I’ve seen people with various solutions to similar problems, but nothing that matches this exactly, which makes me wonder if I’m simply missing something obvious or simple, or if I need to go the route of a more complex relationship between the client and backend.

Unfortunately, I’m out of my depths a bit here, and would appreciate a nudge in the right direction. Simply debouncing on the client could solve this – rather than some more robust commit - confirmation solutions I’ve seen.

However I’m struggling to conceptualize what adding a debounce to the collab module would even look like.

I rubber ducked myself typing this out.

But I’ll keep the post to help out anyone else who hits this. The debouncing was as simple as it seems – I was overthinking it, concerned the plugin would falter if after calling it the steps were “consumed” and its state would expect a subsequent “receive” call. This is not the case. Meaning you can hold off indefinitely on the receive regardless of how many times you call sendableSteps.

Turning:

let sendable = collab.sendableSteps(newState);
if (sendable) {
	this.options.sendStepsToAuthority(sendable);
}

into:

let sendable = collab.sendableSteps(newState);
this.debouncedSendableSteps(sendable);

Where debouncedSendableSteps is debounced by lodash’s debounce and contains:

if (sendable) {
    this.options.sendStepsToAuthority(sendable);
}

And the function sendStepsToAuthority, in my setup, simply sends the serialized steps to the authority. I debounced with a 60ms delay.

This seems to have solved it, at least for now, I can no longer cause the clients or authority to mis-match.

In my searching I stumbled on Prosemirror Collab Performance : prosemirror-collab-commit - #4 by marijn if you do debounce and still experience this you may want to investigate this route. As for my use case I’ll never have more than a few editors at a time so I don’t think I’ll need to explore this route.

This does seem to imply that though the current documentation is correct under optimal conditions. However it does not highlight, what might be obvious to smarter or more experienced devs in this domain, that the condition of multiple steps being sent to an authority before the originals are resolved can lead to discrepancies in state.

Thanks again Marijn. If I can ever get something off the ground I will gladly sponsor this project – looking forward to when I’m able to.