Collab editing — step grouping

I’m using ProseMirror for collaborative editing, but I’ve made a mistake in the implementation of the collaborative editing that I’d like to discuss.

The problem is quite simple — instead of broadcasting confirmed steps to collaborators as a batch, they are broadcast one-by-one.

This means that each step is received and applied as a separate transaction, as opposed to them all being applied in a single transaction. This has only recently become a serious problem after I tried incorporating prosemirror-tables into my editor.

What’s significant about prosemirror-tables is that is uses appendTransaction to check that all the tables in the document are well-formed (e.g. rectangular), and apply any steps necessary to fix them. It also uses multiple steps for insert/delete row/column operations (specifically one step per cell that’s being inserted or deleted).

When these two behaviours combine (steps broadcast separately, multiple steps per table operation, and appendTransaction to fix tables), what happens is that in my editor, collaborative editing of tables triggers an endless cycle of new transactions.

The simplest solution here might just be to broadcast steps in batches, to ensure they’re all applied in a single transaction. (Of course this is only a problem in my implementation, the reference implementation doesn’t have this issue as it batches broadcast).

However even with that solution, there’s some issues that leave me feeling uncomfortable:

Broken history versions

I use the storage approach suggested by the simple reference implementation, where all steps are stored, and the version of the document is simply how many steps have been applied. However with this model some versions of the document are broken (i.e. half way through inserting a column to a table).

I think there’s some conceptual similarities with undo-history, and deciding at what are appropriate “save points” in history that a user could undo-back-to. (As far as I know this is just a 500ms rolling buffer?)

Chunked transmission

I think it’s a good idea to allow clients send a their changes to the server in multiple batches, rather than requiring all of their changes to be sent together. The scenarios I had in mind were:

  • the server has a maximum request size to ensure predictable performance, and
  • a client has collected a large volume of steps (e.g. multiple large copy/paste in a short time, or has been offline for some time), and has a slow network connection and is fighting against other collaborators to get their steps confirmed

Allowing a client to send a subset of their changes at a time seemed like a desirable property of the system.


I’m curious what approach other people have taken for these issues.

2 Likes

Right, a collab implementation should batch transactions at least on the same granularity as applyTransaction—the approach taken in the example, where you call sendableSteps after fully applying a transaction (which may cause appended transactions to come into existence) should address this.

Where are you sending the steps, currently? I can’t really think of an easy way to do this wrong, except by having side effects in plugin apply methods or some similarly scary thing. Or do you get them all at a time, and then send them out in separate messages?

The algorithm I have at the moment is basically:

Client:

Call sendableSteps and take as many items as possible that will fit within the maximum HTTP request size for the server. Any steps that don’t fit are not included. (this will theoretically suffer from splitting steps that shouldn’t be split). The steps that didn’t fit will be sent in a subsequent separate request.

Server:

Each time a step is persisted, broadcast it out to all clients.


So the immediate problem is not so much in sendableSteps, but instead in the server broadcasting steps individually, rather than in the batch that they came in from the client.

One more thing I was thinking about, was that it’s important to avoid sending steps between a transaction and a subsequent added via appendTransaction.

In the case of prosemirror-tables, if an irregular table is transmitted to the authority, followed by a subsequent transmission of the “fix” transaction, it seems like it would cause every collaborator in the session to attempt to send their own “fix” transaction (which would actually cause the table to become irregular), and the run-away cycle of transactions would occur again.

I’ll need to do a bit of investigation to see exactly when I’m transmitting steps to see if it could occur between creating an irregular table and the subsequent “fix” transaction.