I’m using ProseMirror for collaborative editing, but I’ve made a mistake in the implementation of the collaborative editing that I’d like to discuss.
The problem is quite simple — instead of broadcasting confirmed steps to collaborators as a batch, they are broadcast one-by-one.
This means that each step is received and applied as a separate transaction, as opposed to them all being applied in a single transaction. This has only recently become a serious problem after I tried incorporating prosemirror-tables into my editor.
What’s significant about prosemirror-tables is that is uses appendTransaction
to check that all the tables in the document are well-formed (e.g. rectangular), and apply any steps necessary to fix them. It also uses multiple steps for insert/delete row/column operations (specifically one step per cell that’s being inserted or deleted).
When these two behaviours combine (steps broadcast separately, multiple steps per table operation, and appendTransaction
to fix tables), what happens is that in my editor, collaborative editing of tables triggers an endless cycle of new transactions.
The simplest solution here might just be to broadcast steps in batches, to ensure they’re all applied in a single transaction. (Of course this is only a problem in my implementation, the reference implementation doesn’t have this issue as it batches broadcast).
However even with that solution, there’s some issues that leave me feeling uncomfortable:
Broken history versions
I use the storage approach suggested by the simple reference implementation, where all steps are stored, and the version of the document is simply how many steps have been applied. However with this model some versions of the document are broken (i.e. half way through inserting a column to a table).
I think there’s some conceptual similarities with undo-history, and deciding at what are appropriate “save points” in history that a user could undo-back-to. (As far as I know this is just a 500ms rolling buffer?)
Chunked transmission
I think it’s a good idea to allow clients send a their changes to the server in multiple batches, rather than requiring all of their changes to be sent together. The scenarios I had in mind were:
- the server has a maximum request size to ensure predictable performance, and
- a client has collected a large volume of steps (e.g. multiple large copy/paste in a short time, or has been offline for some time), and has a slow network connection and is fighting against other collaborators to get their steps confirmed
Allowing a client to send a subset of their changes at a time seemed like a desirable property of the system.
I’m curious what approach other people have taken for these issues.