Theoretical saving question


Hoping to get some opinions from the experienced creators in the crowd.

Better to save a user document from ProseMirror in a whole file on the database or as separate individual changes?

Currently a user creates a written document here in the editor. We save the entire doc locally every 1/3 of sec for them and then save a backup copy on the server on doc close or page close. Always running on the local copy.

But realized we can’t reconstitute a document later as a document can be overwritten on the server from the local.

I realize this is off-topic a bit. But wanted to draw on people’s experiences.


1 Like

I don’t quite follow this. User loads page and you could check to see if server timestamp is greater than local, in which case you’d use the server’s version.

Anyway to answer your question. I currently store the full html document in the database. I haven’t yet tackled change history which will probably entail also storing changes in the database.

Hi! Thanks!

Understood, I’m saving full html from the editor too.

Not sure I understand. Their local copy in the browser storage will always be newest with changes.

But let’s say they do something really bad locally. If I had saved all the document changes, I could reconstitute to a time stamp from the db.

Only other backup option would be to save multiple full copies in db?


This very much depends on your use case and the features you wish to support. For my use case, I store a single record for the document. So, I’m writing over the previous document with the newly saved document (and then rails auto updates the updated_at timestamp for me).

If you want to support change history (which I do but am not tackling yet), you’ll need to either A. store every version of the document or B. store the diffs (I believe ProseMirror can output some json representation of the changes). Again, I haven’t tackled this part yet so my understanding is a little vague.

Understood. Whoops…=))) I was editing. =) Sorry.

So you may have the same problem as me, in that once written over the copy is written over.

Yeap :(, which is why change history is on the road map!

We may be tackling the same issues. I need to make it possible to help if user badness occurs. =)

Both are valid approaches – I know several users are saving all steps, in order to be able to later display or revert changes, and to be able to go back to earlier versions. Whether you need that complexity depends on what you want to do.

Thanks! I definitely appreciate the info. Curious on how saving all the changes will effect the server bandwidth with a large volume high volume user group, but it’s such a new thing, that we may not find out until later.

Thanks for letting me diverge a bit.

what I’m trying to achieve is saving all steps on the server but compact old steps using Step.merge and replace large chunks of very old steps with checkpoints (docs), this is quite difficult in an implementation using a NOSQL DB where a client will be doing this compacting. It needs to find a safe version to compact to that all current clients have already used and there is the issue of offline clients coming back online.

Once done though the advantage is huge. A client can log in and then travel back in time to the beginning. It gets even better in collab situation where another user could have messed up your document. As far as I know no editor available today can do this (at best users can save/mark versions etc …)

I’m trying to do the same thing right now. For my use case, it seems like an operation that replaces an arbitrary number of old steps, up to and including everything except the unconfirmed change list, with a compacted document would be the simplest solution.

You can use prosemirror-compress to merge consecutive steps and then compress their JSON representations before sending them to the server.

mikeb - what I think maybe an issue is if you have a very slow client still catching up on steps while a fast client decided to compress them … this is hard to reason about …

xylk - prosemirror-compress steps merge function is great (much more elegant than what I came up with :smile:) but I’m not sure about shortening property names, any api change of property names can break this and be a nightmare to fix if you already compressed on the db …

The steps can be merged and compressed before they are sent to the server. If not, maybe merging can be done after a checkpoint is created?

Thanks! Names are shortened using string constants, what change do you think could unavoidably break this?

yes but that’s not good enough. consider the simple case of two clients, one only viewing and the other just typing. And suppose every second we check if there are new steps and send them. Every time we check and there are new steps we can merge them (suppose 3-4 steps into 1) and send them but the next time, since the reading client probably read the old steps, we need to (merge and) write new steps to a new ‘revision’.

So we will still end with a lot of steps on the DB where in fact in this common case all steps can probably be merged into one.

merging can be done after a checkpoint is created

Yes definitely but as I wrote above it is hard to reason about which version is safe to compact to. You need to know where all other connected clients are “at” (reading). Then there is is the even more difficult case of disconnected clients reconnecting …

what change do you think could unavoidably break this

PM changing the name of ‘strong’ to ‘bold’ for example, or adding a new mark named ‘s’ … (both unlikely I agree) …