Ways to serialize/deserialize schema spec

Hey, I am wondering about how to serialize/deserialize the schema spec. I want to save it separately in a database in the form of JSON. There seem to be two things stopping from doing that right now:

let spec = view.state.schema.spec
let specCopy = JSON.parse(JSON.stringify(spec))
  1. specCopy is missing the toDOM methods. (I think i can live with this as I need to use it on the server.)
  2. spec.nodes and spec.marks are OrderedMaps, whereas specCopy.nodes and specCopy.marks are arrays.

I see that new Schema({nodes, marks}) accepts Objects instead of OrderedMaps for nodes and marks. I wonder if it could not also be made to accept these types of arrays that OrderedMaps are being serialized into, or if there is some other way of storing a spec in pure JSON data that I have not thought about?

Schema specs contain function values, and thus can’t be serialized as JSON. And no, making new Schema accept the gook you get when you JSON-serialize an OrderedMap is not something I’m interested in.

well, there is always

JSON.stringify(spec, (k,v) => typeof v === "function" ? "" + v : v);

which does serialize it.

For others who might want to save the spec or move it around, this seems to be the solution to reload it:

    (k,v) => 
       typeof v === "string" && v.startsWith('function ') ? 
           eval(`(${v})`) : 
       ['nodes', 'marks'].includes(k) && v.content ? 
           new OrderedMap(v.content) : 

Hey @johanneswilm - we’re trying to do the same thing and was excited to see your post. Regretablly the above stringify method seems to be producing invalid json when passed the basic schema spec. Just want to double check that this is still working for you?

Hey @mtejera, I believe I was using this for prosemirror-python, but it’s not a package I am actually using as of today (the Python version is comparatively slow, but it’s working OK, may use it in the future). I didn’t need to use the parseDOM and toDOM methods as I was just looking for a way to apply steps in Python.

If it’s not working for you, I’d still think it’s something you can resolve by investigating the issue a bit more. I am not the original inventor of the above method and it’s a fairly common usecase in JavaScript land.

Okay - thanks for the update @johanneswilm!

@marijn Can you please clarify. In collab demo - server part and client part use the same schema from same file.

In my case client editors has different plugins and options on different project pages, so schema initialization is quite complex and I’m not sure that I can duplicate it initialization on collab server. But I need schema to be able to apply steps on server part. There is no official way how can I serialize schema structure and pass this schema on collab server?

Not really. But since the schema was created by your code, shouldn’t it be possible to make that code produce the same schema again?

Not trying to necropost, just trying to save some time for people that would stumble upon the same question…

That won’t work as long as your serialized functions access the variables outside of their scope (and probably in some other cases). That is, the deserialized schema would not be functional even for basic schema from prosemirror

I wanted to use this approach for a collab scenario where documents could be created with an arbitrary schema and server would accept any schema but it turned out it wasn’t possible. What I ended up going for was having schema code (identified by id) shared by both client and server and then client passing a schema id whenever he wants to create a new document

That is, the mentioned way of schema serializing/deserializing might work in some cases but it will generally not.

What won’t work?

Actually, since posting last time, we have switched to doing exactly this. We are using prosemirror-python on the backend to apply steps and export the relevant parts of the schema from JavaScript to Python using this code:

fiduswriter/export.js at 96fcde7aabaa66eef7c811c6d40ca19940c80f32 · fiduswriter/fiduswriter · GitHub (JavaScript)

fiduswriter/export_schema.py at 96fcde7aabaa66eef7c811c6d40ca19940c80f32 · fiduswriter/fiduswriter · GitHub (Python)

The parts of serializing into/reading from DOM are not part of the bits that are converted to Python code as they are not needed by the server. This has been running and working in production environments for about half a year now.

We do access outside variables when serializing to DOM nodes, but those parts do not matter when applying steps.

Could you give an example of a case where this would not be working?

Once you pass that serialized schema to client deserializing it client-side will be incosistent - this is where functions start to matter

My idea was that client would connect to server and get schema alongside the document

Oh, I think you misunderstood. The point with the above code is to transfer the relevant parts of a spec/schema to the server so that it can apply steps. Part of doing this is to leave out the dom-related methods as these are not needed on the server and it’s not easy to transfer them into a format that can be read in Python as well.

You were trying to go the other way - turn those server related parts of the schema/spec back into JavaScript code. Yes you are absolutely right, that cannot work.