Good morning.
We found ourselves looking for a way to store data on PM nodes, that are not attributes.
Nodes, as immutable value objects, hold their information in attrs
, however as our nodes and node views became more complex, we wanted to persist more information.
We have developed a well-defined JSON document structure, which is used to validate nodes, marks and their respective attrs. There are cases, when we have run-time information that we don’t want to serialize, but we need to store with the node instance.
Example use case
For example, imagine an image node spec:
image: NodeSpec = {
group: 'inline',
inline: true,
attrs: {
src: { default: '' },
alt: { default: null },
title: { default: null }
},
draggable: true,
parseDOM: [{
tag: 'img[src]', getAttrs(dom: HTMLElement) {
return {
src: dom.getAttribute('src'),
alt: dom.getAttribute('alt'),
title: dom.getAttribute('title')
};
}
}],
toDOM(node: Node) { return ['img', node.attrs]; }
};
Let’s say, we want to add the source file information - file-name
and file-mime-type
. We could store them as properties on node
and we would extend toDOM() this way:
image: NodeSpec = {
// ...
toDOM(node: Node) {
return [
'img',
{
...node.attrs,
'data-file-size' : node.fileSize,
'data-file-name' : node.fileName
}
];
}
};
That takes care of encoding to DOM, but the opposite operation is not as trivial. Because parseDOM
does not allow invokables, we must return a ParseRule structure which is then interpreted internally by DOMParser
.
Solution 1 - Allow invokables for parseDOM, making parsing more flexible
We already use Node Views to full extent, and also used toDOM()
which is allowed to return the final DOM element. We’ve noticed, however, that the same is not allowed for parseDOM
:
parseDOM: ?[ParseRule] Associates DOM parser information with this node, which can be used by DOMParser.fromSchema to automatically derive a parser. The node field in the rules is implied (the name of this node will be filled in automatically). If you supply your own parser, you do not need to also specify parsing rules in your schema.
It means that the only way to implement custom parsing for a particular node/mark type is to subclass the whole DOMParser (and ParseContext, NodeContext, MarkContext to go with them).
Proposed architecture
I believe it’d be beneficial to enable parseDOM()
to be a function, with the following signature:
type InvokableParseRule = (el: dom.Node, context: ParseContext) => Node | Mark | undefined;
interface NodeSpec {
// ...
parseDOM: ParseRule | InvokableParseRule
}
Here’s an example implementation:
image: NodeSpec = {
parseDOM: (dom, parseContext) => {
// short-circuit if it's not parseable here
if (!dom.matches('img')) {
return;
}
// let's construct the node as it would be normally by DOMParser
const nodeType = parseContext.parser.schema.nodes.image;
const node = nodeType.create({
src: dom.getAttribute('src'),
title dom.getAttribute('title'),
alt: dom.getAttribute('alt') || ''
});
// handle aux properties that are not attrs
if (dom.hasAttribute('data-file-name')) {
node.fileName = dom.getAttribute('data-file-name');
}
if (dom.hasAttribute('data-file-size')) {
node.fileSize = dom.getAttribute('data-file-size');
}
return node;
},
}
I’ve created a prototype DOMParser subclass that allows invokable parseDOM: PM DOMParser allowing invokables for parseDOM · GitHub
This solution:
- Is BC
- If implemented by consumer correctly, will be as performant, or faster than, normal ParseRule (internally it also calls
Element.matches()
) - Will implicitly be used for copy-paste.
- Doesn’t require extra wiring, node-specific data stays with the node instance.
- Requires modification/sub-classing DOMParser.
Solution 2 - parseDOM.getAttrs(), a map, handlePaste and scanning.
Another workaround to the problem we’ve found is to:
- For parsing, have
NodeSpec.parseDom[].getAttrs()
method read the extra attributes:
- For private properties, read and store them in a shared Map
- For everything else, return them as
attrs
- For pasting - implement
EditorProps.handlePaste()
on each Editor that uses this node.
- Go through the
Slice
that has been created from clipboard data. - Find
image
nodes - Use the aforementioned Map to re-apply private properties on node instances.
- For modifying the document:
- Every modification which might involve insert of image nodes will require another scan.
- When constructing document, we need to run another pass to re-apply private properties.
This solution:
- Doesn’t require sub-classing DOMParser.
- is backwards compatible
- Breaks SOC because
handlePaste()
is onEditorProps
, whileNodeSpec
is part of schema, which itself (and any of its nodes) could be shared across different editors. - Is hard to test, because we can’t test the DOMParser directly (as we do for all other nodes)
- Less performant - requires scanning Slice for nodes on each paste.
- Requires injection/sharing of the Map in various places.
- Paste is prone to paste-related bugs.
- Might be hard to maintain because of multiple places where we mutate nodes.