Node created by `ParseRule.getContent` is empty when received by `NodeView`

See the minimal (not-)working example repo here, which is a pruned down version of my prosemirror-math project, which uses KaTeX to render math inside a NodeView. For the MWE, please open in Chrome as there is a different issue preventing the paste behavior from working correctly in FireFox.

I’m working on allowing users to paste external math content into the editor. For instance, Wikipedia (depending on your math render preferences) displays math in image elements of the form

<img src="..."
     class="mwe-math-fallback-image-inline"
     alt="{\displaystyle \sigma (h)={\sqrt {\mathbb {E} {\big (}X(t+h)-X(t){\big )}^{2}}}}">

Code Overview

  • I’m using a ParseRule defined here to detect math in pasted content
  • When math is found, I define a ParseRule.getContent function to create a node of the appropriate math_display type.
  • The schema can be found here.
  • The math_display node type has a corresponding NodeView. Since the editing interface is rather complex (in the full version), I’m not using contentDOM, but instead managing the editing of this node manually.

Expected Behavior:

  • When pasting external HTML of the specified format, a new math_display block should appear, whose contents are a single Text node containing the alt=".." text of the image tag.

Actual Behavior

  • The alt tags are properly detected, and the contents are printed to the console.
  • If I print the newly-created node before returning from getContent, its contents contain a TextNode with the desired string
  • However, the contents of the math_display node appear empty in the document. If I print the node when it is received by the NodeView constructor, its contents are empty.
  • Conclusion: The node’s contents are removed somewhere after getContent is called but before the node is passed to the NodeView constructor. Unfortunately I haven’t been able to determine where exactly this happens.

Questions

  • Is the Fragment I return from getContent constructed correctly?
  • If so, why might the contents of the node be deleted before being passed to the NodeView constructor?

Code Detail

Here is an abbreviated version of getContent, defined here:

getContent<S extends Schema<any, any>>(p: Node, schema: S): Fragment<S> {
	// ...
	let texString: string = "<DETECTED MATH STRING>"
	
	// create block math node
	let nodeType = schema.nodes["math_display"] as NodeType<S>;
	let textNode = schema.text(texString) as ProseNode<S>;
	let mathNode = nodeType.createAndFill( undefined, textNode );

	if(mathNode == null || mathNode == undefined){
		throw new Error("nodeType.createAndFill failed");
	}

	let frag = Fragment.from<S>(mathNode);

	// BUG: The Text node is present here, but disappears when added to the document.
	console.log("returning from getContent with the following fragment", frag);

	return frag;
}

Here is the schema:

export const mathSchemaSpec = createSchemaSpec({
	nodes: {
		// :: NodeSpec top-level document node
		doc: {
			content: "block+"
		},
		paragraph: {
			content: "inline*",
			group: "block",
			parseDOM: [{ tag: "p" }],
			toDOM() { return ["p", 0]; }
		},
		math_display: {
			group: "block math",
			content: "text*",
			atom: true,
			code: true,
			toDOM: () => ["math-display", { class: "math-node" }, 0],
			parseDOM: [
				{ tag: "math-display" },
				...defaultBlockMathParseRules
			]
		},
		text: {
			group: "inline"
		}
	}
});

Possibly related questions:

Thanks in advance for any help you can provide!

Thanks for setting up a reproduction case, but this one seems a bit big—it should be possible to reproduce the problem you describe with just a schema, a DOMParser, and a call to the parser’s parse method, without any views, or even webpack/browser code. Simplifying the setup would be the first thing I’d do when diagnosing this, so I’m asking you to do that work for me instead.

Thanks for the reply, I just pushed a “minimaler” version to the same repo. It’s small enough that I was able to paste it below. What I learned:

  • I had assumed it was a NodeView issue, but I removed all the NodeView code and the issue persists, interesting!

  • Next, I thought it might be an issue with atom: true in the schema, but the issue persists even with atom: false.

With any luck this will be some sort of subtle configuration problem that I’ll feel silly about as soon as it’s pointed out to me :). Thanks for your help!

index.js

function require(name) {
  let id = /^prosemirror-(.*)/.exec(name), mod = id && PM[id[1].replace(/-/g, "_")]
  if (!mod) throw new Error(`Library basic isn't loaded`)
  return mod
}

// ProseMirror imports
const { DOMParser, Fragment, Schema } = require("prosemirror-model");
const { EditorView } = require("prosemirror-view");
const { EditorState } = require("prosemirror-state");
const { baseKeymap } = require("prosemirror-commands");
const { keymap } = require("prosemirror-keymap");

////////////////////////////////////////////////////////////////////////////////

window.onload = function(){
	// get editor element
	let editorElt = document.getElementById("editor");
	if(!editorElt){ throw Error("missing #editor element"); }

	// example document
	let dl = document.createElement("dl");
	dl.textContent = "example text";
	
	let example = document.createElement("div");
	example.appendChild(dl);

	// create parser from schema
	let domParser = DOMParser.fromSchema(editorSchema)
	window.domParser = domParser;

	// create ProseMirror state
	let state = EditorState.create({
		schema: editorSchema,
		doc: domParser.parse(example),
		plugins: [ keymap(baseKeymap) ]
	})

	// create ProseMirror view
	let view = new EditorView(editorElt, {
		state,
		// nodeViews: { math_display(node){ return new MathNodeView(node); } } 
	});
	window.view = view;

	// determine if parse was successful
	console.log("the parsed document is:", view.state.doc);
	console.log("the first node has type:", view.state.doc.content.content[0].type.name);

	let firstNodeContent = view.state.doc.content.content[0].content.content;
	console.log("the first node's content is:", firstNodeContent);
	if(firstNodeContent.length === 0){ console.error("the first node has no content!"); }
}

////////////////////////////////////////////////////////////////////////////////

const parseRule = { 
	tag: "dl",
	getContent(p, schema) {
		console.log("parseRule :: getContent");

		// create block math node
		let nodeType = schema.nodes["math_display"];
		let textNode = schema.text(p.textContent || "empty content");
		let mathNode = nodeType.createAndFill( undefined, textNode );

		if(mathNode == null || mathNode == undefined){
			throw new Error("nodeType.createAndFill failed");
		}

		let frag = Fragment.from(mathNode);

		// BUG: The Text node is present here, but disappears when added to the document.
		console.log("returning from getContent with the following fragment", frag);

		return frag;
	}
};

////////////////////////////////////////////////////////////////////////////////

// bare minimum ProseMirror schema for working with math nodes
const editorSchema = new Schema({
	nodes: {
		// :: NodeSpec top-level document node
		doc: {
			content: "block+"
		},
		paragraph: {
			content: "inline*",
			group: "block",
			parseDOM: [{ tag: "p" }],
			toDOM() { return ["p", 0]; }
		},
		math_display: {
			group: "block math",
			content: "text*",
			atom: true, /* this can be false or true and the error still occurs */
			code: true,
			toDOM: () => ["math-display", { class: "math-node" }, 0],
			parseDOM: [
				parseRule
			]
		},
		text: {
			group: "inline"
		}
	}
});

index.html

<!doctype html>

<html>
	<head>
		<title>ProseMirror Math</title>
		<meta charset="utf-8">
		<style>
		math_display {
			background-color: yellow;
		}
		</style>
		<link rel=stylesheet href="https://prosemirror.net/css/editor.css">
		<script src="https://prosemirror.net/examples/prosemirror.js"></script>
		<script src="require-pm.js"></script>
		<script src="index.js" defer></script>
	</head>
	<body>
		<div class="content"><div class="center">
			<div id="editor" spellcheck="false"></div>
		</div>
	</body>
</html>

It seems that the problem is that you’re returning a math_display node from getContent, but that is not a valid child of a match_display code. You’ll want to return a fragment containing just the text.

That was it, changing it to let frag = Fragment.from(textNode); fixed the issue. For some reason I thought I needed to return the entire node from getContent, but actually I only need to return the node’s children.

Thanks for your help!