Discussion: What are marks

Summary: Marks don’t generalize. Is this a problem and if yes, is there a solution?

What are ProseMirror marks?

ProseMirror’s document model is somewhat unorthodox in that it mixes flat and tree-shaped content—stretches of inline content (text + elements that appear in between it) are treated as flat, whereas the ‘block’ structure of the document is represented as a tree. There exist other systems that represent everything as a tree (XML, strictly HTML-based editors, etc) but also those that treat the document as one long string of text with extra metadata overlaid over it.

I feel that inline content is conceptually flat, and trying to make it tree-shaped makes working with it very awkward, whereas block structure is conceptually a tree, so trying to make that flat is just as awkward. Several early everything-is-a-tree implementations of ProseMirror had a lot of really painful code for handling inline content, which kind of melted away into much much nicer code when I moved to a flat inline representation.

So I still think that this is a good idea. The way people think about a sentence with some strong and emphasized text in it is not as a tree of wrapper nodes, but as a sequence, parts of which are annotated with extra properties.

So marks are these extra properties projected onto a sequence of inline content. In a data structure like ProseMirror’s documents, I can think of two sane ways to represent them. One is to add them to the leaf nodes, like we currently do. The other is to store them in the parent textblock node, with offsets pointing into the content. I opted for the first because it’s much easier to keep consistent—you can perform operations on the content of a node without taking the parent into account, whereas if the parent has separate data pointing into its content, you always have to transform that along with the content. Still, the other approach might have worked too.

For doing what they were designed to do, modeling strong text and links inside of a textblock, I think marks are working out well the way they are.

Cross-node marking

Given that marks act as a way to add information to a piece of a document, it is natural to consider them as a way to do that for pieces of document that aren’t constrained to the inside of a textblock. That currently does not work at all.

Marks are stored on nodes, and when you try to apply them to a document range that’s not a flat sequence of leaf nodes, it becomes unclear where they should be stored. Only on the leaves? If so, an empty paragraph has no way to indicate that it’s part of the marked range. Or on all nodes, parents and children, that fall entirely in the range? This is a pain to maintain coherently, and currently not respected by the various document-manipulation functions that ProseMirror provides (which I guess could be improved, but feels like it’d be error-prone and messy).

Note that this already comes up in the ‘regular’ inline use of marks, if you have inline nodes with content—do the marks on such a node apply to the content? What does it mean for a parent and child node to have the same mark?

The HTML substrate

ProseMirror, though it is an attempt to break free of HTML-based editing, has a very intimate relation with the HTML DOM, because that’s what its editable surface looks like. It absolutely needs to be able to serialize documents to clean HTML, and to parse DOM content (edited or pasted) into its own format. As such, it is important for the concepts in the document model to map rather directly to HTML constructs.

For marks on flat content, it is relatively straightforward to map the marks to wrapping inline nodes, and to parse such nodes by applying marks to the content inside of them. But if marks were able to span across arbitrary ranges of the document tree, it becomes rather non-obvious what the HTML would look like. A tree structure doesn’t naturally lend itself to expressing ranges that don’t follow the tree shape.

You could certainly come up with some schema based on custom attributes using some kind of counting system to refer to offsets inside of the nodes (for example the parent node + offset system used by DOM ranges and selections), but that gets very awkward—do you use marker nodes at the start and end of the range? Or wrapping nodes? If wrapping nodes, the question becomes again at which level(s) they should appear. This should also work well with copying and pasting fragments of the DOM structure, and not get in the way of incremental display updates. I haven’t been able to come up with a good approach. This kind of data is simply not something that HTML authors tend to need to express.

Use cases

I’ve seen two use cases where the limited applicability of marks is causing problems:

  • Trying to make range-shaped information (such as comments added to a piece of the document) part of the document, so that clipboard and undo-history information works on it.

  • Adding some kind of baseline information (say font family/size) to all content, like in classical WYSIWYG editors. (I don’t care a huge lot about this use case, since I think it’s generally a bad idea for an editor to work that way, but it’d be nice if it could be cleanly encoded.)

Should anything change?

Currently, every node has a marks field, because I figured that filling that field with a shared empty array is cheap, and JS engines like coherently-shaped objects. But as the system currently is, putting marks on block nodes doesn’t really work—there are too many cases where it’s not clear how they should be handled, so the library code ends up not handling them in any coherent way at all. This is not a good situation, so I’d like to get to a point where there either is a coherent story, or these are simply forbidden.

(I do know one user is currently using marked blocks as a kludge to implement the first use case above, but it’s not working all that well.)

I’m basically opening this thread as a call for feedback and ideas with regards to marks. I can imagine the following options:

  • Forbid marks on block nodes and declare them to always be constrained to a stretch of inline content.

  • Find some coherent way to model marks as arbitrary ranges, and make the whole library consistently apply that

  • Reinvent marks in some wonderfully creative way that I haven’t thought of yet

(Note that ‘make the whole document tree-shaped’ is not an option. It wouldn’t help with cross-node ranges, and though I know you can build an editor with such a content model, ProseMirror is too far along to make such a radical change, and as mentioned, I found the conceptual mismatch in such a model a real problem.)


Ah, and one thing I forgot to mention in the initial post – content specifications for marks are currently kind of poorly designed. The type<marks> syntax suggests that marks are a per-child-node thing, but conceptually, they are more of a per-parent-node thing, in that the parent determines what kind of mark ranges are allowed in its content. I should probably change the notation so that you can only specify the set of marks allowed for a node’s content once.

I am fine with option one (Forbid marks on block nodes).

I’m also fine with forbidding marks on block nodes.

Regarding the use case of adding baseline information to all content of a block node: Wouldn’t it be possible to simply set an attribute on the block node to encode something like font family/size? I think this would fit the HTML model quite well where you usually do the same thing by setting a class for baseline styling on a parent node. Whenever I think in terms of block nodes I find the attribute model to be more intuitive compared to the marks.

Also fine with forbidding it. I’d love to hear more about the first use case (comments) another possible use case is around suggesting edits. I haven’t implemented these features yet but my guess is I’ll store some marked ranges + other metadata alongside my document and then keep the marked ranges up to date on every change/save.

My team (at the New York Times) is working on a project, in consultation with Marijn, to implement the comments-as-marks approach described above.

Though it’s awkward at points, it’s not unworkable — we came up with a deterministic ruleset for deciding which nodes receive which kind of mark. It’s something like:

if (!isCommentable(node)) {
if (node.isText) {
        tr.addMark(from, to, commentMark);
} else if (!node.isTextBlock) {
        tr.setNodeType(from, node.type, node.attrs, commentMark.addToSet(node.marks))

…where isCommentable allows some block-level nodes with block-level children (which themselves are textblocks) to opt out of block-level comment marks.

Needless to say, we think that ProseMirror would stand to lose a potentially interesting set of behaviors if it did away with block-level marks entirely :slight_smile: . As for alternatives, I’m curious to hear more about a couple of other directions:

The other is to store them in the parent textblock node, with offsets pointing into the content

You mention this, noting that it might be harder to keep consistent; I wonder what this might practically look like, and whether or not encoding mark information on the parent would resolve some of the ambiguity around “where they should be stored”?

content specifications for marks are currently kind of poorly designed

I’d also wonder if a redesign of the content expression’s annotation for a mark might be able to do some of the heavy lifting for resolving ambiguity around which kinds of nodes get which kinds of marks?

I might be missing something, but taking our comments example, might a future version of mark content expressions allow finer-grained notation of “allow marks on block-node here”, “forbid marks on block-node here”, “allow marks on block node’s text children here”, etc? I’ll admit that I don’t have a specific vision for how this might look, but it seems like another possible alternative?

1 Like

Hi Jeff, glad to get your feedback here.

I guess that it would mean that parent nodes, when being updated, would need information about the changes to their child nodes (and grandchild nodes, etc), to know how to adjust their marks. Right now, changing a node is a matter of creating a new set of children and then a copy of the node with those children. This’d no longer work, which I expect would make a lot of code more awkward.

(That’s a nice thing about out-of-document metadata – it can look at the steps and position mappings after the fact, and stay consistent without introducing complexity into the actual document-modification code.)

I was thinking more along the lines of simplifying it to ‘inside this node, these marks may occur’.

Would, for your usecase, replacing block level marks with an attribute that appears only on commentable block nodes (which I think are only leaf block nodes), containing an array of comment ids, work?

I’m not sure that replacing block-level marks with comment attributes would work well for our use case:

  • it would push the onus of “how to render a given node’s block-level mark styles” onto individual nodes, whereas currently that’s wholly owned by the marks themselves
  • it would add complexity to the code we use to derive/persist nodes from our at-rest data format (right now, we have some serialization logic that’s able to ignore marks it doesn’t recognize, precisely because the marks array is well-formed across all nodes as you’ve noted)
  • I also don’t like the complexity required in maintaining two different representation of comment ranges — one for text ranges (which, presumably, would remain marks in this model) and one for block-level ranges (which would become bespoke attributes). Though it may not seem so from a data modeling standpoint, right now it feels pretty straightforward to make changes to comment marks because their NodeSpec “owns” all of their styles

I guess I’m still curious to tease out a model where an updated content expression clarifies the fundamental ambiguity around inline-vs-block marks: if in the new model, the specification determines ‘inside this node, these marks may occur’, couldn’t it instead/also specify (inside this node | on this node) these marks may occur?

1 Like

I’d also be curious to hear a little further what this could look like — I appreciate from your sketch here that it’s complex, but I’m not sure that I have a firm grasp on why an approach like this wouldn’t ultimately be workable?

My point is I don’t know what it’d look like – I can’t come up with an approach that isn’t extremely awkward. If you have suggestions, feel free to sketch them out.

The recent changes to the content expression system move mark specifications into a per-parent rather than per-child position. I.e. you no longer annotate children in the parent’s content expression, but provide a separate property on the node spec, marks, which lists the set of marks that are allowed for any children.

Have you started updating the documentation to reflect these changes?


Scratch that, I just realized that those changes haven’t been published yet. I’m not really familiar with the release process.

The docs on the website tend to get update on releases. The doc comments that they are generated from have been updated by the commits I linked.

We have used the comments-as-marks approach the entire time and created serializers (docx, odt, latex, etc.) that simply ignore these. This has worked well for inline content. I was not aware one could add marks to block level nodes as well, but if that is the case, and it’s working for the New York Times, why not just keep this functionality in some way? Or maybe it could be postponed to after version 1.0?

For the near future, the library’s stance on marks is that you probably only want to use them on inline nodes, and that’s what most of the mark-related library code is written for, but putting them on block nodes is allowed, and though you won’t be able to use utilities like addMark and removeMark directly to manage such marks, the lower-level code should handle them properly. It may not always do so yet—this isn’t well-tested territory, so file bugs when you run into problems.

I do reserve the option to change my mind on this in the future, but for now it seems like allowing block marks isn’t really in the way, and can be useful to some people.


It’s interesting to hear your experience with comments-as-marks! I’m currently doing something similar but using the alternative approach of tracking ranges (from, to) in the document, and storing the set adjacent to the content. When changes are made to the document, I update the ranges by mapping them through the step maps. (I use decorations to apply styling)

However this approach has the pitfall of not “just working” with regard to undo.

So I’m faced with two options — switch to using marks, or implement support for “undo” of ranges. I’ll probably go the route of trying to support “undo” for ranges.

This is definitely the more straightforward approach, but as you found, it’s tricky to integrate with undo, and possibly even more tricky to integrate with copy/paste.