Summary: Marks don't generalize. Is this a problem and if yes, is there a solution?
What are ProseMirror marks?
ProseMirror's document model is somewhat unorthodox in that it mixes flat and tree-shaped content—stretches of inline content (text + elements that appear in between it) are treated as flat, whereas the 'block' structure of the document is represented as a tree. There exist other systems that represent everything as a tree (XML, strictly HTML-based editors, etc) but also those that treat the document as one long string of text with extra metadata overlaid over it.
I feel that inline content is conceptually flat, and trying to make it tree-shaped makes working with it very awkward, whereas block structure is conceptually a tree, so trying to make that flat is just as awkward. Several early everything-is-a-tree implementations of ProseMirror had a lot of really painful code for handling inline content, which kind of melted away into much much nicer code when I moved to a flat inline representation.
So I still think that this is a good idea. The way people think about a sentence with some strong and emphasized text in it is not as a tree of wrapper nodes, but as a sequence, parts of which are annotated with extra properties.
So marks are these extra properties projected onto a sequence of inline content. In a data structure like ProseMirror's documents, I can think of two sane ways to represent them. One is to add them to the leaf nodes, like we currently do. The other is to store them in the parent textblock node, with offsets pointing into the content. I opted for the first because it's much easier to keep consistent—you can perform operations on the content of a node without taking the parent into account, whereas if the parent has separate data pointing into its content, you always have to transform that along with the content. Still, the other approach might have worked too.
For doing what they were designed to do, modeling strong text and links inside of a textblock, I think marks are working out well the way they are.
Given that marks act as a way to add information to a piece of a document, it is natural to consider them as a way to do that for pieces of document that aren't constrained to the inside of a textblock. That currently does not work at all.
Marks are stored on nodes, and when you try to apply them to a document range that's not a flat sequence of leaf nodes, it becomes unclear where they should be stored. Only on the leaves? If so, an empty paragraph has no way to indicate that it's part of the marked range. Or on all nodes, parents and children, that fall entirely in the range? This is a pain to maintain coherently, and currently not respected by the various document-manipulation functions that ProseMirror provides (which I guess could be improved, but feels like it'd be error-prone and messy).
Note that this already comes up in the 'regular' inline use of marks, if you have inline nodes with content—do the marks on such a node apply to the content? What does it mean for a parent and child node to have the same mark?
The HTML substrate
ProseMirror, though it is an attempt to break free of HTML-based editing, has a very intimate relation with the HTML DOM, because that's what its editable surface looks like. It absolutely needs to be able to serialize documents to clean HTML, and to parse DOM content (edited or pasted) into its own format. As such, it is important for the concepts in the document model to map rather directly to HTML constructs.
For marks on flat content, it is relatively straightforward to map the marks to wrapping inline nodes, and to parse such nodes by applying marks to the content inside of them. But if marks were able to span across arbitrary ranges of the document tree, it becomes rather non-obvious what the HTML would look like. A tree structure doesn't naturally lend itself to expressing ranges that don't follow the tree shape.
You could certainly come up with some schema based on custom attributes using some kind of counting system to refer to offsets inside of the nodes (for example the parent node + offset system used by DOM ranges and selections), but that gets very awkward—do you use marker nodes at the start and end of the range? Or wrapping nodes? If wrapping nodes, the question becomes again at which level(s) they should appear. This should also work well with copying and pasting fragments of the DOM structure, and not get in the way of incremental display updates. I haven't been able to come up with a good approach. This kind of data is simply not something that HTML authors tend to need to express.
I've seen two use cases where the limited applicability of marks is causing problems:
Trying to make range-shaped information (such as comments added to a piece of the document) part of the document, so that clipboard and undo-history information works on it.
Adding some kind of baseline information (say font family/size) to all content, like in classical WYSIWYG editors. (I don't care a huge lot about this use case, since I think it's generally a bad idea for an editor to work that way, but it'd be nice if it could be cleanly encoded.)
Should anything change?
Currently, every node has a
marks field, because I figured that filling that field with a shared empty array is cheap, and JS engines like coherently-shaped objects. But as the system currently is, putting marks on block nodes doesn't really work—there are too many cases where it's not clear how they should be handled, so the library code ends up not handling them in any coherent way at all. This is not a good situation, so I'd like to get to a point where there either is a coherent story, or these are simply forbidden.
(I do know one user is currently using marked blocks as a kludge to implement the first use case above, but it's not working all that well.)
I'm basically opening this thread as a call for feedback and ideas with regards to marks. I can imagine the following options:
Forbid marks on block nodes and declare them to always be constrained to a stretch of inline content.
Find some coherent way to model marks as arbitrary ranges, and make the whole library consistently apply that
Reinvent marks in some wonderfully creative way that I haven't thought of yet
(Note that 'make the whole document tree-shaped' is not an option. It wouldn't help with cross-node ranges, and though I know you can build an editor with such a content model, ProseMirror is too far along to make such a radical change, and as mentioned, I found the conceptual mismatch in such a model a real problem.)