Rendering custom controls to update node attributes

I have a use case where I would like to render control elements for a node which should update some of the node’s attributes when clicked. An example would be an image that has three different view modes which control the size. A size attribute could then be set to small, medium, or large via a dropdown that is rendered in the top right corner of the image when hovered.

My first approach would be to render these control elements inside the node’s serializeDOM method and to set up click handlers for the different controls. The click handlers would then get the position of the node using posAtCoords and update the attributes using setNodeType.

Unfortunately, as far as I know the serializeDOM method does not have access to the ProseMirror instance, which makes it impossible to use posAtCoords or setNodeType.
Now my question is if there’s a better way to get this working or if it would be possible to pass a ProseMirror instance to the serializeDOM method. Of course it should only be passed if the node being rendered is actually part of a document that is loaded by a ProseMirror instance.
An additional question would be if the posAtCoords method still works if the control elements aren’t visually positioned inside the node’s container div (e.g. to the right of the node using some CSS). They would still be inside the container div in the DOM, just positioned differently on a visual level. I think it would be important to handle this use case.

I just saw this related issue, which seems to strive for a specialized solution for use cases like the one I described. I think that passing a ProseMirror instance to the serializeDOM method could still be beneficial as it would provide maximum flexibility in terms of handling DOM events inside a node. A specialized solution could probably be then built on top of this flexible API.
I think that the ProseMirror instance would need to be made optional to work in environments where there is no ProseMirror instance available. This could happen when for example batch converting documents to html.

Any thoughts on this?

You could register the mousedown handler on the whole editor and inspect the event target to see if it is hitting your control.

Only if the element is actually on the screen. For mouse events, it should be fine.

Thanks, this sounds like a workable solution for now!

This is really surprising to me if I understand this correctly. So to clarify, if I have an element widget within paragraph A that’s positioned (position: absolute) over paragraph B, when I call view.posAtCoords(elementWidget.getClientRect()) it will actually return me a position close to paragraph A?

And this working is predicated on the DOM hierarchy (widget being a descendant of paragraph A)?

Has anything changed in the last 2 years with regards to how best to find a document position given a reference to a DOM node?

Looking at the docs for posAtCoords it’s not clear to me exactly what pos and inside mean:

Given a pair of viewport coordinates, return the document position that corresponds to them. May return null if the given coordinates aren’t inside of the visible editor. When an object is returned, its pos property is the position nearest to the coordinates, and its inside property holds the position of the inner node that the position falls inside of, or -1 if it is at the top level, not in any node.

I don’t understand what “inner node” means in this context — “inner” with respect to what exactly?

Here’s my attempt at rephrasing (is this correct?):

  • pos is the document position found by drawing a straight line from (left, top) to the closest pixel (as in shortest line) on the screen that has a document position (not all pixels have document positions)
  • inside is the document position found by traversing up the DOM from the element rendered under (left, top) on the screen, until a document position is found

Yes, getting a position from a DOM node will walk down the DOM hierarchy until it finds a node with a prosemirror view description attached to it.

The coordinates will (usually) fall within some node. That nodes has parent nodes that they are also technically inside of, but you get the innermost node that is around the coordinates.

So no, it is the position before the first node that the given DOM position falls inside of.

That helps my understanding, thank-you, but I still feel like I’m missing something.

Yes, getting a position from a DOM node will walk down the DOM hierarchy until it finds a node with a prosemirror view description attached to it.

Ah! I assumed there was a DOM API that returned an element at a given (top, left), and ProseMirror would traverse up from there to search for an element with a view description. I think that assumption definitely contributed to my confusion.

Do I understand correctly that inside is the position before the node containing pos (i.e. doc.resolve(pos).before(-1))? That would mean that inside < pos is always true?

If my understanding is correct, it seems like a primitive version of a ResolvePos, and that makes me wonder why posAtCoords doesn’t return a ResolvedPos?

I think I’m probably still misunderstanding/missing something. I appreciate the help!

Well, that’s also true—I described the process of going from a DOM position to a ProseMirror position. If all you have are window coordinates, you first need to get a DOM position from that, for which we use caretRangeFromPoint or caretPositionFromPoint, depending on browser support.

They might be equal, if the click was in a leaf node and the heuristics for disambiguating the side say that it was closer to the start of the leaf node.

The information returned isn’t encodable in a single position. When you click on a leaf node you’ll get a position next to the node, but that doesn’t yet tell you that the leaf node was clicked, which is relevant when you want to implement interaction with the node.

1 Like