Discussion: Inline nodes with content

Thanks for taking a look into this, the summary is excellent!

Cursor boundaries Confluence adds a zero-width no-break space on the boundaries of @mentions to add another cursor position at the same location, to disambiguate the semantics. Here’s an example where I step through the characters at a constant rate using the arrow keys.

The HTML for this is:

<p>I am &#65279;<a userkey="ff808081560b912f01560c40d04b000b" href="/wiki/display/~bayers" class="confluence-link">Bradley</a>&#65279; from Australia.</p>

Another example is a date:

<p>Today is &#65279;<time datetime="2016-11-15" contenteditable="false" class="non-editable" onselectstart="return false;">15 Nov 2016</time>&#65279;.<br></p>

I was thinking about this yesterday, and I suspect this might be satisfactory. In all the use cases I have in mind, the inline node is plain text, but has some special presentation and semantics applied to it. If it was possible to extend Text in order to restrict the marks that can be applied (in my cases so far I would disable any marks), and hook into selection events to present hover mentions, that would get me a long way.

The situation today seems to be that I could make do with marks, but wouldn’t get any help from the schema to achieve it. I’d need to intercept all commands and prevent any that would violate my constraints. This means that plugins that add support for adding other marks would have no way to tell (by looking at the schema) that they should prohibit it.

1 Like

That’s an interesting trick. It’s not ideal, since it requires a kludge in the rendering of such nodes, and another kludge when parsing them to make sure we don’t treat those characters as actual content, but it might be an option. (I tried doing something similar with :after/:before styles, which would be less invasive, but that didn’t work.)

Thanks for the feedback so far. I’m putting this aside for a bit as I work on other things, but further comments are definitely welcome.

Have there been any new thoughts on this issue?

I’ve been revisiting content storage recently, and we’re evaluating whether just storing the ProseMirror structure (probably just via Node.toJSON) is viable. On the whole it seems okay, but there are some issues that make it inconvenient.

For example when expressing hyperlinks as marks, the text may be split (e.g. if half the text is bold). This means that the “unflatten” logic (i.e. the order of the marks in the schema) needs to be encoded into every tool that operates on the stored content.

Using a tool like jq to “find the text for each hyperlink” isn’t going to go well, since it doesn’t know that it should be merging together sibling text nodes that have a “link” mark with the same values in attrs. For these types of operations, having a true tree structure makes it much simpler.

A similar challenge will occur for server-side rendering code that needs to turn a document in another format (e.g. HTML or PDF) as it’ll need to understand how to merge text nodes together, based on the ordering of marks and equality of their attributes.

Being able to render and search quickly make representing a document as a tree very attractive. But it would nice to avoid an addition layer of translation and introduce distinct concepts of “editor format” and “storage format”.

I’m interested in anyone’s thoughts on this topic :slight_smile:.

1 Like

Sorry, I hadn’t seen this post before.

From the perspective of Fidus Writer: We were the ones needing the footnotes. The idea of having them be inline came mainly from experience with LaTeX. Havign footnotes be defined inline solves some issues (for example, copying them in a copy-paste operation works automatically) and creates others (such as the ones you mention).

Back then I followed Marijn’s advice and created a second editor for the footnotes and added a lot fo extra logic to make sure they are always placed in relation to their reference while not overlapping, etc. . This code is working, but the overall complexity of it all (two editors, collaboration provisions when steps don’t arrive, etc.) means that it is quite a lot of work to maintain it. We are still on a slightly hardened 0.10, because it’s simply too much work to update it all.

So in this sense I would vote for either keeping it all as it is in order to not create even more breaking changes, or to switch to a radically simpler model that will be able to handle editable inline nodes out-of-the-box and won’t require individual editors to maintain a lot of code on top. What I would be against would be to switch to another model that is theoretically cleaner, but ends up requiring the same amount or more code in the editor and adds a lot of new breaking changes.

From the perspective of the W3C editing taskforce: The nornalization of caret positions/selecctions is a problem we have been looking at for a while. The argument by several browser manufatcurers has often been that it’s difficult to change any of this, because various JS editors likely depend on the current, broken state. There have been several proposals to this dilemma, but the most recent one is that one creates a new element that is the successor of contenteditable, that will be versioned. So basically you can do something like:

 <editable version="1">...</editable>

There should be a new version fairly frequently with new fixes in every new version. If someone relies on the bugs still present in version 657, they simple set that version. If one doesn’t set a version, one always gets the newest.

If you think that sounds like a good or bad idea, please let us know!

Thanks for the insight to the W3C editing taskforce direction — I’d like to follow along with that more closely.

On face value I’m skeptical that versioning an element is going to end up solving more problems than it creates, but I’d need to read through the exact proposal before having a strong opinion. I’m wary on the typical scenario where a version number is more of an aspiration of functionality, than of true conformance. (put plainly — I don’t think a version number will mean the same thing on all browsers).

The whole issue reminds me a lot of CSS and the proposal of Houdini, and I wonder whether an approach inspired by that wouldn’t work better here. An <editable> or contenteditable element doesn’t seem like the direction we really want to go, but it’s just something we exploit to get access to rich user-input APIs that feed into a JavaScript model that is then able to update the DOM. Having document integrity enforced by a model in JavaScript seems like the direction everyone wants to go. I wonder whether looking at what problems contenteditable solves regarding user input (e.g. pasting, cursor navigation, …) and building first-class APIs to address each specific problem wouldn’t be a better approach.

I haven’t published any drafts yet. So your ideas are very much appreciated both now and once there is a draft.

The general problem is that contenteditable is very very broken and has been for a long time. All attempts at solving it “once and for all” seem doomed because every time we seem to find a solution to one aspect, we discovered that there are a thousand more issues. So instead of saying we cannot fix anything before everything is fixed, we came up with this which does seem reasonable. The editing taskforce has been operation now for about 3 years and we have only barely scraped the surface of some of the main painpoints that have to be solved on the browser side. So if we do everything in one go, we may have to wait another 30-40 years before we can ship. That doens’t seem like a reasonable timetable.

The wish for a special element came from the side of browser developers. This came up in connection with us talking about the need for a way to disable certain operations on a specific contenteditable container. Having a way to disable certain edit operations is needed mainly because the Safari team only wants to disable items in their native browser UI for editable if the edit action is disabled entirely in that contenteditable element.

You may say that this makes no sense, as no JS editor cares about whether or not the contenteditable element has an attribute that disables bold and italic as the JS will just do whatever it wants anyway. Possibly yes, but this is not how everyone sees the world, and having this global way of disabling features so that they will disappear from the browser native menus is the lowest common denominator.

Why does this trigger a switch from atribute to an element? Apparently it’s complicated for browsers to treat an element completely differently due to an attribute. This sems to be especially true if you have to look at two attributes to determine what is going on. This of this:

<div contenteditable="true" disablededitingactions="underline fontcolorchange lists">...</div>

I was there at one of the first meetings in Sydney. I agree that it would be much better to just expose primitives to be engaged with with JS. But this is not how all the browser people involved with this look at the situation. In the view of some of them, the browser provided UI should be the principal way of interacting with a text editor on the web, and JS editors that implement their own logic are destroying the user experience…

Anyway. drop in to the email discussions or to the next phone call we will have if you feel like getting invovled in years of discussions that may or may not lead to tangible improvements for JS develoeprs some day. :slight_smile:

Yes, though no definite decision yet. My current thinking is that I will

  • Add some way for marks to have more control over which marks they can coexist with (to allow multiple instances of a given mark type at a single spot, or forbid certain marks from occurring together) to address some simple use cases.

  • Require that inline nodes with content are rendered with a custom node view, as a way to put the responsibility of hacking something together that happens to work for a given use case toward the client code. So if you want a nested structured node in your inline content, sure, but you have to kludge together the editing interface

  • Make it easier to use nested ProseMirror instances in custom node views (primarily by providing a way to map steps in a given local node that is the top level node of the nested editor to steps in a wider document)

This isn’t going to solve everything, but I think solving everything may be out of reach, and it’d already be great to have a working approach for these kinds of situations.

Doesn’t a tree-based representation suffer from similar issues? Unless your link mark is the lowest-precedence mark in your schema, it risks being split by other, lower-precedence marks.

1 Like

We would have some more use cases:

  • Keywords: Each keyword is displayed in an uneditable “pill” but after all the keywords in one line there needs to be the possibility to add a new keywords by typing text after the last such pill. Ideally, when the caret comes close to it, the pill turns into editable text. Think of a field such as:

Keywords: Home, Kindergarten, Fish

This could just be a simple text node where the keywords are extracted by separating at the position fo the commas, but if one does that, one can almost be certain that some users will use spaces or semicolons instead. Some will write “and” in-between the words, etc. So the oills seem a safer bet. But as far as I get, there is no way of making a schema-declaration for this currently, right?

  • A line with an arbitrary number of First/last name “pills”. Think of an article that has a title and an authors line:

Heading

By Hans Bild (University of Washington), Line Weingaard (US State Dpartment), Dr. George Kufy (VIetnam-US friendship society)

The editor needs to make sure that the by lines follows the same format in all documents and the program needs to be able to extract all the names and institutions from that line. There may be several ways of doing it. The one that comes to my mind is to have an arbeitrary number of inlinw “pills” where each pill represents one person and placeholder textss are used to guide the user in typing the names correctly. Alternatively, The entire line is non-editable and a separate UI is used to insert structured data.

I had come to the conclusion that being able to have inline nodes with content would be sufficient to solve this (assuming that the inline node content had the same content spec expressiveness as text blocks). However I do have a question about how how you see inheritance of marks in the content specs working.

Consider:

{
  nodes: {
    paragraph: { content: 'inline<italic>' },
    text: { group: 'inline' },
    link: { content: 'text<strong>', group: 'inline' }
  }
}

I’d expect the following:

  • Does a text within a paragraph allow the italic mark? — yes
  • Does a text within a paragraph allow the strong mark? — no
  • Does a link within a paragraph allow the italic mark? — no
  • Does a link within a paragraph allow the strong mark? — yes

Is there a scenario where supporting different sets of marks without employing an inline node with content is a scenario worth supporting?

This sounds very reasonable to me.

I’ve been working under the principle that nesting ProseMirrors should be avoided in favour of a single ProseMirror. Can you point me to the background for why nested ProseMirrors is an approach that should be embraced?

1 Like

Having the same mark on the same text twice with different attribute values would be essential to us. We are currently waiting for this from upstream. But should you decide not to do this anyway, let us know, then I’ll have to hack something on top of the existing code so that we can put several reference IDs into the same mark. Then we just need to figure out abotu a way of removing all references to one particular ID in all the marks within a certain range, or to add a new reference ID to all marks within a specific range. This will be a bit messy, so having direct support for multiple marks would be preferable.

I don’t know yet – I haven’t gotten around to working on that.

When the editor for a given node doesn’t look like inline editable content (say, a tooltip that allows one to edit a footnote after clicking on a placeholder that indicates its position), using a separate ProseMirror instance is preferable. If transactions from that instance can be mapped onto the outer editor, that makes propagating changes and supporting proper collaborative editing inside such content really easy.

1 Like

Interesting! Could you share what your use-case is for this?

We need to reference comments by several different people on the same piece of text. Marked ranges/decorations don’t work for us for several reasons and the comments really are part of the document – we even need to convert them when converting the document to other formats.

I’m curious about the current feasibility of this, since this is exactly what I am trying to do!

My use case is wysiwig math editing. I’m trying to hook up http://mathquill.com/ as a custom node view, on top of an inline node with content of a single text node. The content node in this case contains the latex representation of the math.

I’ve run into a number of issues while doing this, and I’m wondering if I should report the issues I am seeing and try to get it working, or whether it’s not reasonable to attempt at this time. In that case, I’ll probably convert it into a leaf node and use some other mechanism (maybe attrs?), rather than a child text node to store the latex.

2 Likes

This is something that’s still shaky. I wrote an example of editing inline footnotes by showing an editor in a tooltip when they are selected (which only works with the current master branches, not the latest release), but your use case is different again (the editor is always visible, and I guess you want to move the cursor into the inner editor when ‘arrowing’ through). I’ll try to allocate some time to create an editor like that soon, and see how it goes.

I have the same use case and thought about the same approach. Unfortunately I didn’t have the time to try this yet, but it would be really nice to be able to handle such a use case using a custom node view like you describe.

That’s great to here, I’m very interested in seeing how this works!

Cool! Are you thinking about targeting the math editing use case specifically? Or something similar?

I’d like to be of assistance in any way I can! For now I’ll just keep reporting issues as I find them :slight_smile:

I have a use case for supporting @mentions like the following:

Does anybody have an example of that use case that I could look at? Or maybe some direction on how I could accomplish this?

Thanks for any and all help!

// Nick

6 Likes

hi denis! How did you programmatically place/replace the text node inside your custom node view?? I’m a newbie and the documentation doesn’t describe this in a way I can easily understand :frowning:

The examples only got me so far as getting the custom node and node view to show up but like you I want to store my info in a text node.

https://tiptap.dev/suggestions.