Multiple text nodes in a paragraph

bolerio · January 7, 2017, 6:17pm

Hi,

It seems like PM insists on having a single text node as the content of a paragraph (probably unless there are different marks). For example, the following:

let twonodes = [pm.schema.nodes['text'].create({}, "bit 1", []),
                          pm.schema.nodes['text'].create({}, "bit 2", [])]
let p = pm.schema.nodes['paragraph'].create({}, twonodes, [])
pm.tr.insert(1, p).apply()

The result is:

{"type":"paragraph","attrs":{},"content":[{"type":"text","attrs":{},"text":"bit 1bit 2"}]}"

Is there a way to enforce a desired structure without having PM decide what the tree looks like? Even when I create the text nodes with different attributes (I have modified the schema to allow some extra attributes in paragraph and text nodes), it still collapses them and just keeps the attributes from only one of the nodes. I would expect it not to automatically concatenate/merge nodes when they have different attributes and/or marks.

Incidentally, I have this general problem with PM unpredictably deciding to restructure the model during transformations. It appears unpredictable because the rules are not documented, they are just heuristics expressed in code that is not always trivial to follow. It would help if the rules were spelled out somewhere. I understand that most people might not care about the details and just expect PM to do the sensible thing. But I care because I have a separate document model that I need to keep in sync with PM’s model.

Thanks much for your help! Boris

PS I’m not using the latest PM version (I’ve been holding off porting until APIs and behavior stabilize). I’m using 0.9.1. Let me know if there are recent changes that affect the above behavior and might fix my problem?

marijn · January 7, 2017, 9:32pm

Text nodes are conceptually an optimization to avoid allocating a separate node object for every character – they represent a range of uniformly-styled characters. Were your attributes on text nodes (something that I hadn’t considered and which should probably not be allowed) something meaningful, or just an attempt to prevent the merging? If the second, what is the reason why you care about the extent of the text nodes in the first place?

bolerio · January 8, 2017, 12:10am

Hi,

The attributes are meaningful. For example, I want each sentence in my paragraph to be independently identifiable by an ID, so I can attach information to it. That’s how I have it setup in my model, and I was hoping to avoid the messiness of maintaining a mapping between those identifiers and PM position ranges.

Thanks! Boris

bolerio · January 8, 2017, 7:18am

Hi again,

So, if I create a custom mark type with some non-standard HTML tag to render it (e.g. ), I end up with two text nodes. Which is what I want. Not sure if the custom HTML tag is a problem or a bad practice? The browser seems to ignore it and deal well with it. PM doesn’t seem to have a problem with it either.

Do you “bless” this approach?

Cheers, Boris

marijn · January 8, 2017, 8:18pm

It’s what I would have gone with, yes. I agree the extra DOM node is awkward, but I’m not sure marks that have zero effect on the rendered representation are common enough to add a feature for (non-rendered marks).

I’ve pushed a patch that causes the schema to raise an error if text nodes have attributes, so that the next time someone goes down this road they are told it isn’t going to work right away.