How to make A block node contain arbitrary inline children or block children (NOT mixing block and inline node)

According to the HTML specification, a block element can contain either block elements or inline elements. Although ProseMirror does not support mixing of the two. But, is there a way to make a node that allows to include only inline nodes as well as block nodes (The two do NOT mix with each other) ?

E.g.

<ul>
    <li>plaintext</li>
</ul>
<ul>
    <li><p>plaintext</p></li>
</ul>

Both of them should be valid.

I’ve tried to define a node schema’s content as (inline* | block*),. However, it thrown an error: SyntaxError: Mixing inline and block content.

1 Like

No, you’ll have to define two different parent node types to do something like that.

1 Like

Thanks for your kind reply. I think defining two different nodes is not a better workaround. In particular, you need to treat two different nodes as one type.

Now, I can only define a block node, filter it when exporting html, and remove the embedded p-tag (if there is only one p-tag).

Additionally, is it feasible to support such logic from the bottom? After all, this is in line with the needs of the real world.

Hi Marijn

I have a similar problem to this with

Can you please give an example how to have 2 parents work together? At the moment I can only get one or other to work.

        // I want to be able to have parse both block and text within a td
        // e.g. <td><table>...Block content....</table></td> and <td>Text</td>
        // currently <td>Text</p> is transforming into <td><p>Text</p></td>
        nodes = nodes.update('table_cell', {
          content: 'block+',
          tableRole: 'cell',
          isolating: true,
          attrs: { style: { default: null } },
          parseDOM: [{ tag: 'td', getAttrs: getAttributes }],
          toDOM: node => ['td', node.attrs, 0]
        });

        // new rule to try catter to text only need. It works in isolation.
        nodes = nodes.update('table_cell_text', {
          content: 'text*',
          tableRole: 'cell',
          isolating: true,
          attrs: { style: { default: null } },
          parseDOM: [{ tag: 'td', getAttrs: getAttributes }],
          toDOM: node => ['td', node.attrs, 0]
        });

Here is my attempt in a code fiddle.

Many thanks

Hi @marijn - any insight on the above question would be really appreciated. Thanks

Is the problem that the correct parse rule isn’t picked when parsing such an element? You can use getAttrs as a dynamic predicate for rules, by making it return false when the rule doesn’t apply to the given element.

@marijn, thanks for pointed me in the right direction with getAttrs. I have updated it to conditionally check dom.children for nodes or text however this still doesn’t seem apply any rule with a new name.

e.g.

// overwrite existing block rule with a text rule and it will apply text rule fine.
        nodes = nodes.update('table_cell', {
          content: 'text*',
          tableRole: 'cell',
          isolating: true,
          attrs: { style: { default: null } },
          parseDOM: [{ tag: 'td', getAttrs: getAttributes }],
          toDOM: node => ['td', node.attrs, 0]
        });
// new rules never seems to be applied
        nodes = nodes.update('table_cell_block', {
          content: 'block+',
          tableRole: 'cell',
          isolating: true,
          attrs: { style: { default: null } },
          parseDOM: [{ tag: 'td', getAttrs: getAttributes }],
          toDOM: node => ['td', node.attrs, 0]
        });

Is there anything else that is needed to register a new rule?

Ignore me - the parent of the table cells need to include the new rule as well as the existing.

Thanks again for the assistance.

@graham Hey, how did you solve this? Do you have your latest stackblitz link?

I’m looking for the parsing to keep my deeply nested div structure but the deepest div allow inline content. For some reason a new div gets injected though. Which i’m later removing. I’d rather not have that hack though.

{
doc: {
  content: 'block+',
},
div: {
  attrs: {
    class: { default: null },
    data: { default: {} },
  },
  content: 'block*',
  group: 'block',
  parseDOM: [
    {
      tag: 'div',
      getAttrs(dom) {
        const attrs = { data: {}, class: dom.getAttribute('class') }
        for (let d in dom.dataset) {
          attrs.data[d] = dom.dataset[d]
        }
        return attrs
      },
    },
  ],
  toDOM(node) {
    const attrs = { class: node.attrs.class }
    Object.keys(node.attrs.data).forEach(key => {
      attrs[`data-${key}`] = node.attrs.data[key]
    })
    return ['div', attrs, 0]
  },
},

// a paragraph-like div is injected (and later removed) if there are no ps in a deeply nested div structure
pdiv: {
  attrs: {
    class: { default: null },
    data: { default: {} },
  },
  content: 'text*',
  group: 'block',
  parseDOM: [
    {
      tag: 'div',
      getAttrs(dom) {
        const attrs = { data: {}, class: dom.getAttribute('class') }
        for (let d in dom.dataset) {
          attrs.data[d] = dom.dataset[d]
        }
        return attrs
      },
    },
  ],
  toDOM(node) {
    const attrs = { class: node.attrs.class, 'data-inline-wrapper': 'true' }
    Object.keys(node.attrs.data).forEach(key => {
      attrs[`data-${key}`] = node.attrs.data[key]
    })
    return ['div', attrs, 0]
  },
},

EmailEditor.js

import React, { Component } from 'react';
import {
  Editor,
  EditorUtils,
  EditorTools,
  ProseMirror
} from '@progress/kendo-react-editor';
import { extendTableNodes } from './schemaNodes';

const {
  Bold,
  Italic,
  Underline,
  AlignLeft,
  AlignRight,
  AlignCenter,
  Indent,
  Outdent,
  OrderedList,
  UnorderedList,
  Undo,
  Redo,
  Link,
  Unlink,
  InsertImage,
  InsertTable,
  AddRowBefore,
  AddRowAfter,
  AddColumnBefore,
  AddColumnAfter,
  DeleteRow,
  DeleteColumn,
  DeleteTable,
  FormatBlock,
  FontName,
  FontSize,
  ViewHtml
} = EditorTools;

const { Schema, EditorView, EditorState } = ProseMirror;

class EmailEditor extends Component {
  constructor(props) {
    super(props);

    this.editorRef = null;
  }

  onExecute = ({ transaction, state }) => {
    const { doc, selection } = transaction;

    if (doc.eq(state.doc)) return;

    if (this.props.onChange) {
      const nextState = EditorState.create({
        doc,
        selection
      });

      const editorValue = EditorUtils.getHtml(nextState);

      this.props.onChange.call(undefined, editorValue);
    }
  };

  onMount = (event) => {
    const { viewProps } = event;
    const schema = viewProps.state.schema;
    const plugins = viewProps.state.plugins.filter(
      (p) => p.key.indexOf('selectingCells') !== 0
    );
    const tableNodes = extendTableNodes();
    const marks = schema.spec.marks;

    // update built-in schema nodes
    let nodes = schema.spec.nodes;
    for (const nodeName in tableNodes) {
      if (nodeName) {
        nodes = nodes.update(nodeName, tableNodes[nodeName]);
      }
    }
    const mySchema = new Schema({ nodes, marks });

    // Create an empty document to load the schema.
    const doc = EditorUtils.createDocument(mySchema, '');

    // Return the custom EditorView object that will be used by Editor.
    return new EditorView(
      { mount: event.dom },
      {
        ...event.viewProps,
        state: EditorState.create({ doc, plugins })
      }
    );
  };

  setHtml = (content) => {
    if (!this.editorRef.view) return;

    const view = this.editorRef.view;
    EditorUtils.setHtml(view, content);
  };

  getHtml = () => {
    if (!this.editorRef.view) return '';

    const view = this.editorRef.view;
    const content = EditorUtils.getHtml(view.state);
    return content;
  };

  render() {
    return (
      <Editor
        ref={(editor) => (this.editorRef = editor)}
        onExecute={this.onExecute}
        contentStyle={{ height: 600 }}
        tools={[
          [Bold, Italic, Underline],
          [Undo, Redo],
          [Link, Unlink],
          [AlignLeft, AlignCenter, AlignRight],
          [OrderedList, UnorderedList, Indent, Outdent],
          [Link, Unlink, InsertImage],
          [InsertTable],
          [AddRowBefore, AddRowAfter, AddColumnBefore, AddColumnAfter],
          [DeleteRow, DeleteColumn, DeleteTable],
          [ViewHtml],
          FontSize,
          FontName,
          FormatBlock
        ]}
        defaultEditMode="div"
        onMount={this.onMount}
      />
    );
  }
}

export { EmailEditor };

schemaNodes.js

const getAttributes = (dom, isBlock) => {
  if (isBlock && dom.children.length === 0) return false;
  if (!isBlock && dom.children.length > 0) return false;

  const result = {};
  const attributes = dom.attributes;
  let attr;
  for (let i = 0; i < attributes.length; i++) {
    attr = attributes[i];
    result[attr.name] = attr.value;
  }

  if (!isBlock && dom.innerHTML.trim() === '&nbsp;') {
    result['width'] = 0;
  }

  return result;
};

const hole = 0;

const tableAttrs = {
  align: { default: null },
  border: { default: null },
  cellpadding: { default: null },
  cellspacing: { default: null },
  style: { default: null },
  width: { default: null },
  height: { default: null },
  bgcolor: { default: null }
};

const styleAttrs = {
  style: { default: null }
};

const cellAttrs = {
  colspan: { default: null },
  colwidth: { default: null },
  rowspan: { default: null },
  bgcolor: { default: null },
  style: { default: null },
  align: { default: null },
  width: { default: null },
  height: { default: null }
};

export const extendTableNodes = () => {
  return {
    table: {
      content: '(table_colgroup | table_tbody)+',
      tableRole: 'table',
      isolating: true,
      group: 'block',
      attrs: { ...tableAttrs },
      parseDOM: [{ tag: 'table', getAttrs: dom => getAttributes(dom, true) }],
      toDOM: node => ['table', node.attrs, hole]
    },
    table_tbody: {
      content: 'table_row+',
      tableRole: 'tbody',
      group: 'block',
      parseDOM: [{ tag: 'tbody' }],
      toDOM: function toDOM() {
        return ['tbody', 0];
      }
    },
    table_colgroup: {
      content: 'table_col+',
      tableRole: 'colgroup',
      parseDOM: [{ tag: 'colgroup' }],
      toDOM: function toDOM() {
        return ['colgroup', 0];
      }
    },
    table_col: {
      tableRole: 'col',
      attrs: { ...styleAttrs },
      parseDOM: [{ tag: 'col', getAttrs: dom => getAttributes(dom, true) }],
      toDOM: node => ['col', node.attrs]
    },
    table_row: {
      content:
        '(table_cell | table_cell_block | table_header | table_header_block)*',
      tableRole: 'row',
      attrs: { ...styleAttrs },
      parseDOM: [{ tag: 'tr', getAttrs: dom => getAttributes(dom, true) }],
      toDOM: node => ['tr', node.attrs, hole]
    },
    table_header: {
      content: 'text*',
      tableRole: 'cell',
      group: 'block',
      isolating: true,
      marks: '',
      attrs: { ...cellAttrs },
      parseDOM: [{ tag: 'th', getAttrs: dom => getAttributes(dom, false) }],
      toDOM: node => ['th', node.attrs, hole]
    },
    table_cell: {
      content: 'text*',
      tableRole: 'cell',
      group: 'block',
      isolating: true,
      marks: '',
      attrs: { ...cellAttrs },
      parseDOM: [{ tag: 'td', getAttrs: dom => getAttributes(dom, false) }],
      toDOM: node => ['td', node.attrs, hole]
    },
    // need duplicate definitions with for the original block versions
    table_header_block: {
      content: 'block+',
      tableRole: 'cell',
      group: 'block',
      isolating: true,
      attrs: { ...cellAttrs },
      parseDOM: [{ tag: 'th', getAttrs: dom => getAttributes(dom, true) }],
      toDOM: node => ['th', node.attrs, hole]
    },
    table_cell_block: {
      content: 'block+',
      tableRole: 'cell',
      group: 'block',
      isolating: true,
      attrs: { ...cellAttrs },
      parseDOM: [{ tag: 'td', getAttrs: dom => getAttributes(dom, true) }],
      toDOM: node => ['td', node.attrs, hole]
    }
  };
};

@canvaspixels hope this helps

1 Like

@graham thank you so much. Such a quick response. Legend!

Interestingly, even if i straight up return false in the pdiv, I still see the div appear with the data-inline-wrapper attribute on it.

pdiv: {
  attrs: {
    class: { default: null },
    data: { default: {} },
  },
  content: 'text*',
  group: 'block',
  parseDOM: [
    {
      tag: 'div',
      getAttrs(dom) {
        return false;
        const attrs = { data: {}, class: dom.getAttribute('class') }
        for (let d in dom.dataset) {
          attrs.data[d] = dom.dataset[d]
        }
        return attrs
      },
    },
  ],

@marijn

Thanks for the support. Need help on the above issue. As @canvaspixels mentioned if I return false directly, I could still see the extra <p> tag inside <td>.

Here is quick intro to the problem statement:

Input value to the editor:

<table>
    <tr>
      <td>Td without p tag</td>
      <td><p>Td with p tag</p></td>
    </tr>
  </table>

Expectation: (Wherever I have text node, I want to see the same in the output)

<table>
    <tr>
      <td>Td without p tag</td> //<- here
      <td><p>Td with p tag</p></td>
    </tr>
  </table>

Actual Output: (Wherever I have text node, it is enclosed by a <p> tag)

<table>
    <tr>
      <td><p>Td without p tag</p></td> //<- extra <p> tag
      <td><p>Td with p tag</p></td>
    </tr>
  </table>

I have modified the library demo code to add a new node called table_inline_cell in addition to the internal table_cell node.

  • table_inline_cell will parse td with inline child node. If it doesn’t match, I return false as suggested.
  • table_cell will parse td with block child nodes. If it doesn’t match, I return false as suggested.

Here is the stackblitz link to the modified code: StackBlitz

The parsing is happening as expected(Image 1). But in the DOM or in the output, I always get the extra p tag for all cases(Image 2).

Kindly guide me where I miss the point. Thanks in advance.

That’s not something the library supports. A given node must have either all-inline or all-block content.

Okay, so even though it is semantically correct and possible to have both inline and block elements inside a td, the ProseMirror library does not support both types of nodes in a td. We can only use one type for all td elements throughout the editor.

In the past, I’d tried to have different table cells for block and inline content, but I realized it does not work well with, e.g., prosemirror-tables.

I think a clever approach is the one of Pandoc, that defines a Plain block element, which is put around inline content in table cells, but also in list items of any kind: bullet (<ul> in HTML), ordered (<ol>) and descriptions (<dl>).

Your parser should recognize cells and items that have inline contents and embed them in a Plain.

A Plain could be rendered as a p with a certain class, or a div.

Its group property would be "block", while its content property would be "inline*" or "text*".