mtcute/packages/markdown-parser
2023-11-01 14:05:45 +03:00
..
src refactor: initial support for esm 2023-10-11 08:42:37 +03:00
tests ESM + end-to-end tests (#11) 2023-10-16 19:23:53 +03:00
package.json ci: release building 2023-11-01 01:03:49 +03:00
README.md docs: updated packages readmes 2023-11-01 14:05:45 +03:00
tsconfig.json ESM + end-to-end tests (#11) 2023-10-16 19:23:53 +03:00
typedoc.cjs docs: updated typedoc, added ci, fixed some docs 2023-10-27 14:25:21 +03:00

@mtcute/markdown-parser

📖 API Reference

Markdown entities parser for mtcute

NOTE: The syntax implemented here is not compatible with Bot API Markdown, nor MarkdownV2.

Please read Syntax below for a detailed explanation

Note

It is generally recommended to use @mtcute/html-parser instead, as it is easier to use and is more readable in most cases

Usage

import { TelegramClient } from '@mtcute/client'
import { MarkdownMessageEntityParser, md } from '@mtcute/markdown-parser'

const tg = new TelegramClient({ ... })
tg.registerParseMode(new MarkdownMessageEntityParser())

tg.sendText(
    'me',
    md`Hello, **me**! Updates from the feed:\n${await getUpdatesFromFeed()}`
)

Syntax

Inline entities

Inline entities are defined by some tag surrounding some text, and processing them simply strips their tag.

Supported entities:

  • Bold, tag is **
  • Italic, tag is __
  • Underline, tag is -- (NON-STANDARD)
  • Strikethrough, tag is ~~
  • Spoiler, tag is || (NON-STANDARD)
  • Code (monospaced font), tag is `
    • Note that escaping text works differently inside code, see below.

Unlike CommonMark, we use the symbol itself and not its count. Thus, using * (asterisk) will always produce bold, and using _ (underscore) will always produce italic.

This eliminates a lot of confusion, like: _bold_bold, **italic**italic

Code Result (visual) Result (as HTML)
**bold** bold <b>bold</b>
__italic__ italic <i>italic</i>
--underline-- underline <u>underline</u>
~~strikethrough~~ strikethrough <s>strikethrough</s>
||spoiler|| N/A <spoiler>spoiler</spoiler>
*whatever* *whatever* *whatever*
_whatever_ _whatever_ _whatever_
`hello world` hello world <code>hello world</code>
`text` __text__ <code>__text__</code>

Pre

Pre represents a single block of code, optionally with a language.

This entity starts with ``` (triple backtick), optionally followed with language name and a must be followed with a line break, and ends with ``` (triple backtick), optionally preceded with a line break.

Code Result (visual) Result (as HTML)
```
hello
```
hello <pre>hello</pre>
```
hello```
hello <pre>hello</pre>
```javascript
const a = ``
```
const a = ``
<pre language="javascript">
const a = ``
</pre>

Links are parsed exactly the same as standard markdown (except references are not supported).

Defined like this: [Link text](https://example.com).

  • Link text may also contain any formatting, but link cannot contain other links inside (obviously).
  • [ (opening square bracket) inside link text will be treated like a normal character.

A markdown-style link can also be used to define a name mention like this: [Name](tg://user?id=1234567), where 1234567 is the ID of the user you want to mention.

Additionally, a markdown-style link can be used to define a custom emoji like this: [😄](tg://emoji?id=123456), where 123456 is ID of the emoji.

Note

In most cases, you can only use IDs of users that were seen by the client while using given storage.

Alternatively, you can explicitly provide access hash like this: [Name](tg://user?id=1234567&hash=abc, where abc is user's access hash written as a base-16 unsigned integer. Order of the parameters does matter, i.e. tg://user?hash=abc&id=1234567 will not be processed as expected.

Code Result (visual) Result (as HTML)
[Google](https://google.com) Google <a href="https://google.com">Google</a>
[__Google__](https://google.com) Google <a href="https://google.com"><i>Google</i></a>
[empty link]() empty link empty link
[empty link] [empty link] [empty link]
[User](tg://user?id=1234567) N/A N/A
[😄](tg://emoji?id=123456) N/A N/A

Nested and overlapping entities

Quite a powerful feature of this parser is the ability to process overlapping entities. Only inline entities (except code) can be overlapped.

Since inline entities are only defined by their tag, and nesting same entities doesn't make sense, you can think of the tags just as start/end markers, and not in terms of nesting.

Code Result (visual) Result (as HTML)
**Welcome back, __User__!** Welcome back, User! <b>Welcome back, <i>User</i>!</b>
**bold __and** italic__ bold and italic <b>bold <i>and</i></b><i> italic</i>

Escaping

Often, you may want to escape the text in a way it is not processed as an entity.

To escape any character, prepend it with \ (backslash). Escaped characters are added to output as-is.

Inline entities and links inside code entities (both inline and pre) are not processed, so you only need to escape closing tags.

Note

: backslash itself must be escaped like this: \\ (double backslash).

This will look pretty bad in real code, so use escaping only when really needed, and use MarkdownMessageEntityParser.escape or md or other parse modes (like HTML one provided by @mtcute/html-parser)) instead.

In theory, you could escape every single non-markup character, but why would you want to do that 😜

Code Result (visual) Result (as HTML)
\_\_not italic\_\_ __not italic__ __not italic__
__italic \_ text__ italic _ text <i>italic _ text </i>
`not italic` __not italic__ <code>__not italic__</code>
C:\\Users\\Guest C:\Users\Guest C:\Users\Guest
`var a = \`hello\`` var a = `hello` <code>var a = `hello`</code>