2021-08-05 20:38:24 +03:00
|
|
|
# @mtcute/markdown-parser
|
2021-04-08 12:19:38 +03:00
|
|
|
|
2021-08-05 20:38:24 +03:00
|
|
|
> Markdown entities parser for MTCute
|
2021-04-08 12:19:38 +03:00
|
|
|
|
|
|
|
This package implements formatting syntax similar to Markdown (CommonMark) but slightly adjusted and simplified.
|
|
|
|
|
|
|
|
> **NOTE**: The syntax implemented here is **not** compatible with Bot API _Markdown_, nor _MarkdownV2_.
|
|
|
|
>
|
|
|
|
> Please read [Syntax](#syntax) below for a detailed explanation
|
|
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
|
|
```typescript
|
2021-08-05 20:38:24 +03:00
|
|
|
import { TelegramClient } from '@mtcute/client'
|
|
|
|
import { MarkdownMessageEntityParser, md } from '@mtcute/markdown-parser'
|
2021-04-08 12:19:38 +03:00
|
|
|
|
|
|
|
const tg = new TelegramClient({ ... })
|
|
|
|
tg.registerParseMode(new MarkdownMessageEntityParser())
|
|
|
|
|
|
|
|
tg.sendText(
|
|
|
|
'me',
|
2021-07-02 20:20:29 +03:00
|
|
|
md`Hello, **me**! Updates from the feed:\n${await getUpdatesFromFeed()}`
|
2021-04-08 12:19:38 +03:00
|
|
|
)
|
|
|
|
```
|
|
|
|
|
|
|
|
## Syntax
|
|
|
|
|
|
|
|
### Inline entities
|
|
|
|
|
|
|
|
Inline entities are defined by some _tag_ surrounding some text, and processing them simply strips their tag.
|
|
|
|
|
|
|
|
Supported entities:
|
|
|
|
|
|
|
|
- Bold, tag is `**`
|
|
|
|
- Italic, tag is `__`
|
|
|
|
- Underline, tag is `--` (_NON-STANDARD_)
|
|
|
|
- Strikethrough, tag is `~~`
|
|
|
|
- Code (monospaced font), tag is <code>`</code>
|
|
|
|
- Note that escaping text works differently inside code, see below.
|
|
|
|
|
|
|
|
> Unlike CommonMark, we use the symbol itself and not its count.
|
|
|
|
> Thus, using `*` (asterisk) will **always** produce bold,
|
|
|
|
> and using `_` (underscore) will **always** produce italic.
|
|
|
|
>
|
|
|
|
> This eliminates a lot of confusion, like: `_bold_` → _bold_, `**italic**` → **italic**
|
|
|
|
|
|
|
|
| Code | Result (visual) | Result (as HTML)
|
|
|
|
|---|---|---|
|
|
|
|
| `**bold**` | **bold** | `<b>bold</b>`
|
2021-04-08 12:29:06 +03:00
|
|
|
| `__italic__` | __italic__ | `<i>italic</i>`
|
2021-04-08 12:19:38 +03:00
|
|
|
| `--underline` | <u>underline</u> | `<u>underline</u>`
|
|
|
|
| `~~strikethrough~~` | ~~strikethrough~~ | `<s>strikethrough</s>`
|
|
|
|
| `*whatever*` | \*whatever\* | `*whatever*`
|
|
|
|
| `_whatever_` | \_whatever\_ | `_whatever_`
|
|
|
|
| <code>\`hello world\`</code> | `hello world` | `<code>hello world</code>`
|
2021-04-08 12:29:06 +03:00
|
|
|
| <code>\`__text__\`</code> | `__text__` | `<code>__text__</code>`
|
2021-04-08 12:19:38 +03:00
|
|
|
|
|
|
|
### Pre
|
|
|
|
|
|
|
|
Pre represents a single block of code, optionally with a language.
|
|
|
|
|
|
|
|
This entity starts with <code>\`\`\`</code> (triple backtick), optionally followed with language name and a must be
|
|
|
|
followed with a line break, and ends with <code>\`\`\`</code> (triple backtick), optionally preceded with a line break.
|
|
|
|
|
|
|
|
| Code | Result (visual) | Result (as HTML)
|
|
|
|
|---|---|---|
|
|
|
|
| <pre><code>\`\`\`<br>hello<br>\`\`\`</code></pre> | `hello` | `<pre>hello</pre>`
|
|
|
|
| <pre><code>\`\`\`<br>hello\`\`\`</code></pre> | `hello` | `<pre>hello</pre>`
|
|
|
|
| <pre><code>\`\`\`javascript<br>const a = ``<br>\`\`\`</code></pre> | <code>const a = ``</code> | <pre><code><pre language="javascript"><br> const a = ``<br></pre></code></pre>
|
|
|
|
|
|
|
|
### Links
|
|
|
|
|
|
|
|
Links are parsed exactly the same as standard markdown (except references are not supported).
|
|
|
|
|
|
|
|
Defined like this: `[Link text](https://example.com)`.
|
|
|
|
|
|
|
|
- Link text may also contain any formatting, but link cannot contain other links inside (obviously).
|
|
|
|
- `[` (opening square bracket) inside link text will be treated like a normal character.
|
|
|
|
|
|
|
|
A markdown-style link can also be used to define a name mention like this: `[Name](tg://user?id=1234567)`,
|
|
|
|
where `1234567` is the ID of the user you want to mention
|
|
|
|
|
|
|
|
> **Note**: It is up to the client to look up user's input entity by ID.
|
|
|
|
> In most cases, you can only use IDs of users that were seen by the client while using given storage.
|
|
|
|
>
|
|
|
|
> Alternatively, you can explicitly provide access hash like this: `[Name](tg://user?id=1234567&hash=abc`,
|
|
|
|
> where `abc` is user's access hash written as a base-16 *unsigned* integer.
|
|
|
|
> Order of the parameters does matter, i.e. `tg://user?hash=abc&id=1234567` will not be processed as expected.
|
|
|
|
|
|
|
|
| Code | Result (visual) | Result (as HTML)
|
|
|
|
|---|---|---|
|
|
|
|
| `[Google](https://google.com)` | [Google](https://google.com) | `<a href="https://google.com">Google</a>`
|
|
|
|
| `[__Google__](https://google.com)` | [_Google_](https://google.com) | `<a href="https://google.com"><i>Google</i></a>`
|
|
|
|
| `[empty link]()` | empty link | `empty link`
|
2021-06-23 17:08:23 +03:00
|
|
|
| `[empty link]` | [empty link] | `[empty link]`
|
2021-04-08 12:19:38 +03:00
|
|
|
| `[User](tg://user?id=1234567)` | N/A | N/A
|
|
|
|
|
|
|
|
### Nested and overlapping entities
|
|
|
|
|
|
|
|
Quite a powerful feature of this parser is the ability to process overlapping entities. Only inline entities (except
|
|
|
|
code) can be overlapped.
|
|
|
|
|
|
|
|
Since inline entities are only defined by their tag, and nesting same entities doesn't make sense, you can think of the
|
|
|
|
tags just as start/end markers, and not in terms of nesting.
|
|
|
|
|
|
|
|
| Code | Result (visual) | Result (as HTML)
|
|
|
|
|---|---|---|
|
|
|
|
| `**Welcome back, __User__!**` | **Welcome back, _User_!** | `<b>Welcome back, <i>User</i>!</b>`
|
|
|
|
| `**bold __and** italic__` | **bold _and_** _italic_ | `<b>bold <i>and</i></b><i> italic</i>`
|
|
|
|
|
|
|
|
## Escaping
|
|
|
|
|
|
|
|
Often, you may want to escape the text in a way it is not processed as an entity.
|
|
|
|
|
|
|
|
To escape any character, prepend it with ` \ ` (backslash). Escaped characters are added to output as-is.
|
|
|
|
|
|
|
|
Inline entities and links inside code entities (both inline and pre) are not processed, so you only need to escape
|
|
|
|
closing tags.
|
|
|
|
|
|
|
|
Also, in JavaScript (and many other languages) ` \ ` itself must be escaped, so you'll end up with strings
|
|
|
|
like `"\\_\\_not italic\\_\\_`.
|
|
|
|
|
|
|
|
> **Note**: backslash itself must be escaped like this: ` \\ ` (double backslash).
|
|
|
|
>
|
|
|
|
> This will look pretty bad in real code, so use escaping only when really needed, and use
|
2021-07-02 20:20:29 +03:00
|
|
|
> [`MarkdownMessageEntityParser.escape`](./classes/markdownmessageentityparser.html#escape) or `md` or
|
2021-08-05 20:38:24 +03:00
|
|
|
> other parse modes (like HTML one provided by [`@mtcute/html-parser`](../html-parser/index.html))) instead.
|
2021-04-08 12:19:38 +03:00
|
|
|
|
|
|
|
> In theory, you could escape every single non-markup character, but why would you want to do that 😜
|
|
|
|
|
|
|
|
| Code | Result (visual) | Result (as HTML)
|
|
|
|
|---|---|---|
|
|
|
|
| `\_\_not italic\_\_` | \_\_not italic\_\_ | `__not italic__`
|
|
|
|
| `__italic \_ text__` | _italic \_ text_ | `<i>italic _ text </i>`
|
|
|
|
| <code>\`__not italic__\`</code> | `__not italic__` | `<code>__not italic__</code>`
|
|
|
|
| <code>C:\\\\Users\\\\Guest</code> | C:\Users\Guest | `C:\Users\Guest`
|
|
|
|
| <code>\`var a = \\\`hello\\\`\`</code> | <code>var a = \`hello\`</code> | <code><code>var a = \`hello\`</code></code>
|