Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support \(...\) and \[...\] style math formula #4186

Merged
merged 1 commit into from
Mar 25, 2024

Conversation

MrrDrr
Copy link
Contributor

@MrrDrr MrrDrr commented Mar 1, 2024

解决gpt回复的公式使用 \(...\) 和 \[...\] 格式导致的渲染问题 #3436 (comment)

remark-math 里有人讨论了这个问题,不过结论是不会支持这种格式 remarkjs/remark-math#39
曾尝试用rehype-mathjax来代替rehype-katex,不过也没有正确渲染

最后参考了这里的实现,把\(...\) 和 \[...\]格式替换为美元符号:
danny-avila/LibreChat#1585
https://github.com/danny-avila/LibreChat/blob/v0.6.10/client/src/utils/latex.ts#L36

现在的效果:
均方误差的公式

此前的效果:
模型评估

Copy link

vercel bot commented Mar 1, 2024

@MrrDrr is attempting to deploy a commit to the NextChat Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

github-actions bot commented Mar 1, 2024

Your build has completed!

Preview deployment

@H0llyW00dzZ
Copy link
Contributor

H0llyW00dzZ commented Mar 1, 2024

解决gpt回复的公式使用 (...) 和 [...] 格式导致的渲染问题 #3436 (comment)

remark-math 里有人讨论了这个问题,不过结论是不会支持这种格式 remarkjs/remark-math#39 曾尝试用rehype-mathjax来代替rehype-katex,不过也没有正确渲染

最后参考了这里的实现,把(...) 和 [...]格式替换为美元符号: danny-avila/LibreChat#1585 https://github.com/danny-avila/LibreChat/blob/v0.6.10/client/src/utils/latex.ts#L36

现在的效果: 均方误差的公式

此前的效果: 模型评估

that nothing help/fix related to this issue:

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Solve the rendering problem caused by the use of (...) and [...] formats in gpt reply formulas [#3436 (comment)](https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web /issues/3436#issuecomment-1842626873)

Someone in remark-math discussed this issue, but the conclusion is that this format will not be supported remarkjs/remark-math#39 Tried Use rehype-mathjax instead of rehype-katex, but it does not render correctly.

Finally, I referred to the implementation here and replaced the (...) and [...] formats with dollar signs: [danny-avila/LibreChat#1585](https://github.com/danny- avila/LibreChat/pull/1585) https://github.com/danny-avila/LibreChat/blob/v0.6.10/client/src/utils/latex.ts#L36

Current effect: ![Formula of mean square error](https://private-user-images.githubusercontent.com/28617777/309315231-37787e82-4acb-46bc-83d7-fcf9dc53328c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9. .fl2U3ec0B4HKqYyxMHCaESawilX3EoK --kpTAR2WSvo)

Previous effect: ![Model evaluation](https://private-user-images.githubusercontent.com/28617777/309315749-745bb7b3-238b-4de4-83fd-2958bcd0ccfa.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..0LuX U2aH5EF7T2pwBPiUXKaFL-N06- LGyw-BXQKXczk)

that nothing help/fix related to this issue:

@MrrDrr
Copy link
Contributor Author

MrrDrr commented Mar 2, 2024

Bot detected the issue body's language is not English, translate it automatically.

Solve the rendering problem caused by the use of (...) and [...] formats in gpt reply formulas [#3436 (comment)](https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web /issues/3436#issuecomment-1842626873)
Someone in remark-math discussed this issue, but the conclusion is that this format will not be supported remarkjs/remark-math#39 Tried Use rehype-mathjax instead of rehype-katex, but it does not render correctly.
Finally, I referred to the implementation here and replaced the (...) and [...] formats with dollar signs: [danny-avila/LibreChat#1585](https://github.com/danny- avila/LibreChat/pull/1585) https://github.com/danny-avila/LibreChat/blob/v0.6.10/client/src/utils/latex.ts#L36
Current effect: ![Formula of mean square error](https://private-user-images.githubusercontent.com/28617777/309315231-37787e82-4acb-46bc-83d7-fcf9dc53328c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9. .fl2U3ec0B4HKqYyxMHCaESawilX3EoK --kpTAR2WSvo)
Previous effect: ![Model evaluation](https://private-user-images.githubusercontent.com/28617777/309315749-745bb7b3-238b-4de4-83fd-2958bcd0ccfa.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..0LuX U2aH5EF7T2pwBPiUXKaFL-N06- LGyw-BXQKXczk)

that nothing help/fix related to this issue:

Hi
The intention of this PR is to address #3436

The issues you mentioned were caused by the dollar sign followed immediately by numbers, which triggered the escapeDollarNumber function, leading to formulas being treated as plain text. However, openai has recently made adjustments, and the model now uses \(...\) and \[...\] to denote formulas, instead of dollar signs, thus avoiding the issues you mentioned. However, this new style of \(...\) and \[...\] introduces new problems, which this PR aims to address.

This PR will not introduce additional bugs. The content between \(...\) and \[...\] can be confidently identified as formulas because if the content between the brackets is not a formula, openai would not prepend "\" to the brackets, thus not triggering the replacement operation of this PR. This PR also skips over content in code blocks to avoid mistakenly replacing brackets.

Here are the results of my attempts to replicate the issues you mentioned, and everything is fine. openai now occasionally uses plain text to represent formulas, without involving the rendering of LaTeX formulas.
质量和能量等价 (1)
物理学公式
整数转换为无符号整数
级数收敛至零

@H0llyW00dzZ
Copy link
Contributor

Bot detected the issue body's language is not English, translate it automatically.

Solve the rendering problem caused by the use of (...) and [...] formats in gpt reply formulas [#3436 (comment)](https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web /issues/3436#issuecomment-1842626873)
Someone in remark-math discussed this issue, but the conclusion is that this format will not be supported remarkjs/remark-math#39 Tried Use rehype-mathjax instead of rehype-katex, but it does not render correctly.
Finally, I referred to the implementation here and replaced the (...) and [...] formats with dollar signs: [danny-avila/LibreChat#1585](https://github.com/danny- avila/LibreChat/pull/1585) https://github.com/danny-avila/LibreChat/blob/v0.6.10/client/src/utils/latex.ts#L36
Current effect: ![Formula of mean square error](https://private-user-images.githubusercontent.com/28617777/309315231-37787e82-4acb-46bc-83d7-fcf9dc53328c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9. .fl2U3ec0B4HKqYyxMHCaESawilX3EoK --kpTAR2WSvo)
Previous effect: ![Model evaluation](https://private-user-images.githubusercontent.com/28617777/309315749-745bb7b3-238b-4de4-83fd-2958bcd0ccfa.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..0LuX U2aH5EF7T2pwBPiUXKaFL-N06- LGyw-BXQKXczk)

that nothing help/fix related to this issue:

Hi The intention of this PR is to address #3436

The issues you mentioned were caused by the dollar sign followed immediately by numbers, which triggered the escapeDollarNumber function, leading to formulas being treated as plain text. However, openai has recently made adjustments, and the model now uses (...) and [...] to denote formulas, instead of dollar signs, thus avoiding the issues you mentioned. However, this new style of (...) and [...] introduces new problems, which this PR aims to address.

This PR will not introduce additional bugs. The content between (...) and [...] can be confidently identified as formulas because if the content between the brackets is not a formula, openai would not prepend "" to the brackets, thus not triggering the replacement operation of this PR. This PR also skips over content in code blocks to avoid mistakenly replacing brackets.

Here are the results of my attempts to replicate the issues you mentioned, and everything is fine. openai now occasionally uses plain text to represent formulas, without involving the rendering of LaTeX formulas. 质量和能量等价 (1) 物理学公式 整数转换为无符号整数 级数收敛至零

it's so fucking bad if openai literally use \(...\) and \[...\], not depending of prompt just for LaTeX lmao they don't fucking know how hard to handling that stupid pattern in markdown

@H0llyW00dzZ
Copy link
Contributor

for example how hard/difficult to handling that stupid pattern is
when you using regex just for handling that stupid pattern, it possible to break the performance

@H0llyW00dzZ
Copy link
Contributor

Note

I am not going to cherry-pick into my forks this because it seems unreasonable in this scenario. Trying to manage such a flawed stupid pattern could possibly lead to bugs or degrade performance.

@MrrDrr
Copy link
Contributor Author

MrrDrr commented Mar 2, 2024

haha, markdown and openai make life very difficult. However there doesn't seem to be any performance bottleneck as I have tested it on both computer and mobile phone without experiencing additional latency. The previous issue also did not occur, so this might already be the most acceptable solution.

@H0llyW00dzZ
Copy link
Contributor

haha, markdown and openai make life very difficult. However there doesn't seem to be any performance bottleneck as I have tested it on both computer and mobile phone without experiencing additional latency. The previous issue also did not occur, so this might already be the most acceptable solution.

it literally model issue and they don't fucking know how their model it's so fucking bad can't handle LaTeX markdown

@MrrDrr
Copy link
Contributor Author

MrrDrr commented Mar 5, 2024

will not affect the rendering of normal brackets

QQ图片20240305150233

@daiaji
Copy link

daiaji commented Mar 9, 2024

It is interesting that the code for markdown.tsx also seems to have a rendering issue with markdowns.

import ReactMarkdown from "react-markdown";
import "katex/dist/katex.min.css";
import RemarkMath from "remark-math";
import RemarkBreaks from "remark-breaks";
import RehypeKatex from "rehype-katex";
import RemarkGfm from "remark-gfm";
import RehypeHighlight from "rehype-highlight";
import { useRef, useState, RefObject, useEffect, useMemo } from "react";
import { copyToClipboard } from "../utils";
import mermaid from "mermaid";

import LoadingIcon from "../icons/three-dots.svg";
import React from "react";
import { useDebouncedCallback } from "use-debounce";
import { showImageModal } from "./ui-lib";

export function Mermaid(props: { code: string }) {
  const ref = useRef<HTMLDivElement>(null);
  const [hasError, setHasError] = useState(false);

  useEffect(() => {
    if (props.code && ref.current) {
      mermaid
        .run({
          nodes: [ref.current],
          suppressErrors: true,
        })
        .catch((e) => {
          setHasError(true);
          console.error("[Mermaid] ", e.message);
        });
    }
    // eslint-disable-next-line react-hooks/exhaustive-deps
  }, [props.code]);

  function viewSvgInNewWindow() {
    const svg = ref.current?.querySelector("svg");
    if (!svg) return;
    const text = new XMLSerializer().serializeToString(svg);
    const blob = new Blob([text], { type: "image/svg+xml" });
    showImageModal(URL.createObjectURL(blob));
  }

  if (hasError) {
    return null;
  }

  return (
    <div
      className="no-dark mermaid"
      style={{
        cursor: "pointer",
        overflow: "auto",
      }}
      ref={ref}
      onClick={() => viewSvgInNewWindow()}
    >
      {props.code}
    </div>
  );
}

export function PreCode(props: { children: any }) {
  const ref = useRef<HTMLPreElement>(null);
  const refText = ref.current?.innerText;
  const [mermaidCode, setMermaidCode] = useState("");

  const renderMermaid = useDebouncedCallback(() => {
    if (!ref.current) return;
    const mermaidDom = ref.current.querySelector("code.language-mermaid");
    if (mermaidDom) {
      setMermaidCode((mermaidDom as HTMLElement).innerText);
    }
  }, 600);

  useEffect(() => {
    setTimeout(renderMermaid, 1);
    // eslint-disable-next-line react-hooks/exhaustive-deps
  }, [refText]);

  return (
    <>
      {mermaidCode.length > 0 && (
        <Mermaid code={mermaidCode} key={mermaidCode} />
      )}
      <pre ref={ref}>
        <span
          className="copy-code-button"
          onClick={() => {
            if (ref.current) {
              const code = ref.current.innerText;
              copyToClipboard(code);
            }
          }}
        ></span>
        {props.children}
      </pre>
    </>
  );
}

function escapeDollarNumber(text: string) {
  let escapedText = "";

  for (let i = 0; i < text.length; i += 1) {
    let char = text[i];
    const nextChar = text[i + 1] || " ";

    if (char === "$" && nextChar >= "0" && nextChar <= "9") {
      char = "\\$";
    }

    escapedText += char;
  }

  return escapedText;
}

function escapeBrackets(text: string) {
  const pattern =
    /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)/g;
  return text.replace(
    pattern,
    (match, codeBlock, squareBracket, roundBracket) => {
      if (codeBlock) {
        return codeBlock;
      } else if (squareBracket) {
        return `$${squareBracket}$`;
      } else if (roundBracket) {
        return `${roundBracket};
      }
      return match;
    },
  );
}

function _MarkDownContent(props: { content: string }) {
  const escapedContent = useMemo(
    () => escapeBrackets(escapeDollarNumber(props.content)),
    [props.content],
  );

  return (
    <ReactMarkdown
      remarkPlugins={[RemarkMath, RemarkGfm, RemarkBreaks]}
      rehypePlugins={[
        RehypeKatex,
        [
          RehypeHighlight,
          {
            detect: false,
            ignoreMissing: true,
          },
        ],
      ]}
      components={{
        pre: PreCode,
        p: (pProps) => <p {...pProps} dir="auto" />,
        a: (aProps) => {
          const href = aProps.href || "";
          const isInternal = /^\/#/i.test(href);
          const target = isInternal ? "_self" : aProps.target ?? "_blank";
          return <a {...aProps} target={target} />;
        },
      }}
    >
      {escapedContent}
    </ReactMarkdown>
  );
}

export const MarkdownContent = React.memo(_MarkDownContent);

export function Markdown(
  props: {
    content: string;
    loading?: boolean;
    fontSize?: number;
    parentRef?: RefObject<HTMLDivElement>;
    defaultShow?: boolean;
  } & React.DOMAttributes<HTMLDivElement>,
) {
  const mdRef = useRef<HTMLDivElement>(null);

  return (
    <div
      className="markdown-body"
      style={{
        fontSize: `${props.fontSize ?? 14}px`,
      }}
      ref={mdRef}
      onContextMenu={props.onContextMenu}
      onDoubleClickCapture={props.onDoubleClickCapture}
      dir="auto"
    >
      {props.loading ? (
        <LoadingIcon />
      ) : (
        <MarkdownContent content={props.content} />
      )}
    </div>
  );
}

@daiaji
Copy link

daiaji commented Mar 10, 2024

#4230
和这边说的一样,兄弟你有什么看法吗?
应该不存在现成的node.js包能够解决现在的LaTeX渲染故障吧?
quantizor/markdown-to-jsx#485 (comment)
这边提供的解决方案是定义LaTeX渲染会生效的语法,我觉得这比我的解决方案好点,但好得不多,起码不用瞎鸡巴替换了,而是直接定义LaTeX渲染语法,但似乎整个解决方案就是无法绕过复杂的正则和逻辑嵌套。

@MrrDrr
Copy link
Contributor Author

MrrDrr commented Mar 10, 2024

#4230 和这边说的一样,兄弟你有什么看法吗? 应该不存在现成的node.js包能够解决现在的LaTeX渲染故障吧? quantizor/markdown-to-jsx#485 (comment) 这边提供的解决方案是定义LaTeX渲染会生效的语法,我觉得这比我的解决方案好点,但好得不多,起码不用瞎鸡巴替换了,而是直接定义LaTeX渲染语法,但似乎整个解决方案就是无法绕过复杂的正则和逻辑嵌套。

哈喽
我在提交pr之前看到过你的pr
我们解决的不是同一个问题。目前的逻辑是转义所有的"美元符号+数字"这种模式,你的pr是只在代码块以外的地方转义"美元符号+数字"。

我的pr和"美元符号+数字"这种模式没有关系,我解决的是\(...\)和 \[...\]样式的公式渲染问题。
用美元符号来渲染公式其实问题很大,因为美元符号太常见了,太容易和其他场景冲突了,openai也意识到了这个问题,所以现在的gpt已经都默认用\(...\)和 \[...\]这种样式了。

如果gpt将来都用这种样式来渲染公式,那么问题其实会少很多,因为斜杠+括号这种符号很少见,其他场景几乎都不会出现这个,几乎不会冲突。但是我还没找到有哪个公式渲染器支持这种样式的。openai官方用的也是rehype-katex来渲染,但他们应该在什么地方改了代码。所以我只好用这个pr里直接替换的方式。当然最优雅的办法是找到一个原生支持\(...\)和 \[...\]样式的公式渲染器。

我这个pr我自己已经用了一段时间了,还没有发现问题。
这个issue 应该是因为提示里强调了要用latex格式,所以gpt就以美元符号输出了,导致了那个老问题。如果不强调的话是不会有问题的。

微信截图_20240310105124
微信截图_20240310105143
微信截图_20240310105319

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


#4230 It’s the same as what was said here. What do you think, brother? There should be no ready-made node.js package that can solve the current LaTeX rendering failure, right? quantizor/markdown-to-jsx#485 (comment) The solution provided here is to define the LaTeX rendering session Effective syntax, I think this is better than my solution, but not much better, at least there is no need to replace it blindly. , but directly define the LaTeX rendering syntax, but it seems that the entire solution cannot bypass the complex regularization and logical nesting.

hello
I saw your PR before submitting the PR
We are not solving the same problem. The current logic is to escape all "dollar signs + numbers" in this pattern, your pr is to only escape "dollar signs + numbers" outside the code block.

My PR has nothing to do with the "dollar sign + number" mode. What I solve is the problem of formula rendering in the \(...\) and \[...\] styles.
Using the dollar sign to render formulas is actually a big problem, because the dollar sign is too common and can easily conflict with other scenarios. OpenAI is also aware of this problem, so now GPT uses \(...\) by default. This style is similar to \[...\].

If gpt uses this style to render formulas in the future, then there will actually be a lot less problems, because symbols like slash + brackets are rare and will hardly appear in other scenarios, and there will be almost no conflict. But I haven't found any formula renderer that supports this style. Openai officially uses rehype-katex for rendering, but they should have changed the code somewhere. So I had to use the direct replacement method in this PR. Of course, the most elegant way is to find a formula renderer that natively supports the \(...\) and \[...\] styles.

I have been using this PR myself for a while and haven't found any problems yet.
This issue It should be because the prompt emphasized the use of latex format, so gpt was output with dollar signs, which caused the old question. It won't be a problem if you don't emphasize it.

WeChat screenshot_20240310105124
WeChat screenshot_20240310105143
WeChat screenshot_20240310105319

@daiaji
Copy link

daiaji commented Mar 10, 2024

#4230 和这边说的一样,兄弟你有什么看法吗? 应该不存在现成的node.js包能够解决现在的LaTeX渲染故障吧? quantizor/markdown-to-jsx#485 (comment) 这边提供的解决方案是定义LaTeX渲染会生效的语法,我觉得这比我的解决方案好点,但好得不多,起码不用瞎鸡巴替换了,而是直接定义LaTeX渲染语法,但似乎整个解决方案就是无法绕过复杂的正则和逻辑嵌套。

哈喽 我在提交pr之前看到过你的pr 我们解决的不是同一个问题。目前的逻辑是转义所有的"美元符号+数字"这种模式,你的pr是只在代码块以外的地方转义"美元符号+数字"。

我的pr和"美元符号+数字"这种模式没有关系,我解决的是(...)和 [...]样式的公式渲染问题。 用美元符号来渲染公式其实问题很大,因为美元符号太常见了,太容易和其他场景冲突了,openai也意识到了这个问题,所以现在的gpt已经都默认用(...)和 [...]这种样式了。

如果gpt将来都用这种样式来渲染公式,那么问题其实会少很多,因为斜杠+括号这种符号很少见,其他场景几乎都不会出现这个,几乎不会冲突。但是我还没找到有哪个公式渲染器支持这种样式的。openai官方用的也是rehype-katex来渲染,但他们应该在什么地方改了代码。所以我只好用这个pr里直接替换的方式。当然最优雅的办法是找到一个原生支持(...)和 [...]样式的公式渲染器。

我这个pr我自己已经用了一段时间了,还没有发现问题。 这个issue 应该是因为提示里强调了要用latex格式,所以gpt就以美元符号输出了,导致了那个老问题。如果不强调的话是不会有问题的。

微信截图_20240310105124 微信截图_20240310105143 微信截图_20240310105319

直接放弃用$这也是一种解决方案,毕竟用$确实问题很大,对于英语区用户和码农都更友好些,但放弃现在的书写习惯,恐怕也有其难度。

搞这$的兼容险些把我脑浆搞烧。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


#4230 It’s the same as what was said here. What do you think, brother? There should be no ready-made node.js package that can solve the current LaTeX rendering failure, right? quantizor/markdown-to-jsx#485 (comment) The solution provided here is to define the LaTeX rendering session Effective syntax, I think this is better than my solution, but not much better, at least there is no need to replace it blindly. , but directly define the LaTeX rendering syntax, but it seems that the entire solution cannot bypass the complex regularization and logical nesting.

Hello, I saw your PR before submitting the PR. We are not solving the same problem. The current logic is to escape all "dollar signs + numbers" in this pattern, your pr is to only escape "dollar signs + numbers" outside the code block.

My PR has nothing to do with the "dollar sign + number" mode. What I solve is the problem of formula rendering in the (...) and [...] styles. Using the dollar sign to render formulas is actually a big problem, because the dollar sign is too common and can easily conflict with other scenarios. OpenAI is also aware of this problem, so now GPT uses (...) and \ by default. [...]This style.

If gpt uses this style to render formulas in the future, the problem will actually be much less, because symbols such as slash + brackets are rare and will hardly appear in other scenarios, and there will be almost no conflict. But I haven't found any formula renderer that supports this style. Openai officially uses rehype-katex for rendering, but they should have changed the code somewhere. So I had to use the direct replacement method in this PR. Of course, the most elegant way is to find a formula renderer that natively supports (...) and [...] styles.

I have been using this PR myself for a while and have not found any problems yet. This issue It should be because the prompt emphasized the use of latex format, so gpt was output with dollar signs, which caused the old question. It won't be a problem if you don't emphasize it.

![WeChat screenshot_20240310105124](https://private-user-images.githubusercontent.com/28617777/311479931-03b3f2ce-f8e1-41b9-9ae5-8146f0740fe7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6 IkpXVCJ9..j5qwwf885UrHhnGqy_g3xGL0Jq1AOg7rz7C8HTYgKhY) ![WeChat screenshot _20240310105143](https://private-user-images.githubusercontent.com/28617777/311479933-28408dcc-69fe-4a95-8599-c9c8066e81ad.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ 9..qkM9bn3YHniMVPRap6QR-H7z-kGRLySj-b4kic6CJTE) ![WeChat Screenshot_20240310105319](https://private-user-images.githubusercontent.com/28617777/311479934-78940067-a8ed-43dd-aa84-2e619c3a5d7f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpX VCJ9..HuT2OjqMYFFGdV5tsGe--5uVYFcpns1fDfIoHUT1N2A)

Simply giving up using $ is also a solution. After all, using $ is indeed a big problem. It is more friendly to English-speaking areas and coders, but I am afraid it will be difficult to give up the current writing habit.

@Dean-YZG Dean-YZG closed this Mar 25, 2024
@Dean-YZG Dean-YZG reopened this Mar 25, 2024
@Dean-YZG Dean-YZG closed this Mar 25, 2024
@Dean-YZG Dean-YZG reopened this Mar 25, 2024
function _MarkDownContent(props: { content: string }) {
const escapedContent = useMemo(
() => escapeDollarNumber(props.content),
() => escapeBrackets(escapeDollarNumber(props.content)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original intention of the escapeDollarNumber function was to resolve latex syntax conflicts, but unfortunately, it failed to resolve the problem, so the function can be directly modified, rather than wrapped on top of it

@@ -116,9 +116,27 @@ function escapeDollarNumber(text: string) {
return escapedText;
}

function escapeBrackets(text: string) {
const pattern =
/(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)/g;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the execution performance of the the block of code is a little bad

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the execution performance of the the block of code is a little bad

It's no wonder that regular expressions (regex) often have poor performance, especially for interpreting languages, unlike regex in compiled languages (e.g., regex in Golang, which performs better as it is compiled).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was also worried about the performance, but later I found that this worry was unnecessary.
I used the following code to log the time consumed for each call, and the results showed that the function is fast enough.

function escapeBrackets(text: string) {
  let begin_time = performance.now();
  const pattern =
    /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)/g;
  let res = text.replace(
    pattern,
    (match, codeBlock, squareBracket, roundBracket) => {
      if (codeBlock) {
        return codeBlock;
      } else if (squareBracket) {
        return `$$${squareBracket}$$`;
      } else if (roundBracket) {
        return `$${roundBracket}$`;
      }
      return match;
    },
  );
  let endTime = performance.now();
  console.log(`escapeBrackets, string length=${text.length}, time consumed=${endTime - begin_time} ms`);
  return res;
}

微信截图_20240326163806
线性回归模型和评估指标

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modern js engines compile and cache regexp at load time, so this function will not recompile it every time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was also worried about the performance, but later I found that this worry was unnecessary. I used the following code to log the time consumed for each call, and the results showed that the function is fast enough.

function escapeBrackets(text: string) {
  let begin_time = performance.now();
  const pattern =
    /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)/g;
  let res = text.replace(
    pattern,
    (match, codeBlock, squareBracket, roundBracket) => {
      if (codeBlock) {
        return codeBlock;
      } else if (squareBracket) {
        return `$$${squareBracket}$$`;
      } else if (roundBracket) {
        return `$${roundBracket}$`;
      }
      return match;
    },
  );
  let endTime = performance.now();
  console.log(`escapeBrackets, string length=${text.length}, time consumed=${endTime - begin_time} ms`);
  return res;
}

微信截图_20240326163806 线性回归模型和评估指标

it because got handle by react useMemo that's why it looks faster

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modern js engines compile and cache regexp at load time, so this function will not recompile it every time

more likely depends of framework web front-end (e.g, react)

@Dean-YZG Dean-YZG merged commit a4e4286 into ChatGPTNextWeb:main Mar 25, 2024
5 of 6 checks passed
@MrrDrr MrrDrr deleted the formula_rendering branch April 10, 2024 12:51
gaogao1030 pushed a commit to gaogao1030/ChatGPT-Next-Web that referenced this pull request May 16, 2024
support \(...\) and \[...\] style math formula
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants