gpt4 book ai didi

javascript - 正则表达式/JavaScript : Split string to separate lines by max characters per line with looking n chars backwards for a possible whitespace?

转载 作者:行者123 更新时间:2023-12-01 21:47:02 25 4
gpt4 key购买 nike

这是与 How to split a string at every n characters or to nearest previous space 类似的问题,但是,与我根据标题所期望的相反,如果只有一个没有任何空格的长单词,该解决方案将不起作用。

所以我需要一个正则表达式,它将一个字符串拆分为单独的行(如果需要,可以多次拆分)按每行的最大字符数,并且向后看 n 个字符表示可能的空格(如果找到则在此处中断,否则为最大长度)?

编辑 1: 例如,最大行长度为 30 个字符和 15 个字符向后空白查找:

Loremipsumissimplydummytextofthe printing and typesetting industry.

该句子的第一个单词长度为 32 个字符。所以输出应该是:

Loremipsumissimplydummytextoft  # Line has length of 30 char
he printing and typesetting # Cut before the word at otherwise 30 char

因此第 30 个字符后第一个单词应该被强制剪切,因为没有空格。

剩余的字符串在单词“industry”之前的长度为 28(或带破折号的 29),因此在第 30 个字符处有一个单词,因此解决方案会在 15 个字符范围内查找前一个空格。该行在“行业”一词之前断开。

编辑 2: 第二个文本示例:

Loremipsumissimplydummytextofthe printing and typesetting industry. Loremipsumis simply dummytext ofthe printing and typesetting industry. Loremipsumissimplydummytextofthe printing and typesetting industry. Loremipsumis simply dummytext ofthe printing and typesetting industry.


he printing and typesetting
industry. Loremipsumis simply
dummytext ofthe printing and
typesetting industry.
he printing and typesetting
industry. Loremipsumis simply
dummytext ofthe printing and
typesetting industry.


可选要求:在初始发布后,我在编辑 1 中添加了该示例,我还添加了一个可选要求,即在下一行的开头添加破折号“-”字符,如果一个词是以最大线长切割。我现在将其从示例中删除,并将其作为单独的可选要求添加到此处。


Loremipsumissimplydummytextoft-  # Line length 30+1 char with an appended a dash
he printing and typesetting # Cut before the word at otherwise 30 char



var s = "Loremipsumissimplydummytextofthe printing and typesetting industry. Loremipsumis simply dummytext ofthe printing and typesetting industry. Loremipsumissimplydummytextofthe printing and typesetting industry. Loremipsumis simply dummytext ofthe printing and typesetting industry.";
var regex = /\s*(?:(\S{30})|([\s\S]{1,30})(?!\S))/g;
s.replace(regex, function($0,$1,$2) { return $1 ? $1 + "-\n" : $2 + "\n"; } )


  • \s* - 0 个或多个空白字符。
  • (?: - 非捕获组的开始:
    • (\S{30}) - 第 1 组(在回调方法中使用 $1 变量引用):三十(n ) 非空白字符
    • | - 或者
    • ([\s\S]{1,30})(?!\S)) - 第 2 组(在回调中使用 $2 变量引用方法):任何一到三十 (n) 个字符,尽可能多,但不要紧跟非空白字符。

函数($0,$1,$2) { 返回 $1 ? $1 + "-\n": $2 + "\n"; } 部分表示如果第 1 组匹配(即我们匹配了一个被分成两部分的很长的单词),我们将匹配替换为第 1 组值 + 连字符和一个换行符。否则,如果第 2 组匹配,我们将替换为第 2 组值 + 换行符。

符合 ES6+ 的代码片段:

const text = "Loremipsumissimplydummytextofthe printing and typesetting industry. Loremipsumis simply dummytext ofthe printing and typesetting industry. Loremipsumissimplydummytextofthe printing and typesetting industry. Loremipsumis simply dummytext ofthe printing and typesetting industry.";
const lineMaxLen = 30;
const wsLookup = 15; // Look backwards n characters for a whitespace
const regex = new RegExp(String.raw`\s*(?:(\S{${lineMaxLen}})|([\s\S]{${lineMaxLen - wsLookup},${lineMaxLen}})(?!\S))`, 'g');
text.replace(regex, (_, x, y) => x ? `${x}-\n` : `${y}\n`)

关于javascript - 正则表达式/JavaScript : Split string to separate lines by max characters per line with looking n chars backwards for a possible whitespace?,我们在Stack Overflow上找到一个类似的问题:

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号