gpt4 book ai didi

javascript - 句子大小写一段文本,同时忽略其中的 html 元素

转载 作者:行者123 更新时间:2023-11-29 16:46:24 25 4
gpt4 key购买 nike

目前我正在将一段文本传递给以下函数,以确保每个句子的首字母大写。

function sentenceCase(string) {
var n = string.split(".");
var vfinal = ""
for (i = 0; i < n.length; i++) {
var spaceput = ""
var spaceCount = n[i].replace(/^(\s*).*$/, "$1").length;
n[i] = n[i].replace(/^\s+/, "");
var newstring = n[i].charAt(n[i]).toUpperCase() + n[i].slice(1);
for (j = 0; j < spaceCount; j++) spaceput = spaceput + " ";
vfinal = vfinal + spaceput + newstring + ".";
}
vfinal = vfinal.substring(0, vfinal.length - 1);
return vfinal;
}

当文本不包含任何元素并且一切都应该大写时,这很有效。

var str1 = 'he always has a positive contribution to make to the class. in class, he behaves well, but he should aim to complete his homework a little more regularly.';
console.log(sentenceCase(str1));

Returns >>> He always has a positive contribution to make to the class. In class, he behaves well, but he should aim to complete his homework a little more regularly.

但是,如果文本包含 <span>元素包裹句子中的第一个单词,那么它显然会导致问题,如图所示。

var str2 = '<span class="pronoun subjective">he</span> always has a positive contribution to make to the class. in class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly.'; 
console.log(sentenceCase(str2));

Returns >>> <span class="pronoun subjective">he</span> always has a positive contribution to make to the class. In class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly.

我的正则表达式技能远非一流,所以我不确定如何从这里开始,所以任何关于在将文本转换为句子大小写时如何忽略文本中的任何元素的建议将不胜感激。

编辑:为了澄清 - 输出仍应保留元素 - 在考虑大写的句子时只需要忽略它们。

最佳答案

这不是一个小问题。纯粹用正则表达式做它是 bad因为你可能会陷入毛茸茸的极端情况并把事情搞砸 - JS regexp 根本不够强大,无法处理完整的 HTML 语法。

但是,浏览器已经有了处理 HTML 的方法。

var str2 = '<span class="pronoun subjective">he</span> always has a positive contribution to make to the class. in class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly.';

function capitalise(html) {
// HTML DOM parser: engage!
var div = document.createElement('div');
div.innerHTML = html;

// assume the start of the string is also a start of a sentence
var boundary = true;

// go through every text node
var walker = document.createTreeWalker(div, NodeFilter.SHOW_TEXT, null, true);
while (walker.nextNode()) {
var node = walker.currentNode;
var text = node.textContent;

// if we are between sentences, capitalise the first letter
if (boundary) {
text = text.replace(/[a-z]/, function(letter) {
return letter.toUpperCase();
});
}

// capitalise for any internal punctuation
text = text.replace(/([.?!]\s+)([a-z])/g, function(_, punct, letter) {
return punct + letter.toUpperCase();
});

// If the current node ends in punctuation, we're back at sentence boundary
boundary = text.match(/[.?!]\s*$/);

// change the current node's text
node.textContent = text;
}
return div.innerHTML;
}

console.log(capitalise(str2));

关于javascript - 句子大小写一段文本,同时忽略其中的 html 元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40798194/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com