gpt4 book ai didi

javascript - 为包含非单词字符的单词标记字符串

转载 作者:行者123 更新时间:2023-11-30 18:03:55 25 4
gpt4 key购买 nike

我想标记化 Twitter 消息,包括散列标签和现金标签。标记化的正确示例如下:

"Bought $AAPL today,because of the new #iphone".match(...);
>>>> ['Bought', '$AAPL', 'today', 'because', 'of', 'the', 'new', '#iphone']

我为此任务尝试了几个正则表达式,即:

"Bought $AAPL today,because of the new #iphone".match(/\b([\w]+?)\b/g);
>>>> ['Bought', 'AAPL', 'today', 'because', 'of', 'the', 'new', 'iphone']

"Bought $AAPL today,because of the new #iphone".match(/\b([\$#\w]+?)\b/g);
>>>> ['Bought', 'AAPL', 'today', 'because', 'of', 'the', 'new', 'iphone']

"Bought $AAPL today,because of the new #iphone".match(/[\b^#\$]([\w]+?)\b/g);
>>>> ['$AAPL', '#iphone']

我可以使用哪个正则表达式来在标记中包含前导升号或美元符号?

最佳答案

显而易见的怎么样

"Bought $AAPL today,because of the new #iphone".match(/[$#]*\w+/g)
// ["Bought", "$AAPL", "today", "because", "of", "the", "new", "#iphone"]

?

PS:[$#]*可能会换成[$#]?,具体要求不清楚。

关于javascript - 为包含非单词字符的单词标记字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16268337/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com