gpt4 book ai didi

ruby-on-rails - 用于 strip 化非字母和非数字字符的正则表达式

转载 作者:太空宇宙 更新时间:2023-11-03 17:34:33 25 4
gpt4 key购买 nike

这里是编程新手。在 ruby​​ 中,我将如何对以下非字母和非数字字符的字符串进行 strip 化处理,然后通过空格将其拆分为一个数组。

例子

string = "Honey - a sweet, sticky, yellow fluid made by bees and other insects from nectar collected from flowers."

进入这个

tokenized_string = ["Honey", "a", "sweet", "sticky", "yellow", "fluid", "made", "by", "bees", "and", "other", "insects", "from", "nectar", "collected", "from", "flowers"]

如有任何帮助,我们将不胜感激!

最佳答案

我会使用:

string = "Honey - a sweet, sticky, yellow fluid made by bees and other insects from nectar collected from flowers."
string.delete('^A-Za-z0-9 ').split
# => ["Honey",
# "a",
# "sweet",
# "sticky",
# "yellow",
# "fluid",
# "made",
# "by",
# "bees",
# "and",
# "other",
# "insects",
# "from",
# "nectar",
# "collected",
# "from",
# "flowers"]

如果您尝试删除除字母数字以外的所有内容,则不能使用 \w 字符类,因为它被定义为 [A-Za-z0-9_],它允许 _ 渗入或挤过。这是一个例子:

'foo_BAR12'[/\w+/] # => "foo_BAR12"

匹配整个字符串,包括_

'foo_BAR12'[/[A-Za-z0-9]+/] # => "foo"

_ 处停止,因为类 [A-Za-z0-9] 不包含它。

\w 应被视为变量名称的匹配模式,而不是字母数字。如果您想要字母数字的字符类,请查看 POSIX \[\[:alnum:\]\]类:

'foo_BAR12'[/[[:alnum:]]+/] # => "foo"

关于ruby-on-rails - 用于 strip 化非字母和非数字字符的正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21266620/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com