gpt4 book ai didi

r - 如何将字符串中的所有数字提取为向量

转载 作者:行者123 更新时间:2023-12-05 09:36:54 26 4
gpt4 key购买 nike

有什么方法可以将字符串中的所有数字提取为向量吗?我有一个不遵循任何特定模式的大型数据集,因此使用 extract + regex 模式不一定会提取所有数字。因此,例如对于如下所示的每一行数据框:

c("3.2% 1ST $100000 AND 1.1% BALANCE", "3.3% 1ST $100000 AND 1.2% BALANCE AND $3000 BONUS FULL PRICE ONLY", 
"$4000", "3.3% 1ST $100000 AND 1.2% BALANCE", "3.3% 1ST $100000 AND 1.2% BALANCE",
"3.2 - $100000")

[1] "3.2% 1ST $100000 AND 1.1% BALANCE"
[2] "3.3% 1ST $100000 AND 1.2% BALANCE AND $3000 BONUS FULL PRICE ONLY"
[3] "$4000"
[4] "3.3% 1ST $100000 AND 1.2% BALANCE"
[5] "3.3% 1ST $100000 AND 1.2% BALANCE"
[6] "3.2 - $100000"

我想要这样的输出:

[1] "3.2 100000 1.1"                                
[2] "3.3 100000 1.2 3000"
[3] "4000"
[4] "3.3 100000 1.2 "
[5] "3.3 100000 1.2 "
[6] "3.2 100000 "

我查看了资源并找到了这个链接:https://statisticsglobe.com/extract-numbers-from-character-string-vector-in-r

regmatches(x, gregexpr("[[:digit:]]+", x))

上面的函数似乎可以工作,但它不能同时处理所有类型的数字。我知道 "[[:digit:]]+" 只查找整数,但我们如何更改它以使其涵盖所有类型的数字?

最佳答案

我们需要在匹配模式中添加.

sapply(regmatches(x, gregexpr("\\b[[:digit:].]+\\b", x)), paste, collapse= ' ')
#[1] "3.2 100000 1.1"
#[2] "3.3 100000 1.2 3000"
#[3] "4000"
#[4] "3.3 100000 1.2"
#[5] "3.3 100000 1.2"
#[6] "3.2 100000"

关于r - 如何将字符串中的所有数字提取为向量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64812421/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com