gpt4 book ai didi

r - 自定义函数为 dplyr 的 mutate 中的所有行返回相同的值

转载 作者:行者123 更新时间:2023-12-04 16:47:16 25 4
gpt4 key购买 nike

我有以下数据:

                                                 Name
1 Braund, Mr. Owen Harris
2 Cumings, Mrs. John Bradley (Florence Briggs Thayer)
3 Heikkinen, Miss. Laina
4 Futrelle, Mrs. Jacques Heath (Lily May Peel)
5 Allen, Mr. William Henry

数据可以这样加载:

structure(list(Name = c("Braund, Mr. Owen Harris", "Cumings, Mrs. John Bradley (Florence Briggs Thayer)", 
"Heikkinen, Miss. Laina", "Futrelle, Mrs. Jacques Heath (Lily May Peel)",
"Allen, Mr. William Henry")), .Names = "Name", row.names = c(NA,
-5L), class = c("tbl_df", "tbl", "data.frame"))

我的预期输出是:

                                                 Name    Title
1 Braund, Mr. Owen Harris Mr
2 Cumings, Mrs. John Bradley (Florence Briggs Thayer) Mrs
3 Heikkinen, Miss. Laina Mrs
4 Futrelle, Mrs. Jacques Heath (Lily May Peel) Mrs
5 Allen, Mr. William Henry Mr

问题是下面的代码会将所有 Title 设置为 “Mr”。我正在使用带有 dplyr 的 mutate 的自定义函数。

library('stringr')
library('dplyr')

extractTitle <- function(name) {
str_match(name, '(\\b[a-zA-z]+)\\.')[2]
}

data <- data %>%
mutate(Title = extractTitle(Name))

奇怪的是,如果我更改 extractTitle 以按原样返回参数,它会按预期工作。例如:

extractTitle <- function(name) {
name
}

data <- data %>%
mutate(Title = extractTitle(Name))

上面的代码会返回:

                                                 Name    Title
1 Braund, Mr. Owen Harris Braund, Mr. Owen Harris
2 Cumings, Mrs. John Bradley (Florence Briggs Thayer) Cumings, Mrs. John Bradley (Florence Briggs Thayer)
3 Heikkinen, Miss. Laina Heikkinen, Miss. Laina
4 Futrelle, Mrs. Jacques Heath (Lily May Peel) Futrelle, Mrs. Jacques Heath (Lily May Peel)
5 Allen, Mr. William Henry Allen, Mr. William Henry

这是我预期的行为,与我遇到问题的代码的行为不同。

我在这里遗漏了什么或者这是一个错误吗?

附言- 我正在使用 dplyr 版本 0.5.0

最佳答案

library(dplyr)
library(stringr)
data %>%
mutate(title = str_extract(string = Name, pattern = "(Mr|Miss|Mrs)\\.")) %>%
select(Name, title)

返回:

# A tibble: 6 x 2
Name title
<chr> <chr>
1 Braund, Mr. Owen Harris Mr.
2 Cumings, Mrs. John Bradley (Florence Briggs Thayer) Mrs.
3 Heikkinen, Miss. Laina Miss.
4 Futrelle, Mrs. Jacques Heath (Lily May Peel) Mrs.
5 Allen, Mr. William Henry Mr.
6 Moran, Mr. James Mr.

关于r - 自定义函数为 dplyr 的 mutate 中的所有行返回相同的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38410648/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com