gpt4 book ai didi

r - 如何通过改变数据的配置来删除数据帧的NA?

转载 作者:行者123 更新时间:2023-12-02 19:05:29 25 4
gpt4 key购买 nike

所以我有一个像这样的数据框,其中包含三个国家/地区存在的物种名称,以及不存在某个物种的国家/地区的一些 NA:

     Country_A      |      Country_B      |    Country_C   
-----------------------------------------------------
Tilapia guineensis | NA | Tilapia guineensisi
Tilapia zillii | Tilapia zillii | Tilapia zillii
NA | Fundulus rubrifrons | Fundulus rubrifrons
Eutrigla gurnardus | Eutrigla gurnardus | NA
Sprattus sprattus | NA | NA

我想做的基本上是检查一个物种是否存在于一个、两个或三个国家,并制作一个数据框,如下所示:

     Species name   |      Country_A     |    Country_B    | Country_C 
---------------------------------------------------------------------
Tilapia guineensis | present | not_present | present
Tilapia zillii | present | present | present
Fundulus rubrifrons | not_present | present | present
Eutrigla gurnardus | present | present | not_present
Sprattus sprattus | present | not_present | not_present

我认为也许使用扩展函数或 ifelse 函数可能是一种方法,但我真的不知道如何实现它。非常感谢您的回答

最佳答案

这里是一个带有 is.na 的选项。使用lapply循环数据集的列,使用is.na创建逻辑向量,将其转换为数字索引,用字符串向量替换值并绑定(bind)它们进入使用 coalesce

创建的 transmuteed 'Species_name' 列
lst1 <- lapply(df1, function(x) c("present", "not_present")[1 + is.na(x)])

library(dplyr)
df1 %>%
transmute(Species_name = coalesce(!!! .)) %>%
bind_cols(lst1)

-输出

#         Species_name   Country_A   Country_B   Country_C
#1 Tilapia guineensis present not_present present
#2 Tilapia zillii present present present
#3 Fundulus rubrifrons not_present present present
#4 Eutrigla gurnardus present present not_present
#5 Sprattus sprattus present not_present not_present

或者,如果我们只想在 tidyverse 上执行此操作,则可以选择仅使用 dplyr 并且更紧凑

df1 %>% 
mutate(Species_name = coalesce(!!! .),
across(starts_with('Country'),
~c("present", "not_present")[1 + is.na(.)]))
# Country_A Country_B Country_C Species_name
#1 present not_present present Tilapia guineensis
#2 present present present Tilapia zillii
#3 not_present present present Fundulus rubrifrons
#4 present present not_present Eutrigla gurnardus
#5 present not_present not_present Sprattus sprattus

数据

df1 <- structure(list(Country_A = c("Tilapia guineensis", "Tilapia zillii", 
NA, "Eutrigla gurnardus", "Sprattus sprattus"), Country_B = c(NA,
"Tilapia zillii", "Fundulus rubrifrons", "Eutrigla gurnardus",
NA), Country_C = c("Tilapia guineensisi", "Tilapia zillii",
"Fundulus rubrifrons",
NA, NA)), class = "data.frame", row.names = c(NA, -5L))

关于r - 如何通过改变数据的配置来删除数据帧的NA?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65116197/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com