gpt4 book ai didi

r - 为什么 rbind 会抛出警告

转载 作者:行者123 更新时间:2023-12-04 10:39:34 28 4
gpt4 key购买 nike

这与Are there more elegant ways to transform ragged data into a tidy dataframe有关

为什么以下代码不起作用:

events = structure(list(date = structure(c(-714974, -714579, -717835), class = "Date"), 
days = c(1, 6, 0.5), name = c("Intro to stats", "Stats Winter school",
"TidyR tools"), topics = c("probability|R", "R|regression|ggplot",
"tidyR|dplyr")), .Names = c("date", "days", "name", "topics"
), row.names = c(NA, -3L), class = "data.frame")

> newdf <- data.frame(topic=character(), days=character())
> for(i in 1:length(events$topics)){
+ xx = unlist(strsplit(events$topics[i],'\\|'))
+ for(j in 1:length(xx)){
+ yy = c(xx[j], events$days[i]/length(xx))
+ print(yy)
+ newdf=rbind(newdf, yy)
+ }
+ }
[1] "probability" "0.5"
[1] "R" "0.5"
[1] "R" "2"
[1] "regression" "2"
[1] "ggplot" "2"
[1] "tidyR" "0.25"
[1] "dplyr" "0.25"
There were 11 warnings (use warnings() to see them)
> newdf
X.probability. X.0.5.
1 probability 0.5
2 <NA> 0.5
3 <NA> <NA>
4 <NA> <NA>
5 <NA> <NA>
6 <NA> <NA>
7 <NA> <NA>
>
> warnings()
Warning messages:
1: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, NA ... :
invalid factor level, NAs generated
2: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, NA, ... :
invalid factor level, NAs generated
3: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, 1L, ... :
invalid factor level, NAs generated
4: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, NA, ... :
invalid factor level, NAs generated
5: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, 1L, ... :
invalid factor level, NAs generated
6: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, NA, ... :
invalid factor level, NAs generated
7: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, 1L, ... :
invalid factor level, NAs generated
8: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, NA, ... :
invalid factor level, NAs generated
9: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, 1L, ... :
invalid factor level, NAs generated
10: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, NA, ... :
invalid factor level, NAs generated
11: In `[<-.factor`(`*tmp*`, ri, value = structure(c(1L, 1L, ... :
invalid factor level, NAs generated
>

yy 没问题,但 rbind 不起作用。错误在哪里,如何纠正?谢谢你的帮助。

最佳答案

你有没有试过调试你的 for环形?例如,通过添加 print(class(yy)) print(str(newdf))你会看到第一次迭代后 newdf向量成为因素。

# [1] "probability" "0.5"        
# [1] "character"
# 'data.frame': 0 obs. of 2 variables:
# $ topic: Factor w/ 0 levels:
# $ days : Factor w/ 0 levels:
# NULL
# [1] "R" "0.5"
# [1] "character"
# 'data.frame': 1 obs. of 2 variables:
# $ X.probability.: Factor w/ 1 level "probability": 1
# $ X.0.5. : Factor w/ 1 level "0.5": 1
# NULL
# [1] "R" "2"
# [1] "character"
# 'data.frame': 2 obs. of 2 variables:
# $ X.probability.: Factor w/ 1 level "probability": 1 NA
# $ X.0.5. : Factor w/ 1 level "0.5": 1 1

...

你会说“但我将它们定义为 character”。是的,但如果您会阅读 rbind文档,你会看到

For cbind (rbind), vectors of zero length (including NULL) are ignored unless the result would have zero rows (columns), for S compatibility. (Zero-extent matrices do not occur in S3 and are not ignored in R.)


rbind的另一个属性是它从 data.frame 继承了它的属性其中之一是 stringsAsFactors == TRUE
这里发生的事情可以很容易地用一个虚拟的例子来说明,考虑
temp <- data.frame(A = letters[1:3])
str(temp)
## 'data.frame': 3 obs. of 1 variable:
## $ A: Factor w/ 3 levels "a","b","c": 1 2 3

temp$A[3] <- "d"
## Warning message:
## In `[<-.factor`(`*tmp*`, 3, value = c(1L, 2L, NA)) :
## invalid factor level, NA generated

temp$A
## [1] a b <NA>
## Levels: a b c

你可以在这里看到两件事:
  • data.frame自动转换 character类到因子
  • 尝试将新级别解析为 factor 时向量它将其转换为 NA并抛出您收到的确切错误

  • 正如@akrun 所提到的,设置为 options(stringsAsFactors=F)会解决你的问题

    关于r - 为什么 rbind 会抛出警告,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25102966/

    28 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com