gpt4 book ai didi

r - 如何使用 dcast() 对列的值求和?

转载 作者:行者123 更新时间:2023-12-04 04:51:11 41 4
gpt4 key购买 nike

我坚持使用 dcast 函数;我正在尝试为每个计数年的许多物种的个体创建一个总和表。

我有一个包含 3 列的数据框:(1)年份(因子),(2)物种名称(因子),以及(3)计数(数字)。

Year    Species Counts
2002 SP1 2
2002 SP1 3
2004 SP1 2
2002 SP2 8
2002 SP2 2
2002 SP3 1
2002 SP3 1
2003 SP3 2
2004 SP3 1

我试图用总和得到这种表格:
    2002    2003    2004
SP1 5 0 2
SP2 10 0 0
SP3 2 2 1

聚合不做我想要的。我正在使用 dcast像这样的功能:
dcast( DF, Species ~ Year , sum)

无论我尝试什么,总和都不适用于因数。我收到此错误消息:
(Error in Summary.factor(integer(0), na.rm = FALSE): sum not meaningful for factors 

当我尝试默认函数(长度)时,我得到的是行数的总和,而不是个体的总和。当我尝试设置总和来处理我的“计数”列而不是其他因素时,它也不起作用或我收到相同的错误消息。

我怎样才能得到那种带有计数总和的表格?

编辑:

该函数计算假和并生成个体。在这里,我向您展示了在 Excel 和 R 中使用交叉表得出的总和的比较:
EXCEL                   2003    2004    2005    2006    2007    2008    2009
Anthocharis_cardamines 1 0 2 2 0 1 0
Apatura_ilia 0 0 0 0 1 0 0
Aporia_crataegi 2 3 29 26 6 4 3
Brintesia_circe 126 217 199 303 64 99 55


DCAST 2003 2004 2005 2006 2007 2008 2009
Anthocharis_cardamines 2 0 4 4 0 2 0
Apatura_ilia 0 0 0 0 2 0 0
Aporia_crataegi 4 6 258 205 25 8 6
Brintesia_circe 883 1334 1050 1770 490 848 354

计算出的数字甚至不符合我昨天的行数总和。
这些款项是如何运作的?

编辑2:
>dput(head(counts, 10)
structure(list(year = structure(c(16L, 16L, 16L, 16L, 16L, 16L,
16L, 16L, 16L, 15L), .Label = c("1994", "1995", "1996", "1997",
"1998", "1999", "2000", "2001", "2002", "2003", "2004", "2005",
"2006", "2007", "2008", "2009"), class = "factor"), species = structure(c(146L,
146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L), .Label = c("Aglais_urticae",
"Anthocharis_cardamines", "Anthocharis_euphenoides", "Apatura_ilia",
"Apatura_iris", "Aphantopus_hyperantus", "Aporia_crataegi", "Araschnia_levana",
"Arethusana_arethusa", "Argynnis_adippe", "Argynnis_aglaja",
"Argynnis_paphia", "Aricia_agestis", "Boloria_dia", "Boloria_euphrosyne",
"Boloria_selene", "Brenthis_daphne", "Brenthis_ino", "Brintesia_circe",
"Callophrys_rubi", "Carcharodus_alceae", "Carcharodus_floccifera",
"Carcharodus_lavatherae", "Carterocephalus_palaemon", "Celastrina_argiolus",
"Charaxes_jasius", "Chazara_briseis", "Clossiana_dia", "Coenonympha_arcania",
"Coenonympha_dorus", "Coenonympha_glycerion", "Coenonympha_oedippus",
"Coenonympha_pamphilus", "Coenonympha_tullia", "Colias_alfacariensis",
"Colias_croceus", "Colias_hyale", "Colias_palaeno", "Cupido_alcetas",
"Cupido_argiades", "Cupido_minimus", "Cupido_osiris", "Diacrisia_sannio",
"Erebia_aethiops", "Erebia_euryale", "Erebia_ligea", "Erebia_medusa",
"Erebia_meolans", "Erynnis_tages", "Euchloe_crameri", "Euclidia_glyphica",
"Euphydryas_aurinia", "Euplagia_quadripunctaria", "Everes_argiades",
"Fabriciana_adippe", "Glaucopsyche_alcon", "Glaucopsyche_alexis",
"Glaucopsyche_arion", "Glaucopsyche_melanops", "Glaucopsyche_nausithous",
"Glaucopsyche_teleius", "Gonepteryx_cleopatra", "Gonepteryx_rhamni",
"Hamearis_lucina", "Hesperia_comma", "Heteropterus_morpheus",
"Hipparchia_fidia", "Hipparchia_semele", "Hyles_euphorbiae",
"Hyponephele_lupinus", "Inachis_io", "Iphiclides_podalirius",
"Issoria_lathonia", "Lampides_boeticus", "Lasiommata_maera",
"Lasiommata_megera", "Leptidea_sinapis", "Leptotes_pirithous",
"Libelloides_coccajus", "Libelloides_longicornis", "Limenitis_camilla",
"Limenitis_populi", "Limenitis_reducta", "Lopinga_achine", "Lycaena_alciphron",
"Lycaena_dispar", "Lycaena_helle", "Lycaena_phlaeas", "Lycaena_tityrus",
"Macroglossum_stellatarum", "Maculinea_arion", "Maniola_jurtina",
"Melanargia_arge", "Melanargia_galathea", "Melanargia_lachesis",
"Melanargia_occitanica", "Melitaea_cinxia", "Melitaea_diamina",
"Melitaea_didyma", "Melitaea_phoebe", "Mesoacidalia_aglaja",
"Minois_dryas", "Neohipparchia_statilinus", "Neozephyrus_quercus",
"Nymphalis_antiopa", "Nymphalis_polychloros", "Ochlodes_sylvanus",
"Ochlodes_venatus", "Palaeochrysophanus_hippothoe", "Papilio_machaon",
"Pararge_aegeria", "Pieris_napi", "Plebeius_agestis", "Plebeius_argyrognomon",
"Polygonia_c-album", "Polyommatus_bellargus", "Polyommatus_coridon",
"Polyommatus_escheri", "Polyommatus_icarus", "Polyommatus_semiargus",
"Polyommatus_thersites", "Pontia_daplidice", "Pseudopanthera_macularia",
"Pseudophilotes_baton", "Pseudotergumia_fidia", "Pyrgus_malvae",
"Pyronia_bathseba", "Pyronia_cecilia", "Pyronia_tithonus", "Quercusia_quercus",
"Satyrium_acaciae", "Satyrium_esculi", "Satyrium_ilicis", "Satyrium_pruni",
"Satyrium_spini", "Satyrium_w-album", "Smerinthus_ocellatus",
"Speyeria_aglaja", "Spialia_sertorius", "Thecla_betulae", "Thymelicus_acteon",
"Thymelicus_lineola", "Thymelicus_sylvestris", "Vanessa_atalanta",
"Vanessa_cardui", "Zerynthia_polyxena", "Zygaena_carniolica",
"Zygaena_ephialtes", "Zygaena_erythrus", "Zygaena_fausta", "Zygaena_filipendulae",
"Zygaena_hilaris", "Zygaena_loti", "Zygaena_occitanica", "Zygaena_purpuralis",
"Zygaena_sarpedon", "Zygaena_transalpina"), class = "factor"),
Counts = c(2, 2, 2, 2, 2, 17, 52, 2, 2, 17)), .Names = c("year",
"species", "Counts"), row.names = 5479:5488, class = "data.frame")

> str(counts)
'data.frame': 3161 obs. of 3 variables:
$ year : Factor w/ 16 levels "1994","1995",..: 16 16 16 16 16 16 16 16 16 15 ...
$ species: Factor w/ 157 levels "Aglais_urticae",..: 146 146 146 146 146 146 146 146 146 146 ...
$ Counts : num 2 2 2 2 2 17 52 2 2 17 ...

我希望它有帮助...

最佳答案

一个 dcast()版本

这对我有用:

require("reshape2")
dcast(counts, Year ~ Species, value.var = "Counts", fun.aggregate = sum)

> dcast(counts, Year ~ Species, value.var = "Counts", fun.aggregate = sum)
Year SP1 SP2 SP3
1 2002 5 10 2
2 2003 0 0 2
3 2004 2 0 1

检查 counts$Counts是数字;查看 str(counts) 的输出哪里 counts是你的 DF .我创建了 counts通过:
counts <- read.table(text = "Year    Species Counts
2002 SP1 2
2002 SP1 3
2004 SP1 2
2002 SP2 8
2002 SP2 2
2002 SP3 1
2002 SP3 1
2003 SP3 2
2004 SP3 1", header = TRUE)

这是使用
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] reshape2_1.2.2

loaded via a namespace (and not attached):
[1] plyr_1.8 stringr_0.6.2 tools_3.0.1

使用 xtabs() 的替代基础 R 版本

您可能还想尝试 xtabs()带有基本 R 的函数
xtabs(Counts ~ Year + Species, data = counts)

> xtabs(Counts ~ Year + Species, data = counts)
Species
Year SP1 SP2 SP3
2002 5 10 2
2003 0 0 2
2004 2 0 1

关于r - 如何使用 dcast() 对列的值求和?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17428960/

41 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com