gpt4 book ai didi

R - 按中断切割并按组计算出现次数

转载 作者:行者123 更新时间:2023-12-02 03:44:33 26 4
gpt4 key购买 nike

我有一个如下所示的数据框:

dat <- structure(list(Geocode = c("1100015", "1100023", "1100031", "1100049", 
"1100056", "1100064", "1100072", "1100080", "1100098", "1100106",
"1100114", "1100122", "1100130", "1100148", "1100155", "1100189",
"1100205", "1100254", "1100262", "1100288", "1100296", "1100304",
"1100320", "1100338", "1100346", "1100379", "1100403", "1100452",
"1100502", "1100601"), Location = c("Alta Floresta D'oeste, RO",
"Ariquemes, RO", "Cabixi, RO", "Cacoal, RO", "Cerejeiras, RO",
"Colorado Do Oeste, RO", "Corumbiara, RO", "Costa Marques, RO",
"Espigo D'oeste, RO", "Guajar-Mirim, RO", "Jaru, RO", "Ji-Paran, RO",
"Machadinho D'oeste, RO", "Nova Brasilndia D'oeste, RO", "Ouro Preto Do Oeste, RO",
"Pimenta Bueno, RO", "Porto Velho, RO", "Presidente Mdici, RO",
"Rio Crespo, RO", "Rolim De Moura, RO", "Santa Luzia D'oeste, RO",
"Vilhena, RO", "So Miguel Do Guapor, RO", "Nova Mamor, RO", "Alvorada D'oeste, RO",
"Alto Alegre Dos Parecis, RO", "Alto Paraso, RO", "Buritis, RO",
"Novo Horizonte Do Oeste, RO", "Cacaulandia, RO"), Region = c("Norte",
"Norte", "Norte", "Norte", "Norte", "Norte", "Norte", "Norte",
"Norte", "Norte", "Sul", "Sul", "Sul", "Sul", "Sul",
"Sul", "Sul", "Sul", "Sul", "Sul", "Nordeste", "Nordeste",
"Nordeste", "Nordeste", "Nordeste", "Nordeste", "Nordeste", "Nordeste", "Nordeste",
"Nordeste"), Population = c(25578L, 104401L, 6355L, 87226L, 17986L,
18817L, 8842L, 16651L, 32385L, 46632L, 55738L, 130419L, 37167L,
21592L, 39924L, 37512L, 502748L, 22557L, 3750L, 56242L, 8532L,
91801L, 23933L, 27600L, 17063L, 13940L, 20210L, 37838L, 10276L,
6367L)), .Names = c("Geocode", "Location", "Region", "Population"
), row.names = c(NA, 30L), class = "data.frame")

它显示了一些城市的人口,以及这些城市所属的地区。

我需要将人口划分为breaks(breaks=c(0,50000,100000)),然后根据breaks求城市的计数,既作为一个整体(所有地区) 并按区域分隔。

生成的数据框应如下所示(随机、假设值):

Class                  Region       Count
[0-50000] Norte 7
[50000-100000] Norte 3
[>100000] Norte 0
[0-50000] Sul 5
[50000-100000] Sul 4
[>100000] Sul 1
[0-50000] Nordeste 4
[50000-100000] Nordeste 5
[>100000] Nordeste 1
[0-50000] All 16
[50000-100000] All 12
[>100000] All 2

感谢任何帮助。

最佳答案

通过使用 cutdplyr

dat$Class=cut(dat$Population,c(0,50000,100000,Inf),labels=c('0-50000','50000-100000','>100000'))
library(dplyr)
d1=dat%>%group_by(Class,Region)%>%summarise(count=n())
d2=dat%>%group_by(Class)%>%summarise(count=n(),Region='All')
bind_rows(d1,d2)

Class Region count
<fctr> <chr> <int>
1 0-50000 Nordeste 9
2 0-50000 Norte 8
3 0-50000 Sul 6
4 50000-100000 Nordeste 1
5 50000-100000 Norte 1
6 50000-100000 Sul 2
7 >100000 Norte 1
8 >100000 Sul 2
9 0-50000 All 23
10 50000-100000 All 4
11 >100000 All 3

关于R - 按中断切割并按组计算出现次数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47233841/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com