gpt4 book ai didi

重新排列数据: convert from water year to calendar year

转载 作者:行者123 更新时间:2023-12-02 09:23:10 28 4
gpt4 key购买 nike

我有一个表,其中包含来自流量计的数据,排列如下:

  Water.Year   May   Jun   Jul   Aug    Sep    Oct    Nov   Dec   Jan   Feb   Mar   Apr 
1 1953-1954 55.55 43.62 30.46 26.17 26.76 41.74 19.92 41.25 28.77 20.96 12.47 10.51
2 1954-1955 23.49 81.35 46.71 29.33 67.83 133.30 37.62 30.16 21.07 19.38 13.87 10.63
3 1955-1956 9.87 51.59 55.36 63.03 154.08 98.15 104.06 32.85 22.89 17.30 15.68 10.88

> data <- structure(list(Water.Year = structure(1:6, .Label = c("1953-1954", "1954-1955", "1955-1956", "1956-1957", "1957-1958", "1958-1959", "1959-1960", "1960-1961", "1961-1962", "1962-1963", "1963-1964", "1964-1965", "1965-1966", "1966-1967", "1967-1968", "1968-1969", "1969-1970", "1970-1971", "1971-1972", "1972-1973", "1973-1974", "1974-1975", "1975-1976", "1976-1977", "1977-1978", "1978-1979", "1979-1980", "1980-1981", "1981-1982", "1982-1983", "1983-1984", "1984-1985", "1985-1986", "1986-1987", "1987-1988", "1988-1989", "1989-1990", "1990-1991", "1991-1992", "1992-1993", "1993-1994", "1994-1995", "1995-1996", "1996-1997", "1997-1998", "1998-1999", "1999-2000", "2000-2001"), class = "factor"), May = c(55.55, 23.49, 9.87, 18.03, 17.46, 11.37), Jun = c(43.62, 81.35, 51.59, 28.61, 15.14, 29.48), Jul = c(30.46, 46.71, 55.36, 24.36, 20.09, 19.48), Ago = c(26.17, 29.33, 63.03, 22.01, 16.97, 16.86), Set = c(26.76, 67.83, 154.08, 28.51, 27.24, 21.01), Oct = c(41.74, 133.3, 98.15, 53.72, 35.78, 19.78), Nov = c(19.92, 37.62, 104.06, 115.78, 20.35, 18.69), Dic = c(41.25, 30.16, 32.85, 32.04, 22, 18.86), Ene = c(28.77, 21.07, 22.89, 25.44, 13.27, 14.89), Feb = c(20.96, 19.38, 17.3, 14.53, 10.37, 10.4), Mar = c(12.47, 13.87, 15.68, 10.78, 8.77, 8.79), Abr = c(10.51, 10.63, 10.88, 9.33, 7.69, 8.99)), .Names = c("Water.Year", "May", "Jun", "Jul", "Ago", "Set", "Oct", "Nov", "Dic", "Ene", "Feb", "Mar", "Abr"), row.names = c(NA, 6L), class = "data.frame")

按“水年”排列,每年从五月开始,到次年四月结束(这一点可以在第一栏看到)。我想将其转换为包含三列的数据框: Calendar.Year -- Month -- Flow.Measurement

我已经使用 tidyr 中的“separate”将 Water.Year 列分解为两列:

> df = separate(data, Water.Year, c("year1","year2"))

year1 year2 May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr
1 1953 1954 55.55 43.62 30.46 26.17 26.76 41.74 19.92 41.25 28.77 20.96 12.47 10.51
2 1954 1955 23.49 81.35 46.71 29.33 67.83 133.30 37.62 30.16 21.07 19.38 13.87 10.63

现在我计划使用 tidyr 中的“gather”来完成其余的转换,但我一直不知道如何使用 year1< 创建 Calendar.Year 列/em> 表示 MayDec 列,year2 表示 JanApr

任何帮助将不胜感激。

最佳答案

另一个想法(使用带有英文月份的@useR数据)

library(dplyr)
library(tidyr)


df %>%
separate(Water.Year, c("Year1", "Year2")) %>%
gather(Month, Value, -(Year1:Year2)) %>%
group_by(Year1, Year2) %>%
mutate(Year = if_else(match(Month, month.abb) >= 5, Year1, Year2),
Month = factor(Month, levels = month.abb)) %>%
ungroup() %>%
select(Year, Month, Value) %>%
arrange(Year, Month)

我们将 Water.Year 列分为 Year1Year2,并使用 gather() 将数据重新整形为长格式。然后,对于每个组,我们使用 match()month.abb 检查月份是否大于或等于 5(五月),并用 if_else()。最后,我们按 YearMonth

删除不必要的列和 arrange()
## A tibble: 36 × 3
# Year Month Value
# <chr> <fctr> <dbl>
#1 1953 May 55.55
#2 1953 Jun 43.62
#3 1953 Jul 30.46
#4 1953 Aug 26.17
#5 1953 Sep 26.76
#6 1953 Oct 41.74
#7 1953 Nov 19.92
#8 1953 Dec 41.25
#9 1954 Jan 28.77
#10 1954 Feb 20.96
## ... with 26 more rows

关于重新排列数据: convert from water year to calendar year,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40210235/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com