gpt4 book ai didi

r - 将数据框中的一长行转换为单独的记录

转载 作者:行者123 更新时间:2023-12-03 09:27:47 26 4
gpt4 key购买 nike

我有一个可变的人员列表,作为数据框中的一长行,我有兴趣将这些记录重新组织为更有意义的格式。

我的原始数据如下所示,

df <- data.frame(name1 = "John Doe", email1 = "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f1bb9e999fb1b59e94df929e9c" rel="noreferrer noopener nofollow">[email protected]</a>", phone1 = "(444) 444-4444", name2 = "Jane Doe", email2 = "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7d371c13183d391218531e1210" rel="noreferrer noopener nofollow">[email protected]</a>", phone2 = "(444) 444-4445", name3 = "John Smith", email3 = "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="eca6838482acbf81859884c28f8381" rel="noreferrer noopener nofollow">[email protected]</a>", phone3 = "(444) 444-4446", name4 = NA, email4 = "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c48ea5aaa18497a9adb0aceaa7aba9" rel="noreferrer noopener nofollow">[email protected]</a>", phone4 = NA, name5 = NA, email5 = NA, phone5 = NA)
df
# name1 email1 phone1 name2 email2 phone2
# 1 John Doe <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="155f7a7d7b55517a703b767a78" rel="noreferrer noopener nofollow">[email protected]</a> (444) 444-4444 Jane Doe <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="064c676863464269632865696b" rel="noreferrer noopener nofollow">[email protected]</a> (444) 444-4445
# name3 email3 phone3 name4 email4 phone4 name5
# 1 John Smith <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="145e7b7c7a5447797d607c3a777b79" rel="noreferrer noopener nofollow">[email protected]</a> (444) 444-4446 NA Jan<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="45200516282c312d6b262a28" rel="noreferrer noopener nofollow">[email protected]</a> NA NA
# email5 phone5
# 1 NA NA

我正在尝试将其弯曲成这样的格式,

df_transform <- structure(list(name = structure(c(2L, 1L, 3L, NA, NA), .Label = c("Jane Doe", 
"John Doe", "John Smith"), class = "factor"), email = structure(c(3L,
1L, 4L, 2L, NA), .Label = c("<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5c163d32391c183339723f3331" rel="noreferrer noopener nofollow">[email protected]</a>", "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="357f545b507566585c415d1b565a58" rel="noreferrer noopener nofollow">[email protected]</a>",
"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="2a604542446a6e454f04494547" rel="noreferrer noopener nofollow">[email protected]</a>", "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="1c567374725c4f71756874327f7371" rel="noreferrer noopener nofollow">[email protected]</a>"), class = "factor"), phone = structure(c(1L,
2L, 3L, NA, NA), .Label = c("(444) 444-4444", "(444) 444-4445",
"(444) 444-4446"), class = "factor")), .Names = c("name", "email",
"phone"), class = "data.frame", row.names = c(NA, -5L))
df_transform
# name email phone
# 1 John Doe <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="4b012423250b0f242e65282426" rel="noreferrer noopener nofollow">[email protected]</a> (444) 444-4444
# 2 Jane Doe <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="571d363932171338327934383a" rel="noreferrer noopener nofollow">[email protected]</a> (444) 444-4445
# 3 John Smith <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="367c595e5876655b5f425e1855595b" rel="noreferrer noopener nofollow">[email protected]</a> (444) 444-4446
# 4 <NA> <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e0aa818e85a0b38d899488ce838f8d" rel="noreferrer noopener nofollow">[email protected]</a> <NA>
# 5 <NA> <NA> <NA>

应该补充的是,它并不总是五个记录,它可以是 1 到 99 之间的任何数字。我尝试使用 reshape2melt 和 `t() 1 但事情变得复杂了。我想有一些我根本不知道的已知方法。

最佳答案

您的方向是正确的,请尝试以下操作:

library(reshape2)

# melt it down
df.melted = melt(t(df))
# get rid of the numbers at the end
df.melted$Var1 = sub('[0-9]+$', '', df.melted$Var1)

# cast it back
dcast(df.melted, (seq_len(nrow(df.melted)) - 1) %/% 3 ~ Var1)[,-1]
# email name phone
#1 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c78da8afa98783a8a2e9a4a8aa" rel="noreferrer noopener nofollow">[email protected]</a> John Doe (444) 444-4444
#2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f4be959a91b4b09b91da979b99" rel="noreferrer noopener nofollow">[email protected]</a> Jane Doe (444) 444-4445
#3 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d79db8bfb99784babea3bff9b4b8ba" rel="noreferrer noopener nofollow">[email protected]</a> John Smith (444) 444-4446
#4 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="0f456e616a4f5c62667b67216c6062" rel="noreferrer noopener nofollow">[email protected]</a> <NA> <NA>
#5 <NA> <NA> <NA>

关于r - 将数据框中的一长行转换为单独的记录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16739357/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com