gpt4 book ai didi

r - 将一列多行数据转换为多列多行数据

转载 作者:行者123 更新时间:2023-12-02 09:21:35 27 4
gpt4 key购买 nike

我在 R 中得到了网络抓取数据的输出,如下所示

Name1
Email: email1@xyz.com
City/Town: Location1
Name2
Email: email2@abc.com
City/Town: Location2
Name3
Email: email3@pqr.com
City/Town: Location3

某些姓名可能没有电子邮件或位置。我想将以上数据转换为表格格式。输出应该类似于

Name      Email           City/Town
Name1 email1@xyz.com Location1
Name2 email2@abc.com Location2
Name3 email3@pqr.com Location3
Name4 Location4
Name5 email5@abc.com

最佳答案

使用:

txt <- readLines(txt)

library(data.table)
library(zoo)

dt <- data.table(txt = txt)

dt[!grepl(':', txt), name := txt
][, name := na.locf(name)
][grepl('^Email:', txt), email := sub('Email: ','',txt)
][grepl('^City/Town:', txt), city_town := sub('City/Town: ','',txt)
][txt != name, lapply(.SD, function(x) toString(na.omit(x))), by = name, .SDcols = c('email','city_town')]

给出:

    name          email city_town
1: Name1 email1@xyz.com Location1
2: Name2 email2@abc.com Location2
3: Name3 email3@pqr.com Location3
4: Name4 Location4
5: Name5 email5@abc.com

这也适用于实名。通过@uweBlock的数据你将得到:

                  name          email city_town
1: John Doe email1@xyz.com Location1
2: Save the World Fund email2@abc.com Location2
3: Best Shoes Ltd. email3@pqr.com Location3
4: Mother Location4
5: Jane email5@abc.com

每个部分有多个键(同样使用@UweBlock的数据):

                  name                          email             city_town
1: John Doe email1@xyz.com, email1@abc.com Location1
2: Save the World Fund email2@abc.com Location2
3: Best Shoes Ltd. email3@pqr.com Location3
4: Mother Location4, everywhere
5: Jane email5@abc.com
<小时/>

使用的数据:

txt <- textConnection("Name1
Email: email1@xyz.com
City/Town: Location1
Name2
Email: email2@abc.com
City/Town: Location2
Name3
Email: email3@pqr.com
City/Town: Location3
Name4
City/Town: Location4
Name5
Email: email5@abc.com")

关于r - 将一列多行数据转换为多列多行数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44795372/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com