gpt4 book ai didi

xml - 如何从 xml 文件创建 R 数据框?

转载 作者:数据小太阳 更新时间:2023-10-29 01:43:22 26 4
gpt4 key购买 nike

我有一个 XML 文档文件。该文件的一部分如下所示:

-<attr>  
<attrlabl>COUNTY</attrlabl>
<attrdef>County abbreviation</attrdef>
<attrtype>Text</attrtype>
<attwidth>1</attwidth>
<atnumdec>0</atnumdec>
-<attrdomv>
-<edom>
<edomv>C</edomv>
<edomvd>Clackamas County</edomvd>
<edomvds/>
</edom>
-<edom>
<edomv>M</edomv>
<edomvd>Multnomah County</edomvd>
<edomvds/>
</edom>
-<edom>
<edomv>W</edomv>
<edomvd>Washington County</edomvd>
<edomvds/>
</edom>
</attrdomv>
</attr>

从这个 XML 文件中,我想创建一个包含 attrlablattrdefattrtype 列的 R 数据框>属性。请注意,attrdomv 列应包含类别变量的所有级别。数据框应如下所示:

attrlabl    attrdef                attrtype    attrdomv  
COUNTY County abbreviation Text C Clackamas County; M Multnomah County; W Washington County

我有这样一个不完整的代码:

doc <- xmlParse("taxlots.shp.xml")  
dataDictionary <- xmlToDataFrame(getNodeSet(doc,"//attrlabl"))

你能完成我的 R 代码吗?感谢您的帮助!

最佳答案

假设这是正确的 taxlots.shp.xml 文件:

<attr>  
<attrlabl>COUNTY</attrlabl>
<attrdef>County abbreviation</attrdef>
<attrtype>Text</attrtype>
<attwidth>1</attwidth>
<atnumdec>0</atnumdec>
<attrdomv>
<edom>
<edomv>C</edomv>
<edomvd>Clackamas County</edomvd>
<edomvds/>
</edom>
<edom>
<edomv>M</edomv>
<edomvd>Multnomah County</edomvd>
<edomvds/>
</edom>
<edom>
<edomv>W</edomv>
<edomvd>Washington County</edomvd>
<edomvds/>
</edom>
</attrdomv>
</attr>

你快到了:

doc <- xmlParse("taxlots.shp.xml")
xmlToDataFrame(nodes=getNodeSet(doc1,"//attr"))[c("attrlabl","attrdef","attrtype","attrdomv")]
attrlabl attrdef attrtype attrdomv
1 COUNTY County abbreviation Text CClackamas CountyMMultnomah CountyWWashington County

但是最后一个字段没有你想要的格式。为此,需要一些额外的步骤:

step1 <- xmlToDataFrame(nodes=getNodeSet(doc1,"//attrdomv/edom"))
step1
edomv edomvd edomvds
1 C Clackamas County
2 M Multnomah County
3 W Washington County

step2 <- paste(paste(step1$edomv, step1$edomvd, sep=" "), collapse="; ")
step2
[1] "C Clackamas County; M Multnomah County; W Washington County"

cbind(xmlToDataFrame(nodes= getNodeSet(doc1, "//attr"))[c("attrlabl", "attrdef", "attrtype")],
attrdomv= step2)
attrlabl attrdef attrtype attrdomv
1 COUNTY County abbreviation Text C Clackamas County; M Multnomah County; W Washington County

关于xml - 如何从 xml 文件创建 R 数据框?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13579996/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com