gpt4 book ai didi

python - 选择文本文件中的特定行和单元格并放入数据框 : python or R

转载 作者:太空宇宙 更新时间:2023-11-04 07:33:18 25 4
gpt4 key购买 nike

python 或 R 都可以用于此,但有人可以建议我如何选择“基本统计信息”行,一个文本文件如下所示。我希望将此信息和 ROI 的名称放入 pandas 数据框中或作为 R 中的数据表。

ROI: mrc_ranch_house [Red] 195 points

Basic Stats Min Max Mean Stdev
Band 1 -20.208261 6.025762 -8.866403 5.289712

Histogram DN Npts Total Percent Acc Pct
Band 1 -20.208261 1 1 0.5128 0.5128
Bin=0.10287 -20.105383 0 1 0.0000 0.5128
-20.002504 1 2 0.5128 1.0256
-19.899626 0 2 0.0000 1.0256
-19.796747 0 2 0.0000 1.0256
-19.693869 0 2 0.0000 1.0256
-19.590990 0 2 0.0000 1.0256
-19.488112 0 2 0.0000 1.0256

Stats for ROI: river_1 [Blue] 90 points
Basic Stats Min Max Mean Stdev
Band 1 -20.187374 -6.694543 -12.227586 2.66464

Histogram DN Npts Total Percent Acc Pct
Band 1 -20.187374 1 1 1.1111 1.1111
Bin=0.05291 -20.134461 0 1 0 1.1111
-20.081548 0 1 0 1.1111
-20.028635 0 1 0 1.1111
-19.975722 0 1 0 1.1111


Stats for ROI: river_2 [Blue] 96 points
Basic Stats Min Max Mean Stdev
Band 1 -18.365091 -5.820825 -13.164463 2.851231

Histogram DN Npts Total Percent Acc Pct
Band 1 -18.365091 1 1 1.0417 1.0417
Bin=0.04919 -18.315898 0 1 0 1.0417
-18.266705 0 1 0 1.0417
-18.217512 0 1 0 1.0417

最终输出应该是这样的:

ROI              Min        Max         Mean        Stdev
mrc_ranch_house -20.208261 6.025762 -8.866403 5.289712
river_1 -20.187374 -6.694543 -12.227586 2.66464
river_2 -18.365091 -5.820825 -13.164463 2.851231

...等等

谢谢!

最佳答案

使用 R,使用:

# read the text file
txt <- readLines('https://dl.dropboxusercontent.com/u/45095175/rois_all.txt')

# create an index for the lines that are needed
ti <- rep(which(grepl('ROI:', txt)), each = 3) + 1:3
# create a grouping vector of the same length
grp <- rep(1:33, each = 3)

# filter the text with the index 'ti'
# and split into a list with grouping variable 'grp'
lst <- split(txt[ti], grp)
# loop over the list a read the text parts in as dataframes
lst <- lapply(lst, function(x) read.table(text = x, sep = '\t', header = TRUE,
blank.lines.skip = TRUE))

# bind the dataframes in the list together in one data.frame
DF <- do.call(rbind, lst)
# change the name of the first column
names(DF)[1] <- 'ROI'

# get the correct ROI's for the ROI-column
DF$ROI <- sub('.*: (\\w+).*$', '\\1', txt[grepl('ROI: ', txt)])

给出:

> DF
ROI Min Max Mean Stdev
1 mrc_ranch_house -20.208261 6.025762 -8.866403 5.289712
2 river_1 -20.187374 -6.694543 -12.227586 2.664640
3 river_2 -18.365091 -5.820825 -13.164463 2.851231
4 river_3 -18.291010 -4.583666 -12.092995 3.479293
5 river_4 -17.074295 -4.926921 -9.970926 2.897855
6 river_5 -16.849176 -8.622208 -12.387085 2.168462
7 adjacent_river_2 -18.987597 -7.957749 -13.392523 1.962263
8 adjacent_river_3 -19.426531 -8.640042 -13.467425 1.888105
9 adjacent_river_4 -20.452566 -6.830183 -12.833450 2.124761
10 bcs_1_ -23.612043 -8.221417 -16.032305 2.080695
11 bcs_2_ -24.018219 -10.648975 -16.814048 1.948863
12 bcs_3_ -23.011086 -9.106754 -15.404174 1.867498
13 red_1_ -22.313442 -7.839107 -14.768196 2.134152
14 red_2_ -22.551537 -7.236300 -14.613618 2.204253
15 red_3_ -22.057703 -7.746992 -14.483161 2.123497
16 bcs_4 -22.705107 -8.972753 -15.201623 1.817122
17 bcs_5 -24.109459 -10.113716 -15.776537 1.849163
18 glade_1_ -19.913187 -6.189866 -12.695884 3.303929
19 glade_2_ -19.812855 -4.672865 -11.995191 4.840168
20 glade_3_ -10.078033 -2.828722 -5.877417 1.941401
21 mwea_b -13.979379 -4.977155 -11.392434 2.019037
22 kaga -13.114172 -8.889531 -10.649324 1.290551
23 huku -14.206743 -7.853305 -10.608210 1.441250
24 ruai -18.643108 -12.645180 -14.540123 1.224183
25 tumaini -19.543234 -13.164941 -15.899968 1.812876
26 nkando -19.973492 -7.040238 -11.716987 2.617544
27 jikaze -16.408030 -9.001065 -12.323898 1.942196
28 miarage_b -15.126486 -6.661448 -10.391111 1.764279
29 batian -15.269146 -9.603316 -11.962470 1.168859
30 gitaraga -17.037708 -7.495215 -10.886802 2.561877
31 wiumiririe -9.578024 -6.225223 -7.688715 1.059796
32 chumvi -14.883148 -10.327570 -12.819469 1.231636
33 next_to_airstrip -17.242777 -5.207252 -10.601750 1.987712

最后一部分(从将列表绑定(bind)到一个数据框中开始)也可以使用 data.table-package 中的 rbindlist 函数来完成:

# load the 'data.table' package for the 'rbindlist' function
library(data.table)
# bind the dataframes in the list together to a data.table (enhanced version of a data.frame)
DT <- rbindlist(lst)
# change the name of the first column
setnames(DT, 1, 'ROI')

# get the correct ROI's for the ROI-column
DT[, ROI := sub('.*: (\\w+).*$', '\\1', txt[grepl('ROI: ', txt)])]

关于python - 选择文本文件中的特定行和单元格并放入数据框 : python or R,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42513614/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com