gpt4 book ai didi

linux - 将复杂文本文件解析为一行字段名称和第二行值

转载 作者:太空宇宙 更新时间:2023-11-04 04:17:11 25 4
gpt4 key购买 nike

我正在尝试解析一个文本文件,该文件的行主要包含文本和单个数字(每行开头都有“#”)。文件的第二部分由具有多个数字的行组成,所有数字都与单个结构相关。由于我需要合并数百个案例的这些输出文件,因此如果我可以将这些文件中的每一个处理为一行数据,那将会有很大帮助。我在使用 bash/perl/awk 组合时遇到困难。谁能建议我可以做到这一点的方法吗? (下面的示例文件)。

感谢您的考虑。

最美好的祝愿,

-S

# Title Segmentation Statistics
#
# generating_program mri_segstats
# cvs_version $Id: mri_segstats.c,v 1.75.2.9 2013/02/16 00:09:33 greve Exp $
# cmdline mri_segstats --seg mri/aseg.mgz --sum stats/aseg.stats --pv mri/norm.mgz --empty --brainmask mri/brainmask.mgz --brain-vol-from-seg --excludeid 0 --excl-ctxgmwm --supratent --subcortgray --in mri/norm.mgz --in-intensity-name norm --in-intensity-units MR --etiv --surf-wm-vol --surf-ctx-vol --totalgray --euler --ctab /mnt/glusterfs/salsoman/freesurfer/ASegStatsLUT.txt --subject WCA_0162_T1_FS
# sysname Linux
# hostname barley15.stanford.edu
# machine x86_64
# user salsoman
# anatomy_type volume
#
# SUBJECTS_DIR /mnt/glusterfs/salsoman/output/FS
# subjectname WCA_0162_T1_FS
# Measure BrainSeg, BrainSegVol, Brain Segmentation Volume, 1089921.000000, mm^3
# Measure BrainSegNotVent, BrainSegVolNotVent, Brain Segmentation Volume Without Ventricles, 993734.000000, mm^3
# Measure BrainSegNotVentSurf, BrainSegVolNotVentSurf, Brain Segmentation Volume Without Ventricles from Surf, 993214.631437, mm^3
# Measure lhCortex, lhCortexVol, Left hemisphere cortical gray matter volume, 240339.518738, mm^3
# Measure rhCortex, rhCortexVol, Right hemisphere cortical gray matter volume, 236468.599276, mm^3
# Measure Cortex, CortexVol, Total cortical gray matter volume, 476808.118013, mm^3
# Measure lhCorticalWhiteMatter, lhCorticalWhiteMatterVol, Left hemisphere cortical white matter volume, 191135.667925, mm^3
# Measure rhCorticalWhiteMatter, rhCorticalWhiteMatterVol, Right hemisphere cortical white matter volume, 180013.845498, mm^3
# Measure CorticalWhiteMatter, CorticalWhiteMatterVol, Total cortical white matter volume, 371149.513423, mm^3
# Measure SubCortGray, SubCortGrayVol, Subcortical gray matter volume, 52383.000000, mm^3
# Measure TotalGray, TotalGrayVol, Total gray matter volume, 604954.118013, mm^3
# Measure SupraTentorial, SupraTentorialVol, Supratentorial volume, 991108.631437, mm^3
# Measure SupraTentorialNotVent, SupraTentorialVolNotVent, Supratentorial volume, 902611.631437, mm^3
# Measure SupraTentorialNotVentVox, SupraTentorialVolNotVentVox, Supratentorial volume voxel count, 900542.000000, mm^3
# Measure Mask, MaskVol, Mask Volume, 1694747.000000, mm^3
# Measure BrainSegVol-to-eTIV, BrainSegVol-to-eTIV, Ratio of BrainSegVol to eTIV, 0.624390, unitless
# Measure MaskVol-to-eTIV, MaskVol-to-eTIV, Ratio of MaskVol to eTIV, 0.970881, unitless
# Measure lhSurfaceHoles, lhSurfaceHoles, Number of defect holes in lh surfaces prior to fixing, 239, unitless
# Measure rhSurfaceHoles, rhSurfaceHoles, Number of defect holes in rh surfaces prior to fixing, 227, unitless
# Measure SurfaceHoles, SurfaceHoles, Total number of defect holes in surfaces prior to fixing, 466, unitless
# Measure EstimatedTotalIntraCranialVol, eTIV, Estimated Total Intracranial Volume, 1745576.756023, mm^3
# SegVolFile mri/aseg.mgz
# SegVolFileTimeStamp 2013/03/27 19:34:08
# ColorTable /mnt/glusterfs/salsoman/freesurfer/ASegStatsLUT.txt
# ColorTableTimeStamp 2013/02/25 22:23:16
# InVolFile mri/norm.mgz
# InVolFileTimeStamp 2013/03/27 14:00:28
# InVolFrame 0
# PVVolFile mri/norm.mgz
# PVVolFileTimeStamp 2013/03/27 14:00:28
# Excluding Cortical Gray and White Matter
# ExcludeSegId 0 2 3 41 42
# VoxelVolume_mm3 1
# TableCol 1 ColHeader Index
# TableCol 1 FieldName Index
# TableCol 1 Units NA
# TableCol 2 ColHeader SegId
# TableCol 2 FieldName Segmentation Id
# TableCol 2 Units NA
# TableCol 3 ColHeader NVoxels
# TableCol 3 FieldName Number of Voxels
# TableCol 3 Units unitless
# TableCol 4 ColHeader Volume_mm3
# TableCol 4 FieldName Volume
# TableCol 4 Units mm^3
# TableCol 5 ColHeader StructName
# TableCol 5 FieldName Structure Name
# TableCol 5 Units NA
# TableCol 6 ColHeader normMean
# TableCol 6 FieldName Intensity normMean
# TableCol 6 Units MR
# TableCol 7 ColHeader normStdDev
# TableCol 7 FieldName Itensity normStdDev
# TableCol 7 Units MR
# TableCol 8 ColHeader normMin
# TableCol 8 FieldName Intensity normMin
# TableCol 8 Units MR
# TableCol 9 ColHeader normMax
# TableCol 9 FieldName Intensity normMax
# TableCol 9 Units MR
# TableCol 10 ColHeader normRange
# TableCol 10 FieldName Intensity normRange
# TableCol 10 Units MR
# NRows 45
# NTableCols 10
# ColHeaders Index SegId NVoxels Volume_mm3 StructName normMean normStdDev normMin normMax normRange
1 4 41962 41962.4 Left-Lateral-Ventricle 22.0753 10.2057 3.0000 94.0000 91.0000
2 5 2150 2149.7 Left-Inf-Lat-Vent 37.5636 16.3886 5.0000 89.0000 84.0000
3 7 8273 8273.3 Left-Cerebellum-White-Matter 88.0903 11.6908 21.0000 123.0000 102.0000
4 8 35427 35427.4 Left-Cerebellum-Cortex 56.4255 12.5475 2.0000 92.0000 90.0000
5 10 6087 6086.7 Left-Thalamus-Proper 92.2098 11.7928 50.0000 124.0000 74.0000
6 11 5101 5100.7 Left-Caudate 75.0335 9.9708 29.0000 100.0000 71.0000
7 12 4773 4773.0 Left-Putamen 75.7113 6.2195 48.0000 95.0000 47.0000
8 13 1178 1177.6 Left-Pallidum 86.3354 6.2568 59.0000 104.0000 45.0000
9 14 2973 2973.1 3rd-Ventricle 27.5508 11.3394 9.0000 77.0000 68.0000
10 15 2403 2403.0 4th-Ventricle 26.8237 11.9581 6.0000 79.0000 73.0000
11 16 18347 18347.2 Brain-Stem 82.1731 12.0144 15.0000 116.0000 101.0000
12 17 3824 3824.2 Left-Hippocampus 66.7333 8.6661 26.0000 100.0000 74.0000
13 18 2087 2087.1 Left-Amygdala 63.9856 7.2932 37.0000 91.0000 54.0000
14 24 2094 2094.0 CSF 36.2929 14.6972 12.0000 90.0000 78.0000
15 26 340 340.0 Left-Accumbens-area 69.8967 8.7139 37.0000 87.0000 50.0000
16 28 2969 2969.5 Left-VentralDC 94.9737 13.6527 44.0000 122.0000 78.0000
17 30 76 75.9 Left-vessel 58.3205 11.6736 27.0000 80.0000 53.0000
18 31 1103 1102.6 Left-choroid-plexus 51.7182 16.3692 12.0000 100.0000 88.0000
19 43 38108 38108.2 Right-Lateral-Ventricle 20.2269 10.2570 0.0000 92.0000 92.0000
20 44 2165 2165.0 Right-Inf-Lat-Vent 30.2048 13.6808 0.0000 80.0000 80.0000
21 46 9715 9715.4 Right-Cerebellum-White-Matter 86.9395 8.3909 25.0000 115.0000 90.0000
22 47 41688 41688.2 Right-Cerebellum-Cortex 57.5291 10.3208 9.0000 91.0000 82.0000
23 49 4769 4769.3 Right-Thalamus-Proper 82.0576 12.2446 18.0000 106.0000 88.0000
24 50 4587 4587.1 Right-Caudate 69.9613 12.7863 11.0000 103.0000 92.0000
25 51 4694 4694.4 Right-Putamen 69.9372 7.9141 48.0000 91.0000 43.0000
26 52 1407 1406.8 Right-Pallidum 88.0501 5.7841 57.0000 105.0000 48.0000
27 53 3160 3159.6 Right-Hippocampus 63.3511 8.9283 17.0000 95.0000 78.0000
28 54 1877 1877.4 Right-Amygdala 57.3686 8.5163 20.0000 83.0000 63.0000
29 58 376 376.0 Right-Accumbens-area 70.4901 9.9104 41.0000 96.0000 55.0000
30 60 2973 2972.7 Right-VentralDC 89.6143 14.1755 29.0000 120.0000 91.0000
31 62 105 105.1 Right-vessel 50.1458 12.1126 21.0000 78.0000 57.0000
32 63 2843 2842.7 Right-choroid-plexus 46.3759 13.8319 6.0000 115.0000 109.0000
33 72 68 67.9 5th-Ventricle 42.4444 11.2861 26.0000 83.0000 57.0000
34 77 25325 25325.0 WM-hypointensities 71.8650 16.2379 5.0000 112.0000 107.0000
35 78 0 0.0 Left-WM-hypointensities 0.0000 0.0000 0.0000 0.0000 0.0000
36 79 0 0.0 Right-WM-hypointensities 0.0000 0.0000 0.0000 0.0000 0.0000
37 80 153 153.1 non-WM-hypointensities 50.4551 16.1478 18.0000 88.0000 70.0000
38 81 0 0.0 Left-non-WM-hypointensities 0.0000 0.0000 0.0000 0.0000 0.0000
39 82 0 0.0 Right-non-WM-hypointensities 0.0000 0.0000 0.0000 0.0000 0.0000
40 85 350 349.6 Optic-Chiasm 66.0833 15.7641 24.0000 102.0000 78.0000
41 251 806 805.6 CC_Posterior 119.2646 18.1322 57.0000 150.0000 93.0000
42 252 252 251.7 CC_Mid_Posterior 109.1685 16.3862 51.0000 150.0000 99.0000
43 253 295 295.4 CC_Central 113.3418 16.2739 77.0000 140.0000 63.0000
44 254 294 293.7 CC_Mid_Anterior 115.1645 17.9396 72.0000 149.0000 77.0000
45 255 657 657.4 CC_Anterior 124.1047 22.5045 54.0000 166.0000 112.0000

最佳答案

你试过吗Talend Open Studio / Data Integration ? TOS 能够自动执行此类复杂的转换。数据转换作业的最终可执行文件将是一个 jar 文件,您可以轻松地从 shell 脚本中调用该文件。虽然 TOS 上手需要一些时间,但它非常强大。该产品已获得 GPL v2 许可,并且具有相当活跃的 community .

当然,您可以编写一些 awk/sed/perl 狂欢,并且您会得到结果,但在您的情况下,通过如此复杂的转换,这可能会变得非常不可读且无法维护。

迈克尔·HTH

关于linux - 将复杂文本文件解析为一行字段名称和第二行值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15733176/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com