gpt4 book ai didi

python - 在python中读取二进制数据文件进行分析

转载 作者:行者123 更新时间:2023-12-04 02:38:54 25 4
gpt4 key购买 nike

我用Fortran将数据写入二进制文件,格式如下

open(unit=99,form='unformatted',status='unknown')
do i=1,N
write(99) (i),(A(i)),(B(i))
enddo
close(99)

这里,AB 是 double 组。python如何读取这个二进制数据文件?

PS:到目前为止,我尝试了以下步骤来读取数据。

with open('fort.99', 'rb') as binfile:
data=binfile.read()

虽然文件读取成功,但我无法对这些数据使用 numpy 函数,这是我大多数分析的首选。

最佳答案

更新的答案

我在 Numpy 中尝试了更多,您可以像这样更清晰地阅读文件 - 下面的信息仍然适用并解释了它是如何工作的:

import numpy as np

# Read file and reshape as "records" of 28 bytes each
n = np.fromfile('fort.99',dtype=np.uint8).reshape(-1,28)
I = n[:,4:8].copy().view(np.int32) # pick bytes 4-8, make contiguous and view as integer
A = n[:,8:16].copy().view(np.float64) # pick bytes 8-16, make contiguous and view as float64
B = n[:,16:24].copy().view(np.float64) # pick bytes 16-24, make contiguous and view as float64

原始答案

我更改了您的程序以包含一些可识别的数据:

      program thing
REAL*8, DIMENSION(128) :: A,B
INTEGER N
N=128

open(unit=99,form='unformatted',status='unknown')
do i=1,N
A(i)=100.0 * i
B(i)=-1000.0 * i
write(99) (2*i),(A(i)),(B(i))
enddo
close(99)
end program

然后我查看了文件大小,它是 3,584 字节,所以我将其除以 128 得到每个 fortran WRITE 的字节数为 28。

所以,用xxd检查数据如下:

xxd -c 28 fort.99

00000000: 1400 0000 0200 0000 0000 0000 0000 5940 0000 0000 0040 8fc0 1400 0000 ..............Y@.....@......
0000001c: 1400 0000 0400 0000 0000 0000 0000 6940 0000 0000 0040 9fc0 1400 0000 ..............i@.....@......
00000038: 1400 0000 0600 0000 0000 0000 00c0 7240 0000 0000 0070 a7c0 1400 0000 ..............r@.....p......
00000054: 1400 0000 0800 0000 0000 0000 0000 7940 0000 0000 0040 afc0 1400 0000 ..............y@.....@......
00000070: 1400 0000 0a00 0000 0000 0000 0040 7f40 0000 0000 0088 b3c0 1400 0000 .............@.@............
0000008c: 1400 0000 0c00 0000 0000 0000 00c0 8240 0000 0000 0070 b7c0 1400 0000 ...............@.....p......
000000a8: 1400 0000 0e00 0000 0000 0000 00e0 8540 0000 0000 0058 bbc0 1400 0000 ...............@.....X......
000000c4: 1400 0000 1000 0000 0000 0000 0000 8940 0000 0000 0040 bfc0 1400 0000 ...............@.....@......
000000e0: 1400 0000 1200 0000 0000 0000 0020 8c40 0000 0000 0094 c1c0 1400 0000 ............. .@............
000000fc: 1400 0000 1400 0000 0000 0000 0040 8f40 0000 0000 0088 c3c0 1400 0000 .............@.@............

因此每个 fortran WRITE 有 28 个字节,每个记录的开头和结尾有 4 个字节的索引。

然后我像这样解码它们:

#!/usr/bin/env python3

import struct

# Important to open the file in binary mode, 'b'
with open('fort.99','rb') as f:
# You should really read until error in case there are more or fewer than 128 records one day - but you know there are 128

for i in range(128):
# Read one record, i.e. output of one fortran WRITE
record = f.read(28)
# Unpack 2 INTEGERs, 2 DOUBLEs and an INTEGER
_, I, A, B, _ = struct.unpack('2i2di',record)
print(I,A,B)

输出

2 100.0 -1000.0
4 200.0 -2000.0
6 300.0 -3000.0
8 400.0 -4000.0
10 500.0 -5000.0
12 600.0 -6000.0
14 700.0 -7000.0
16 800.0 -8000.0
18 900.0 -9000.0
20 1000.0 -10000.0
22 1100.0 -11000.0
24 1200.0 -12000.0
26 1300.0 -13000.0
28 1400.0 -14000.0
30 1500.0 -15000.0
32 1600.0 -16000.0
34 1700.0 -17000.0
36 1800.0 -18000.0
38 1900.0 -19000.0
40 2000.0 -20000.0
42 2100.0 -21000.0
44 2200.0 -22000.0
46 2300.0 -23000.0
48 2400.0 -24000.0
50 2500.0 -25000.0
52 2600.0 -26000.0
54 2700.0 -27000.0
56 2800.0 -28000.0
58 2900.0 -29000.0
60 3000.0 -30000.0
62 3100.0 -31000.0
64 3200.0 -32000.0
66 3300.0 -33000.0
68 3400.0 -34000.0
70 3500.0 -35000.0
72 3600.0 -36000.0
74 3700.0 -37000.0
76 3800.0 -38000.0
78 3900.0 -39000.0
80 4000.0 -40000.0
82 4100.0 -41000.0
84 4200.0 -42000.0
86 4300.0 -43000.0
88 4400.0 -44000.0
90 4500.0 -45000.0
92 4600.0 -46000.0
94 4700.0 -47000.0
96 4800.0 -48000.0
98 4900.0 -49000.0
100 5000.0 -50000.0
102 5100.0 -51000.0
104 5200.0 -52000.0
106 5300.0 -53000.0
108 5400.0 -54000.0
110 5500.0 -55000.0
112 5600.0 -56000.0
114 5700.0 -57000.0
116 5800.0 -58000.0
118 5900.0 -59000.0
120 6000.0 -60000.0
122 6100.0 -61000.0
124 6200.0 -62000.0
126 6300.0 -63000.0
128 6400.0 -64000.0
130 6500.0 -65000.0
132 6600.0 -66000.0
134 6700.0 -67000.0
136 6800.0 -68000.0
138 6900.0 -69000.0
140 7000.0 -70000.0
142 7100.0 -71000.0
144 7200.0 -72000.0
146 7300.0 -73000.0
148 7400.0 -74000.0
150 7500.0 -75000.0
152 7600.0 -76000.0
154 7700.0 -77000.0
156 7800.0 -78000.0
158 7900.0 -79000.0
160 8000.0 -80000.0
162 8100.0 -81000.0
164 8200.0 -82000.0
166 8300.0 -83000.0
168 8400.0 -84000.0
170 8500.0 -85000.0
172 8600.0 -86000.0
174 8700.0 -87000.0
176 8800.0 -88000.0
178 8900.0 -89000.0
180 9000.0 -90000.0
182 9100.0 -91000.0
184 9200.0 -92000.0
186 9300.0 -93000.0
188 9400.0 -94000.0
190 9500.0 -95000.0
192 9600.0 -96000.0
194 9700.0 -97000.0
196 9800.0 -98000.0
198 9900.0 -99000.0
200 10000.0 -100000.0
202 10100.0 -101000.0
204 10200.0 -102000.0
206 10300.0 -103000.0
208 10400.0 -104000.0
210 10500.0 -105000.0
212 10600.0 -106000.0
214 10700.0 -107000.0
216 10800.0 -108000.0
218 10900.0 -109000.0
220 11000.0 -110000.0
222 11100.0 -111000.0
224 11200.0 -112000.0
226 11300.0 -113000.0
228 11400.0 -114000.0
230 11500.0 -115000.0
232 11600.0 -116000.0
234 11700.0 -117000.0
236 11800.0 -118000.0
238 11900.0 -119000.0
240 12000.0 -120000.0
242 12100.0 -121000.0
244 12200.0 -122000.0
246 12300.0 -123000.0
248 12400.0 -124000.0
250 12500.0 -125000.0
252 12600.0 -126000.0
254 12700.0 -127000.0
256 12800.0 -128000.0

如您所见,每条记录上都有记录标记 1400 0000,这意味着记录有 20 个字节长(1 fortran INTEGER@4、2 fortran DOUBLE@8)和 4 个字节的记录标记在每一端 - 我读入一个名为 _ 的变量并在 Python 中丢弃。如果你不想这样,你需要在 fortran 中使用 DIRECT/STREAM 输出 - 但我不知道你是否可以在你的另一端控制它处理链。参见 here以获得描述。

关键词:fortran, Python, binary, unformatted, xxd, dump, records, view as integer, view as float, view as float64, view as double

关于python - 在python中读取二进制数据文件进行分析,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60284470/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com