gpt4 book ai didi

python - 对非时间数据进行上采样

转载 作者:行者123 更新时间:2023-12-01 03:09:07 25 4
gpt4 key购买 nike

是否有更好的方法根据每行中包含的开始和停止条件对一系列 float 进行上采样?这是我开始的示例:

borehole   top  bottom  lithology
0 AP-2 94.6 95.1 dolomite
1 AP-2 95.1 96.7 limestone
2 AP-2 96.7 97.0 dolomite
3 AP-2 97.0 97.5 limestone
4 AP-2 97.5 97.8 limestone
5 AP-3 87.4 87.7 limestone
6 AP-3 87.7 88.1 limestone
7 AP-3 88.1 88.5 dolomite
8 AP-3 88.5 89.1 limestone

对于每一行,我想使用选定的增量在顶部值和底部值之间添加行,并向前填充钻孔和岩性值。数据帧中的增量是恒定的。这是我想要增量为 0.1 的填充数据框。 (这仅适用于上述数据框中的前两行:)

   borehole   Top       Lith
0 AP-2 94.6 dolomite
1 AP-2 94.7 dolomite
2 AP-2 94.8 dolomite
3 AP-2 94.9 dolomite
4 AP-2 95.0 dolomite
5 AP-2 95.1 limestone
6 AP-2 95.2 limestone
7 AP-2 95.3 limestone
8 AP-2 95.4 limestone
9 AP-2 95.5 limestone
10 AP-2 95.6 limestone
11 AP-2 95.7 limestone
12 AP-2 95.8 limestone
13 AP-2 95.9 limestone
14 AP-2 96.0 limestone
15 AP-2 96.1 limestone
16 AP-2 96.2 limestone
17 AP-2 96.3 limestone
18 AP-2 96.4 limestone
19 AP-2 96.5 limestone
20 AP-2 96.6 limestone
21 AP-2 96.7 limestone

这是我使用过的代码,它可以工作,但是当我在 pandas 中进行循环时,我想知道我是否遗漏了一些明显的东西。 pd.DataFrame.resample() 很诱人,但我不知道如何让它处理非时间数据。

import pandas as pd
import numpy as np

liths = pd.DataFrame(
{'borehole': {0: 'AP-2',
1: 'AP-2',
2: 'AP-2',
3: 'AP-2',
4: 'AP-2',
5: 'AP-3',
6: 'AP-3',
7: 'AP-3',
8: 'AP-3'},
'bottom': {0: 95.099999999999994,
1: 96.700000000000003,
2: 97.0,
3: 97.5,
4: 97.799999999999997,
5: 87.700000000000003,
6: 88.099999999999994,
7: 88.5,
8: 89.099999999999994},
'lithology': {0: 'dolomite',
1: 'limestone',
2: 'dolomite',
3: 'limestone',
4: 'limestone',
5: 'limestone',
6: 'limestone',
7: 'dolomite',
8: 'limestone'},
'top': {0: 94.599999999999994,
1: 95.099999999999994,
2: 96.700000000000003,
3: 97.0,
4: 97.5,
5: 87.400000000000006,
6: 87.700000000000003,
7: 88.099999999999994,
8: 88.5}}
)

filled = []
increment = 0.1
for row in liths.itertuples():
start = row.top
end = row.bottom
for i in np.arange(start, end, increment):
filled.append([row.borehole, i, row.lithology])
filled = pd.DataFrame(filled, columns=['borehole', 'Top', 'Lith']); filled

最佳答案

我将从每一行构造一个新的数据框并将它们全部连接起来

def expand(r):
a = np.arange(r.top, r.bottom, .1)
n = len(a)
return pd.DataFrame(dict(
borehole=[r.borehole] * n,
lithology=[r.lithology] * n,
top=a
))

pd.concat([expand(r) for r in df.itertuples()], ignore_index=True)
<小时/>
   borehole  lithology   top
0 AP-2 dolomite 94.6
1 AP-2 dolomite 94.7
2 AP-2 dolomite 94.8
3 AP-2 dolomite 94.9
4 AP-2 dolomite 95.0
5 AP-2 limestone 95.1
6 AP-2 limestone 95.2
7 AP-2 limestone 95.3
8 AP-2 limestone 95.4
9 AP-2 limestone 95.5
10 AP-2 limestone 95.6
11 AP-2 limestone 95.7
12 AP-2 limestone 95.8
13 AP-2 limestone 95.9
14 AP-2 limestone 96.0
15 AP-2 limestone 96.1
16 AP-2 limestone 96.2
17 AP-2 limestone 96.3
18 AP-2 limestone 96.4
19 AP-2 limestone 96.5
20 AP-2 limestone 96.6
21 AP-2 limestone 96.7
22 AP-2 dolomite 96.7
23 AP-2 dolomite 96.8
24 AP-2 dolomite 96.9
25 AP-2 limestone 97.0
26 AP-2 limestone 97.1
27 AP-2 limestone 97.2
28 AP-2 limestone 97.3
29 AP-2 limestone 97.4
30 AP-2 limestone 97.5
31 AP-2 limestone 97.6
32 AP-2 limestone 97.7
33 AP-3 limestone 87.4
34 AP-3 limestone 87.5
35 AP-3 limestone 87.6
36 AP-3 limestone 87.7
37 AP-3 limestone 87.8
38 AP-3 limestone 87.9
39 AP-3 limestone 88.0
40 AP-3 dolomite 88.1
41 AP-3 dolomite 88.2
42 AP-3 dolomite 88.3
43 AP-3 dolomite 88.4
44 AP-3 dolomite 88.5
45 AP-3 limestone 88.5
46 AP-3 limestone 88.6
47 AP-3 limestone 88.7
48 AP-3 limestone 88.8
49 AP-3 limestone 88.9
50 AP-3 limestone 89.0

关于python - 对非时间数据进行上采样,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43049355/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com