gpt4 book ai didi

python - 如何在 Pandas 数据框中添加坐标数组作为行

转载 作者:太空宇宙 更新时间:2023-11-04 05:23:37 25 4
gpt4 key购买 nike

我有一个看起来像这样的文本文件

,A,B
0,"[[-81.03443909 29.22855949]
[-81.09729767 29.27094078]
[-80.9937973 29.19698906]
[-81.03072357 29.27445984]
[-81.00499725 29.22187805]]","[[-81.42427063 28.30874634]
[-81.42427063 28.30874634]
[-81.42427063 28.30874634]
[-81.36068726 28.29172897]
[-81.42297363 28.30497551]
[-81.48571777 28.24975777]
[-81.35914612 28.29036331]]"

这就是我使用的 data 放入 Pandas DataFrame 后的样子

[[-78.70117188  33.80754852]
[-78.9934082 33.61843491]
[-80.81887817 28.60919952]
...,
[-76.62332916 35.54064941]
[-79.04235077 33.81600952]
[-79.03309631 33.55596161]]

我希望它看起来像这样

                       lat      long 
cluster point
0 a 0.445900 -1.286198
b -0.574496 -0.407154
c 0.872979 0.068084
d 0.297255 -2.157051

在创建 .txt 文件之前,数据位于 nd.array 中,我使用 pandas 创建文本文件。所以也许有一种方法可以跳过 txt 文件并使用 pandas 将数组拆分或格式化为一个整洁的数据帧。我已经有一段时间了,但我不知道该怎么做。

这就是我生成数据的方式。我通过仅复制 2 列来保持清晰,但将来我想传递一个唯一的点标识符

# Generate sample data
col_1 ="RL15_LONGITUDE"
col_2 ="RL15_LATITUDE"

data = pd.read_csv("input_data.csv")
coords = data.as_matrix(columns=[col_1, col_2])
data = data[[col_1,col_2]].dropna()
data = data.as_matrix().astype('float16',copy=False)

这是print clusters的输出

[array([[-81.03443909,  29.22855949],
[-81.09729767, 29.27094078],
[-81.42297363, 28.30497551],
[-81.48571777, 28.24975777],
[-81.35914612, 28.29036331]], dtype=float32), array([[-81.49134064, 27.58896065],
[-81.5194931 , 27.63422012],
[-81.5096283 , 27.55581093],
[-82.05444336, 26.93555069]], dtype=float32), array([[-82.18956757, 26.52433586],
[-82.18956757, 26.52433586],
[-82.18956757, 26.52433586],
[-82.19439697, 26.53297997]], dtype=float32)]

这就是我创建数据框和编写 .txt 文件的方式

clusters = pd.DataFrame({'A':[clusters]})
clusters.to_csv('output.txt')

最佳答案

这里是一个起点:

In [72]: (pd.concat([pd.DataFrame(c, columns=['lat','lon']).assign(cluster=i)
....: for i,c in enumerate(clusters)])
....: .reset_index()
....: .rename(columns={'index':'point'})
....: )
Out[72]:
point lat lon cluster
0 0 -81.034439 29.228559 0
1 1 -81.097298 29.270941 0
2 2 -81.422974 28.304976 0
3 3 -81.485718 28.249758 0
4 4 -81.359146 28.290363 0
5 0 -81.491341 27.588961 1
6 1 -81.519493 27.634220 1
7 2 -81.509628 27.555811 1
8 3 -82.054443 26.935551 1
9 0 -82.189568 26.524336 2
10 1 -82.189568 26.524336 2
11 2 -82.189568 26.524336 2
12 3 -82.194397 26.532980 2

或者使用多索引:

In [73]: (pd.concat([pd.DataFrame(c, columns=['lat','lon']).assign(cluster=i)
....: for i,c in enumerate(clusters)])
....: .reset_index()
....: .rename(columns={'index':'point'})
....: .set_index(['cluster','point'])
....: )
Out[73]:
lat lon
cluster point
0 0 -81.034439 29.228559
1 -81.097298 29.270941
2 -81.422974 28.304976
3 -81.485718 28.249758
4 -81.359146 28.290363
1 0 -81.491341 27.588961
1 -81.519493 27.634220
2 -81.509628 27.555811
3 -82.054443 26.935551
2 0 -82.189568 26.524336
1 -82.189568 26.524336
2 -82.189568 26.524336
3 -82.194397 26.532980

关于python - 如何在 Pandas 数据框中添加坐标数组作为行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39518141/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com