gpt4 book ai didi

Python:通过以更 Pandas 特定的方式编写代码来简化代码

转载 作者:太空宇宙 更新时间:2023-11-03 16:47:12 25 4
gpt4 key购买 nike

我编写了一些代码,根据具有相同序列号的机器查找 GPS 坐标之间的距离

但我相信如果可以简化为使用iterrowsdf.apply,效率会更高;但是,我似乎无法弄清楚。

因为我只需要在 ser_no[i] == ser_no[i+1] 时执行该函数,并在 ser_no 更改的位置插入一个 NaN 值,我似乎无法应用 Pandas 方法来使代码更高效。我看过:

不幸的是,即使在浏览了这些帖子之后,我也不容易看到我需要做出的飞跃。

What I have:

def haversine(lat1, long1, lat2, long2):
r = 6371 # radius of Earth in km
# convert decimals to degrees
lat1, long1, lat2, long2 = map(np.radians, [lat1, long1, lat2, long2])
# haversine formula
lat = lat2 - lat1
lon = long2 - long1
a = np.sin(lat/2)**2 + np.cos(lat1)*np.cos(lat2)*np.sin(lon/2)**2
c = 2*np.arcsin(np.sqrt(a))
d = r*c
return d
# pre-allocate vector
hdist = np.zeros(len(mttt_pings.index), dtype = float)
# haversine loop calculation
for i in range(0, len(mttt_pings.index) - 1):
'''
when the ser_no from i and i + 1 are the same calculate the distance
between them using the haversine formula and put the distance in the
i + 1 location
'''
if mttt_pings.ser_no.loc[i] == mttt_pings.ser_no[i + 1]:
hdist[i + 1] = haversine(mttt_pings.EQP_GPS_SPEC_LAT_CORD[i], \
mttt_pings.EQP_GPS_SPEC_LONG_CORD[i], \
mttt_pings.EQP_GPS_SPEC_LAT_CORD[i + 1], \
mttt_pings.EQP_GPS_SPEC_LONG_CORD[i + 1])
else:
hdist = np.insert(hdist, i, np.nan)
'''
when ser_no i and i + 1 are not the same, insert NaN at the ith location
'''

最佳答案

主要思想是利用shift来检查连续的行。我还编写了一个 get_dist 函数,它只是包装了现有的距离函数,以便在我使用 apply 计算距离时使事情更具可读性。

def get_dist(row):
lat1 = row['EQP_GPS_SPEC_LAT_CORD']
long1 = row['EQP_GPS_SPEC_LONG_CORD']
lat2 = row['EQP_GPS_SPEC_LAT_CORD_2']
long2 = row['EQP_GPS_SPEC_LONG_CORD_2']
return haversine(lat1, long1, lat2, long2)

# Find consecutive rows with matching ser_no, and get coordinates.
coord_cols = ['EQP_GPS_SPEC_LAT_CORD', 'EQP_GPS_SPEC_LONG_CORD']
matching_ser = mttt_pings['ser_no'] == mttt_pings['ser_no'].shift(1)
shift_coords = mttt_pings.shift(1).loc[matching_ser, coord_cols]

# Join shifted coordinates and compute distances.
mttt_pings_shift = mttt_pings.join(shift_coords, how='inner', rsuffix='_2')
mttt_pings['hdist'] = mttt_pings_shift.apply(get_dist, axis=1)

在上面的代码中,我已将距离添加到您的数据帧中。如果你想以 numpy 数组的形式获取结果,你可以这样做:

hdist = mttt_pings['hdist'].values

作为旁注,您可能需要考虑使用 geopy.distance.vincenty计算纬度/经度坐标之间的距离。一般来说,vincentyhaversine 更准确,尽管计算时间可能更长。要使用 vincenty,需要对 get_dist 函数进行非常小的修改。

from geopy.distance import vincenty

def get_dist(row):
lat1 = row['EQP_GPS_SPEC_LAT_CORD']
long1 = row['EQP_GPS_SPEC_LONG_CORD']
lat2 = row['EQP_GPS_SPEC_LAT_CORD_2']
long2 = row['EQP_GPS_SPEC_LONG_CORD_2']
return vincenty((lat1, long1), (lat2, long2)).km

关于Python:通过以更 Pandas 特定的方式编写代码来简化代码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36207540/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com