Python有效地创建密度图-6ren

Python有效地创建密度图

转载作者：塔克拉玛干更新时间：2023-11-03 03:11:14

我希望能得到一些帮助，让我的代码运行得更快。

基本上，我在列表 insideoceanlist 中有一个由纬度和经度点组成的正方形网格。然后是一个目录，其中包含 lat, long 坐标的数据文件，代表特定日期的雷击。这个想法是针对每一天，我们想知道方形网格上每个点周围有多少次雷击。目前它只有两个 for 循环，所以对于方形网格上的每个点，您检查当天每次雷击的距离。如果它在 40 公里以内，我会在该点上加一个以制作密度图。

起始网格的整体形状为矩形，由宽度为 0.11、长度为 0.11 的正方形组成。整个矩形约为 50x30。最后我有一个 shapefile 概述了澳大利亚的“预测区域”，如果网格中的任何点在该区域之外，那么我们将忽略它。所以剩下的所有点(insideoceanlist)都是澳大利亚的点。

方形网格上有大约 100000 个点，即使是缓慢的一天，也有大约 1000 次闪电，因此需要很长时间来处理。有没有办法更有效地做到这一点？我非常感谢任何建议。

顺便说一句，我将 list2 更改为 list3，因为我听说在 Python 中迭代列表比数组快。

for i in range(len(list1)): #list1 is a list of data files containing lat,long coords for lightning strikes for each day
    dict_density = {}
    for k in insideoceanlist: #insideoceanlist is a grid of ~100000 lat,long points
        dict_density[k] = 0
    list2 = np.loadtxt(list1[i],delimiter = ",") #this open one of the files containing lat,long coords and puts it into an array
    list3 = map(list,list2) #converts the array into a list
    # the following part is what I wanted to improve
    for j in insideoceanlist:
        for l in list3:
            if great_circle(l,j).meters < 40000: #great_circle is a function which measures distance between points the two lat,long points
                dict_density[j] += 1
    #
    filename = 'example' +str(i) + '.txt'
        with open(filename, 'w') as f:
            for m in range(len(insideoceanlist)):
                f.write('%s\n' % (dict_density[insideoceanlist[m]])) #writes each point in the same order as the insideoceanlist
    f.close()

最佳答案

为了详细说明@DanGetz 的回答，这里有一些代码使用击打数据作为驱动程序，而不是为每个击打点迭代整个网格。我假设您以澳大利亚的中点为中心，具有 0.11 度的网格正方形，即使度数的大小因纬度而异!

通过快速引用维基百科进行的粗略计算告诉我，您的 40 公里距离是从北到南的 ±4 个方格范围，从东到西的 ±5 个方格范围。 (它在低纬度下降到 4 个正方形，但是......嗯!)

如前所述，此处的技巧是以直接、公式化的方式从击打位置(纬度/经度)转换为方格。计算出网格一个角的位置，从 strike 中减去该位置，然后除以网格的大小 - 0.11 度，截断，然后得到行/列索引。现在访问所有周围的方 block ，直到距离变得太大，最多 1 + (2 * 2 * 4 * 5) = 81 个方 block 检查距离。增加范围内的方 block 。

结果是我最多进行 81 次访问乘以 1000 次攻击(或者无论您有多少次)，而不是访问 100,000 个网格方 block 乘以 1000 次攻击。这是一个显着的性能提升。

请注意，您没有描述传入数据的格式，所以我只是随机生成数字。你会想要解决这个问题。 ;-)

#!python3

"""
Per WikiPedia (https://en.wikipedia.org/wiki/Centre_points_of_Australia)

Median point
============

The median point was calculated as the midpoint between the extremes of
latitude and longitude of the continent.

    24 degrees 15 minutes south latitude, 133 degrees 25 minutes east
    longitude (24°15′S 133°25′E); position on SG53-01 Henbury 1:250 000
    and 5549 James 1:100 000 scale maps.

"""
MEDIAN_LAT = -(24.00 + 15.00/60.00)
MEDIAN_LON = (133 + 25.00/60.00)

"""
From the OP:

The starting grid has the overall shape of a rectangle, made up of
squares with width of 0.11 and length 0.11. The entire rectange is about
50x30. Lastly I have a shapefile which outlines the 'forecast zones' in
Australia, and if any point in the grid is outside this zone then we
omit it. So all the leftover points (insideoceanlist) are the ones in
Australia.
"""

DELTA_LAT = 0.11
DELTA_LON = 0.11

GRID_WIDTH = 50.0 # degrees
GRID_HEIGHT = 30.0 # degrees

GRID_ROWS = int(GRID_HEIGHT / DELTA_LAT) + 1
GRID_COLS = int(GRID_WIDTH / DELTA_LON) + 1

LAT_SIGN = 1.0 if MEDIAN_LAT >= 0 else -1.0
LON_SIGN = 1.0 if MEDIAN_LON >= 0 else -1.0

GRID_LOW_LAT = MEDIAN_LAT - (LAT_SIGN * GRID_HEIGHT / 2.0)
GRID_HIGH_LAT = MEDIAN_LAT + (LAT_SIGN * GRID_HEIGHT / 2.0)
GRID_MIN_LAT = min(GRID_LOW_LAT, GRID_HIGH_LAT)
GRID_MAX_LAT = max(GRID_LOW_LAT, GRID_HIGH_LAT)

GRID_LOW_LON = MEDIAN_LON - (LON_SIGN * GRID_WIDTH / 2.0)
GRID_HIGH_LON = MEDIAN_LON + (LON_SIGN * GRID_WIDTH / 2.0)
GRID_MIN_LON = min(GRID_LOW_LON, GRID_HIGH_LON)
GRID_MAX_LON = max(GRID_LOW_LON, GRID_HIGH_LON)

GRID_PROXIMITY_KM = 40.0

"""https://en.wikipedia.org/wiki/Longitude#Length_of_a_degree_of_longitude"""
_Degree_sizes_km = (
    (0,  110.574, 111.320),
    (15, 110.649, 107.551),
    (30, 110.852, 96.486),
    (45, 111.132, 78.847),
    (60, 111.412, 55.800),
    (75, 111.618, 28.902),
    (90, 111.694, 0.000),
)

# For the Australia situation, +/- 15 degrees means that our worst
# case scenario is about 40 degrees south. At that point, a single
# degree of longitude is smallest, with a size about 80 km. That
# in turn means a 40 km distance window will span half a degree or so.
# Since grid squares a 0.11 degree across, we have to check +/- 5
# cols.

GRID_SEARCH_COLS = 5

# Latitude degrees are nice and constant-like at about 110km. That means
# a .11 degree grid square is 12km or so, making our search range +/- 4
# rows.

GRID_SEARCH_ROWS = 4

def make_grid(rows, cols):
    return [[0 for col in range(cols)] for row in range(rows)]

Grid = make_grid(GRID_ROWS, GRID_COLS)

def _col_to_lon(col):
    return GRID_LOW_LON + (LON_SIGN * DELTA_LON * col)

Col_to_lon = [_col_to_lon(c) for c in range(GRID_COLS)]

def _row_to_lat(row):
    return GRID_LOW_LAT + (LAT_SIGN * DELTA_LAT * row)

Row_to_lat = [_row_to_lat(r) for r in range(GRID_ROWS)]

def pos_to_grid(pos):
    lat, lon = pos

    if lat < GRID_MIN_LAT or lat >= GRID_MAX_LAT:
        print("Lat limits:", GRID_MIN_LAT, GRID_MAX_LAT)
        print("Position {} is outside grid.".format(pos))
        return None

    if lon < GRID_MIN_LON or lon >= GRID_MAX_LON:
        print("Lon limits:", GRID_MIN_LON, GRID_MAX_LON)
        print("Position {} is outside grid.".format(pos))
        return None

    row = int((lat - GRID_LOW_LAT) / DELTA_LAT)
    col = int((lon - GRID_LOW_LON) / DELTA_LON)

    return (row, col)


def visit_nearby_grid_points(pos, dist_km):
    row, col = pos_to_grid(pos)

    # +0, +0 is not symmetric - don't increment twice
    Grid[row][col] += 1

    for dr in range(1, GRID_SEARCH_ROWS):
        for dc in range(1, GRID_SEARCH_COLS):
            misses = 0
            gridpos = Row_to_lat[row+dr], Col_to_lon[col+dc]
            if great_circle(pos, gridpos).meters <= dist_km:
                Grid[row+dr][col+dc] += 1
            else:
                misses += 1
            gridpos = Row_to_lat[row+dr], Col_to_lon[col-dc]
            if great_circle(pos, gridpos).meters <= dist_km:
                Grid[row+dr][col-dc] += 1
            else:
                misses += 1
            gridpos = Row_to_lat[row-dr], Col_to_lon[col+dc]
            if great_circle(pos, gridpos).meters <= dist_km:
                Grid[row-dr][col+dc] += 1
            else:
                misses += 1
            gridpos = Row_to_lat[row-dr], Col_to_lon[col-dc]
            if great_circle(pos, gridpos).meters <= dist_km:
                Grid[row-dr][col-dc] += 1
            else:
                misses += 1
            if misses == 4:
                break

def get_pos_from_line(line):
    """
    FIXME: Don't know the format of your data, just random numbers
    """
    import random
    return (random.uniform(GRID_LOW_LAT, GRID_HIGH_LAT),
            random.uniform(GRID_LOW_LON, GRID_HIGH_LON))

with open("strikes.data", "r") as strikes:
    for line in strikes:
        pos = get_pos_from_line(line)
        visit_nearby_grid_points(pos, GRID_PROXIMITY_KM)

关于Python有效地创建密度图，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/36732513/

文章推荐： java - 如何将 Jersey ExceptionMapper 与 Google Guice 一起使用？

文章推荐： java - JDBI 是否接受 UUID 参数？

文章推荐：上下文无关文法的算法

r - 确定和分组完全相关的变量(有效地)
我收到未知数据，我想以编程方式查看相关性，并将所有完全相关的变量组合在一起(忽略方向)。在下面的数据集中，我可以手动查看相关性并说 a, f, g, h一起去吧b, d, e .我怎样才能以编程方
有效地 reshape 总结的结果
这个问题在这里已经有了答案: use dplyr's summarise_each to return one row per function? (3 个答案) 关闭 4 年前。作为探索性工作的
c++ - (有效地)动态存储多项式
我想要完成的是使用数组存储未知大小的多项式。我在互联网上看到的是使用一个数组，每个单元格都包含系数，度数是单元格编号，但这不是有效的，因为如果我们有一个多项式，如:6x^14+x+5。这意味着我们将从
java - 一个正则表达式来统治它们(有效地)？
嘿伙计们，我一直在尝试解析 HTML 文件以从中抓取文本，但时不时地，我会得到一些非常奇怪的字符，例如 à€œ。我确定是“智能引号”或弯头标点符号导致了我的所有问题，因此我的临时修复是搜索所有这些字符
r - 如何从访问过的地方列表(有效地)构建边缘列表？
我原来的 data.table 由三列组成。 site、observation_number 和 id。例如以下是 id = z 的所有观察结果 |site|observation_number|i
python - pandas *有效地*按组将有效值复制到其他行
"Premature optimisation is the root of all evil (but it's nice to have once you have an ugly solutio
python - 有效地 reshape 这个数组
给定这个数组 X: [1 2 3 2 3 1 4 5 7 1] 和行长度数组R: [3 2 5] 表示转换后每行的长度。我正在寻找一个计算效率高的函数来将 X reshape 为数组 Y: [[ 1
r - 如何从数据框列表中(有效地)创建箱线图？
我有一些 data.frame s: # Sample data a <- data.frame(c(1:10), c(11:20)) names(a) <- c("A", "B") b <- dat
c++ - 列表如何知道在哪里插入(有效地)但不能进行随机访问？
我有点困惑。列表擅长任意位置插入，但不善于随机访问？ (怎么可能)如果你不能随机访问，你怎么知道在哪里插入？同样，如果你可以在任何位置插入，为什么你不能从那个位置高效地读取？最佳答案如果您已经有
matlab - 我如何(有效地)计算向量的移动平均值？
我有一个向量，我想计算它的移动平均值(使用宽度为 5 的窗口)。例如，如果有问题的向量是[1,2,3,4,5,6,7,8]，那么结果向量的第一个条目应该是 [1,2,3,4,5] 中所有条目的总和
java - 如何从连续运行的随机整数生成器中(有效地)找到整数簇的数量？
有一个随机整数生成器，它生成随机整数并在后台运行。需求设计一个API，调用时返回当时的簇数。簇:簇是连续整数的字典顺序。例如，在这种情况下，10,7,1,2,8,5,9 簇是 3 (1,2--5--
algorithm - 如何在仅使用一次元素对的同时(有效地)生成不相交的集合？
我想做的是将一组 (n) 项分成大小相等的组(大小为 m 的组，并且为简单起见，假设没有剩余，即 n 可以被 m 整除)。这样做多次，我想确保同一组中的任何项目都不会出现两次。为了使这稍微更具体一些
c++ - 我怎样才能*有效地*从嵌套表达式生成所有类型的元组？
假设我有一些包含类型排列的模板表达式，在本例中它们来自 Abstract Syntax Tree : template
python - 有效地 merge python中的两个大字符串
我已经在这方面工作了几天，似乎没有我需要的答案。由于担心这个被标记为重复，我将解释为什么其他问题对我不起作用。使用 DIFFLIB for Python 的任何答案都无助于我的需求。 (我在下面描
python - 有效地 reshape numpy 数组
我正在使用 NumPy 数组。我有一个 2N 长度向量 D，并希望将其一部分 reshape 为 N x N 数组 C. 现在这段代码可以满足我的要求，但对于较大的 N 来说是一个瓶颈: ``` i
python - 有效地 reshape pandas 数据框列中的数组
我有一个问题: 让我们考虑这样的 pandas 数据框: Width Height Bitmap 67 56 59 71 61 73 ...
c - 有效地 "Wrapping"C 中的常量字符串
我目前正在用 C 语言编写一个解析器，设计它时我需要的东西之一是一个可变字符串“类”(一组对表示实例的不透明结构进行操作的函数)，我将其称为 my_string。 string 类的实例只不过是包装
python - 如何(有效地)检查任何两个元素是否相差 10
假设我在 --pandas-- 数据框中有以下列: x 1 589 2 354 3 692 4 474 5 739 6 731 7 259 8 723
c++ - 有效地 const_cast-ing 常量引用参数
我有一个成员函数，它接受另一个对象的常量引用参数。我想 const_cast 这个参数以便在成员函数中轻松使用它。为此，以下哪个代码更好？: void AClass::AMember(const BC
java - 有效地 "modifying"一个 ImmutableMap
我们目前正在将 Guava 用于其不可变集合，但我惊讶地发现他们的 map 没有方法可以轻松创建只需稍作修改的新 map 。最重要的是，他们的构建器不允许为键分配新值或删除键。因此，如果我只想修改一

塔克拉玛干

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Python有效地创建密度图