pandas - 比 itertuples 内的 loc 更快地更新列的 block-6ren

pandas - 比 itertuples 内的 loc 更快地更新列的 block

转载作者：行者123 更新时间：2023-12-03 16:46:29

26

4

我有以下代码，它遍历数据帧并根据其他两个列更新列的块。当前解决方案使用 loc内itertuples .
是否可以在不诉诸手动并行化或拆分数据帧的情况下使代码更快？

n_rows = 10000
ix_ = pd.date_range(start="2020-01-01 00:00", freq="min", periods=n_rows)
offsets_ = pd.to_timedelta(np.random.randint(0, 60, size=n_rows), unit="min")
df = pd.DataFrame(
    ix_ + pd.to_timedelta(offsets_, unit="min"), index=ix_, columns=["t_end"]
)
df["active"] = 0
for row in df.itertuples():
    df.loc[row.Index : row.t_end, "active"] += 1

最佳答案

如果在 NumPy 数组而不是在 Pandas 系列上进行计算，速度会快 3-4 倍:

df['int_index'] = range(len(df))
active = np.zeros(len(df), dtype=int)

for row in df.itertuples():
    active[df.int_index.loc[row.Index : row.t_end]] += 1
        
df['active'] = active

关于pandas - 比 itertuples 内的 loc 更快地更新列的 block ，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/67122668/

26

4

0

文章推荐： reactjs - 不应具有其他属性 'nodeModulesPath'(Expo React Native)

文章推荐： javascript - window.open 与参数一起使用时不会打开新窗口

文章推荐： html - 使用 CSS 网格创建导航栏

python - Pandas DataFrames - itertuples 的迭代格式
对于 pandas DataFrame 中的所有行，我想将行写入新的 csv 文件，其中第 1 列和第 6 列的值均与所有列的相应列值匹配其他行，仅写入在不同列 (3) 中找到的最大值的行。 (第 1
python - pandas 使用 itertuples 编辑单元格值
我想加快我的代码速度，所以我不喜欢将 double 用于:我的代码: for c in range(0, n-1): for l in range(0,n-c-1):
python - Pandas - 使用 itertuples 创建列
我有一个带有 AcctId、Latitude 和 Longitude 的 pandas.DataFrame。我还有一个坐标列表。我正在尝试计算纬度和经度与列表中每个坐标对之间的距离(使用半正弦公式)。
python - Pandas df.itertuples 在打印时重命名数据框列
我知道通常 pandas 的 itertuples() 会返回每个值，包括列名，如下所示: ab=pd.DataFrame(np.random.random([3,3]),columns=['hi',
python - 如何从 itertuples 中删除 "Pandas"对象名称？
我有以下数据框: df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]}, index=['a', '
python - 使用 df.itertuples() 中的元组，如何在条件下检索每个元组元素的列值？
我有一个 pandas.DataFrame 例如: 1 2 3 1 1 0 0 2 0 1 0 3 0 0 1 它是从包含以下关系的集合创建的: {(1,1),(2,2),
python - 即使未找到匹配项，Pandas itertuples 第一行也会返回 true
我有 2 个 csv 文件，其行完全相同，如下所示: asas,asafdfdd,fgffgdvnufg,rterrtrrtr,wewewtyuhe,yuuiiyuyuy,uiuiui9u absas
python - 使用 itertuples 遍历 pandas dataframe
我正在使用 itertuples 遍历 pandas 数据框。我还想在迭代时捕获行号: for row in df.itertuples(): print row['name'] 预期输出:
pandas - 比 itertuples 内的 loc 更快地更新列的 block
我有以下代码，它遍历数据帧并根据其他两个列更新列的块。当前解决方案使用 loc内itertuples . 是否可以在不诉诸手动并行化或拆分数据帧的情况下使代码更快？ n_rows = 10000 ix
python - Pandas itertuple 返回不一致的类型，Pandas 或 tuple
我从我之前在较小的数据集上使用过多次的代码中得到了一种以前从未见过的奇怪行为。我正在使用 Pandas 数据帧 read_table 解析 VCF 文件。 VCF 文件有一个标题，然后是 9 列以及任
python - 值错误 : too many values to unpack when using itertuples() on pandas dataframe
我正在尝试根据我在此处找到的答案将一个简单的 pandas 数据框转换为一个嵌套的 JSON 文件:pandas groupby to nested json 我的分组数据框如下所示:
python - 使用cursor.executemany( query, df.itertuples(index=False) ) 进行pyodbc批量数据导入挑战
我对 python 还算陌生，但为了优雅地解决这个问题，我已经认真地进行了一次重击。挑战:我想将 pandas df 中的市场数据导入到 SQL 表中。有大约 7000 只不同的股票，每只都有大约
python - 在 Pandas itertuples() 中，字符串 'class' 在namedtuple 中转换为 '_1'
我正在尝试进行一些数据清理并使用 pandas 'itertuples' 函数生成命名元组以存储在数据框中。但是，当我使用 itertuples 时，名为“class”的列在命名元组中存储为“_1”，

首页

博学

6Ren·AI

商城

pandas - 比 itertuples 内的 loc 更快地更新列的 block