python - 更简单的 python 等同于 R 风格的 grep，包括多个要匹配的东西-6ren

python - 更简单的 python 等同于 R 风格的 grep，包括多个要匹配的东西

转载作者：行者123 更新时间：2023-12-04 02:51:55

24

4

这个问题几乎与 this one 重复，进行了一些调整。

获取以下数据框，并获取其中包含“sch”或“oa”的列的位置。在 R 中足够简单:

df <- data.frame(cheese = rnorm(10),
                 goats = rnorm(10), 
                 boats = rnorm(10), 
                 schmoats = rnorm(10), 
                 schlomo = rnorm(10),
                 cows = rnorm(10))

grep("oa|sch", colnames(df))

[1] 2 3 4 5

write.csv(df, file = "df.csv")

现在在 python 中，我可以使用一些冗长的列表理解:

import pandas as pd
df = pd.read_csv("df.csv", index_col = 0)
matches = [i for i in range(len(df.columns)) if "oa" in df.columns[i] or "sch" in df.columns[i]]

matches
Out[10]: [1, 2, 3, 4]

我想知道在 python 中是否有比上面的列表理解示例更好的方法。具体来说，如果我有几十个字符串要匹配怎么办。在 R 中，我可以做类似的事情

regex <- paste(vector_of_strings, sep = "|")
grep(regex, colnames(df))

但是如何使用列表理解在 python 中做到这一点并不明显。也许我可以使用字符串操作以编程方式创建将在列表内部执行的字符串，以处理所有重复的 或 语句？

最佳答案

使用 Pandas 的 DataFrame.filter运行相同的正则表达式:

df.filter(regex = "oa|sch").columns
# Index(['goats', 'boats', 'schmoats', 'schlomo'], dtype='object')

df.filter(regex = "oa|sch").columns.values
# ['goats' 'boats' 'schmoats' 'schlomo']

数据

import numpy as np
import pandas as pd

np.random.seed(21419)

df = pd.DataFrame({'cheese': np.random.randn(10),
                   'goats': np.random.randn(10), 
                   'boats': np.random.randn(10), 
                   'schmoats': np.random.randn(10), 
                   'schlomo': np.random.randn(10),
                   'cows': np.random.randn(10)})

以及要搜索的多个字符串:

rgx = "|".join(list_of_strings)

df.filter(regex = rgx)

要返回索引，请考虑来自 @Divakar 的矢量化 numpy 解决方案.请注意，与 R 不同，Python 是零索引的。

def column_index(df, query_cols):
    cols = df.columns.values
    sidx = np.argsort(cols)
    return sidx[np.searchsorted(cols,query_cols,sorter=sidx)]

column_index(df, df.filter(regex="oa|sch").columns)
# [1 2 3 4]

关于python - 更简单的 python 等同于 R 风格的 grep，包括多个要匹配的东西，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54699227/

24

4

0

文章推荐： r - 如何在 R 中使用 layout() 函数？

文章推荐： scalacheck:生成一个非空字符串

文章推荐： Weka 无法在 MacOS 上打开

C# 等同 URI
判断这2个相似的Uris实际上相同的标准方法是什么？ var a = new Uri("http://sample.com/sample/"); var b = new Uri("http://sam
javascript - 在 if 中将字符串与 true 等同
这个问题在这里已经有了答案: Why does "true" == true show false in JavaScript? (5 个答案) 关闭 5 年前。可能我很困惑，但我无法理解这个愚蠢

首页

博学

6Ren·AI

商城

python - 更简单的 python 等同于 R 风格的 grep，包括多个要匹配的东西