gpt4 book ai didi

python-3.x - 基于 Spacy Rule Matcher 选择 Pandas DataFrame 的行

转载 作者:行者123 更新时间:2023-12-04 09:26:26 25 4
gpt4 key购买 nike

我需要切 Pandas DataFrame基于 spacy 基于规则的匹配器结果。以下是我尝试过的。

import pandas as pd
import numpy as np
import spacy
from spacy.matcher import Matcher

df = pd.DataFrame([['Eight people believed injured in serious SH1 crash involving truck and three cars at Hunterville',
'Fire and emergency responding to incident at Mataura, Southland ouvea premix site',
'Civil Defence Minister Peeni Henare heartbroken over Northland flooding',
'Far North flooding: New photos reveal damage to roads']]).T
df.columns = ['col1']

nlp = spacy.load("en_core_web_sm")

flood_pattern = [{'LOWER': 'flooding'}]

matcher = Matcher(nlp.vocab, validate=True)
matcher.add("FLOOD_DIS", None, flood_pattern)
titles = (_ for _ in df['col1'])
g = (d for d in nlp.pipe(titles) if matcher(d))
x = list(g)

df2 = df[df['col1'].isin(x)]
df2
这会产生一个空的 DataFrame。但是,它应该从 df 中提取以下两行.
  • 民防部长Peeni Henare 因北地洪水泛滥而伤心欲绝
  • 远北洪水:新照片显示道路受损
  • 最佳答案

    您可以执行以下操作。

    titles = (_ for _ in df['col1'])
    g = (d for d in nlp.pipe(titles) if matcher(d))


    A = []
    for i in range(len(df)):
    doc = nlp(next(titles))
    if len(matcher(doc)) == 1:
    A.append(str(doc))
    df2 = df[df['col1'].isin(A)]

    关于python-3.x - 基于 Spacy Rule Matcher 选择 Pandas DataFrame 的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62993303/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com