gpt4 book ai didi

python - 多字表达式的字符串拆分问题

转载 作者:行者123 更新时间:2023-11-28 21:30:02 25 4
gpt4 key购买 nike

我有一系列字符串,例如:

'i would like a blood orange'

我还有一个字符串列表,例如:

["blood orange", "loan shark"]

对字符串进行操作,我想要以下列表:

["i", "would", "like", "a", "blood orange"]

获取上述列表的最佳方式是什么?我在整个代码中一直使用 re,但我对这个问题感到困惑。

最佳答案

这是一个相当简单的生成器实现:将字符串拆分为单词,将形成短语的单词组合在一起,然后生成结果。

(可能有一种更简洁的方法来处理skip,但由于某种原因我画了一个空白。)

def split_with_phrases(sentence, phrase_list):
words = sentence.split(" ")
phrases = set(tuple(s.split(" ")) for s in phrase_list)
print phrases
max_phrase_length = max(len(p) for p in phrases)

# Find a phrase within words starting at the specified index. Return the
# phrase as a tuple, or None if no phrase starts at that index.
def find_phrase(start_idx):
# Iterate backwards, so we'll always find longer phrases before shorter ones.
# Otherwise, if we have a phrase set like "hello world" and "hello world two",
# we'll never match the longer phrase because we'll always match the shorter
# one first.
for phrase_length in xrange(max_phrase_length, 0, -1):
test_word = tuple(words[idx:idx+phrase_length])
if test_word in phrases:
return test_word
return None

skip = 0
for idx in xrange(len(words)):
if skip:
# This word was returned as part of a previous phrase; skip it.
skip -= 1
continue

phrase = find_phrase(idx)
if phrase is not None:
skip = len(phrase)
yield " ".join(phrase)
continue

yield words[idx]

print [s for s in split_with_phrases('i would like a blood orange',
["blood orange", "loan shark"])]

关于python - 多字表达式的字符串拆分问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3974363/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com