gpt4 book ai didi

python - 将文本从第一人称转换为第二人称时出现问题,同时忽略引号内的文本 ""

转载 作者:行者123 更新时间:2023-12-01 06:36:58 25 4
gpt4 key购买 nike

我正在尝试将故事/句子/单词/等从第一人称语法转换为第二人称语法,但尝试不转换引号“”或“”内的文本。

这是在 google colab、python 3 笔记本中运行的。代码读取我的 googledrive 中的文件,读取 .txt 文件,通过“forms =”将文件中的单词从第一人称转换为第二人称。还有一个问题是转换后在引号前后插入空格("和 ' 受到影响)。


import nltk
from google.colab import drive

drive.mount('/content/drive')

sent = open('/content/drive/My Drive/storyuno.txt', 'r')


forms = {"am" : "are", "are" : "am", 'i' : 'you', 'my' : 'yours', 'me' : 'you', 'mine' : 'yours', 'you' : 'I', 'your' : 'my', 'yours' : 'mine'} # More?
def translate(word):
if word.lower() in forms: return forms[word.lower()]
return word

translated = []
quote_mode = False
for word in nltk.wordpunct_tokenize(sent.read()):
if quote_mode:
translated.append(word)
if word == '"': quote_mode = False;

if not quote_mode:
translated.append(translate(word))
if word == '"': quote_mode = True;

result = ' '.join(translated)

print(result)
sent.close()

我输入的故事:

The bottom line is that if I was going to tell anyone about the frog, it would be Soy. I decided that our walk home would be the most opportune time. “Did you see anything outside today during math?” I asked Soy as we started walking. “What do you mean? Like in the sky?” he asked, jumping over cracks in the sidewalk. “I mean right outside the window. Like right up against it,” I answered. “Like a person?” he asked, still hopping. Soy sat in the row farthest from the window, so it was possible, but unlikely, for someone to walk by without him noticing.

它转换为:

The bottom line is that if you was going to tell anyone about the frog , it would be Soy . you decided that our walk home would be the most opportune time . “ Did I see anything outside today during math ?” you asked Soy as we started walking . “ What do I mean ? Like in the sky ?” he asked , jumping over cracks in the sidewalk . “ you mean right outside the window . Like right up against it ,” you answered . “ Like a person ?” he asked , still hopping . Soy sat in the row farthest from the window , so it was possible , but unlikely , for someone to walk by without him noticing .

问题是引号内的文本不应被转换。例如:我告诉她,“你很无聊”。 ---> 你告诉她,“你很无聊”。

忽略除了引用问题之外的任何语法错误,我稍后会修复它。

最佳答案

您的报价有两个问题。第一个是 不等于 "。第二个是引号可以与相邻的标点符号捆绑在一起,因此您会得到像 ?”.解决方案是使用正则表达式来检查 token 中是否存在任何引号:

import re
quote_re = re.compile(r'["“”]')

然后改变

if word == '"':

进入

if quote_re.search(word):

空格问题可以通过去标记化来解决:

from nltk.tokenize.treebank import TreebankWordDetokenizer
detokenizer = TreebankWordDetokenizer()
result = detokenizer.detokenize(translated)

关于python - 将文本从第一人称转换为第二人称时出现问题,同时忽略引号内的文本 "",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59622182/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com