gpt4 book ai didi

python - 使用 praw 时如何不打印评论或提交中的表情符号

转载 作者:行者123 更新时间:2023-12-01 07:00:11 26 4
gpt4 key购买 nike

当我尝试打印带有表情符号的评论或提交时收到错误消息。我怎样才能忽略并只打印字母和数字?

使用 Praw 进行网页抓取

top_posts2 = page.top(limit = 25)
for post in top_posts2:
outputFile.write(post.title)
outputFile.write(' ')
outputFile.write(str(post.score))
outputFile.write('\n')
outputFile.write(post.selftext)
outputFile.write('\n')

submissions = reddit.submission(id = post.id)

comment_page = submissions.comments
top_comment = comment_page[0] #by default, this will be the best comment of the post

commentBody = top_comment.body

outputFile.write(top_comment.body)
outputFile.write('\n')

我只想输出字母和数字。也许还有一些特殊字符(或全部)

最佳答案

有几种方法可以做到这一点。我建议创建一种“文本清理”功能

def cleanText(text):
new_text = ""
for c in text: # for each character in the text
if c.isalnum(): # check if it is either a letter or number (alphanumeric)
new_text += c
return new_text

或者如果您想包含特定的非字母数字数字

def cleanText(text):
valid_symbols = "!@#$%^&*()" # <-- add whatever symbols you want here
new_text = ""
for c in text: # for each character in the text
if c.isalnum() or c in valid_symbols: # check if alphanumeric or a valid symbol
new_text += c
return new_text

那么在你的脚本中你可以做类似的事情

commentBody = cleanText(top_comment.body)

关于python - 使用 praw 时如何不打印评论或提交中的表情符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58674125/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com