gpt4 book ai didi

python - PRAW/Tweepy 过滤关键字

转载 作者:太空宇宙 更新时间:2023-11-03 17:03:36 24 4
gpt4 key购买 nike

所以我在过滤我的虾的结果时遇到了一些问题。我想在结果中排​​除诸如([request]、[off topic] 或 [nsfw])之类的关键字。我不想在 tweepy 上发布类似 praw 结果中的帖子。我正在寻找文档,但在 PRAW 网站上找不到任何内容。

这是我的代码:

def poster():
conn = sqlite3.connect('jb_id.db')
c = conn.cursor()
toTweet = []
for submission in reddit.subreddit(SUB).hot(limit=POST_LIMIT):
if not submission.stickied and len(submission.title) < 255:
url = submission.shortlink
title = submission.title
udate = time.strftime("%Y-%m-%d %X",time.gmtime(submission.created_utc))

try:
# This keeps a record of the posts in a the database
c.execute("INSERT INTO posts (id, title, udate) VALUES (?, ?, ?)",
(url, title, udate))
conn.commit()


message = title + " " + url
print(message)
toTweet.append(message)

except sqlite3.IntegrityError:
# This means the post was already tweeted and is ignored
print("Duplicate", url)

c.close()
conn.close()
tweeter(toTweet)

如您所见,我排除了超过 255 个字符的标签和标题。我想知道是否有一种方法可以用我上面提到的关于 praw 的结果的关键字来过滤 reddit 上的帖子。谢谢!

最佳答案

列出不应出现在提交标题中的关键字

bad_keywords = "[request]", "[off topic]", "[nsfw]"

如果提交标题包含列表中的项目,则跳过循环

title_lowercase = submission.title.lower()
if any(x in title_lowercase for x in bad_keywords):
continue

我会将其与您的其他排除项结合使用以减少缩进并使其更具可读性

bad_title = any(x in title_lowercase for x in bad_keywords)
skip_submission = submission.stickied and len(submission.title) > 255 and bad_title
if skip_submission:
continue

完整的解决方案

def poster():
conn = sqlite3.connect('jb_id.db')
c = conn.cursor()
toTweet = []

bad_keywords = "[request]", "[off topic]", "[nsfw]"

for submission in reddit.subreddit(SUB).hot(limit=POST_LIMIT):
title = submission.title
title_lowercase = title.lower()

bad_title = any(x in title_lowercase for x in bad_keywords)
skip_submission = submission.stickied and len(submission.title) > 255 and bad_title

if skip_submission:
continue

url = submission.shortlink
udate = time.strftime("%Y-%m-%d %X",time.gmtime(submission.created_utc))

try:
# This keeps a record of the posts in a the database
c.execute("INSERT INTO posts (id, title, udate) VALUES (?, ?, ?)",
(url, title, udate))
conn.commit()


message = title + " " + url
print(message)
toTweet.append(message)

except sqlite3.IntegrityError:
# This means the post was already tweeted and is ignored
print("Duplicate", url)

c.close()
conn.close()
tweeter(toTweet)

关于python - PRAW/Tweepy 过滤关键字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58799355/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com