- android - 多次调用 OnPrimaryClipChangedListener
- android - 无法更新 RecyclerView 中的 TextView 字段
- android.database.CursorIndexOutOfBoundsException : Index 0 requested, 光标大小为 0
- android - 使用 AppCompat 时,我们是否需要明确指定其 UI 组件(Spinner、EditText)颜色
我试图在 tkinter GUI 中按下某个按钮时打开文件目录,但当我运行程序时该目录会自动打开。另外,如果我在文件目录中按取消,我的程序就会卡住,我必须关闭程序,我不确定这是为什么。
我尝试将所有 tkinter 相关编码放在一个单独的文件中,但是当我尝试从该文件调用方法时,它会打开 tkinter GUI 两次,所以这不起作用,我无法为了解决这个问题,所以我认为将两者结合起来会更容易一些。我能够让 tkinter GUI 停止出现两次,但现在我陷入了困境。我尝试使用spyder附带的调试器,但除了向我展示为什么 tkinter GUI 不断出现两次之外,它没有多大帮助。
import os
import PyPDF2
import pandas
import webbrowser
import tkinter as tk
from tkinter import ttk
from tkinter import filedialog
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.tokenize import word_tokenize
#Creats the GUI that will be used to select inputs#
window = tk.Tk()
window.geometry("300x300")
window.resizable(0, 0)
window.title("Word Frequency Program")
#Allows user to select PDF to use in program#
def select_PDF():
filename = filedialog.askopenfilename(initialdir = "/", title = "Select file", filetypes = (("pdf files", "*.pdf"), ("all files", "*.*")))
return filename
button1 = ttk.Button(window, text = "Select File", command = select_PDF)
button1.grid()
#Quits out of the program when certain button clicked#
button3 = ttk.Button(window, text = "Quit", command = window.quit)
button3.grid()
#Loads in PDF into program#
filepath = select_PDF()
PDF_file = open(filepath, 'rb')
read_pdf = PyPDF2.PdfFileReader(PDF_file)
#Determines number of pages in PDF file and sets the document content to 'null'#
number_of_pages = read_pdf.getNumPages()
doc_content = ""
#Extract text from the PDF file#
for i in range(number_of_pages):
page = read_pdf.getPage(0)
page_content = page.extractText()
doc_content += page_content
#Method that a pdf that is read into the program goes through to eliminate any unwanted words or symbols#
def preprocess(text):
#Filters out punctuation from paragraph witch becomes tokenized to words and punctuation#
tokenizer = RegexpTokenizer(r'\w+')
result = tokenizer.tokenize(text)
#Makes all words lowercase#
words = [item.lower() for item in result]
#Removes all remaining tokens that are not alphabetic#
result = [word for word in words if word.isalpha()]
#Imports stopwords to be removed from paragraph#
stop_words = set(stopwords.words("english"))
#Removes the stop words from the paragraph#
filtered_sent = []
for w in result:
if w not in stop_words:
filtered_sent.append(w)
#Return word to root word/chop-off derivational affixes#
ps = PorterStemmer()
stemmed_words = []
for w in filtered_sent:
stemmed_words.append(ps.stem(w))
#Lemmatization, which reduces word to their base word, which is linguistically correct lemmas#
lem = WordNetLemmatizer()
lemmatized_words = ' '.join([lem.lemmatize(w,'n') and lem.lemmatize(w,'v') for w in filtered_sent])
#Re-tokenize lemmatized words string#
tokenized_word = word_tokenize(lemmatized_words)
return tokenized_word
#Turns the text drawn from the PDF file into data the remaining code can understand#
tokenized_words = preprocess(doc_content)
#Determine frequency of words tokenized + lemmatized text#
from nltk.probability import FreqDist
fdist = FreqDist(tokenized_words)
final_list = fdist.most_common(len(fdist))
#Organize data into two columns and export the data to an html that automatically opens#
df = pandas.DataFrame(final_list, columns = ["Word", "Frequency"])
df.to_html('word_frequency.html')
webbrowser.open('file://' + os.path.realpath('word_frequency.html'))
window.mainloop()
window.destroy()
tkinter GUI 应该会自行弹出,而不会出现文件目录,直到您按下 GUI 中的按钮。当您在文件目录中按“取消”时,程序也不应该崩溃。
最佳答案
如果你想在按下按钮后运行,那么你必须运行select_PDF
内的所有代码
button1 = ttk.Button(window, text="Select File", command=select_PDF)
def select_PDF():
filename = filedialog.askopenfilename(initialdir = "/", title = "Select file", filetypes = (("pdf files", "*.pdf"), ("all files", "*.*")))
#Loads in PDF into program#
PDF_file = open(filename, 'rb')
read_pdf = PyPDF2.PdfFileReader(PDF_file)
#Determines number of pages in PDF file and sets the document content to 'null'#
number_of_pages = read_pdf.getNumPages()
doc_content = ""
#Extract text from the PDF file#
# ... rest of code ...
Button
的工作方式与 input()
不同 - 它不会停止代码,也不会等待您的点击。它仅定义按钮,mainloop()
将显示它。你应该
你的代码应该是这样的:
import os
import PyPDF2
import pandas
import webbrowser
import tkinter as tk
from tkinter import ttk
from tkinter import filedialog
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.tokenize import word_tokenize
# --- functions ---
def preprocess(text):
'''Method that a pdf that is read into the program goes through to eliminate any unwanted words or symbols'''
#Filters out punctuation from paragraph witch becomes tokenized to words and punctuation#
tokenizer = RegexpTokenizer(r'\w+')
result = tokenizer.tokenize(text)
#Makes all words lowercase#
words = [item.lower() for item in result]
#Removes all remaining tokens that are not alphabetic#
result = [word for word in words if word.isalpha()]
#Imports stopwords to be removed from paragraph#
stop_words = set(stopwords.words("english"))
#Removes the stop words from the paragraph#
filtered_sent = []
for w in result:
if w not in stop_words:
filtered_sent.append(w)
#Return word to root word/chop-off derivational affixes#
ps = PorterStemmer()
stemmed_words = []
for w in filtered_sent:
stemmed_words.append(ps.stem(w))
#Lemmatization, which reduces word to their base word, which is linguistically correct lemmas#
lem = WordNetLemmatizer()
lemmatized_words = ' '.join([lem.lemmatize(w,'n') and lem.lemmatize(w,'v') for w in filtered_sent])
#Re-tokenize lemmatized words string#
tokenized_word = word_tokenize(lemmatized_words)
return tokenized_word
def select_PDF():
filename = filedialog.askopenfilename(initialdir = "/", title = "Select file", filetypes = (("pdf files", "*.pdf"), ("all files", "*.*")))
PDF_file = open(filename, 'rb')
read_pdf = PyPDF2.PdfFileReader(PDF_file)
#Determines number of pages in PDF file and sets the document content to 'null'#
number_of_pages = read_pdf.getNumPages()
doc_content = ""
#Extract text from the PDF file#
for i in range(number_of_pages):
page = read_pdf.getPage(0)
page_content = page.extractText()
doc_content += page_content
#Turns the text drawn from the PDF file into data the remaining code can understand#
tokenized_words = preprocess(doc_content)
#Determine frequency of words tokenized + lemmatized text#
from nltk.probability import FreqDist
fdist = FreqDist(tokenized_words)
final_list = fdist.most_common(len(fdist))
#Organize data into two columns and export the data to an html that automatically opens#
df = pandas.DataFrame(final_list, columns = ["Word", "Frequency"])
df.to_html('word_frequency.html')
webbrowser.open('file://' + os.path.realpath('word_frequency.html'))
# --- main ---
#Creats the GUI that will be used to select inputs#
window = tk.Tk()
window.geometry("300x300")
window.resizable(0, 0)
window.title("Word Frequency Program")
button1 = ttk.Button(window, text = "Select File", command=select_PDF)
button1.grid()
#Quits out of the program when certain button clicked#
button3 = ttk.Button(window, text="Quit", command=window.quit)
button3.grid()
window.mainloop()
window.destroy()
<小时/>
或者您可以使用按钮选择文件名,将其保存在全局变量中并关闭窗口(window.quit()
),并将其余代码放在mainloop()
之后。 mainloop()
将等到您关闭窗口,并且 mainloop()
之后的所有代码将在您选择文件(并关闭窗口)后执行
import os
import PyPDF2
import pandas
import webbrowser
import tkinter as tk
from tkinter import ttk
from tkinter import filedialog
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.tokenize import word_tokenize
# --- functions ---
def preprocess(text):
'''Method that a pdf that is read into the program goes through to eliminate any unwanted words or symbols'''
#Filters out punctuation from paragraph witch becomes tokenized to words and punctuation#
tokenizer = RegexpTokenizer(r'\w+')
result = tokenizer.tokenize(text)
#Makes all words lowercase#
words = [item.lower() for item in result]
#Removes all remaining tokens that are not alphabetic#
result = [word for word in words if word.isalpha()]
#Imports stopwords to be removed from paragraph#
stop_words = set(stopwords.words("english"))
#Removes the stop words from the paragraph#
filtered_sent = []
for w in result:
if w not in stop_words:
filtered_sent.append(w)
#Return word to root word/chop-off derivational affixes#
ps = PorterStemmer()
stemmed_words = []
for w in filtered_sent:
stemmed_words.append(ps.stem(w))
#Lemmatization, which reduces word to their base word, which is linguistically correct lemmas#
lem = WordNetLemmatizer()
lemmatized_words = ' '.join([lem.lemmatize(w,'n') and lem.lemmatize(w,'v') for w in filtered_sent])
#Re-tokenize lemmatized words string#
tokenized_word = word_tokenize(lemmatized_words)
return tokenized_word
def select_PDF():
global filename # to assign to global variable
filename = filedialog.askopenfilename(initialdir = "/", title = "Select file", filetypes = (("pdf files", "*.pdf"), ("all files", "*.*")))
window.close() # close
# --- main ---
filename = None # create global variable with default value at start
#Creats the GUI that will be used to select inputs#
window = tk.Tk()
window.geometry("300x300")
window.resizable(0, 0)
window.title("Word Frequency Program")
button1 = ttk.Button(window, text = "Select File", command=select_PDF)
button1.grid()
#Quits out of the program when certain button clicked#
button3 = ttk.Button(window, text="Quit", command=window.quit)
button3.grid()
window.mainloop()
window.destroy()
# --- executed after closing window ---
if filename: # check if filename was selected
PDF_file = open(filename, 'rb')
read_pdf = PyPDF2.PdfFileReader(PDF_file)
#Determines number of pages in PDF file and sets the document content to 'null'#
number_of_pages = read_pdf.getNumPages()
doc_content = ""
#Extract text from the PDF file#
for i in range(number_of_pages):
page = read_pdf.getPage(0)
page_content = page.extractText()
doc_content += page_content
#Turns the text drawn from the PDF file into data the remaining code can understand#
tokenized_words = preprocess(doc_content)
#Determine frequency of words tokenized + lemmatized text#
from nltk.probability import FreqDist
fdist = FreqDist(tokenized_words)
final_list = fdist.most_common(len(fdist))
#Organize data into two columns and export the data to an html that automatically opens#
df = pandas.DataFrame(final_list, columns = ["Word", "Frequency"])
df.to_html('word_frequency.html')
webbrowser.open('file://' + os.path.realpath('word_frequency.html'))
关于python - 仅在按下特定按钮时打开文件目录,而不是自动打开,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57225888/
我正在为我的程序编写安装脚本,它应该在 Linux/Unix 操作系统上运行。以下文件的默认目录是什么: 可执行文件(程序)。程序应通过从命令行键入其名称来执行。 共享库。 第三方共享库(程序未开源,
我有一堆用户、组和应用程序注册,我的 MVC 应用程序使用 AAD 数据进行身份验证和授权。是否可以将 Azure Active Directory 从一个租户(目录)迁移到另一个租户(目录)?如果可
查看 cljsbuild 文档 https://github.com/emezeske/lein-cljsbuild :cljsbuild { :builds [{ ; The
忽略已经版本控制的文件 如果你不小心添加了一些应该被忽略的文件,你如何将它们从版本控制中去除而不会丢失它们?或许你有自己的IDE配置文件,不是项目的一部分,但将会花费很多时间使之按照自己的方式工作。
我想使用\tableofcontents 命令,但没有目录从新页面开始或在末尾创建新页面,并且所有内容都是单倍行距。我怎样才能做到这一点?我假设使用 tocloft,但有哪些选择? 谢谢 最佳答案 试
我有一些 javascript 菜单代码,可以在单独的目录中正常工作。但是,当我尝试从同一目录中调用相同的 .js 文件时,它不会看到这些文件。 以下内容来自另一个目录: script type="t
我有这样的路径: /my/path/to/important_folder 在同一级别上,我还有其他文件和文件夹想要在达到与 important_folder 相同的级别时列出。 我的文件夹可能更深,
1、获取文件路径实现 1.1 获取当前文件路径 ? 1
我正在使用最新版本的 NTEmacs。 我写了一个名为“.dir-locals.el”的文件,如下所示。 ((nil . ((tab-width . 8) (fill-column .
关闭。这个问题不满足Stack Overflow guidelines .它目前不接受答案。 想改善这个问题吗?更新问题,使其成为 on-topic对于堆栈溢出。 7年前关闭。 Improve thi
在我的 .vimrc 中有这些行 :set foldmethod=marker :set foldmarker=SECTION:,ENDSECTION: 用于自定义代码折叠。在我的文件中,相关语言的注
在 fish 中: for x in * echo $x end *这里包括所有目录和文件,如何只列出文件(或目录)? 最佳答案 fish 没有很多花哨的通配语法。但是,目录可以像这样迭代: f
这是我的目录结构: ├── src │ ├── helpers │ │ ├── __init__.py │ │ ├── foo.py │ │ └── bar.py │
我想递归重命名文件夹/目录名称并找到 this solution所以。但是这个命令没有效果 find . -type f -exec rename 's/old/new/' '{}' \; 这是一个正
我想在相册中创建一个文件夹,并希望将图像保存在创建的相册中。 这可能吗?有什么办法可以做到这一点吗? 我已经搜索过,大多数人都说这是不可能的。 感谢您的帮助。 最佳答案 您也许可以使用AssetsLi
如何在python中使用用户定义的名称创建临时文件/目录。我知道 tempfile .但是我看不到任何以文件名作为参数的函数。 注意:我需要这个来对包含临时文件的临时目录上的 glob(文件名模式匹配
我在项目中使用JaCoCo Gradle插件。 作为问题的一个示例,我的大部分代码都在com.me.mysoftware包下。 我正在使用代码生成器来生成build/generated/java/..
我正在尝试使用 Gradle 开始运行 jar 文件 我的任务如下所示: task startServer(type: Exec) { workingDir file("${buildDir}/a
如何在 Ant 中定义一个目录集,其中包括两个目录:项目的基目录和子目录“test”? 看起来您无法使用“/”、“.”或“”专门包含目录集的根目录。例如,这包括“./test”,但不包括“.”:
我正在使用 CTAGs 包,它使用 Sublime Text 2 生成两个文件 .tags 和 .tags_sorted_by_file。 那么当我进行项目搜索(CMD + SHIFT + F)时,如
我是一名优秀的程序员,十分优秀!