gpt4 book ai didi

java - 需要帮助用 Java 解析文件

转载 作者:太空宇宙 更新时间:2023-11-03 19:02:44 25 4
gpt4 key购买 nike

我目前在做一个小的数据结构项目,我正在尝试获取全国大学的数据;然后对它们进行一些数据操作。我在这里找到了这个数据:http://archive.ics.uci.edu/ml/machine-learning-databases/university/university.data

但是,这个数据的问题是(我从网站上引用):“这是一个 LISP 可读文件,在数据文件的末尾有一些相关的功能。”我计划获取此数据并将其保存为 .txt 文件。

文件看起来有点像:

(def-instance Adelphi
(state newyork)
(control private)
(no-of-students thous:5-10)
(male:female ratio:30:70)
(student:faculty ratio:15:1)
(sat verbal 500)
(sat math 475)
(expenses thous$:7-10)
(percent-financial-aid 60)
(no-applicants thous:4-7)
(percent-admittance 70)
(percent-enrolled 40)
(academics scale:1-5 2)
(social scale:1-5 2)
(quality-of-life scale:1-5 2)
(academic-emphasis business-administration)
(academic-emphasis biology))
(def-instance Arizona-State
(state arizona)
(control state)
(no-of-students thous:20+)
(male:female ratio:50:50)
(student:faculty ratio:20:1)
(sat verbal 450)
(sat math 500)
(expenses thous$:4-7)
(percent-financial-aid 50)
(no-applicants thous:17+)
(percent-admittance 80)
(percent-enrolled 60)
(academics scale:1-5 3)
(social scale:1-5 4)
(quality-of-life scale:1-5 5)
(academic-emphasis business-education)
(academic-emphasis engineering)
(academic-emphasis accounting)
(academic-emphasis fine-arts))

......

文件结尾:

(dfx def-instance (l)
(tlet (instance (car l) f-list (cdr l))
(cond ((or (null instance) (consp instance))
(msg t instance " is not a valid instance name (must be an atom)"))
(t (make:event instance)
(push instance !instances)
(:= (get instance 'features)
(tfor (f in f-list)
(when (cond ((or (atom f) (null (cdr f)))
(msg t f " is not a valid feature "
"(must be a 2 or 3 item list)") nil)
((consp (car f))
(msg t (car f) " is not a valid feature "
"name (must be an atom)") nil)
((and (cddr f) (consp (cadr f)))
(msg t (cadr f) " is not a valid feature "
"role (must be an atom)") nil)
(t t)))
(save (cond ((equal (length f) 3)
(make:feature (car f) (cadr f) (caddr f)))
(t (make:feature (car f) 'value (cadr f)))))))
instance))))

(set-if !instances nil)



(dex run-uniq-colleges (l n)
(tfor (sc in l)
(when (cond ((ge (length *events-added*) n))
((not (get sc 'duplicate))
(run-instance sc)
~ (remprop sc 'features)
nil)
(t (remprop sc 'features) nil)))
(stop)))

我最感兴趣的数据是学生人数、学术重点和学校名称。

非常感谢任何帮助。

最佳答案

您可以处理/使用 Lisp 文件解析器,或者您可以忽略编写它的语言并专注于数据。你提到你需要:

  • 学校名称
  • 学生人数
  • 学术重点

您可以grep 相关关键字(def-instance、no-of-students、academic-emphasis),这会让您(根据您的示例):

(def-instance Adelphi
(no-of-students thous:5-10)
(academic-emphasis business-administration)
(academic-emphasis biology))
(def-instance Arizona-State
(no-of-students thous:20+)
(academic-emphasis business-education)
(academic-emphasis engineering)
(academic-emphasis accounting)
(academic-emphasis fine-arts))

这简化了编写特定解析器的过程(def-instance 后跟名称,然后下一个 def-instance 之前的所有 academic-emphasis 和 no-of-students 均指之前定义的名称)

关于java - 需要帮助用 Java 解析文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5568724/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com