gpt4 book ai didi

rdf - 通过SPARQL查询古腾堡项目catalog.rdf

转载 作者:行者123 更新时间:2023-12-01 21:26:40 24 4
gpt4 key购买 nike

我在构建 Project Gutenberg 目录的 SPARQL 查询时遇到困难(可在页面底部的 Gutenberg Feeds 获取)。我知道我对 SparQL/RDF 等如何工作缺乏了解。实际上可以工作,将其与 SQL 等混为一谈。但我已经尝试了几个教程,但我只是不太清楚如何将 WHERE 子句与看似多维数据集的内容拼凑在一起。

我已将catalog.rdf导入到TDB数据库(来自Jena项目),并使用tdbquery工具最初设置我的查询,然后将其包装到允许按作者或标题搜索的命令行工具中.

这是我到目前为止所拥有的:

$ cat gutenquery.tq
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dcmitype: <http://purl.org/dc/dcmitype/>
PREFIX cc: <http://web.resource.org/cc/>
PREFIX pgterms: <http://www.gutenberg.org/rdfterms/>
PREFIX dcmitype: <http://purl.org/dc/dcmitype/>

SELECT ?title ?author
WHERE {
?book dc:title ?title ;
dc:creator ?author
}
LIMIT 10

$ ./tdbquery --loc=/var/db/gutenberg/ --file=gutenquery.tq
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| title | author |
======================================================================================================================================================================
| "The Belgian Curtain\nEurope after Communism"^^rdf:XMLLiteral | "Vaknin, Samuel, 1961-"^^rdf:XMLLiteral |
| "Fairy Tales; Their Origin and Meaning\nWith Some Account of Dwellers in Fairyland"^^rdf:XMLLiteral | "Bunce, John Thackray, 1828-1899"^^rdf:XMLLiteral |
| "The World English Bible (WEB): Zephaniah"^^rdf:XMLLiteral | "Anonymous"^^rdf:XMLLiteral |
| "Lectures of Col. R. G. Ingersoll - Latest"^^rdf:XMLLiteral | "Ingersoll, Robert Green, 1833-1899"^^rdf:XMLLiteral |
| "Selections from Erasmus\nPrincipally from his Epistles"^^rdf:XMLLiteral | "Erasmus, Desiderius, 1469-1536"^^rdf:XMLLiteral |
| "East and West\nPoems"^^rdf:XMLLiteral | "Harte, Bret, 1836-1902"^^rdf:XMLLiteral |
| "The Enormous Room"^^rdf:XMLLiteral | "Cummings, E. E. (Edward Estlin), 1894-1962"^^rdf:XMLLiteral |
| "The Enormous Room"^^rdf:XMLLiteral | _:b0 |
| "Actes et Paroles, Volume 4\nDepuis l'Exil 1876-1885"^^rdf:XMLLiteral | "Hugo, Victor, 1802-1885"^^rdf:XMLLiteral |
| "L'ÃŽle Des Pingouins"^^rdf:XMLLiteral | "France, Anatole, 1844-1924"^^rdf:XMLLiteral |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------

PG 的典型条目如下所示,尽管并非所有字段都出现在所有记录中:

<pgterms:etext rdf:ID="etext7250">
<dc:publisher>&pg;</dc:publisher>
<dc:title rdf:parseType="Literal">A Connecticut Yankee in King Arthur's Court, Part 9.</dc:title>
<dc:creator rdf:parseType="Literal">Twain, Mark, 1835-1910</dc:creator>
<pgterms:friendlytitle rdf:parseType="Literal">A Connecticut Yankee in King Arthur's Court, Part </pgterms:friendlytitle>
<dc:language><dcterms:ISO639-2><rdf:value>en</rdf:value></dcterms:ISO639-2></dc:language>
<dc:subject>
<rdf:Bag>
<rdf:li><dcterms:LCSH><rdf:value>Americans -- Great Britain -- Fiction</rdf:value></dcterms:LCSH></rdf:li>
<rdf:li><dcterms:LCSH><rdf:value>Arthurian romances -- Adaptations</rdf:value></dcterms:LCSH></rdf:li>
<rdf:li><dcterms:LCSH><rdf:value>Britons -- Fiction</rdf:value></dcterms:LCSH></rdf:li>
<rdf:li><dcterms:LCSH><rdf:value>Fantasy fiction</rdf:value></dcterms:LCSH></rdf:li>
<rdf:li><dcterms:LCSH><rdf:value>Kings and rulers -- Fiction</rdf:value></dcterms:LCSH></rdf:li>
<rdf:li><dcterms:LCSH><rdf:value>Knights and knighthood -- Fiction</rdf:value></dcterms:LCSH></rdf:li>
<rdf:li><dcterms:LCSH><rdf:value>Satire</rdf:value></dcterms:LCSH></rdf:li>
<rdf:li><dcterms:LCSH><rdf:value>Time travel -- Fiction</rdf:value></dcterms:LCSH></rdf:li>
</rdf:Bag>
</dc:subject>
<dc:subject><dcterms:LCC><rdf:value>PS</rdf:value></dcterms:LCC></dc:subject>
<dc:created><dcterms:W3CDTF><rdf:value>2004-07-07</rdf:value></dcterms:W3CDTF></dc:created>
<dc:rights rdf:resource="&lic;" />

此外,例如dc:author 和 dc:title,我想从 pgterms:etext rdf:ID="STUFF IN HERE":

的属性中获取值
<pgterms:etext rdf:ID="etext7250">

以及组合 dc:subject 下列表中的条目等。基本上,通过命令行查询将本书的所有信息作为单个连贯条目提供。

所以,我的问题:

  1. 如何将 pg:eterms rdf:ID 中的属性值与查询的其余部分结合起来?
  2. 如何将 dc:subject 列表下的条目合并为一个字符串?
  3. 由于并非每条记录都会显示所有字段,因此我是否应该使用 OPTIONAL() 子句来包围并不总是出现的字段?
  4. 如何根据用户指定的字符串限制查询?我应该使用 FILTER() 吗?

非常感谢。我已经能够构建查询来获取单层信息,但除此之外的任何内容、属性等对我来说几乎是难以理解的。这与标准 SQL 有很大不同,而且是一个比我最初想象的要复杂得多的项目。

最佳答案

How can I combine the attribute value from pg:eterms rdf:ID with the rest of the query?

RDF id 将是您的知识库中该书的 URI。在您的情况下,将 ?book 放入您的 select 子句中会将其带回来。

How can I combine the entries under dc:subject's list into one string?

我对此不太确定。您可以将 dc:subject 放入您的查询中,然后与您的客户进行迭代。

Since not all fields show up for every record, should I use the OPTIONAL() clause to surround fields that don't always appear?

是的

How can I limit my query based on a user-specified string? Am I supposed to use FILTER() for that?

是的,特别是FILTER regex()

关于rdf - 通过SPARQL查询古腾堡项目catalog.rdf,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3328344/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com