gpt4 book ai didi

marklogic - mlcp 将 csv 文件转换为 OBI 源

转载 作者:行者123 更新时间:2023-12-03 06:57:29 26 4
gpt4 key购买 nike

我面临以下挑战。我们想要使用 mlcp 将 csv 文件加载到 MarkLogic 数据库中。我们还希望在加载期间将加载的行转换为 OBI 源,因此我们为此构建了一个转换函数。

现在我正为转型而苦苦挣扎。如果没有转换,数据将按预期作为每行文档加载。

csv 示例:

voornaam,achternaam
hugo,koopmans
thijs,van ulden

变换救护车.xqy:

xquery version "1.0-ml";
module namespace rws = "http://marklogic.com/rws";

import module namespace source = "http://marklogic.com/solutions/obi/source" at "/ext/obi/lib/source-lib.xqy";

(: If the input document is XML, create an OBI source from it, with the value
: specified in the input parameter. If the input document is not
: XML, leave it as-is.
:)
declare function rws:transform(
$content as map:map,
$context as map:map
) as map:map*
{
let $attr-value :=
(map:get($context, "transform_param"), "UNDEFINED")[1]
let $the-doc := map:get($content, "value")
return
if (fn:empty($the-doc/element()))
then $content
else
let $root := xdmp:unquote($the-doc/*)
let $source-title := "ambulance source data"
let $collection := 'ambulance'
let $source-id := source:create-source($source-title, (),$root)
let $_ := xdmp:document-add-collections(concat("/marklogic.solutions.obi/source/", $source-id[1],".xml"), $collection)
return (
map:put($content, "value",
$source-id[2]
), $content
)
};

mlcp 命令:

mlcp.sh import \
-host localhost \
-port 27041 \
-username admin \
-password admin \
-input_file_path ./sampledata/so-example.csv \
-input_file_type delimited_text \
-transform_module /transforms/transform-ambulance.xqy \
-transform_namespace "http://marklogic.com/rws" \
-mode local

mlcp 输出:

15/09/08 21:35:08 INFO contentpump.ContentPump: Hadoop library version: 2.6.0
15/09/08 21:35:08 INFO contentpump.LocalJobRunner: Content type: XML
15/09/08 21:35:08 INFO input.FileInputFormat: Total input paths to process : 1
15/09/08 21:35:10 WARN mapreduce.ContentWriter: XDMP-DOCROOTTEXT: xdmp:unquote(document{<root><voornaam>hugo</voornaam><achternaam>koopmans</achternaam></root>}) -- Invalid root text "hugokoopmans" at line 1
15/09/08 21:35:10 WARN mapreduce.ContentWriter: XDMP-DOCROOTTEXT: xdmp:unquote(document{<root><voornaam>thijs</voornaam><achternaam>van ulden</achternaam></root>}) -- Invalid root text "thijsvan ulden" at line 1
15/09/08 21:35:11 INFO contentpump.LocalJobRunner: completed 100%
15/09/08 21:35:11 INFO contentpump.LocalJobRunner: com.marklogic.contentpump.ContentPumpStats:
15/09/08 21:35:11 INFO contentpump.LocalJobRunner: ATTEMPTED_INPUT_RECORD_COUNT: 2
15/09/08 21:35:11 INFO contentpump.LocalJobRunner: SKIPPED_INPUT_RECORD_COUNT: 0
15/09/08 21:35:11 INFO contentpump.LocalJobRunner: Total execution time: 2 sec

我尝试过不使用 xdmp:unquote() 但后来遇到了强制 document-node() 错误...

请多多指教...

最佳答案

好吧,问题是我们需要将 $root 变量转换为 document-node()...

let $root := document {$the-doc/root}

解决了问题。

关于marklogic - mlcp 将 csv 文件转换为 OBI 源,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32469719/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com