gpt4 book ai didi

perl - 从复杂的数据库提取 (SQL) 构建树状哈希 (YAML)?

转载 作者:行者123 更新时间:2023-12-04 12:53:21 26 4
gpt4 key购买 nike

介绍

考虑到这个问题末尾给出的表格,我想要一个算法或一个简单的解决方案,它从 YAML 返回嵌套树。描述。使用 格式是可选的需要。事实上,我需要的输出是有序哈希数组,可能包含也可能不包含嵌套有序哈希或有序哈希数组。

简而言之,我说的是树状结构。

为了更好地理解我的问题,我将处理一个涵盖我所有需求的简单示例。实际上这个例子是我用来实现这个算法的例子。

由于我对 Perl 的了解有限,我决定在我自己的调查的同时提出这个问题。我不想挖错隧道,这就是我寻求帮助的原因。

我目前专注于 DBI 模块。我尝试查看其他模块,例如 DBIx::Tree::NestedSet ,但我不认为这是我需要的。

所以,让我们来看看我的例子的细节。

例子

最初的想法是写一个采用 的程序描述并输出提取的数据。

此输入描述遵循简单的规则:

  • 查询 是我们正在寻找的数据。它可以包含以下键
  • sql 是 SQL 查询
  • 隐藏 从最终输出中隐藏列。当仅在子查询中需要列但最终不需要时使用此字段。
  • 子查询 是对父查询的每一行执行的嵌套查询
  • 绑定(bind) 将列值绑定(bind)到 查询
  • 哈希 告诉程序将结果分组,而不是散列数组,而是散列散列。其实这可以直接给DBI::selectall_hashref .如果省略此字段,则输出被列为有序哈希数组。
  • 关键 是与父级结果同级列出的键名。我们会看到
    稍后,键名可以屏蔽结果列。
  • 列表 告诉程序将结果列出到一个数组中。请注意,只能显示一列,即 array: name显示 names 的列表
  • 连接 是 DBI 连接字符串
  • 格式 是输出格式。可以是 XML , YAMLJSON .我主要关注YAML格式,因为它可以很容易地翻译。省略时,默认输出为 YAML .
  • 缩进 一个标识有多少个空格。 tabstab值也受支持。

  • 此外,我们知道在 Perl 中哈希是没有顺序的。在这里,输出键的顺序很重要,应该像它们出现在 sql 查询中一样出现。

    从这里我简单地使用 YAML模块 :(

    总之,最后我们将只执行这个命令:
    $ cat desc.yml | ./fetch > data.yml
    desc.yml描述如下:
    ---
    connect: "dbi:SQLite:dbname=einstein-puzzle.sqlite"
    ident: 4
    query:
    - sql: SELECT * from people
    hide:
    - pet_id
    - house_id
    - id
    subquery:
    - key: brevage
    bind: id
    sql: |
    SELECT name, calories, potassium FROM drink
    LEFT JOIN people_has_drink ON drink.id = people_has_drink.id_drink
    WHERE people_has_drink.id_people = 1
    hash:
    - name
    - key: house
    sql: SELECT color as paint, size, id from house WHERE id = ?
    hide: id
    bind: paint
    subquery:
    - key: color
    sql: SELECT name, ral, hex from color WHERE short LIKE ?
    bind: color
    - key: pet
    sql: SELECT name from pet WHERE id = ?
    bind: pet_id
    list: name

    预期产出

    根据上面的描述,输出数据将是这样的:
    ---
    - nationality: Norvegian
    smoke: Dunhill
    brevage:
    orange juice:
    calories: 45
    potassium: 200 mg
    water:
    calories: 0
    potassium: 3 mg
    house:
    color:
    name: Zinc yellow
    ral: RAL 1018
    hex: #F8F32B
    paint: yellow
    size: small
    pet:
    - cats
    - nationality: Brit
    smoke: Pall Mall
    brevage:
    milk:
    calories: 42
    potassium: 150 mg
    house:
    color:
    name: Vermilion
    ral: RAL 2002
    hex: #CB2821
    paint: red
    size: big
    pet:
    - birds
    - phasmatodea

    我在哪里

    我仍然没有完全实现嵌套查询。我目前的状态在这里给出:
    #!/usr/bin/env perl
    use 5.010;
    use strict;
    use warnings;
    use DBI;
    use YAML;
    use Data::Dumper;
    use Tie::IxHash;

    # Read configuration and databse connection
    my %yml = %{ Load(do { local $/; <DATA>}) };
    my $dbh = DBI->connect($yml{connect});

    # Fill the bind values of the first query with command-line information
    my %bind;
    for(@ARGV) {
    next unless /--(\w+)=(.*)/;
    $bind{$1} = $2;
    }

    my $q0 = $yml{query}[0];
    if ($q0->{bind} and keys %bind > 0) {
    $q0->{bind_values} = arrayref($q0->{bind});
    $q0->{bind_values}[$_] = $bind{$q0->{bind}[$_]} foreach (0 .. @{$q0->{bind}} - 1);
    }

    # Fetch all data from the database recursively
    my $out = fetch($q0);

    sub fetch {
    # As long we have a query, one processes it
    my $query = shift;
    return undef unless $query;

    $query->{bind_values} = [] unless ref $query->{bind_values} eq 'ARRAY';
    # Execute SQL query
    my $sth = $dbh->prepare($query->{sql});
    $sth->execute(@{$query->{bind_values}});
    my @columns = @{$sth->{NAME}};

    # Fetch all the current level's data and preserve columns order
    my @return;
    for my $row (@{$sth->fetchall_arrayref()}) {
    my %data;
    tie %data, 'Tie::IxHash';
    $data{$columns[$_]} = $row->[$_] for (0 .. $#columns);
    for my $subquery (@{ $query->{subquery} }) {
    my @bind;
    push @bind, $data{$_} for (@{ arrayref($subquery->{bind}) });
    $subquery->{bind_values} = \@bind;
    my $sub = fetch($subquery);

    # Present output as a list
    if ($subquery->{list}) {
    #if ( map ( $query->{list} eq $_ , keys $sub ) )
    my @list;
    for (@$sub) {
    push @list, $_->{$subquery->{list}};
    }
    $sub = \@list;
    }

    if ($subquery->{key}) {
    $data{$subquery->{key}} = $sub;
    } else {
    die "[Error] Key is missing for query '$subquery->{sql}'";
    }
    }

    # Remove unwanted columns from the output
    if ($query->{hide}) {
    delete $data{$_} for( @{ arrayref($query->{hide}) } );
    }

    push @return, \%data;
    }

    \@return;
    }

    DumpYaml($out);

    sub arrayref {
    my $ref = shift;
    return (ref $ref ne 'ARRAY') ? [$ref] : $ref;
    }

    sub DumpYaml {
    # I am not happy with this current dumper. I cannot specify the indent and it does
    # not preserve the extraction order
    print Dump shift;
    }

    __DATA__
    ---
    connect: "dbi:SQLite:dbname=einstein-puzzle.sqlite"
    ident: 4
    query:
    - sql: SELECT * from people
    hide:
    - pet_id
    - house_id
    - id
    subquery:
    - key: brevage
    bind: id
    sql: |
    SELECT name, calories, potassium FROM drink
    LEFT JOIN people_has_drink ON drink.id = people_has_drink.id_drink
    WHERE people_has_drink.id_people = ?
    hash:
    - name
    - key: house
    sql: SELECT color as paint, size, id from house WHERE id = ?
    hide: id
    bind: house_id
    subquery:
    - key: color
    sql: SELECT short, ral, hex from color WHERE short LIKE ?
    bind: paint
    - key: pet
    sql: SELECT name from pet WHERE id = ?
    bind: pet_id
    list: name

    这就是我得到的输出:
    ---
    - brevage:
    - calories: 0
    name: water
    potassium: 3 mg
    - calories: 45
    name: orange juice
    potassium: 200 mg
    house:
    - color:
    - hex: '#F8F32B'
    ral: RAL 1018
    short: yellow
    paint: yellow
    size: small
    nationality: Norvegian
    pet:
    - cats
    smoke: Dunhill
    - brevage:
    - calories: 42
    name: milk
    potassium: 150 mg
    house:
    - color:
    - hex: '#CB2821'
    ral: RAL 2002
    short: red
    paint: red
    size: big
    nationality: Brit
    pet:
    - birds
    - phasmatodea
    smoke: Pall Mall

    数据库

    我的测试数据库是 db 表如下:

    餐 table 人

    .----+-------------+----------+--------+-----------.
    | id | nationality | house_id | pet_id | smoke |
    +----+-------------+----------+--------+-----------+
    | 1 | Norvegian | 4 | 3 | Dunhill |
    | 2 | Brit | 1 | 2 | Pall Mall |
    '----+-------------+----------+--------+-----------'

    餐 table 饮品

    .----+--------------+----------+-----------.
    | id | name | calories | potassium |
    +----+--------------+----------+-----------+
    | 1 | tea | 1 | 18 mg |
    | 2 | coffee | 0 | 49 mg |
    | 3 | milk | 42 | 150 mg |
    | 4 | beer | 43 | 27 mg |
    | 5 | water | 0 | 3 mg |
    | 6 | orange juice | 45 | 200 mg |
    '----+--------------+----------+-----------'

    表人有饮料

    .-----------+----------.
    | id_people | id_drink |
    +-----------+----------+
    | 1 | 5 |
    | 1 | 6 |
    | 2 | 3 |
    '-----------+----------'

    table 屋

    +----+--------+--------+
    | id | color | size |
    +----+--------+--------+
    | 1 | red | big |
    | 2 | green | small |
    | 3 | white | middle |
    | 4 | yellow | small |
    | 5 | blue | huge |
    +----+--------+--------+

    表格颜色

    .--------+-------------+----------+---------.
    | short | color | ral | hex |
    +--------+-------------+----------+---------+
    | red | Vermilion | RAL 2002 | #CB2821 |
    | green | Pale green | RAL 6021 | #89AC76 |
    | white | Light grey | RAL 7035 | #D7D7D7 |
    | yellow | Zinc yellow | RAL 1018 | #F8F32B |
    | blue | Capri blue | RAL 5019 | #1B5583 |
    '--------+-------------+----------+---------'

    餐 table 宠物

    +----+-------------+
    | id | name |
    +----+-------------+
    | 1 | dogs |
    | 2 | birds |
    | 3 | cats |
    | 4 | horses |
    | 5 | fishes |
    | 2 | phasmatodea |
    +----+-------------+

    数据库数据

    如果您希望使用与我相同的数据,也可以为您提供所需的一切:

    BEGIN TRANSACTION;
    CREATE TABLE "pet" (
    `id` INTEGER,
    `name` TEXT
    );
    INSERT INTO `pet` VALUES (1,'dogs');
    INSERT INTO `pet` VALUES (2,'birds');
    INSERT INTO `pet` VALUES (3,'cats');
    INSERT INTO `pet` VALUES (4,'horses');
    INSERT INTO `pet` VALUES (5,'fishes');
    INSERT INTO `pet` VALUES (2,'phasmatodea');
    CREATE TABLE `people_has_drink` (
    `id_people` INTEGER NOT NULL,
    `id_drink` INTEGER NOT NULL,
    PRIMARY KEY(id_people,id_drink)
    );
    INSERT INTO `people_has_drink` VALUES (1,5);
    INSERT INTO `people_has_drink` VALUES (1,6);
    INSERT INTO `people_has_drink` VALUES (2,3);
    CREATE TABLE "people" (
    `id` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    `nationality` VARCHAR(45),
    `house_id` INT,
    `pet_id` INT,
    `smoke` VARCHAR(45)
    );
    INSERT INTO `people` VALUES (1,'Norvegian',4,3,'Dunhill');
    INSERT INTO `people` VALUES (2,'Brit',1,2,'Pall Mall');
    CREATE TABLE "house" (
    `id` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    `color` TEXT,
    `size` TEXT
    );
    INSERT INTO `house` VALUES (1,'red','big');
    INSERT INTO `house` VALUES (2,'green','small');
    INSERT INTO `house` VALUES (3,'white','middle');
    INSERT INTO `house` VALUES (4,'yellow','small');
    INSERT INTO `house` VALUES (5,'blue','huge');
    CREATE TABLE `drink` (
    `id` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
    `name` TEXT,
    `calories` INTEGER,
    `potassium` TEXT
    );
    INSERT INTO `drink` VALUES (1,'tea',1,'18 mg');
    INSERT INTO `drink` VALUES (2,'coffee',0,'49 mg');
    INSERT INTO `drink` VALUES (3,'milk',42,'150 mg');
    INSERT INTO `drink` VALUES (4,'beer',43,'27 mg');
    INSERT INTO `drink` VALUES (5,'water',0,'3 mg');
    INSERT INTO `drink` VALUES (6,'orange juice',45,'200 mg');
    CREATE TABLE `color` (
    `short` TEXT UNIQUE,
    `color` TEXT,
    `ral` TEXT,
    `hex` TEXT,
    PRIMARY KEY(short)
    );
    INSERT INTO `color` VALUES ('red','Vermilion','RAL 2002','#CB2821');
    INSERT INTO `color` VALUES ('green','Pale green','RAL 6021','#89AC76');
    INSERT INTO `color` VALUES ('white','Light grey','RAL 7035','#D7D7D7');
    INSERT INTO `color` VALUES ('yellow','Zinc yellow','RAL 1018','#F8F32B');
    INSERT INTO `color` VALUES ('blue','Capri blue','RAL 5019','#1B5583');
    COMMIT;

    最佳答案

    Is my implementation good



    这是一个相当广泛的问题,答案可能取决于您希望从代码中得到什么。例如:

    它有效吗?它是否具有您需要的所有功能?它做你想做的事吗?它是否对您想要满足的所有输入范围(以及您不想要的输入)做出适当的响应?如果你不确定,写一些 tests .

    速度够快吗?如果不是,那么慢的位是什么?使用 Devel::NYTProf找到他们。

    如果它工作正常,您可能还想将您的代码转换为 module而不仅仅是一个脚本,所以你可以再次使用它。

    and if not (I'm supposing that I am doing all wrong), what modules should I use to get the desired behavior?



    听起来很像您正在尝试执行 DBIx::Class 之类的操作。 (又名 DBIC)当您向 prefetch 询问时会这样做;它将为您构建对象的数据结构。

    如果您需要动态地执行此操作以响应任意数据库和 YAML,那并不是 DBIC 的设计初衷;这可能是可能的,但可能会涉及您动态创建包,这并不容易。

    关于perl - 从复杂的数据库提取 (SQL) 构建树状哈希 (YAML)?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30683972/

    26 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com