sql - 逆笛卡尔积-6ren

sql - 逆笛卡尔积

转载作者：塔克拉玛干更新时间：2023-11-03 03:23:09

给定以下数据集:

 a  |  b  |  c  |  d
 1  |  3  |  7  |  11
 1  |  5  |  7  |  11
 1  |  3  |  8  |  11
 1  |  5  |  8  |  11
 1  |  6  |  8  |  11

执行逆笛卡尔积得到:

 a  |  b  |  c  |  d
 1  | 3,5 | 7,8 |  11
 1  |  6  |  8  |  11

我目前正在使用 scala，我的输入/输出数据类型目前是:

ListBuffer[Array[Array[Int]]]

我想出了一个解决方案(见下文)，但感觉它可以优化。我对优化我的方法和全新的方法持开放态度。首选 scala 和 c# 解决方案。

我也很好奇这是否可以在 MS SQL 中完成。

我目前的解决方案:

def main(args: Array[String]): Unit = {

    // Input
    val data = ListBuffer(Array(Array(1), Array(3), Array(7), Array(11)),
                          Array(Array(1), Array(5), Array(7), Array(11)),
                          Array(Array(1), Array(3), Array(8), Array(11)),
                          Array(Array(1), Array(5), Array(8), Array(11)),
                          Array(Array(1), Array(6), Array(8), Array(11)))

    reverseCartesianProduct(data)
}

def reverseCartesianProduct(input: ListBuffer[Array[Array[Int]]]): ListBuffer[Array[Array[Int]]] = {
    val startIndex = input(0).size - 1

    var results:ListBuffer[Array[Array[Int]]] = input

    for (i <- startIndex to 0 by -1) {
      results = groupForward(results, i, startIndex)
    }

    results
}

def groupForward(input: ListBuffer[Array[Array[Int]]], groupingIndex: Int, startIndex: Int): ListBuffer[Array[Array[Int]]] = {

    if (startIndex < 0) {
      val reduced = input.reduce((a, b) => {
        mergeRows(a, b)
      })

      return ListBuffer(reduced)
    }

    val grouped = if (startIndex == groupingIndex) {
      Map(0 -> input)
    }
    else {
      groupOnIndex(input, startIndex)
    }

    val results = grouped.flatMap{
      case (index, values: ListBuffer[Array[Array[Int]]]) =>
        groupForward(values, groupingIndex, startIndex - 1)
    }

    results.to[ListBuffer]
  }

  def groupOnIndex(list: ListBuffer[Array[Array[Int]]], index: Int): Map[Int, ListBuffer[Array[Array[Int]]]] = {

    var results = Map[Int, ListBuffer[Array[Array[Int]]]]()

    list.foreach(a => {
      val key = a(index).toList.hashCode()

      if (!results.contains(key)) {
        results += (key -> ListBuffer[Array[Array[Int]]]())
      }

      results(key) += a
    })

    results
  }

  def mergeRows(a: Array[Array[Int]], b: Array[Array[Int]]): Array[Array[Int]] = {

    val zipped = a.zip(b)

    val merged = zipped.map{ case (array1: Array[Int], array2: Array[Int]) =>
      val m = array1 ++ array2

      quickSort(m)

      m.distinct
        .array
    }

    merged
  }

它的工作方式是:

从右到左遍历各列(groupingIndex 指定要在哪一列上运行。该列是唯一不需要具有彼此相等的值即可合并行的列。)
递归地对所有其他列(不是 groupingIndex)上的数据进行分组。
在对所有列进行分组后，假设每组中的数据在除分组列之外的每一列中都具有相等的值。
将行与匹配的列合并。为每一列取不同的值并对每一列进行排序。

如果其中的某些内容没有意义，我深表歉意，我的大脑今天无法正常工作。

最佳答案

这是我的看法。代码使用 Java，但可以轻松转换为 Scala 或 C#。

我对 n-1 的所有组合运行 groupingBy 并选择计数最少的那个，这意味着最大的合并深度，所以这是一种贪婪的方法。但是，不能保证您会找到最佳解决方案，这意味着最小化 k 的数量，这是 np-hard 要做的，请参阅链接 here寻求解释，但您会找到一个有效的解决方案，而且速度相当快。

完整示例在这里:https://github.com/jbilander/ReverseCartesianProduct/tree/master/src

Main.java

    import java.util.*;
    import java.util.stream.Collectors;

    public class Main {

        public static void main(String[] args) {

            List<List<Integer>> data = List.of(List.of(1, 3, 7, 11), List.of(1, 5, 7, 11), List.of(1, 3, 8, 11), List.of(1, 5, 8, 11), List.of(1, 6, 8, 11));
            boolean done = false;
            int rowLength = data.get(0).size(); //4
            List<Table> tables = new ArrayList<>();

            // load data into table
            for (List<Integer> integerList : data) {

                Table table = new Table(rowLength);
                tables.add(table);

                for (int i = 0; i < integerList.size(); i++) {
                    table.getMap().get(i + 1).add(integerList.get(i));
                }
            }

            // keep track of count, needed so we know when to stop iterating
            int numberOfRecords = tables.size();

            // start algorithm
            while (!done) {

                Collection<List<Table>> result = getMinimumGroupByResult(tables, rowLength);

                if (result.size() < numberOfRecords) {

                    tables.clear();

                    for (List<Table> tableList : result) {

                        Table t = new Table(rowLength);
                        tables.add(t);

                        for (Table table : tableList) {
                            for (int i = 1; i <= rowLength; i++) {
                                t.getMap().get(i).addAll(table.getMap().get(i));
                            }
                        }
                    }
                    numberOfRecords = tables.size();
                } else {
                    done = true;
                }
            }

            tables.forEach(System.out::println);
        }

        private static Collection<List<Table>> getMinimumGroupByResult(List<Table> tables, int rowLength) {

            Collection<List<Table>> result = null;
            int min = Integer.MAX_VALUE;

            for (List<Integer> keyCombination : getKeyCombinations(rowLength)) {

                switch (rowLength) {

                    case 4: {
                        Map<Tuple3<TreeSet<Integer>, TreeSet<Integer>, TreeSet<Integer>>, List<Table>> map =
                                tables.stream().collect(Collectors.groupingBy(t -> new Tuple3<>(
                                        t.getMap().get(keyCombination.get(0)),
                                        t.getMap().get(keyCombination.get(1)),
                                        t.getMap().get(keyCombination.get(2))
                                )));
                        if (map.size() < min) {
                            min = map.size();
                            result = map.values();
                        }
                    }
                    break;
                    case 5: {
                        //TODO: Handle n = 5
                    }
                    break;
                    case 6: {
                        //TODO: Handle n = 6
                    }
                    break;
                }
            }

            return result;
        }

        private static List<List<Integer>> getKeyCombinations(int rowLength) {

            switch (rowLength) {
                case 4:
                    return List.of(List.of(1, 2, 3), List.of(1, 2, 4), List.of(2, 3, 4), List.of(1, 3, 4));

                //TODO: handle n = 5, n = 6, etc...
            }

            return List.of(List.of());
        }
    }

tables.forEach(System.out::println)的输出

    Table{1=[1], 2=[3, 5, 6], 3=[8], 4=[11]}
    Table{1=[1], 2=[3, 5], 3=[7], 4=[11]}

或为了可读性重写:

     a |   b   | c | d
     --|-------|---|---
     1 | 3,5,6 | 8 | 11
     1 |  3,5  | 7 | 11

如果您要在 sql (mysql) 中执行所有这些操作，您可能会使用 group_concat()，我认为 MS SQL 在这里有类似的东西:simulating-group-concat或 STRING_AGG 如果是 SQL Server 2017，但我认为您必须使用在这种情况下有点讨厌的文本列:

例如

    create table my_table (A varchar(50) not null, B varchar(50) not null, 
                           C varchar(50) not null, D varchar(50) not null);

    insert into my_table values ('1','3,5','4,15','11'), ('1','3,5','3,10','11');

    select A, B, group_concat(C order by C) as C, D from my_table group by A, B, D;

将给出以下结果，因此您必须解析、排序和更新逗号分隔的结果，以使任何下一次合并迭代(分组依据)正确。

    ['1', '3,5', '3,10,4,15', '11']

关于sql - 逆笛卡尔积，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46084505/

文章推荐： Java 嵌套数组转递归方法

文章推荐： Java 并发 : Paired locks with shared access

文章推荐： algorithm - 最小元素数使用 xor 生成所有其他元素

Scala:具有多个来源和异构类型的交叉(笛卡尔)积
我正在尝试构建不同(但每个同质)类型的可遍历项的多个交叉产品。所需的返回类型是元组的可遍历对象，其类型与输入可遍历对象中的类型相匹配。例如: List(1, 2, 3) cross Seq("a",
java - 求两个矩阵的 boolean 积
import java.util.Scanner; public class BooleanProduct { public static void main(String[] args) {
c++ - 数字的最大 K 积
任务 - 数字的最大 K 积时间限制:1 内存限制:64 M 给定一个整数序列 N(1 ≤ N ≤ 10 月，| A i | ≤ 2.10 9)和数量 K(1 ≤ K ≤ N)。找出乘积最大的 K
c++ - 快速单精度矩阵乘以 vector 积
考虑一个大小为 48x16 的 float 矩阵 A 和一个大小为 1x48 的 float vector b。请建议一种在常见桌面处理器 (i5/i7) 上尽可能快地计算 b×A 的方法。背景。
python - 获取元组或列表的(乘法)积？
假设我有一个 class Rectangle(object): def __init__(self, len
c++ - 矩阵 A 的 boolean 积
设 A 为 3x3 阶矩阵。判断矩阵A的 boolean 积可以组成多少个不同的矩阵。这是我想出的: #include int main() { int matri
python - 如何在 NumPy 计算中避免 Kronecker 积
背景生成随机权重列表后: sizes = [784,30,10] weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1],sizes[
python - 具有单位矩阵和正则矩阵的高效 Kronecker 积 - NumPy/Python
我正在开发一个 python 项目并使用 numpy。我经常需要通过单位矩阵计算矩阵的克罗内克积。这些是我代码中的一个相当大的瓶颈，所以我想优化它们。我必须服用两种产品。第一个是: np.kron(n
c++ - Boost uBLAS 矩阵/vector 积
有人可以提供一个例子说明如何使用 uBLAS 产品来乘法吗？或者，如果有更好的 C++ 矩阵库，您可以推荐我也欢迎。这正在变成一个令人头疼的问题。这是我的代码: vector myVec(scala
javascript - 使用 Javascript 对话框获取和、积、差和商 + 显示结果
我正在尝试开发一个Javascript程序，它会提示用户输入两个整数，然后显示这两个整数的和、乘积、差和商。现在它只显示总和。我实际上不知道乘法、减法和除法命令是否正在执行。这是 jsfiddle 的
java - 如何使用 la4j 计算 3D vector 积？
如何使用 la4j 计算 vector (叉)积？ vector 乘积为接受两个 vector 并返回 vector 。但是他们有scalar product , product of all e
c++ - 使用 dgemm/dgemv 的矩阵 vector 积
在 C++ 中使用 Lapack 让我有点头疼。我发现为 fortran 定义的函数有点古怪，所以我尝试在 C++ 上创建一些函数，以便我更容易阅读正在发生的事情。无论如何，我没有让矩阵 vecto
objective-c - Apple Metal 逐元素矩阵乘法(Hadamard 积)
是否可以使用 Apple 的 Metal Performance Shaders 执行 Hadamard 产品？我看到可以使用 this 执行普通矩阵乘法，但我特别在寻找逐元素乘法，或者一种构造乘法的
c - 使用 open mp 的慢速稀疏矩阵 vector 积 (CSR)
我正在尝试使用 open mp 加速稀疏矩阵 vector 乘积，代码如下: void zAx(double * z, double * data, long * colind, long * row
c++ - 在 OpenCv 中计算 cv::Mat 的外(张量)积
有没有一种方法可以使用 cv::Mat OpenCV 中的数据结构？我检查过 the documentation并且没有内置功能。但是我在尝试将标准矩阵乘法表达式 (*) 与 cv::Mat 类型的

塔克拉玛干

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

sql - 逆笛卡尔积