java - 我怎样才能进一步调整以下算法(Java)以使其运行得更快？

转载作者：太空宇宙更新时间：2023-11-03 21:50:52

介绍

我目前已经实现了这个 Paper (Efficiently selecting spatially distributed keypoints for visual tracking) 的算法在 java 。我没有做过论文中的以下建议(第 5 节末尾的第 3 页):

The relatively expensive cell cover operation can be substantially sped up by using a single bit to store the state of each cell of Gr. This enables using bitwise OR operations to “cover” contiguous patches at once with precomputed bitmasks that implement the cover to be applied at a given bit-offset position.

测试、基准测试和分析

算法测试通过创建 12000 个随机点并使用初始半径执行一次算法进行测试。 JMH附上测试。
我不断地使用 JProfiler 进行简介对于对象内存吞吐量(实际上创建的对象并不多)、CPU(它是 CPU 瓶颈)、GC(这里没有发生什么)，而 CPU 瓶颈目前在 bresenhamFilledCircle 方法中(这是所有操作发生的地方)。在 12.000 个点中，主算法返回了大约 1.500 个，因此 bresenhamFilledCircle 执行了大约 1.500 * 6.700 = 大约 1000 万次 pr。第二。这大约是 0.1 微秒(100 纳秒)pr 调用。相当快，但应该有空间让它走得更快....

到目前为止我做了什么

从基本的强力算法开始:行和列的两个嵌套循环，以及一个标准 Pythagorean theorem弄清楚我是否在一个圆圈内，将圆圈“绘制”为 boolean 值 [][]。
吞吐量 ~3 500 ops/sec。
切换到使用 System.arrayCopy 进行填充而不是暴力破解。
吞吐量约为 5 600 次操作/秒。
优化数组初始化(使用缓存)。
吞吐量 ~6 000 ops/sec。
在行和列上添加边距以避免在算法期间进行边界检查。
吞吐量 ~6 500 ops/sec。
切换到Bresenham's circle algorithm (略微修改以填充圆圈)以避免“复杂的”毕达哥拉斯检查。
吞吐量 ~6 500 ops/sec。 :(
从二维数组切换到一维数组..
吞吐量 ~6 700 ops/sec。

现在我没有主意了，除了将 boolean[] 转换为 byte[] 并按照建议使用位掩码进行设置/获取(如果我正确理解了论文中的建议)。

有人愿意挑战吗？

下面是 JMH 测试:

public class KeyPointFilterBenchmark {
    private static final int DEFAULT_RADIUS = 10;

    @Benchmark
    public List<OpenCVKeyPoint> benchmarkFilterByRadius(KeyPointFilterState state) {
        return state.filter.filterByRadius(DEFAULT_RADIUS, state.list);
    }

    @State(Scope.Thread)
    public static class KeyPointFilterState {
        private static final int NUMBER_OF_POINTS = 12_000;
        private static final int IMAGE_WIDTH = 640;
        private static final int IMAGE_HEIGHT = 480;
        private static final int RESPONSE_RANGE = 255;
        private List<OpenCVKeyPoint> list;
        private KeyPointFilter filter;

        @Setup(Level.Trial)
        public void doSetup() {
            this.list = new ArrayList<>();
            for (int i = 0; i < NUMBER_OF_POINTS; i++) {
                double x = Math.random() * IMAGE_WIDTH;
                double y = Math.random() * IMAGE_HEIGHT;
                float response = (float) (Math.random() * RESPONSE_RANGE);
                list.add(new OpenCVKeyPoint(x, y, response));
            }
            this.filter = new KeyPointFilter(IMAGE_WIDTH, IMAGE_HEIGHT);
        }
    }
}

当前的实现:

public class KeyPointFilter {
    private boolean[] matrix;
    private final int rowCount;
    private final int colCount;
    private int matrixColCount;
    private int matrixRowCount;
    private boolean[] ones;
    private int radiusInitialized;

    public KeyPointFilter(int colCount, int rowCount) {
        this.colCount = colCount;
        this.rowCount = rowCount;
    }

    void init(int radius) {
        if (radiusInitialized == radius) {
            // Already initialized, just reset.
            this.matrix = new boolean[matrixRowCount * matrixColCount];
            return;
        }
        this.matrixRowCount = rowCount + radius * 2;
        this.matrixColCount = colCount + radius * 2;
        this.matrix = new boolean[matrixRowCount * matrixColCount];
        // Initialize a one array, to use in the coverAround arraycopy optimization.
        this.ones = new boolean[matrixColCount];
        for (int i = 0; i < ones.length; i++) {
            ones[i] = true;
        }
        radiusInitialized = radius;
    }

    public List<OpenCVKeyPoint> filterByRadius(int radius, List<OpenCVKeyPoint> input) {
        init(radius);
        List<OpenCVKeyPoint> filtered = new ArrayList<>();
        // Eliminating by covering
        for (OpenCVKeyPoint point : input) {
            int col = (int) point.getXPos();
            int row = (int) point.getYPos();
            if (!isSet(col, row)) {
                bresenhamFilledCircle(col, row, radius);
                filtered.add(point);
            }
        }
        return filtered;
    }

    void bresenhamFilledCircle(int col, int row, int radius) {
        // CHECKSTYLE IGNORE MagicNumber FOR NEXT 1 LINES.
        int d = (5 - radius * 4) / 4;
        int x = 0;
        int y = radius;
        int rowOffset = radius + row;
        int colOffset = radius + col;
        do {
            //Since we are filling a circle, we fill using System.arraycopy, from left to right.
            int yStart = colOffset - y;
            int yLength = 2 * y;
            // Row a bottom
            System.arraycopy(ones, 0, matrix, getIndex(rowOffset - x, yStart), yLength);
            if (x != 0) {
                int xStart = colOffset - x;
                int xLength = 2 * x;
                // Row a top
                System.arraycopy(ones, 0, matrix, getIndex(rowOffset + x, yStart), yLength);
                // Row b bottom
                System.arraycopy(ones, 0, matrix, getIndex(rowOffset - y, xStart), xLength);
                // Row b top
                System.arraycopy(ones, 0, matrix, getIndex(rowOffset + y, xStart), xLength);
            }
            if (d < 0) {
                d += 2 * x + 1;
            } else {
                d += 2 * (x - y) + 1;
                y--;
            }
            x++;
        } while (x <= y);
    }

    private int getIndex(int row, int col) {
        return row * matrixColCount + col;
    }

    private void debugArray() {
        StringBuilder actualResult = new StringBuilder();
        for (int row = 0; row < getRowCount(); row++) {
            for (int col = 0; col < getColCount(); col++) {
                actualResult.append(isSet(col, row) ? '1' : '0');
            }
            actualResult.append('\n');
        }
        System.out.println(actualResult);
    }

    public boolean isSet(int col, int row) {
        return matrix[getIndex(row + radiusInitialized, col + radiusInitialized)];
    }

    int getRowCount() {
        return rowCount;
    }

    int getColCount() {
        return colCount;
    }
}

加上要使用的关键点类:

public class OpenCVKeyPoint {
    private final double xPos;
    private final double yPos;
    private final float response;

    public OpenCVKeyPoint(double xPos, double yPos, float response) {
        this.xPos = xPos;
        this.yPos = yPos;
        this.response = response;
    }

    public float getResponse() {
        return response;
    }

    public double getXPos() {
        return xPos;
    }

    public double getYPos() {
        return yPos;
    }
}

最佳答案

可以尽可能缓存更多的计算和内联函数。

尝试用这个替换 filterByRadius 看看是否有任何改进:

public List<OpenCVKeyPoint> filterByRadius(final int radius, List<OpenCVKeyPoint> input) {
    init(radius);

    // Possibly give a hint to the arraylist on how much space to allocate from the start.
    List<OpenCVKeyPoint> filtered = new ArrayList<>();

    // calculate once
    final int d_init = (5 - radius * 4) / 4;

    // Eliminating by covering
    for (OpenCVKeyPoint point : input) {

        // FIXME do the points need to be doubles, only to be cast to int?
        int col = (int) point.getXPos();
        int row = (int) point.getYPos();

        if (!isSet(col, row)) {
            final int rowOffset = (radius + row) * matrixColCount;
            final int colOffset = radius + col;

            int d = d_init;
            int x = 0;
            int y = radius;
            do {
                final int yStart = colOffset - y;
                final int yLength = 2 * y;

                final int xByMatrixColCount = x * matrixColCount;
                final int rowOffsetPlusYStart = rowOffset + yStart;

                // Since we are filling a circle, we fill using System.arraycopy, from left to right.

                // Row a bottom
                System.arraycopy(ones, 0, matrix, (rowOffsetPlusYStart - xByMatrixColCount),
                        yLength);
                if (x != 0) {
                    // Row a top
                    System.arraycopy(ones, 0, matrix, (rowOffsetPlusYStart + xByMatrixColCount),
                            yLength);

                    // -----
                    final int xLength = 2 * x;
                    final int yByMatrixColCount = y * matrixColCount;
                    final int rowOffsetPlusXStart = rowOffset + colOffset - x;

                    // Row b bottom
                    System.arraycopy(ones, 0, matrix, (rowOffsetPlusXStart - yByMatrixColCount),
                            xLength);

                    // Row b top
                    System.arraycopy(ones, 0, matrix, (rowOffsetPlusXStart + yByMatrixColCount),
                            xLength);
                }
                if (d < 0) {
                    d += 2 * x + 1;
                } else {
                    d += 2 * (x - y) + 1;
                    y--;
                }
                x++;
            } while (x <= y);

            filtered.add(point);
        }
    }
    return filtered;
}

这可能不会有太大的改进，但你要求更快，我认为这会稍微快一点，尽管我没有任何测量结果来支持我。如果您对此进行基准测试，那么我很想知道结果!

关于java - 我怎样才能进一步调整以下算法(Java)以使其运行得更快？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49521049/

文章推荐： html - CSS:我无法将宽度设置为 "auto"始终显示 100%

文章推荐： node.js - 如何在 Mongoose 模型中实现事务概念？

文章推荐： c# - 串口通讯写入

javascript - 进一步/递归对象解构
我有这个: const {ops} = getOplogStreamInterpreter(strm); ops.del.subscribe(v => { console.log('delete
go - 使用 golang 时是否建议(进一步)限制表单的大小？
我四处搜索，据我所知，POST 表单请求已被限制为 10MB (http://golang.org/src/net/http/request.go#L721)。如果我要在我的 ServeHTTP 方

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

java - 我怎样才能进一步调整以下算法(Java)以使其运行得更快？

介绍

测试、基准测试和分析

到目前为止我做了什么