gpt4 book ai didi

java - 我怎样才能进一步调整以下算法(Java)以使其运行得更快?

转载 作者:太空宇宙 更新时间:2023-11-03 21:50:52 25 4
gpt4 key购买 nike

介绍

我目前已经实现了这个 Paper (Efficiently selecting spatially distributed keypoints for visual tracking) 的算法在 java 。我没有做过论文中的以下建议(第 5 节末尾的第 3 页):

The relatively expensive cell cover operation can be substantially sped up by using a single bit to store the state of each cell of Gr. This enables using bitwise OR operations to “cover” contiguous patches at once with precomputed bitmasks that implement the cover to be applied at a given bit-offset position.

测试、基准测试和分析

  • 算法测试通过创建 12000 个随机点并使用初始半径执行一次算法进行测试。 JMH附上测试。
  • 我不断地使用 JProfiler 进行简介对于对象内存吞吐量(实际上创建的对象并不多)、CPU(它是 CPU 瓶颈)、GC(这里没有发生什么),而 CPU 瓶颈目前在 bresenhamFilledCircle 方法中(这是所有操作发生的地方)。在 12.000 个点中,主算法返回了大约 1.500 个,因此 bresenhamFilledCircle 执行了大约 1.500 * 6.700 = 大约 1000 万次 pr。第二。这大约是 0.1 微秒(100 纳秒)pr 调用。相当快,但应该有空间让它走得更快....

到目前为止我做了什么

  • 从基本的强力算法开始:行和列的两个嵌套循环,以及一个标准 Pythagorean theorem弄清楚我是否在一个圆圈内,将圆圈“绘制”为 boolean 值 [][]。
    吞吐量 ~3 500 ops/sec
  • 切换到使用 System.arrayCopy 进行填充而不是暴力破解。
    吞吐量约为 5 600 次操作/秒

  • 优化数组初始化(使用缓存)。
    吞吐量 ~6 000 ops/sec

  • 在行和列上添加边距以避免在算法期间进行边界检查。
    吞吐量 ~6 500 ops/sec
  • 切换到Bresenham's circle algorithm (略微修改以填充圆圈)以避免“复杂的”毕达哥拉斯检查。
    吞吐量 ~6 500 ops/sec。 :(
  • 从二维数组切换到一维数组..
    吞吐量 ~6 700 ops/sec

现在我没有主意了,除了将 boolean[] 转换为 byte[] 并按照建议使用位掩码进行设置/获取(如果我正确理解了论文中的建议)。

有人愿意挑战吗?

下面是 JMH 测试:

public class KeyPointFilterBenchmark {
private static final int DEFAULT_RADIUS = 10;

@Benchmark
public List<OpenCVKeyPoint> benchmarkFilterByRadius(KeyPointFilterState state) {
return state.filter.filterByRadius(DEFAULT_RADIUS, state.list);
}

@State(Scope.Thread)
public static class KeyPointFilterState {
private static final int NUMBER_OF_POINTS = 12_000;
private static final int IMAGE_WIDTH = 640;
private static final int IMAGE_HEIGHT = 480;
private static final int RESPONSE_RANGE = 255;
private List<OpenCVKeyPoint> list;
private KeyPointFilter filter;

@Setup(Level.Trial)
public void doSetup() {
this.list = new ArrayList<>();
for (int i = 0; i < NUMBER_OF_POINTS; i++) {
double x = Math.random() * IMAGE_WIDTH;
double y = Math.random() * IMAGE_HEIGHT;
float response = (float) (Math.random() * RESPONSE_RANGE);
list.add(new OpenCVKeyPoint(x, y, response));
}
this.filter = new KeyPointFilter(IMAGE_WIDTH, IMAGE_HEIGHT);
}
}
}

当前的实现:

public class KeyPointFilter {
private boolean[] matrix;
private final int rowCount;
private final int colCount;
private int matrixColCount;
private int matrixRowCount;
private boolean[] ones;
private int radiusInitialized;

public KeyPointFilter(int colCount, int rowCount) {
this.colCount = colCount;
this.rowCount = rowCount;
}

void init(int radius) {
if (radiusInitialized == radius) {
// Already initialized, just reset.
this.matrix = new boolean[matrixRowCount * matrixColCount];
return;
}
this.matrixRowCount = rowCount + radius * 2;
this.matrixColCount = colCount + radius * 2;
this.matrix = new boolean[matrixRowCount * matrixColCount];
// Initialize a one array, to use in the coverAround arraycopy optimization.
this.ones = new boolean[matrixColCount];
for (int i = 0; i < ones.length; i++) {
ones[i] = true;
}
radiusInitialized = radius;
}

public List<OpenCVKeyPoint> filterByRadius(int radius, List<OpenCVKeyPoint> input) {
init(radius);
List<OpenCVKeyPoint> filtered = new ArrayList<>();
// Eliminating by covering
for (OpenCVKeyPoint point : input) {
int col = (int) point.getXPos();
int row = (int) point.getYPos();
if (!isSet(col, row)) {
bresenhamFilledCircle(col, row, radius);
filtered.add(point);
}
}
return filtered;
}

void bresenhamFilledCircle(int col, int row, int radius) {
// CHECKSTYLE IGNORE MagicNumber FOR NEXT 1 LINES.
int d = (5 - radius * 4) / 4;
int x = 0;
int y = radius;
int rowOffset = radius + row;
int colOffset = radius + col;
do {
//Since we are filling a circle, we fill using System.arraycopy, from left to right.
int yStart = colOffset - y;
int yLength = 2 * y;
// Row a bottom
System.arraycopy(ones, 0, matrix, getIndex(rowOffset - x, yStart), yLength);
if (x != 0) {
int xStart = colOffset - x;
int xLength = 2 * x;
// Row a top
System.arraycopy(ones, 0, matrix, getIndex(rowOffset + x, yStart), yLength);
// Row b bottom
System.arraycopy(ones, 0, matrix, getIndex(rowOffset - y, xStart), xLength);
// Row b top
System.arraycopy(ones, 0, matrix, getIndex(rowOffset + y, xStart), xLength);
}
if (d < 0) {
d += 2 * x + 1;
} else {
d += 2 * (x - y) + 1;
y--;
}
x++;
} while (x <= y);
}

private int getIndex(int row, int col) {
return row * matrixColCount + col;
}

private void debugArray() {
StringBuilder actualResult = new StringBuilder();
for (int row = 0; row < getRowCount(); row++) {
for (int col = 0; col < getColCount(); col++) {
actualResult.append(isSet(col, row) ? '1' : '0');
}
actualResult.append('\n');
}
System.out.println(actualResult);
}

public boolean isSet(int col, int row) {
return matrix[getIndex(row + radiusInitialized, col + radiusInitialized)];
}

int getRowCount() {
return rowCount;
}

int getColCount() {
return colCount;
}
}

加上要使用的关键点类:

public class OpenCVKeyPoint {
private final double xPos;
private final double yPos;
private final float response;

public OpenCVKeyPoint(double xPos, double yPos, float response) {
this.xPos = xPos;
this.yPos = yPos;
this.response = response;
}

public float getResponse() {
return response;
}

public double getXPos() {
return xPos;
}

public double getYPos() {
return yPos;
}
}

最佳答案

可以尽可能缓存更多的计算和内联函数。

尝试用这个替换 filterByRadius 看看是否有任何改进:

public List<OpenCVKeyPoint> filterByRadius(final int radius, List<OpenCVKeyPoint> input) {
init(radius);

// Possibly give a hint to the arraylist on how much space to allocate from the start.
List<OpenCVKeyPoint> filtered = new ArrayList<>();

// calculate once
final int d_init = (5 - radius * 4) / 4;

// Eliminating by covering
for (OpenCVKeyPoint point : input) {

// FIXME do the points need to be doubles, only to be cast to int?
int col = (int) point.getXPos();
int row = (int) point.getYPos();

if (!isSet(col, row)) {
final int rowOffset = (radius + row) * matrixColCount;
final int colOffset = radius + col;

int d = d_init;
int x = 0;
int y = radius;
do {
final int yStart = colOffset - y;
final int yLength = 2 * y;

final int xByMatrixColCount = x * matrixColCount;
final int rowOffsetPlusYStart = rowOffset + yStart;

// Since we are filling a circle, we fill using System.arraycopy, from left to right.

// Row a bottom
System.arraycopy(ones, 0, matrix, (rowOffsetPlusYStart - xByMatrixColCount),
yLength);
if (x != 0) {
// Row a top
System.arraycopy(ones, 0, matrix, (rowOffsetPlusYStart + xByMatrixColCount),
yLength);

// -----
final int xLength = 2 * x;
final int yByMatrixColCount = y * matrixColCount;
final int rowOffsetPlusXStart = rowOffset + colOffset - x;

// Row b bottom
System.arraycopy(ones, 0, matrix, (rowOffsetPlusXStart - yByMatrixColCount),
xLength);

// Row b top
System.arraycopy(ones, 0, matrix, (rowOffsetPlusXStart + yByMatrixColCount),
xLength);
}
if (d < 0) {
d += 2 * x + 1;
} else {
d += 2 * (x - y) + 1;
y--;
}
x++;
} while (x <= y);

filtered.add(point);
}
}
return filtered;
}

这可能不会有太大的改进,但你要求更快,我认为这会稍微快一点,尽管我没有任何测量结果来支持我。如果您对此进行基准测试,那么我很想知道结果!

关于java - 我怎样才能进一步调整以下算法(Java)以使其运行得更快?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49521049/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com