gpt4 book ai didi

Algorithm for drawing outline of image shape in BezierPath (Canny edge detector)(BezierPath(Canny边缘检测器)中的图像形状轮廓提取算法)

转载 作者:bug小助手 更新时间:2023-10-25 15:17:49 30 4
gpt4 key购买 nike



I'm trying to draw the outline of an image using BezierPath based on the transparency of each pixel.

我正在尝试根据每个像素的透明度使用BezierPath绘制图像的轮廓。


However, I'm having an issue with the logic; my logic also draws the internal outlines.

然而,我对逻辑有一个问题;我的逻辑也画出了内部轮廓。


I only want to draw the external outline with BezierPath.
What I get (the first shape is the original image, the second is the bezierPath):

我只想用BezierPath绘制外部轮廓。我得到的(第一个形状是原始图像,第二个是bezierPath):


enter image description here


My code:

我的代码是:


func processImage(_ image: UIImage) -> UIBezierPath? {
guard let cgImage = image.cgImage else {
print("Error: Couldn't get CGImage from UIImage")
return nil
}

let width = cgImage.width
let height = cgImage.height

// Create a context to perform image processing
let colorSpace = CGColorSpaceCreateDeviceGray()
let context = CGContext(data: nil, width: width, height: height, bitsPerComponent: 8, bytesPerRow: width, space: colorSpace, bitmapInfo: CGImageAlphaInfo.none.rawValue)

guard let context = context else {
print("Error: Couldn't create CGContext")
return nil
}

// Draw the image into the context
context.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))

// Perform Canny edge detection
guard let edgeImage = context.makeImage() else {
print("Error: Couldn't create edge image")
return nil
}

// Create a bezier path for the outline of the shape
let bezierPath = UIBezierPath()

// Iterate over the image pixels to find the edges
for y in 0..<height {
for x in 0..<width {
let pixel = edgeImage.pixel(x: x, y: y)

if pixel > 0 {
let leftPixel = (x > 0) ? edgeImage.pixel(x: x - 1, y: y) : 0
let rightPixel = (x < width - 1) ? edgeImage.pixel(x: x + 1, y: y) : 0
let abovePixel = (y > 0) ? edgeImage.pixel(x: x, y: y - 1) : 0
let belowPixel = (y < height - 1) ? edgeImage.pixel(x: x, y: y + 1) : 0

if leftPixel == 0 || rightPixel == 0 || abovePixel == 0 || belowPixel == 0 {
bezierPath.move(to: CGPoint(x: CGFloat(x), y: CGFloat(y)))
bezierPath.addLine(to: CGPoint(x: CGFloat(x) + 1.0, y: CGFloat(y) + 1.0))
}
}
}
}

return bezierPath
}

extension CGImage {
func pixel(x: Int, y: Int) -> UInt8 {
let data = self.dataProvider!.data
let pointer = CFDataGetBytePtr(data)
let bytesPerRow = self.bytesPerRow

let pixelInfo = (bytesPerRow * y) + x
return pointer![pixelInfo]
}
}

更多回答

You want a flood fill algorithm, starting outside the image to find everything "not" in the shape. Then edge detect on that. en.wikipedia.org/wiki/Flood_fill There are many other approaches, but that one is pretty easy to implement and will work reasonably for many things as long as there are no gaps in your edge. You can also find the fist pixel of an edge as you are, and then try "walking around it" by testing nearby pixels in all directions.

您需要一个整体填充算法,从图像外部开始查找形状中的所有内容。然后对其进行边缘检测。En.wikipedia.org/wiki/Flood_Fill还有许多其他方法,但这种方法很容易实现,只要您的边缘没有缺口,它就可以很好地适用于许多事情。你也可以找到一条边的第一个像素,然后尝试通过在各个方向上测试附近的像素来“绕过它”。

Note that generating thousands of two-pixel lines can be very expensive to work with. You probably will want to simplify the final curve using something like Ramer–Douglas–Peucker (possibly before creating an actual BezierPath).

请注意,生成数千条两像素行的操作成本可能非常高。您可能希望使用类似Ramer-Douglas-Peucker的方法来简化最终曲线(可能在创建实际的BezierPath之前)。

That said, how would you want this to work if the above font were the letter P? I would assume there would be a large internal hole that you would want to treat as edges, even though you don't want to capture the other, smaller internal holes. Solving this well is likely complex, and require you to detect "features" rather than just adjacent pixels. One approach is to scan with a larger (3x3, 5x5, on that order) overlapping "block" that is considered filled in if any of its pixels are filled in. That will ignore small holes, while capturing larger holes. It's challenging.

也就是说,如果上面的字体是字母P,你希望它是如何工作的?我会假设会有一个大的内部孔,您会希望将其视为边,即使您不想捕获其他较小的内部孔。解决这口井可能很复杂,需要你检测“特征”,而不仅仅是相邻的像素。一种方法是用一个更大的(3x3,5x5,顺序为3x3,5x5)重叠的“块”进行扫描,如果它的任何像素被填充,就认为它被填充了。这将忽略小洞,同时捕获较大的洞。这很有挑战性。

Thank you for your response. I will try to implement the flood fill algorithm; it seems to be what I'm looking for. Indeed, I will probably need to simplify the paths, which I will do in step 2. And to address the example of the letter P, I'm not interested in the hole inside for my goal, only the external outline. I will keep you updated. Thank you very much!

谢谢您的回复。我将尝试实现泛洪填充算法;这似乎就是我要寻找的。事实上,我可能需要简化路径,我将在步骤2中这样做。为了解决字母P的例子,我对我的目标不是内部的洞感兴趣,只对外部轮廓感兴趣。我会随时通知你最新情况。非常感谢!

My goal is to get out the the outline as BezierPath, after more investigation I need to use another algorithm Moore-Neighbor tracing algorithm

我的目标是得到BezierPath的轮廓,经过更多的研究后,我需要使用另一种算法摩尔-邻居跟踪算法

优秀答案推荐

From the comments, you found the algorithm (Moore Neighborhood Tracing). Here's an implementation that works well for your problem. I'll comment on some improvements you might consider.

从评论中,您找到了算法(摩尔邻域跟踪)。这里有一个可以很好地解决您的问题的实现。我会评论一下你可能会考虑的一些改进。


First, you need to get the data into a buffer, with one byte per pixel. You seem to know how to do that, so I won't belabor the point. 0 should be "transparent" and non-0 should be "filled." In the literature, these are often black (1) lines on a white (0) background, so I'll use that naming.

首先,您需要将数据放入缓冲区,每个像素一个字节。你似乎知道如何做那件事,所以我就不多说了。0应该是“透明的”,非0应该是“填充的”。在文献中,这些通常是白(0)背景上的黑(1)线,所以我将使用这种命名。


The best introduction I've found (and regularly cited) is Abeer George Ghuneim's Contour Tracing site. Really useful site. I've seen some implementations of MNT that over-check some pixels. I try to follow the algorithm Abeer describes carefully to avoid that.

我找到的最好的介绍(并经常被引用)是阿比尔·乔治·古纳姆的等高线追踪网站。非常有用的网站。我见过MNT的一些实现,它们过度检查了一些像素。我试着遵循阿比尔描述的算法,以避免这种情况。


There is more testing I want to do on this code, but it handles your case.

我还想对这段代码进行更多测试,但它可以处理您的情况。


First, the algorithm operates on a Grid of Cells:

首先,该算法对单元格网格进行操作:


public struct Cell: Equatable {
public var x: Int
public var y: Int
}

public struct Grid: Equatable {
public var width: Int
public var height: Int
public var values: [UInt8]

public var columns: Range<Int> { 0..<width }
public var rows: Range<Int> { 0..<height }

// The pixels immediately outside the grid are white.
// Accessing beyond that is a runtime error.
public subscript (p: Cell) -> Bool {
if p.x == -1 || p.y == -1 || p.x == width || p.y == height { return false }
else { return values[p.y * width + p.x] != 0 }
}

public init?(width: Int, height: Int, values: [UInt8]) {
guard values.count == height * width else { return nil }
self.height = height
self.width = width
self.values = values
}
}

There is also the concept of "direction." This is in two forms: the direction from the center to one of the 8 neighbors, and the "backtrack" direction, which is the direction a cell is "entered" during the search

还有“方向”的概念。这有两种形式:从中心到8个邻域之一的方向和“回溯”方向,即在搜索过程中“进入”像元的方向


enum Direction: Equatable {
case north, northEast, east, southEast, south, southWest, west, northWest

mutating func rotateClockwise() {
self = switch self {
case .north: .northEast
case .northEast: .east
case .east: .southEast
case .southEast: .south
case .south: .southWest
case .southWest: .west
case .west: .northWest
case .northWest: .north
}
}

//
// Given a direction from the center, this is the direction that box was entered from when
// rotating clockwise.
//
// +---+---+---+
// + ↓ + ← + ← +
// +---+---+---+
// + ↓ + + ↑ +
// +---+---+---+
// + → + → + ↑ +
// +---+---+---+
func backtrackDirection() -> Direction {
switch self {
case .north: .west
case .northEast: .west
case .east: .north
case .southEast: .north
case .south: .east
case .southWest: .east
case .west: .south
case .northWest: .south
}
}
}

And Cells can advance in a given direction:

细胞可以沿着给定的方向前进:


extension Cell {
func inDirection(_ direction: Direction) -> Cell {
switch direction {
case .north: Cell(x: x, y: y - 1)
case .northEast: Cell(x: x + 1, y: y - 1)
case .east: Cell(x: x + 1, y: y )
case .southEast: Cell(x: x + 1, y: y + 1)
case .south: Cell(x: x , y: y + 1)
case .southWest: Cell(x: x - 1, y: y + 1)
case .west: Cell(x: x - 1, y: y )
case .northWest: Cell(x: x - 1, y: y - 1)
}
}
}

And finally, the Moore Neighbor algorithm:

最后是摩尔邻居算法:


public struct BorderFinder {
public init() {}

// Returns the point and the direction of the previous point
// Since this scans left-to-right, the previous point is always to the west
// The grid includes x=-1, so it's ok if this is an edge.
func startingPoint(for grid: Grid) -> (point: Cell, direction: Direction)? {
for y in grid.rows {
for x in grid.columns {
let point = Cell(x: x, y: y)
if grid[point] {
return (point, .west)
}
}
}
return nil
}

/// Finds the boundary of a blob within `grid`
///
/// - Parameter grid: an Array of bytes representing a 2D grid of UInt8. Each cell is either zero (white) or non-zero (black).
/// - Returns: An array of points defining the boundary. The boundary includes only black points.
/// If multiple "blobs" exist, it is not defined which will be returned.
/// If no blob is found, an empty array is returned
public func findBorder(in grid: Grid) -> [Cell] {
guard let start = startingPoint(for: grid) else { return [] }
var (point, direction) = start
var boundary: [Cell] = [point]

var rotations = 0
repeat {
direction.rotateClockwise()
let nextPoint = point.inDirection(direction)
if grid[nextPoint] {
boundary.append(nextPoint)
point = nextPoint
direction = direction.backtrackDirection()
rotations = 0
} else {
rotations += 1
}
} while (point, direction) != start && rotations <= 7

return boundary
}
}

This returns a list of Cells. That can be converted to a CGPath as follows:

这将返回一个单元格列表。它可以转换为CGPath,如下所示:


let data = ... Bitmap data with background as 0, and foreground as non-0 ...
let grid = Grid(width: image.width, height: image.height, values: Array(data))!

let points = BorderFinder().findBorder(in: grid)

let path = CGMutablePath()
let start = points.first!

path.move(to: CGPoint(x: start.x, y: start.y))
for point in points.dropFirst() {
let cgPoint = CGPoint(x: point.x, y: point.y)
path.addLine(to: cgPoint)
}
path.closeSubpath()

That generates the following path:

这将生成以下路径:


Contour of original picture


There is a gist of the full sample code I used. (This sample code isn't meant to be a good example of how to prep the image for processing. I just threw it together to work on the algorithm.)

下面是我使用的完整示例代码的要点。(此示例代码并不是一个很好的示例,说明如何准备图像以进行处理。我只是把它们拼凑在一起来研究算法。)


Some thoughts for future work:

对今后工作的一些思考:



  • You can likely get good and faster results by first scaling the image to something smaller. Half-scale definitely works well, but consider even 1/10 scale.

  • You may get better results by first applying a small Gaussian blur to the image. This will eliminate small gaps in the edge, which can cause trouble for the algorithm, and reduce the complexity of the contour.

  • Managing 5000 path elements, each a 2-pixel line, is probably not great. Pre-scaling the image can help a lot. Another approach is applying Ramer–Douglas–Peucker to further simplify the contour.


更多回答

Works perfectly, thank you for your help. You just forgot the 'inDirection' method in your response. But I had understood it:func inDirection(_ direction: Direction) -> Cell { switch direction { case .north: return Cell(x: x, y: y - 1)....

效果很好,谢谢你的帮助。您只是忘记了您的响应中的“间接”方法。但我已经理解了:函数间接(_Direction:Direction)->单元格{切换方向{case.North:Return Cell(x:x,y:y-1)...

Oh yes! Sorry about that. Added.

哦,是的!真对不起。加进去了。

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com