gpt4 book ai didi

ios - 如何从 Swift 3.0 中的字符串中提取单词(英文)和名称?

转载 作者:行者123 更新时间:2023-11-28 15:59:26 27 4
gpt4 key购买 nike

我正在使用一个图像处理 API 来读取图像上的文本,并且从我返回的字符串数据中,我需要提取英语或字典中的单词以及常见的名字和姓氏。换句话说,我在字符串中得到了我需要的文本,但在结果中也得到了一些我需要过滤掉的垃圾(非单词)。这里最好的方法是什么?我已经研究过 NSLinguisticTagger 但它不是 100% 与我正在做的事情相符,还有其他建议吗?

REGEX 会帮助我吗?我不知道如何形成只匹配单词的模式的语法?

下面是我试图从中提取单词/名称的示例字符串的 2 个示例:

(1) “PUMPER im CasSICI 1111 Cassu with Andrew Webster PUMPE im CasSICI 1111 Cassu with Andrew Webster”
//我需要提取:“Pumper With Andrew Webster”

(2) “强大的 Hazelwood High 三部曲中的 SHARON M DRAPER000kFORGEDBY FIRESWINNER SHARON M DRAPER 000k 在强大的 Hazelwood High 三部曲中 FORGED BY FIRE S WINNER”
//我需要提取“Sharon Hazelwood High Draper in the powerful trilogy in forced by fire winner”

最佳答案

我把这个类拼凑在一起,它是真实代码和伪代码的混合体。我会为名字和姓氏创建一个单例类。有关详细信息,请参阅代码中的注释。这不是全部,但它应该可以解决您的大部分问题。

更新使用 switch 语句调整 cleanUpString 方法。

更新 2添加这个是为了处理 UITextChecker 不...

return UIReferenceLibraryViewController.dictionaryHasDefinition(forTerm: self)

无论您从哪里获取 OCR 文本,您都可以像这样使用它:

let stringParser = StringParser()
let cleanedUpText = stringParser.cleanUpString(yourOCRText)

这是类:

import UIKit // need this so UITextChecker will work
import Foundation

class StringParser: NSObject {

// TODO: You'll need to create a singleton class for your first and last names
// https://krakendev.io/blog/the-right-way-to-write-a-singleton

func cleanUpString(_ inputString: String) -> String {

// chuck stuff separated by a space into an array as an invdividual string
let inputStringArray = inputString.characters.split(separator: " ").map(String.init)

var outputArray = [String]()

for word in inputStringArray {
// Switch to check if word satisfies any of the desired conditions...if so, chuck in outputArray

switch word {
case _ where word.isRealWord():
outputArray.append(word)
break
case _ where word.isFirstName():
outputArray.append(word.capitalized)
break
case _ where word.isLastName():
outputArray.append(word.capitalized)
break
default:
break
}
}

// reassemble the cleaned up words into an output array and return it as a single string
return outputArray.joined(separator: " ")
}
}

extension String {

func isFirstName() -> Bool {
let firstNameArray = ["Andrew", "Sharon"] // FIXME: this should be your singleton

return firstNameArray.contains(self.capitalized)
}

func isLastName() -> Bool {
let lastNameArray = ["Webster", "Hazelwood"] // FIXME: this should be your singleton

return lastNameArray.contains(self.capitalized)
}

func isRealWord() -> Bool {
// adapted from https://www.hackingwithswift.com/example-code/uikit/how-to-check-a-string-is-spelled-correctly-using-uitextchecker
let checker = UITextChecker()
let range = NSRange(location: 0, length: self.utf16.count)
let misspelledRange = checker.rangeOfMisspelledWord(in: self, range: range, startingAt: 0, wrap: false, language: "en")

if misspelledRange.location == NSNotFound {
// cleans up what UITextChecker misses
return UIReferenceLibraryViewController.dictionaryHasDefinition(forTerm: self) // returns yes if there's a definition for it
}
return false
}
}

关于ios - 如何从 Swift 3.0 中的字符串中提取单词(英文)和名称?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41292904/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com