gpt4 book ai didi

python - 从 Tensorflow 转换 -> CoreML 3.0 用于插槽/意图检测

转载 作者:行者123 更新时间:2023-11-30 10:31:50 24 4
gpt4 key购买 nike

我正在尝试在我的 Swift 应用程序中使用此代码库 ( Slot-Filling-Understanding-Using-RNNs ) 创建的一些模型。

我能够将 lstm_nopoolinglstm_nopooling300lstm 转换为 CoreML。

model.py中我使用了以下代码:

def save_model(self):
joblib.dump(self.summary, 'models/' + self.name + '.txt')
self.model.save('models/' + self.name + '.h5')
try:
coreml_model = coremltools.converters.keras.convert(self.model, input_names="main_input", output_names=["intent_output","slot_output"])
coreml_model.save('models/' + self.name + '.mlmodel')
except:
pass
print("Saved model to disk")

我正在尝试将向量转换回意图和插槽。

我有这个,但是

    func tokenizeSentences(instr: String) -> [Int] {
let s = instr.lowercased().split(separator: " ")
var ret = [Int]()
if let filepath = Bundle.main.path(forResource: "atis.dict.vocab", ofType: "csv") {
do {
let contents = try String(contentsOfFile: filepath)
print(contents)
var lines = contents.split { $0.isNewline }
var pos = 0
for word in s {
if let index = lines.firstIndex(of: word) {
print(index.description + " " + word)
ret.append(index)
}
}
return ret
} catch {
// contents could not be loaded
}
} else {
// example.txt not found!
}
return ret
}
func predictText(instr:String) {
let model = lstm_nopooling300()
guard let mlMultiArray = try? MLMultiArray(shape:[20,1,1],
dataType:MLMultiArrayDataType.int32) else {
fatalError("Unexpected runtime error. MLMultiArray")
}
let tokens = tokenizeSentences(instr: instr)
for (index, element) in tokens.enumerated() {
mlMultiArray[index] = NSNumber(integerLiteral: element)
}

guard let m = try? model.prediction(input: lstm_nopooling300Input.init(main_input: mlMultiArray))
else {
fatalError("Unexpected runtime error. MLMultiArray")
}
let mm = m.intent_output
let length = mm.count
let doublePtr = mm.dataPointer.bindMemory(to: Double.self, capacity: length)
let doubleBuffer = UnsafeBufferPointer(start: doublePtr, count: length)
let output = Array(doubleBuffer)
print("******** intents \(mm.count) ********")
print(output)
let mn = m.slot_output
let length2 = mn.count
let doublePtr2 = mm.dataPointer.bindMemory(to: Double.self, capacity: length2)
let doubleBuffer2 = UnsafeBufferPointer(start: doublePtr2, count: length2)
let output2 = Array(doubleBuffer2)
print("******** slots \(mn.count) ********")
print(output2)
}
}

当我运行我的代码时,我得到了这个,被截断的意图:

******** intents 540 ********

[0.0028914143331348896, 0.0057610333897173405, 4.1651015635579824e-05, 0.15935245156288147, 5.6665314332349226e-05, 5.7797817134996876e-05, 0.0044302307069301605, 0.00012486864579841495, 0.0004683282459154725, 0.003053907072171569, 3.806956738117151e-05, 0.012112349271774292, 5.861848694621585e-05, 0.0031344725284725428,

我认为,问题在于 ids 位于 pickle 文件中,因此可能位于 atis/atis.train.pkl 中。

我所做的只是训练模型并将其转换为 CoreML,现在我正在尝试使用它,但不确定下一步该做什么。

我有一个文本字段,我输入“伦敦当前天气”,我希望得到类似的内容(这是来自运行 example.py){'intent': 'weather_intent', 'slots': [{'name': 'city', 'value': '伦敦'}]}

这是 coreml 输入/输出

enter image description here

最佳答案

感谢@MatthijsHollemans,我能够弄清楚该怎么做。

在 data_processing.py 中我添加了这些:

with open('atis/wordlist.csv', 'w') as f:
for key in ids2words.keys():
f.write("%s\n"%(ids2words.keys[key]))
with open('atis/wordlist_slots.csv', 'w') as f:
for key in ids2slots.keys():
f.write("%s\n"%(ids2slots[key]))
with open('atis/wordlist_intents.csv', 'w') as f:
for key in ids2intents.keys():
f.write("%s\n"%(ids2intents[key]))

这使我能够使用 wordlist.csv 正确标记。

然后,当我收到响应时,使用 mm.count 是错误的,它应该是 output.count 例如,我可以看到意图。

查找具有最大值的元素,然后在 wordlist_intents.csv(我将其转换为数组,可能应该是字典)中查找以查找可能的意图。

我仍然需要做插槽,但基本思想是相同的。

关键是将python中使用的字典输出到csv文件中,然后导入到项目中。

更新

我意识到,当 mm.count 为 540 时,这是因为它可以在句子中包含 20 个单词,因此它可以返回那么多单词。因此,就我而言,我需要按空格分割单词,然后循环多次,因为我不会得到比单词更多的插槽。

我在 SwiftUI 中执行此操作,因此我还必须创建一个可观察对象,以便我可以使用 EnvironmentObject 来传递术语。

因此,为了正确循环内存中的 double 组,我包含了符合我预期的最新代码。

func predictText(instr:String) {
let model = lstm_nopooling300()
guard let mlMultiArray = try? MLMultiArray(shape:[20,1,1],
dataType:MLMultiArrayDataType.int32) else {
fatalError("Unexpected runtime error. MLMultiArray")
}
let tokens = tokenizeSentences(instr: instr)
let sent = instr.split(separator: " ")
print(instr)
print(tokens)
for (index, element) in tokens.enumerated() {
mlMultiArray[index] = NSNumber(integerLiteral: element)
}

guard let m = try? model.prediction(input: lstm_nopooling300Input.init(main_input: mlMultiArray))
else {
fatalError("Unexpected runtime error. MLMultiArray")
}

let mm = m.intent_output
let length = mm.count
let doublePtr = mm.dataPointer.bindMemory(to: Double.self, capacity: length)
var intents = [String]()
for i in 0...sent.count - 1 {
let doubleBuffer = UnsafeBufferPointer(start: doublePtr + i * 27, count: 27)
let output = Array(doubleBuffer)
let intent = convertVectorToIntent(vector: output)
intents.append(intent)
}
print(intents)
let mn = m.slot_output
let length2 = mn.count
let doublePtr2 = mn.dataPointer.bindMemory(to: Double.self, capacity: length2)
var slots = [String]()
for i in 0...sent.count - 1 {
let doubleBuffer2 = UnsafeBufferPointer(start: doublePtr2 + i * 133, count: 133)
let output2 = Array(doubleBuffer2)
var slot = ""
slot = convertVectorToSlot(vector: output2)
slots.append(slot)
slots.append(sent[i].description)
}
print(slots)
}

关于python - 从 Tensorflow 转换 -> CoreML 3.0 用于插槽/意图检测,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59024502/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com