作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
Python NLTK 有 cmudict 可以吐出已识别单词的音素。例如 'see' -> [u'S', u'IY1'],但对于无法识别的单词,它会给出错误。例如“seasee”-> 错误。
import nltk
arpabet = nltk.corpus.cmudict.dict()
for word in ('s', 'see', 'sea', 'compute', 'comput', 'seesea'):
try:
print arpabet[word][0]
except Exception as e:
print e
#Output
[u'EH1', u'S']
[u'S', u'IY1']
[u'S', u'IY1']
[u'K', u'AH0', u'M', u'P', u'Y', u'UW1', u'T']
'comput'
'seesea'
import nltk
arpabet = nltk.corpus.cmudict.dict()
for word in ('s', 'see', 'sea', 'compute', 'comput', 'seesea', 'darfasasawwa'):
try:
phone = arpabet[word][0]
except:
try:
counter = 0
for i in word:
substring = word[0:1+counter]
counter += 1
try:
print substring, arpabet[substring][0]
except Exception as e:
print e
except Exception as e:
print e
#Output
c [u'S', u'IY1']
co [u'K', u'OW1']
com [u'K', u'AA1', u'M']
comp [u'K', u'AA1', u'M', u'P']
compu [u'K', u'AA1', u'M', u'P', u'Y', u'UW0']
comput 'comput'
s [u'EH1', u'S']
se [u'S', u'AW2', u'TH', u'IY1', u'S', u'T']
see [u'S', u'IY1']
sees [u'S', u'IY1', u'Z']
seese [u'S', u'IY1', u'Z']
seesea 'seesea'
d [u'D', u'IY1']
da [u'D', u'AA1']
dar [u'D', u'AA1', u'R']
darf 'darf'
darfa 'darfa'
darfas 'darfas'
darfasa 'darfasa'
darfasas 'darfasas'
darfasasa 'darfasasa'
darfasasaw 'darfasasaw'
darfasasaww 'darfasasaww'
darfasasawwa 'darfasasawwa'
最佳答案
我遇到了同样的问题,我通过递归划分未知数解决了它(见 wordbreak
)
import nltk
from functools import lru_cache
from itertools import product as iterprod
try:
arpabet = nltk.corpus.cmudict.dict()
except LookupError:
nltk.download('cmudict')
arpabet = nltk.corpus.cmudict.dict()
@lru_cache()
def wordbreak(s):
s = s.lower()
if s in arpabet:
return arpabet[s]
middle = len(s)/2
partition = sorted(list(range(len(s))), key=lambda x: (x-middle)**2-x)
for i in partition:
pre, suf = (s[:i], s[i:])
if pre in arpabet and wordbreak(suf) is not None:
return [x+y for x,y in iterprod(arpabet[pre], wordbreak(suf))]
return None
关于python-2.7 - 从 Python NLTK 或其他模块中的任何单词中获取音素?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33666557/
这是我的代码: #!/usr /bin/env python import os import sphinxbase as sb import pocketsphinx as ps MODELDIR
Alexa 能够使用 IPA 音素进行语音说话......下面的示例 You say, pecan. I say, pecan. 我在 Google Home 的任何地方都看不到这
我是一名优秀的程序员,十分优秀!