gpt4 book ai didi

python - 字母尺度和随机文本上的马尔可夫链

转载 作者:太空狗 更新时间:2023-10-30 02:05:48 25 4
gpt4 key购买 nike

我想使用 .txt 文件中一本书的字母频率生成随机文本,以便每个新字符 (string.lowercase + ' ') 都依赖于前一个字符。

我如何使用马尔可夫链来做到这一点?还是对每个字母使用 27 个条件频率数组更简单?

最佳答案

I would like to generate a random text using letter frequencies from a book in a txt file

考虑使用 collections.Counter 在文本文件中一次循环两个字母时建立频率。

How do I use markov chains to do so? Or is it simpler to use 27 arrays with conditional frequencies for each letter?

这两个语句是等价的。马尔可夫链就是您正在做的。具有条件频率的 27 个数组是您如何做的。

这里有一些基于字典的代码可以帮助您入门:

from collections import defaultdict, Counter
from itertools import ifilter
from random import choice, randrange

def pairwise(iterable):
it = iter(iterable)
last = next(it)
for curr in it:
yield last, curr
last = curr

valid = set('abcdefghijklmnopqrstuvwxyz ')

def valid_pair((last, curr)):
return last in valid and curr in valid

def make_markov(text):
markov = defaultdict(Counter)
lowercased = (c.lower() for c in text)
for p, q in ifilter(valid_pair, pairwise(lowercased)):
markov[p][q] += 1
return markov

def genrandom(model, n):
curr = choice(list(model))
for i in xrange(n):
yield curr
if curr not in model: # handle case where there is no known successor
curr = choice(list(model))
d = model[curr]
target = randrange(sum(d.values()))
cumulative = 0
for curr, cnt in d.items():
cumulative += cnt
if cumulative > target:
break

model = make_markov('The qui_.ck brown fox')
print ''.join(genrandom(model, 20))

关于python - 字母尺度和随机文本上的马尔可夫链,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8660015/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com