gpt4 book ai didi

php - 如何使用可互换字母获取单词的所有可能变体?

转载 作者:可可西里 更新时间:2023-11-01 12:38:34 25 4
gpt4 key购买 nike

在阿拉伯语中,像“ا”(Alef)这样的字母有多种形式/变体:

(ا, أ, إ, آ)

字母ي也一样,也可以是ى。

我想做的是获取一个单词的所有可能变体,其中包含许多 أ 和 ي 字母。

例如,“أين”这个词应该有所有这些可能的(在大多数情况下是不正确的)变体:أين, إين, اين, آين, أىц, إين, اىц, آىц ...等

为什么?我正在构建一个小型文本校正系统,它可以处理语法错误并用正确的单词替换错误的单词。

我一直在尝试以最干净的方式来做这件事,但我最终得到了 8 个 for/foreach 循环来处理单词“أ”

必须有更好更干净的方法来做到这一点!有什么想法吗?

到目前为止,这是我的代码:

        $alefVariations = ['ا', 'إ', 'أ', 'آ'];
$word = 'أيامنا';

// Break into letters
$wordLetters = preg_split('//u', $word, null, PREG_SPLIT_NO_EMPTY);
$wordAlefLettersIndexes = [];

// Get the أ letters
for($letterIndex = 0; $letterIndex < count($wordLetters); $letterIndex++){
if(in_array($wordLetters[$letterIndex], $alefVariations)){
$wordAlefLettersIndexes[] = $letterIndex;
}
}

$eachLetterVariations = [];
foreach($wordAlefLettersIndexes as $alefLettersIndex){
foreach($alefVariations as $alefVariation){
$wordCopy = $wordLetters;
$wordCopy[$alefLettersIndex] = $alefVariation;

$eachLetterVariations[$alefLettersIndex][] = $wordCopy;
}
}

$variations = [];
foreach($wordAlefLettersIndexes as $alefLettersIndex){
$alefWordVariations = $eachLetterVariations[$alefLettersIndex];

foreach($wordAlefLettersIndexes as $alefLettersIndex_inner){
if($alefLettersIndex == $alefLettersIndex_inner) continue;

foreach($alefWordVariations as $alefWordVariation){
foreach($alefVariations as $alefVariation){
$alefWordVariationCopy = $alefWordVariation;
$alefWordVariationCopy[$alefLettersIndex_inner] = $alefVariation;

$variations[] = $alefWordVariationCopy;
}
}
}
}

$finalList = [];
foreach($variations as $variation){
$finalList[] = implode('', $variation);
}

return array_unique($finalList);

最佳答案

我不认为这是进行自动更正的方法,但这是针对您提出的问题的通用解决方案。它使用递归并且在 javascript 中(我不知道 php)。

function solve(word, sameLetters, customIndices = []){
var splitLetters = word.split('')
.map((char, index) => { // check if the current letter is within any variation
if(customIndices.length == 0 || customIndices.includes(index)){
var variations = sameLetters.find(arr => arr.includes(char));
if(variations != undefined) return variations;
}
return [char];
});

// up to this point splitLetters will be like this
// [["ا","إ","أ","آ"],["ي","ى","ي"],["ا"],["م"],["ن"],["ا"]]
var res = [];
recurse(splitLetters, 0, '', res); // this function will generate all the permuations
return res;
}

function recurse(letters, index, cur, res){
if(index == letters.length){
res.push(cur);
} else {
for(var letter of letters[index]) {
recurse(letters, index + 1, cur + letter, res );
}
}
}

var sameLetters = [ // represents the variations that you want to enumerate
['ا', 'إ', 'أ', 'آ'],
['ي', 'ى', 'ي']
];

var word = 'أيامنا';
var customIndices = [0, 1]; // will make variations to the letters in these indices only. leave it empty for all indices

var ans = solve(word, sameLetters, customIndices);
console.log(ans);

关于php - 如何使用可互换字母获取单词的所有可能变体?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50210473/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com