gpt4 book ai didi

perl - 在 Perl 中使用 "sort"作为 utf8 字符串

转载 作者:行者123 更新时间:2023-12-01 15:08:48 25 4
gpt4 key购买 nike

我正在尝试找出如何在 Perl 中按字母顺序对数组进行排序。这是我所拥有的用英语工作得很好的东西:

   # List of countries (kept like this to keep clean, as its re-used in other places)
my $countries = {
'AT' => "íAustria",
'AU' => "Australia",
'BE' => "Belgium",
'BG' => "Bulgaria",
'CA' => "Canada",
'CY' => "Cyprus",
'CZ' => "Czech Republic",
'DK' => "Denmark",
'EN' => "England",
'EE' => "Estonia",
'FI' => "Finland",
'FR' => "France",
'DE' => "Germany",
'GB' => "Great Britain",
'GR' => "Greece",
'HU' => "Hungary",
'IE' => "Ireland",
'IT' => "Italy",
'LV' => "Latvia",
'LT' => "Lithuania",
'LU' => "Luxembourg",
'MT' => "Malta",
'NZ' => "New Zealand",
'NL' => "Netherlands",
'PL' => "Poland",
'PT' => "Portugal",
'RO' => "Romania",
'SK' => "Slovakia",
'SI' => "Slovenia",
'ES' => "Spain",
'SE' => "Sweden",
'CH' => "Switzerland",
'SC' => "Scotland",
'UK' => "United Kingdom",
'US' => "USA",
'TK' => "Turkey",
'NO' => "Norway",
'MX' => "Mexico",
'IL' => "Israel",
'IN' => "India",
'IS' => "Iceland",
'CN' => "China",
'JP' => "Japan",
'VN' => "áVietnamí"
};
# Populate the original loop with "name" and "code"
my @country_loop_orig;
print $IN->header;
foreach (keys %{$countries}) {
push @country_loop_orig, {
name => $countries->{$lang}->{$_},
code => $_
}
}

# sort it alphabetically
my @country_loop = sort { lc($a->{name}) cmp lc($b->{name}) } @country_loop_orig;

这适用于英文版本:

Australia
Austria
Belgium
Bulgaria
Canada
China
Cyprus
Czech Republic
Denmark
England
Estonia
Finland
France
Germany
Great Britain
Greece
Hungary
Iceland
India
Ireland
Israel
Italy
Japan
Latvia
Lithuania
Luxembourg
Malta
Mexico
Netherlands
New Zealand
Norway
Poland
Portugal
Romania
Scotland
Slovakia
Slovenia
Spain
Sweden
Switzerland
Turkey
United Kingdom
USA
Vietnam

...但是当您尝试使用 íéó 等 utf8 来执行此操作时,它不起作用:

Australia
Belgium
Bulgaria
Canada
China
Cyprus
Czech Republic
Denmark
England
Estonia
Finland
France
Germany
Great Britain
Greece
Hungary
Iceland
India
Ireland
Israel
Italy
Japan
Latvia
Lithuania
Luxembourg
Malta
Mexico
Netherlands
New Zealand
Norway
Poland
Portugal
Romania
Scotland
Slovakia
Slovenia
Spain
Sweden
Switzerland
Turkey
United Kingdom
USA
áVietnam
íAustria

你是如何做到这一点的?我找到了 Sort::Naturally::XS,但无法正常工作。

最佳答案

Unicode::Collate应该对此有所帮助。

对最后一个列表进行排序的简单示例

use warnings;
use strict;
use feature 'say';

use Unicode::Collate;

use open ":std", ":encoding(UTF-8)";

open my $fh, '<', "country_list.txt";
my @list = <$fh>;
chomp @list;

my $uc = Unicode::Collate->new();
my @sorted = $uc->sort(@list);

say for @sorted;

但是,在某些语言中,非 ascii 字符可能有一个非常特殊的接受位置,并且该问题没有提供任何详细信息。那么也许Unicode::Collate::Locale可以提供帮助。

参见(研究)this perl.com articlethis post (T. Christiansen)和 this Effective Perler article .


如果待排序的数据是复杂的数据结构,cmp方法用于单独比较

my @sorted = map { $uc->cmp($a, $b) } @list;

对于 $a$b,您将从复杂的数据结构中提取需要比较的内容。

关于perl - 在 Perl 中使用 "sort"作为 utf8 字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46617113/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com