gpt4 book ai didi

perl - 如何在 perl CGI 参数中使用 unicode

转载 作者:行者123 更新时间:2023-12-05 03:15:09 25 4
gpt4 key购买 nike

我有一个 Perl CGI 脚本接受 unicode 字符作为参数之一。
url 的形式是

.../worker.pl?text="some_unicode_chars"&...

在 perl 脚本中,我将 $text 变量传递给 shell 脚本:

system "a.sh \"$text\" out_put_file"; 

如果我在 perl 脚本中对文本进行硬编码,则效果很好。但是,当使用 CGI 从 Web 获取 $text 时,输出没有任何意义。

my $q = CGI->new;  
my $text = $q->param('text');

我怀疑是编码导致了问题。 uft-8 给我带来了很多麻烦。谁能帮帮我?

最佳答案

也许这会有所帮助。来自 Perl Programming/Unicode UTF-8 :

By default, CGI.pm does not decode your form parameters. You can use the -utf8 pragma, which will treat (and decode) all parameters as UTF-8 strings, but this will fail if you have any binary file upload fields. A better solution involves overriding the param method: (example follows)

[错误 - 请参阅更正] 这是 documentation for the utf-8 pragma .由于上传二进制数据似乎不是您关心的问题,因此使用 utf-8 pragma 似乎是最直接的方法。

更正:根据@Slaven 的评论,不要混淆一般的 Perl utf8 pragma 和 -utf-8 pragma已定义为与 CGI.pm 一起使用:

-utf8

This makes CGI.pm treat all parameters as UTF-8 strings. Use this with care, as it will interfere with the processing of binary uploads. It is better to manually select which fields are expected to return utf-8 strings and convert them using code like this:

use Encode;
my $arg = decode utf8=>param('foo');

跟进: duleshi,你问:但我仍然不明白 Encode 中的 decode 和 utf8::decode 之间的区别。 Encode 和 utf8 模块有何不同?

来自 utf8 pragma 的文档:

Note that this function does not handle arbitrary encodings. Therefore Encode is recommended for the general purposes; see also Encode.

换句话说,Encode 模块使用许多 不同的编码(包括 UTF-8),而 utf8 函数使用 UTF-8 编码。

这是一个 Perl 程序,它演示了编码和解码 UTF-8 的两种方法的等效性。 (另请参阅 live demo。)

#!/usr/bin/perl

use strict;
use warnings;
use utf8; # allows 'ñ' to appear in the source code

use Encode;

my $word = "Español"; # the 'ñ' is permitted because of the 'use utf8' pragma

# Convert the string to its UTF-8 equivalent.
my $utf8_word = Encode::encode("UTF-8", $word);

# Use 'utf8::decode' to convert the string back to internal form.
my $word_again_via_utf8 = $utf8_word;
utf8::decode($word_again_via_utf8); # converts in-place

# Use 'Encode::decode' to convert the string back to internal form.
my $word_again_via_Encode = Encode::decode("UTF-8", $utf8_word);

# Do the two conversion methods produce the same result?
# Prints 'Yes'.
print $word_again_via_utf8 eq $word_again_via_Encode ? "Yes\n" : "No\n";

# Do we get back the original internal string after converting both ways?
# Prints 'Yes'.
print $word eq $word_again_via_Encode ? "Yes\n" : "No\n";

关于perl - 如何在 perl CGI 参数中使用 unicode,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20424488/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com