awk - gawk FS 将记录拆分为单个字符-6ren

awk - gawk FS 将记录拆分为单个字符

转载作者：行者123 更新时间：2023-12-04 11:38:29

如果字段分隔符为空字符串，则每个字符成为一个单独的字段

$ echo hello | awk -F '' -v OFS=, '{$1 = NF OFS $1} 1'
5,h,e,l,l,o

但是，如果 FS 是一个可能匹配零次的正则表达式，则不会发生相同的行为:

$ echo hello | awk -F ' *' -v OFS=, '{$1 = NF OFS $1} 1'
1,hello

有谁知道这是为什么？我在 gawk manual 中找不到任何内容.是 FS=""只是特例？

我最感兴趣的是理解为什么第二种情况不会将记录拆分为更多字段。就好像 awk 正在处理 FS=" *"喜欢 FS=" +"

最佳答案

有趣的问题!

我刚刚提取了 gnu-awk 4.1.0 的代码，我想我们可以在文件 field.c 中找到答案。 .

line 371:
 * re_parse_field --- parse fields using a regexp.
 *
 * This is called both from get_field() and from do_split()
 * via (*parse_field)().  This variation is for when FS is a regular
 * expression -- either user-defined or because RS=="" and FS==" "
 */
static long
re_parse_field(lo...

还有这一行:( line 425 ):

if (REEND(rp, scan) == RESTART(rp, scan)) {   /* null match */

这是 <space>*的情况匹配你的问题。实现没有增加 nf ，也就是说，它认为整行是一个字段。注意这个函数在 do_split() 中使用过功能也。

首先，如果 FS为空字符串，gawk 将每个字符分隔到自己的字段中。 gawk 的 doc 清楚地写了这个，也在代码中，我们可以看到:

line 613:
 * null_parse_field --- each character is a separate field
 *
 * This is called both from get_field() and from do_split()
 * via (*parse_field)().  This variation is for when FS is the null string.
 */
static long
null_parse_field(long up_to,

如果 FS具有单个字符，awk 不会将其视为正则表达式。文档中也提到了这一点。同样在代码中:

#line 667
 * sc_parse_field --- single character field separator
 *
 * This is called both from get_field() and from do_split()
 * via (*parse_field)().  This variation is for when FS is a single character
 * other than space.
 */
static long
sc_parse_field(l

如果我们阅读该函数，则那里没有进行正则表达式匹配处理。

在函数的注释中 re_parse_field() , 和 sc_parse_field() ，我们看到 do_split也会调用它们。它解释了为什么我们有 1在以下命令中而不是 3 :

kent$  echo "foo"|awk '{split($0,a,/ */);print length(a)}'
1

备注，为了避免帖子太长，我没有把完整的代码贴在这里，我们可以在这里找到代码:

http://git.savannah.gnu.org/cgit/gawk.git/

关于awk - gawk FS 将记录拆分为单个字符，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/22044272/

文章推荐： r - 在R中的数据框中进行列洗牌后删除不必要的行

文章推荐： r - "Collapse"使用列名作为 ID 将多列分成两列

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

awk - gawk FS 将记录拆分为单个字符