gpt4 book ai didi

linux - 词频和-gt

转载 作者:塔克拉玛干 更新时间:2023-11-02 23:33:57 25 4
gpt4 key购买 nike

我的代码检查文件中所有单词的频率并显示,但我想知道如何只显示长度大于变量 k 的单词。这是我的代码:

#!/bin/bash
if [ $# -eq 0 ]; then

echo "you need an argument"
exit 2
fi

echo "Insert k"
read k
for file in $@; do
if ! [ -f $file ]; then
echo "Not a file"
exit 2
fi
sed -e 's/\s/\n/g' < $file | sort | uniq -c | sort -nr
done

文件内容:

ceva
ceva
aiurea
sebi
este
cel
mai
smecher

输出:

     2 ceva
1 smecher
1 sebi
1 mai
1 este
1 cel
1 aiurea

最佳答案

使用 awk 计算字长大于变量的频率:

awk -v k=3 'length() > k { freq[$0]++} END{for (i in freq) print freq[i], i}' file |
sort -rn

2 ceva
1 smecher
1 sebi
1 este
1 aiurea

完整脚本:

#!/usr/bin/env bash
if [[ $# -eq 0 ]]; then
echo "you need an argument"
exit 2
fi

read -p "Insert k: " k

for file in "$@"; do
if [[ ! -f $file ]]; then
echo "$file is not a file"
exit 2
fi

echo "$file:"
awk -v k=$k 'length()>k{freq[$0]++} END{for (i in freq) print freq[i], i}' "$file" | sort -rn
done

关于linux - 词频和-gt,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43481413/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com