gpt4 book ai didi

linux - 打印 file1 到 file2 的差异而不删除 file2 中的任何内容

转载 作者:太空宇宙 更新时间:2023-11-04 09:13:18 25 4
gpt4 key购买 nike

我正在创建一个脚本,用于根据预定义的黑化 IP 列表从 a.csv 日志文件中搜索 IP。

它首先导入日志文件,然后从中解析IP,然后根据预定义的黑IP列表搜索解析的IP,最后它需要询问用户(如果找到任何结果)将结果保存到原始日志文件那是进口的。

文件1是代码中IP-output.csv的例子。

文件2是代码中$filename的例子(原导入.csv)。

文件 1:

107.147.166.60 ,SUSPICIOUS IP
107.147.167.26 ,SUSPICIOUS IP
108.48.185.186 ,SUSPICIOUS IP
108.51.114.130 ,SUSPICIOUS IP
142.255.102.68 ,SUSPICIOUS IP

文件 2:

outlook.office365.com ,174.203.0.118 ,UserLoginFailed
outlook.office365.com ,107.147.166.60 ,UserLoginFailed
outlook.office365.com ,107.147.167.26 ,UserLoginFailed
outlook.office365.com ,174.205.17.24 ,UserLoginFailed
outlook.office365.com ,108.48.185.186 ,UserLoginFailed
outlook.office365.com ,174.226.15.21 ,UserLoginFailed
outlook.office365.com ,108.51.114.130 ,UserLoginFailed
outlook.office365.com ,67.180.23.93 ,UserLoginFailed
outlook.office365.com ,142.255.102.68 ,UserLoginFailed
outlook.office365.com ,164.106.75.235 ,UserLoginFailed

我想将文件 2 更改为:

outlook.office365.com ,174.203.0.118 ,UserLoginFailed
outlook.office365.com ,107.147.166.60 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,107.147.167.26 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,174.205.17.24 ,UserLoginFailed
outlook.office365.com ,108.48.185.186 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,174.226.15.21 ,UserLoginFailed
outlook.office365.com ,108.51.114.130 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,67.180.23.93 ,UserLoginFailed
outlook.office365.com ,142.255.102.68 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,164.106.75.235 ,UserLoginFailed

这是我创建的脚本:

#!/bin/bash
#
# IP Blacklist Checker
#Import .csv (File within working directory)
echo "Please import a .csv log file to parse/search the IP(s) and UserAgents: "
read filename
#Parsing IPs from .csv log file
echo "Parsing IP(s) from imported log file..."
grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' $filename | sort | uniq > IP-list.txt
echo 'Done'
awk 'END {print NR,"IP(s) Found in imported log file"}' IP-list.txt
echo 'IPs found in imported log file:'
cat IP-list.txt
#searches parsed ip's against blacked ip lists
echo 'Searching parsed IP(s) from pre-defined Blacked IP List Databases...'
fgrep -w -f "IP-list.txt" "IPlist.txt" > IP-output.txt
awk 'END {print NR,"IP(s) Found Blacked IP List Databases"}' IP-output.txt
echo 'Suspicious IPs found in Blacked IP List Databases:'
cat IP-output.txt
while true; do
read -p "Do you want to add results to log file?" yn
case $yn in
[Yy]* ) grep -Ff IP-output.txt $filename | sed 's/$/ ,SUSPICIOUS IP/' > IP-output.csv && awk 'FNR==NR {m[$1]=$0; next} {for (i in m) {match($0,i); val=substr($0, RSTART, RLENGTH); if (val) {sub(val, m[i]); print; next}};} 1' IP-output.csv $filename > $filename; break;;
[Nn]* ) break;;
* ) echo "Please answer yes or no.";;
esac
done
echo "Finished searching parsed IP(s) from pre-defined Blacked IP List Databases."
rm IP-list.txt IP-output.csv IP-output.txt

我正在导入的日志文件非常长,有 15-20 列,IPlist.txt(黑化 IP)中有超过 15000 个 IP。将结果保存到同一个日志文件后,.csv 文件变空,如果我用不同的名称保存它,所有列都会乱序,IP 列旁边会出现“,可疑 IP”列,我需要它而不是在最后一列(行尾)。

我也不知道如何只在找到任何东西时才提示保存文件,如果不是只提示找不到任何东西!

我得到的结果:

 outlook.office365.com ,174.203.0.118 ,UserLoginFailed
outlook.office365.com ,107.147.166.60 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,107.147.167.26 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,174.205.17.24 ,UserLoginFailed
outlook.office365.com ,108.48.185.186 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,174.226.15.21 ,UserLoginFailed
outlook.office365.com ,108.51.114.130 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,67.180.23.93 ,UserLoginFailed
outlook.office365.com ,142.255.102.68 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,164.106.75.235 ,UserLoginFailed

最佳答案

你的意思是这样的:

awk 'FNR==NR { m[$1]=$0; next; } { for (i in m) { idx = index($0, i); if (idx > 0) { print substr($0, 1, idx-1) m[i]; next; } } } 1' file1.txt file2.txt > newfile2.txt

它基本上按顺序处理file1.txtfile2.txtFNR==NR 对于第一个文件中的所有行都是正确的,其中映射 m 是用替换模式构建的(第一个空格之前的所有内容都映射到整行).对于第二个文件,将在 m 中检查每一行是否匹配。如果匹配(使用 index()),脚本会打印匹配之前的所有内容,然后打印来自 m 的值。哦,最后的 1 将打印 file2 中的不匹配行。

关于linux - 打印 file1 到 file2 的差异而不删除 file2 中的任何内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52019512/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com