我正在尝试从命令“ovs dump-flows”的输出中提取源和目标 MAC 和 IP 地址以及传输的数据包。命令的输出将如下所示
in_port(2),eth(src=00:26:55:e8:b0:43,dst=bc:30:5b:f7:07:fc),eth_type(0x0806),arp(sip=193.170.192.129,tip=193.170.192.142,op=2,sha=00:26:55:e8:b0:43,tha=bc:30:5b:f7:07:fc), packets:0, bytes:0, used:never, actions:1
in_port(2),eth(src=bc:30:5b:f6:dd:fc,dst=bc:30:5b:f7:07:fc),eth_type(0x0800),ipv4(src=193.170.192.143,dst=193.170.192.142,proto=6,tos=0,ttl=64,frag=no),tcp(src=45969,dst=5672), packets:1, bytes:87, used:4.040s, flags:P., actions:1
in_port(2),eth(src=bc:30:5b:f6:dd:fc,dst=bc:30:5b:f7:07:fc),eth_type(0x0800),ipv4(src=193.170.192.143,dst=193.170.192.142,proto=6,tos=0,ttl=64,frag=no),tcp(src=45992,dst=5672), packets:118412, bytes:21787661, used:2.168s, flags:P., actions:1
in_port(2),eth(src=00:18:6e:3a:aa:e8,dst=01:80:c2:00:00:00), packets:29131, bytes:1864384, used:1.200s, actions:drop
代码是
from pyparsing import *
import datetime,time
import os
f = os.popen('ovs-dpctl dump-flows ovs-system')
flows = f.read()
print "Flows are ", flows
LBRACE,RBRACE,COMMA,EQUAL,COLON = map(Suppress,'(),=:')
in_port = packets = proto = tos = ttl = src = dst = op = Word(nums)
ipAddress = Combine(Word(nums) + ('.' + Word(nums))*3)
twohex = Word(hexnums,exact=2)
macAddress = Combine(twohex + (':'+twohex)*5)
eth_type = Combine('0x' + Word(hexnums,exact=4))
frag = Word
flowTcp = "in_port" + LBRACE + in_port("in_port") + RBRACE + COMMA + "eth" + LBRACE + "src" + EQUAL + macAddress("src") + COMMA + "dst" + EQUAL + macAddress("dst") + RBRACE + COMMA + "eth_type" + LBRACE + eth_type("eth_type") + RBRACE + COMMA + "ipv4" + LBRACE + "src" + EQUAL + ipAddress("src") + COMMA + "dst" + EQUAL + ipAddress("dst") + COMMA + "proto" + EQUAL + proto("proto") + COMMA + "tos" + EQUAL + tos("tos") + COMMA + "ttl" + EQUAL + ttl("ttl") + COMMA + "frag" + EQUAL + frag("frag") + RBRACE + COMMA + "tcp" + LBRACE + "src" + EQUAL + src("srcPkt") + COMMA + "dst" + EQUAL + dst("dstPkt") + RBRACE + "packets" + COLON + packets("packets")
因为 Mac 地址、IP 地址和数据包的名称表示与“src”和“dst”相同。由于重复出现的名称,我无法解析和提取所需的数据。请建议如何做到这一点。
首先我必须重新格式化你的代码,这样我才能更容易地看到解析器中的结构:
flowTcp = ("in_port" + LBRACE + in_port("in_port") + RBRACE + COMMA +
"eth" + LBRACE + "src" + EQUAL + macAddress("src") + COMMA +
"dst" + EQUAL + macAddress("dst") + RBRACE + COMMA +
"eth_type" + LBRACE + eth_type("eth_type") + RBRACE + COMMA +
"ipv4" + LBRACE + "src" + EQUAL + ipAddress("src") + COMMA +
"dst" + EQUAL + ipAddress("dst") + COMMA +
"proto" + EQUAL + proto("proto") + COMMA +
"tos" + EQUAL + tos("tos") + COMMA +
"ttl" + EQUAL + ttl("ttl") + COMMA +
"frag" + EQUAL + frag("frag") + RBRACE + COMMA +
"tcp" + LBRACE +
"src" + EQUAL + src("srcPkt") + COMMA +
"dst" + EQUAL + dst("dstPkt") +
RBRACE +
"packets" + COLON + packets("packets"))
然后,为了解析您发布的示例,我必须将其中一些结构设为可选,并添加缺少的“eth”和“arp”字段(并修复您对 frag
的定义) :
frag = oneOf("yes no")
flowTcp = ("in_port" + LBRACE + in_port("in_port") + RBRACE + COMMA +
"eth" + LBRACE +
"src" + EQUAL + macAddress("src") + COMMA +
"dst" + EQUAL + macAddress("dst") +
RBRACE + COMMA +
Optional("eth_type" + LBRACE + eth_type("eth_type") + RBRACE + COMMA) +
Optional("arp" + LBRACE +
"sip" + EQUAL + ipAddress("sip") + COMMA +
"tip" + EQUAL + ipAddress("tip") + COMMA +
"op" + EQUAL + op("op") + COMMA +
"sha" + EQUAL + macAddress("sha") + COMMA +
"tha" + EQUAL + macAddress("tha") +
RBRACE + COMMA) +
Optional("ipv4" + LBRACE +
"src" + EQUAL + ipAddress("src") + COMMA +
"dst" + EQUAL + ipAddress("dst") + COMMA +
"proto" + EQUAL + proto("proto") + COMMA +
"tos" + EQUAL + tos("tos") + COMMA +
"ttl" + EQUAL + ttl("ttl") + COMMA +
"frag" + EQUAL + frag("frag") +
RBRACE + COMMA) +
Optional("tcp" + LBRACE +
"src" + EQUAL + src("srcPkt") + COMMA +
"dst" + EQUAL + dst("dstPkt") +
RBRACE) +
"packets" + COLON + packets("packets"))
此时,解析器“工作”了,但它出现了您所询问的问题,即您重复使用了一些结果名称,如“src”、“dst”等。
显然,您可以使用不同的名称,例如“eth_src”、“tcp_src”。但我建议您使用 pyparsing Group
类来为您解析的数据添加结构。我把每个子结构都拿出来定义为它们自己的迷你解析器:
eth = Group("eth" + LBRACE +
"src" + EQUAL + macAddress("src") + COMMA +
"dst" + EQUAL + macAddress("dst") +
RBRACE)
arp = Group("arp" + LBRACE +
"sip" + EQUAL + ipAddress("sip") + COMMA +
"tip" + EQUAL + ipAddress("tip") + COMMA +
"op" + EQUAL + op("op") + COMMA +
"sha" + EQUAL + macAddress("sha") + COMMA +
"tha" + EQUAL + macAddress("tha") +
RBRACE)
ipv4 = Group("ipv4" + LBRACE + "src" + EQUAL + ipAddress("src") + COMMA +
"dst" + EQUAL + ipAddress("dst") + COMMA +
"proto" + EQUAL + proto("proto") + COMMA +
"tos" + EQUAL + tos("tos") + COMMA +
"ttl" + EQUAL + ttl("ttl") + COMMA +
"frag" + EQUAL + frag("frag") +
RBRACE)
tcp = Group("tcp" + LBRACE +
"src" + EQUAL + src("srcPkt") + COMMA +
"dst" + EQUAL + dst("dstPkt") +
RBRACE)
然后我将每个结果添加回主解析器,并为每个组指定一个结果名称。
flowTcp = ("in_port" + LBRACE + in_port("in_port") + RBRACE + COMMA +
eth("eth") + COMMA +
Optional("eth_type" + LBRACE + eth_type("eth_type") + RBRACE + COMMA ) +
Optional(arp("arp") + COMMA) +
Optional(ipv4("ipv4") + COMMA) +
Optional(tcp("tcp") + COMMA) +
"packets" + COLON + packets("packets"))
(所以我在这里做了 2 件事 - 我对子结构进行了分组
,然后给它们命名。我本可以在不破坏 eth、arp 等的情况下内联整个事情,但我迷路了,试图将所有内容都放在一个最重要的声明中。)
现在我解析了您的 4 个示例,并导出了结果。 dump() 方法将向您展示输出结果中的结构,示例代码展示了如何使用普通属性命名(如 flowTcpValues.eth.src
)访问子结构。
for d in data:
print d
flowTcpValues = flowTcp.parseString(d)
print flowTcpValues.dump()
print flowTcpValues.packets
print flowTcpValues.eth.src
print flowTcpValues.eth.dst
print
给予:
Flows are
in_port(2),eth(src=00:26:55:e8:b0:43,dst=bc:30:5b:f7:07:fc),eth_type(0x0806),arp(sip=193.170.192.129,tip=193.170.192.142,op=2,sha=00:26:55:e8:b0:43,tha=bc:30:5b:f7:07:fc), packets:0, bytes:0, used:never, actions:1
['in_port', '2', ['eth', 'src', '00:26:55:e8:b0:43', 'dst', 'bc:30:5b:f7:07:fc'], 'eth_type', '0x0806', ['arp', 'sip', '193.170.192.129', 'tip', '193.170.192.142', 'op', '2', 'sha', '00:26:55:e8:b0:43', 'tha', 'bc:30:5b:f7:07:fc'], 'packets', '0']
- arp: ['arp', 'sip', '193.170.192.129', 'tip', '193.170.192.142', 'op', '2', 'sha', '00:26:55:e8:b0:43', 'tha', 'bc:30:5b:f7:07:fc']
- op: 2
- sha: 00:26:55:e8:b0:43
- sip: 193.170.192.129
- tha: bc:30:5b:f7:07:fc
- tip: 193.170.192.142
- eth: ['eth', 'src', '00:26:55:e8:b0:43', 'dst', 'bc:30:5b:f7:07:fc']
- dst: bc:30:5b:f7:07:fc
- src: 00:26:55:e8:b0:43
- eth_type: 0x0806
- in_port: 2
- packets: 0
0
00:26:55:e8:b0:43
bc:30:5b:f7:07:fc
in_port(2),eth(src=bc:30:5b:f6:dd:fc,dst=bc:30:5b:f7:07:fc),eth_type(0x0800),ipv4(src=193.170.192.143,dst=193.170.192.142,proto=6,tos=0,ttl=64,frag=no),tcp(src=45969,dst=5672), packets:1, bytes:87, used:4.040s, flags:P., actions:1
['in_port', '2', ['eth', 'src', 'bc:30:5b:f6:dd:fc', 'dst', 'bc:30:5b:f7:07:fc'], 'eth_type', '0x0800', ['ipv4', 'src', '193.170.192.143', 'dst', '193.170.192.142', 'proto', '6', 'tos', '0', 'ttl', '64', 'frag', 'no'], ['tcp', 'src', '45969', 'dst', '5672'], 'packets', '1']
- eth: ['eth', 'src', 'bc:30:5b:f6:dd:fc', 'dst', 'bc:30:5b:f7:07:fc']
- dst: bc:30:5b:f7:07:fc
- src: bc:30:5b:f6:dd:fc
- eth_type: 0x0800
- in_port: 2
- ipv4: ['ipv4', 'src', '193.170.192.143', 'dst', '193.170.192.142', 'proto', '6', 'tos', '0', 'ttl', '64', 'frag', 'no']
- dst: 193.170.192.142
- frag: no
- proto: 6
- src: 193.170.192.143
- tos: 0
- ttl: 64
- packets: 1
- tcp: ['tcp', 'src', '45969', 'dst', '5672']
- dstPkt: 5672
- srcPkt: 45969
1
bc:30:5b:f6:dd:fc
bc:30:5b:f7:07:fc
in_port(2),eth(src=bc:30:5b:f6:dd:fc,dst=bc:30:5b:f7:07:fc),eth_type(0x0800),ipv4(src=193.170.192.143,dst=193.170.192.142,proto=6,tos=0,ttl=64,frag=no),tcp(src=45992,dst=5672), packets:118412, bytes:21787661, used:2.168s, flags:P., actions:1
['in_port', '2', ['eth', 'src', 'bc:30:5b:f6:dd:fc', 'dst', 'bc:30:5b:f7:07:fc'], 'eth_type', '0x0800', ['ipv4', 'src', '193.170.192.143', 'dst', '193.170.192.142', 'proto', '6', 'tos', '0', 'ttl', '64', 'frag', 'no'], ['tcp', 'src', '45992', 'dst', '5672'], 'packets', '118412']
- eth: ['eth', 'src', 'bc:30:5b:f6:dd:fc', 'dst', 'bc:30:5b:f7:07:fc']
- dst: bc:30:5b:f7:07:fc
- src: bc:30:5b:f6:dd:fc
- eth_type: 0x0800
- in_port: 2
- ipv4: ['ipv4', 'src', '193.170.192.143', 'dst', '193.170.192.142', 'proto', '6', 'tos', '0', 'ttl', '64', 'frag', 'no']
- dst: 193.170.192.142
- frag: no
- proto: 6
- src: 193.170.192.143
- tos: 0
- ttl: 64
- packets: 118412
- tcp: ['tcp', 'src', '45992', 'dst', '5672']
- dstPkt: 5672
- srcPkt: 45992
118412
bc:30:5b:f6:dd:fc
bc:30:5b:f7:07:fc
in_port(2),eth(src=00:18:6e:3a:aa:e8,dst=01:80:c2:00:00:00), packets:29131, bytes:1864384, used:1.200s, actions:drop
['in_port', '2', ['eth', 'src', '00:18:6e:3a:aa:e8', 'dst', '01:80:c2:00:00:00'], 'packets', '29131']
- eth: ['eth', 'src', '00:18:6e:3a:aa:e8', 'dst', '01:80:c2:00:00:00']
- dst: 01:80:c2:00:00:00
- src: 00:18:6e:3a:aa:e8
- in_port: 2
- packets: 29131
29131
00:18:6e:3a:aa:e8
01:80:c2:00:00:00
我是一名优秀的程序员,十分优秀!