gpt4 book ai didi

python - 如何使用python客户端实时捕获来自HiveServer2的查询日志?

转载 作者:可可西里 更新时间:2023-11-01 16:30:57 28 4
gpt4 key购买 nike

我使用修改后的 pyhs2 版本(https://pypi.python.org/pypi/pyhs2),能够在 Hue(https://github.com/cloudera/hue/blob/master/apps/beeswax/gen-py/TCLIService/TCLIService.py#L739)源中运行异步查询和来自 TCLIService.Client(GetLog、send_GetLog、recv_GetLog)的其他方法

但是当我运行 TCLIService.Client.GetLog 方法时,出现错误:

$ python example.py 
Traceback (most recent call last):
File "example.py", line 85, in <module>
rq = client.GetLog(lq)
File "/Users/toly/hive_streaming/libs/pyhs4/TCLIService/TCLIService.py", line 757, in GetLog
return self.recv_GetLog()
File "/Users/toly/hive_streaming/libs/pyhs4/TCLIService/TCLIService.py", line 773, in recv_GetLog
raise x
thrift.Thrift.TApplicationException: Invalid method name: 'GetLog'

在脚本中,我使用 Cloudera VM 中的 HiveServer2。正如我所料,Hue 使用的是同一台服务器,并且它成功运行。此外,我尝试使用 0 到 7 范围内的 client_protocol 来创建 session 。

import time
import sasl

from thrift.protocol.TBinaryProtocol import TBinaryProtocol
from thrift.transport.TSocket import TSocket
from thrift.transport.TTransport import TBufferedTransport
from libs.pyhs4.cloudera.thrift_sasl import TSaslClientTransport


from libs.pyhs4.TCLIService import TCLIService
from libs.pyhs4.TCLIService.ttypes import TOpenSessionReq, TGetTablesReq, TFetchResultsReq,\
TStatusCode, TGetResultSetMetadataReq, TGetColumnsReq, TType, TTypeId, \
TExecuteStatementReq, TGetOperationStatusReq, TFetchOrientation, TCloseOperationReq, \
TCloseSessionReq, TGetSchemasReq, TCancelOperationReq, TGetLogReq

auth = 'PLAIN'
username = 'apanin'
password = 'none'
host = 'cloudera'
port = 10000
test_hql1 = 'select count(*) from test_text'


def sasl_factory():
saslc = sasl.Client()
saslc.setAttr("username", username)
saslc.setAttr("password", password)
saslc.init()
return saslc


def get_type(typeDesc):
for ttype in typeDesc.types:
if ttype.primitiveEntry is not None:
return TTypeId._VALUES_TO_NAMES[ttype.primitiveEntry.type]
elif ttype.mapEntry is not None:
return ttype.mapEntry
elif ttype.unionEntry is not None:
return ttype.unionEntry
elif ttype.arrayEntry is not None:
return ttype.arrayEntry
elif ttype.structEntry is not None:
return ttype.structEntry
elif ttype.userDefinedTypeEntry is not None:
return ttype.userDefinedTypeEntry


def get_value(colValue):
if colValue.boolVal is not None:
return colValue.boolVal.value
elif colValue.byteVal is not None:
return colValue.byteVal.value
elif colValue.i16Val is not None:
return colValue.i16Val.value
elif colValue.i32Val is not None:
return colValue.i32Val.value
elif colValue.i64Val is not None:
return colValue.i64Val.value
elif colValue.doubleVal is not None:
return colValue.doubleVal.value
elif colValue.stringVal is not None:
return colValue.stringVal.value


sock = TSocket(host, port)
transport = TSaslClientTransport(sasl_factory, "PLAIN", sock)
client = TCLIService.Client(TBinaryProtocol(transport))
transport.open()

res = client.OpenSession(TOpenSessionReq(username=username, password=password))
session = res.sessionHandle

query1 = TExecuteStatementReq(session, statement=test_hql1, confOverlay={}, runAsync=True)
response1 = client.ExecuteStatement(query1)
opHandle1 = response1.operationHandle


while True:
time.sleep(1)

q1 = TGetOperationStatusReq(operationHandle=opHandle1)
res1 = client.GetOperationStatus(q1)

lq = TGetLogReq(opHandle1)
rq = client.GetLog(lq)

if res1.operationState == 2:
break


req = TCloseOperationReq(operationHandle=opHandle1)
client.CloseOperation(req)

req = TCloseSessionReq(sessionHandle=session)
client.CloseSession(req)

HiveServer2如何实时抓取hive查询日志?

UPD Hive 版本 - 1.2.1

最佳答案

要获取操作日志,请使用带有参数 fetchType=1 的方法 FetchResults - 返回日志。

示例用法:

query1 = TExecuteStatementReq(session, statement=test_hql1, confOverlay={}, runAsync=True)
response1 = client.ExecuteStatement(query1)
opHandle1 = response1.operationHandle

while True:
time.sleep(1)

q1 = TGetOperationStatusReq(operationHandle=opHandle1)
res1 = client.GetOperationStatus(q1)

request_logs = TFetchResultsReq(operationHandle=opHandle1, orientation=0, maxRows=10, fetchType=1)
response_logs = client.FetchResults(request_logs)

print response_logs.results

if res1.operationState == 2:
break

关于python - 如何使用python客户端实时捕获来自HiveServer2的查询日志?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32530086/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com