博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
记一次曲折的Debug经历
阅读量:2133 次
发布时间:2019-04-30

本文共 36333 字,大约阅读时间需要 121 分钟。

Impala-shell卡顿分析——记一次曲折的Debug经历

问题发现

最近准备在Impala中增加对UTF-8的支持,以修正跟Hive、Spark等基于Java的系统在UTF-8字符串上的不兼容表现(如、、等)。结果在impala-shell里简单跑了个substring查询就挂住了:

[localhost:21050] default> select substr("引擎", 1, 3);Query: select substr("引擎", 1, 3)Query submitted at: 2020-10-28 11:54:26 (Coordinator: http://quanlong-OptiPlex-BJ:25000)Query progress can be monitored at: http://quanlong-OptiPlex-BJ:25000/query_plan?query_id=464f44d405bb16a1:5b41e00e00000000+----------------------+| substr('引擎', 1, 3) |+----------------------+| 引                   |+----------------------+Fetched 1 row(s) in 0.13s[localhost:21050] default> select substr("引擎", 1, 2);Query: select substr("引擎", 1, 2)Query submitted at: 2020-10-28 11:54:32 (Coordinator: http://quanlong-OptiPlex-BJ:25000)Query progress can be monitored at: http://quanlong-OptiPlex-BJ:25000/query_plan?query_id=d048231be08f29b4:860c959500000000Caught exception 'utf8' codec can't decode bytes in position 0-1: unexpected end of data, type=
in FetchResults.

第一个查询执行 substr(“引擎”, 1, 3) 没有问题,因为每个中文字符的UTF-8编码是3个字节,而Impala中把String都当成了字节数组(byte array),因此前3个字节就是第一个中文字符,没问题。

第二个查询读出来的是 “引” 的UTF-8编码的前2个字节,在解码成UTF-8字符时报了unexpected end of data的错误,并且impala-shell卡住不动,只能强行Ctrl+C.

这个在以前的版本是没问题的,以前解码失败的UTF-8字符是以乱码的形式展现,impala-shell也不会卡死。于是我先建了个JIRA汇报问题:,我所在的分支是master分支,commit为1bd27a3.

问题初探

定位Impala-Shell卡住的代码位置

进程卡死时一般是把堆栈打出来,看它卡在哪了。Impala-shell是个python进程,先用pystack看下:

$ ps aux | grep shellquanlong 30957  0.5  0.0 296100 24716 pts/22   S+   15:06   0:00 bin/../infra/python/env-gcc7.5.0/bin/python /home/quanlong/workspace/Impala/shell/impala_shell.pyquanlong 31003  0.0  0.0  15960  1088 pts/20   S+   15:06   0:00 grep --color=auto shell$ sudo /home/quanlong/.local/bin/pystack 30957Dumping Threads....  File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 2058, in 
impala_shell_main() File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 2023, in impala_shell_main shell.cmdloop(intro) File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/cmd.py", line 142, in cmdloop stop = self.onecmd(line) File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 687, in onecmd return func(arg) File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 1113, in do_select return self._execute_stmt(query_str, print_web_link=True) File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 1303, in _execute_stmt self.imp_client.close_query(self.last_query_handle) File "/home/quanlong/workspace/Impala/shell/impala_client.py", line 847, in close_query resp = self._do_hs2_rpc(CloseImpalaOperation, retry_on_error=True) File "/home/quanlong/workspace/Impala/shell/impala_client.py", line 959, in _do_hs2_rpc return rpc() File "/home/quanlong/workspace/Impala/shell/impala_client.py", line 845, in CloseImpalaOperation return self.imp_service.CloseImpalaOperation(req) File "/home/quanlong/workspace/Impala/shell/build/thrift-11-gen/gen-py/ImpalaService/ImpalaHiveServer2Service.py", line 155, in CloseImpalaOperation return self.recv_CloseImpalaOperation() File "/home/quanlong/workspace/Impala/shell/build/thrift-11-gen/gen-py/ImpalaService/ImpalaHiveServer2Service.py", line 167, in recv_CloseImpalaOperation (fname, mtype, rseqid) = iprot.readMessageBegin() File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 148, in readMessageBegin name = self.trans.readAll(sz) File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 60, in readAll chunk = self.read(sz - have) File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 162, in read self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size))) File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 117, in read buff = self.handle.recv(sz) File "
", line 1, in
File "
", line 1, in

这里能看到impala-shell是卡在一个close_query操作,具体是发hs2协议的CloseImpalaOperation RPC,在等impalad那边回复。可以看到thrift层的代码都在做各种read操作并卡住。那么问题很可能出现在server端,要排查一下CloseImpalaOperation这个RPC为什么没有返回。

定位Server端卡住的位置

查看Impalad的日志,看不出什么异常,看起来查询就是正常结束了。比如我重新跑了个 select substr(“引擎”, 1, 4) 的查询,impala-shell 也是卡住不动,impalad端的日志如下:

I1106 09:49:19.698134 15454 Frontend.java:1532] fc450a0fa19a6fbc:9315824900000000] Analyzing query: select substr("引擎", 1, 4) db: defaultI1106 09:49:19.702113 15454 BaseAuthorizationChecker.java:110] fc450a0fa19a6fbc:9315824900000000] Authorization check took 3 msI1106 09:49:19.702214 15454 Frontend.java:1574] fc450a0fa19a6fbc:9315824900000000] Analysis and authorization finished.W1106 09:49:19.706602 15459 executor-group.cc:164] fc450a0fa19a6fbc:9315824900000000] Executor group empty group (using coordinator only) is unhealthy: 0 out of 1 are available.I1106 09:49:19.706990 15459 admission-controller.cc:1646] fc450a0fa19a6fbc:9315824900000000] Trying to admit id=fc450a0fa19a6fbc:9315824900000000 in pool_name=default-pool executor_group_name=empty group (using coordinator only) per_host_mem_estimate=10.00 MB dedicated_coord_mem_estimate=100.00 MB max_requests=-1 max_queued=200 max_mem=-1.00 BI1106 09:49:19.707115 15459 admission-controller.cc:1654] fc450a0fa19a6fbc:9315824900000000] Stats: agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0,  local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, backend_mem_reserved=0, topN_query_stats: queries=[], total_mem_consumed=0; pool_level_stats: num_running=0, min=0, max=0, pool_total_mem=0)I1106 09:49:19.707186 15459 admission-controller.cc:1200] fc450a0fa19a6fbc:9315824900000000] Admitting query id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.707155 15454 impala-hs2-server.cc:523] ExecuteStatement(): return_val=TExecuteStatementResp {
01: status (struct) = TStatus {
01: statusCode (i32) = 0, }, 02: operationHandle (struct) = TOperationHandle {
01: operationId (struct) = THandleIdentifier {
01: guid (string) = "\xbco\x9a\xa1\x0f\nE\xfc\x00\x00\x00\x00I\x82\x15\x93", 02: secret (string) = "\xa5|\xf9A\xbc\x8aG\xd4\x8a\r\xb8\xae\xb4\xcf\x9f\xfc", }, 02: operationType (i32) = 0, 03: hasResultSet (bool) = true, },}I1106 09:49:19.707319 15459 impala-server.cc:2062] fc450a0fa19a6fbc:9315824900000000] Registering query locationsI1106 09:49:19.707382 15459 coordinator.cc:149] fc450a0fa19a6fbc:9315824900000000] Exec() query_id=fc450a0fa19a6fbc:9315824900000000 stmt=select substr("引擎", 1, 4)I1106 09:49:19.707787 15454 impala-hs2-server.cc:832] GetResultSetMetadata(): query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.707967 15459 coordinator.cc:473] fc450a0fa19a6fbc:9315824900000000] starting execution on 1 backends for query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.707902 15454 impala-hs2-server.cc:864] GetResultSetMetadata(): return_val=TGetResultSetMetadataResp {
01: status (struct) = TStatus {
01: statusCode (i32) = 0, }, 02: schema (struct) = TTableSchema {
01: columns (list) = list
[1] {
[0] = TColumnDesc {
01: columnName (string) = "substr('\xe5\xbc\x95\xe6\x93\x8e', 1, 4)", 02: typeDesc (struct) = TTypeDesc {
01: types (list) = list
[1] {
[0] = TTypeEntry {
01: primitiveEntry (struct) = TPrimitiveTypeEntry {
01: type (i32) = 7, }, }, }, }, 03: position (i32) = 0, }, }, },}I1106 09:49:19.709297 7513 control-service.cc:142] fc450a0fa19a6fbc:9315824900000000] ExecQueryFInstances(): query_id=fc450a0fa19a6fbc:9315824900000000 coord=quanlong-OptiPlex-BJ:27000 #instances=1I1106 09:49:19.711102 15462 query-state.cc:897] fc450a0fa19a6fbc:9315824900000000] Executing instance. instance_id=fc450a0fa19a6fbc:9315824900000000 fragment_idx=0 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=1I1106 09:49:19.711118 15459 coordinator.cc:532] fc450a0fa19a6fbc:9315824900000000] started execution on 1 backends for query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.810067 15454 coordinator.cc:672] ExecState: query id=fc450a0fa19a6fbc:9315824900000000 execution completedI1106 09:49:19.810063 15462 krpc-data-stream-mgr.cc:337] fc450a0fa19a6fbc:9315824900000000] cancelling active streams for fragment_instance_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.810087 15454 coordinator.cc:821] Coordinator waiting for backends to finish, 1 remaining. query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.810119 15462 query-state.cc:906] fc450a0fa19a6fbc:9315824900000000] Instance completed. instance_id=fc450a0fa19a6fbc:9315824900000000 #in-flight=0 status=OKI1106 09:49:19.810176 15461 query-state.cc:464] fc450a0fa19a6fbc:9315824900000000] UpdateBackendExecState(): last report for fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.810976 7513 coordinator.cc:965] Backend completed: host=quanlong-OptiPlex-BJ:27000 remaining=1 query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.811048 15454 coordinator.cc:1284] Release admission control resources for query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.811172 15461 query-state.cc:686] fc450a0fa19a6fbc:9315824900000000] Cancelling fragment instances as directed by the coordinator. Returned status: CancelledI1106 09:49:19.811204 15461 query-state.cc:923] fc450a0fa19a6fbc:9315824900000000] Cancel: query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.811213 15461 krpc-data-stream-mgr.cc:337] fc450a0fa19a6fbc:9315824900000000] cancelling active streams for fragment_instance_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.811383 15454 impala-hs2-server.cc:800] CloseOperation(): query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.811399 15454 impala-server.cc:1320] UnregisterQuery(): query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.811409 15454 coordinator-backend-state.cc:921] query_id=fc450a0fa19a6fbc:9315824900000000 target backend=127.0.0.1:27000: Not cancelling because the backend is already done:I1106 09:49:19.811415 15454 coordinator.cc:932] CancelBackends() query_id=fc450a0fa19a6fbc:9315824900000000, tried to cancel 0 backendsI1106 09:49:19.812331 8319 impala-server.cc:1352] Query successfully unregistered: query_id=fc450a0fa19a6fbc:9315824900000000I1106 09:49:19.812489 8319 query-exec-mgr.cc:213] ReleaseQueryState(): deleted query_id=fc450a0fa19a6fbc:9315824900000000

那有一种可能,就是卡在了没有日志的地方。好在我这是开发环境,而且问题可固定重现,可以加点日志重新编译来Debug。先找一下处理 hs2 协议里 CloseImpalaOperation 这个RPC的代码,主要是在 be/src/service/impala-hs2-server.cc 中:

void ImpalaServer::CloseOperation(TCloseOperationResp& return_val,    const TCloseOperationReq& request) {
TCloseImpalaOperationReq request2; request2.operationHandle = request.operationHandle; TCloseImpalaOperationResp tmp_resp; CloseImpalaOperation(tmp_resp, request2); return_val.status = tmp_resp.status;}void ImpalaServer::CloseImpalaOperation(TCloseImpalaOperationResp& return_val, const TCloseImpalaOperationReq& request) {
TUniqueId query_id; TUniqueId op_secret; HS2_RETURN_IF_ERROR(return_val, THandleIdentifierToTUniqueId( request.operationHandle.operationId, &query_id, &op_secret), SQLSTATE_GENERAL_ERROR); VLOG_QUERY << "CloseOperation(): query_id=" << PrintId(query_id); QueryHandle query_handle; HS2_RETURN_IF_ERROR( return_val, GetActiveQueryHandle(query_id, &query_handle), SQLSTATE_GENERAL_ERROR); ScopedSessionState session_handle(this); const TUniqueId session_id = query_handle->session_id(); HS2_RETURN_IF_ERROR(return_val, session_handle.WithSession(session_id, SecretArg::Operation(op_secret, query_id)), SQLSTATE_GENERAL_ERROR); if (query_handle->stmt_type() == TStmtType::DML) {
Status query_status; if (query_handle->GetDmlStats(&return_val.dml_result, &query_status)) {
return_val.__isset.dml_result = true; } } // TODO: use timeout to get rid of unwanted query_handle. HS2_RETURN_IF_ERROR(return_val, UnregisterQuery(query_id, true), SQLSTATE_GENERAL_ERROR); return_val.status.__set_statusCode(thrift::TStatusCode::SUCCESS_STATUS);}

这里有一些耗时的操作,比如最后的 UnregisterQuery,可能卡在这。简单暴力的方式就是加log,看这个函数卡在哪了。CloseImpalaOperation还被前面一个CloseOperation调用了,也加点日志。于是做了如下修改重新编译:

diff --git a/be/src/service/impala-hs2-server.cc b/be/src/service/impala-hs2-server.ccindex 5c618fb..82a2820 100644--- a/be/src/service/impala-hs2-server.cc+++ b/be/src/service/impala-hs2-server.cc@@ -786,14 +786,17 @@ void ImpalaServer::CloseOperation(TCloseOperationResp& return_val,   TCloseImpalaOperationReq request2;   request2.operationHandle = request.operationHandle;   TCloseImpalaOperationResp tmp_resp;+  VLOG_QUERY << "0";   CloseImpalaOperation(tmp_resp, request2);   return_val.status = tmp_resp.status;+  VLOG_QUERY << "5"; } void ImpalaServer::CloseImpalaOperation(TCloseImpalaOperationResp& return_val,     const TCloseImpalaOperationReq& request) {
TUniqueId query_id; TUniqueId op_secret;+ VLOG_QUERY << "1"; HS2_RETURN_IF_ERROR(return_val, THandleIdentifierToTUniqueId( request.operationHandle.operationId, &query_id, &op_secret), SQLSTATE_GENERAL_ERROR);@@ -802,12 +805,14 @@ void ImpalaServer::CloseImpalaOperation(TCloseImpalaOperationResp& return_val, QueryHandle query_handle; HS2_RETURN_IF_ERROR( return_val, GetActiveQueryHandle(query_id, &query_handle), SQLSTATE_GENERAL_ERROR);+ VLOG_QUERY << "2"; ScopedSessionState session_handle(this); const TUniqueId session_id = query_handle->session_id(); HS2_RETURN_IF_ERROR(return_val, session_handle.WithSession(session_id, SecretArg::Operation(op_secret, query_id)), SQLSTATE_GENERAL_ERROR);+ VLOG_QUERY << "3"; if (query_handle->stmt_type() == TStmtType::DML) {
Status query_status; if (query_handle->GetDmlStats(&return_val.dml_result, &query_status)) {
@@ -819,6 +824,7 @@ void ImpalaServer::CloseImpalaOperation(TCloseImpalaOperationResp& return_val, HS2_RETURN_IF_ERROR(return_val, UnregisterQuery(query_id, true), SQLSTATE_GENERAL_ERROR); return_val.status.__set_statusCode(thrift::TStatusCode::SUCCESS_STATUS);+ VLOG_QUERY << "4"; } void ImpalaServer::GetResultSetMetadata(TGetResultSetMetadataResp& return_val,

改完只需要 ninja impalad 重新编译 impalad(如果最初编译时没启用ninja的话,则是用 make impalad ),因为只改了BE的代码。然后重启集群重现问题,发现 ImpalaServer::CloseImpalaOperation 新加的日志都打出来了,但 ImpalaServer::CloseOperation 没有打。说明 CloseOperation 没有被调用,而 CloseImpalaOperation 正常结束了。日志如下:

I1106 10:12:25.581028 29674 query-state.cc:464] bc46ff8a74592f6b:9146663900000000] UpdateBackendExecState(): last report for bc46ff8a74592f6b:9146663900000000I1106 10:12:25.581405 29667 impala-hs2-server.cc:799] 1I1106 10:12:25.581435 29667 impala-hs2-server.cc:803] CloseOperation(): query_id=bc46ff8a74592f6b:9146663900000000I1106 10:12:25.581471 29667 impala-hs2-server.cc:808] 2I1106 10:12:25.581503 29667 impala-hs2-server.cc:815] 3I1106 10:12:25.581519 29667 impala-server.cc:1320] UnregisterQuery(): query_id=bc46ff8a74592f6b:9146663900000000I1106 10:12:25.581562 29667 coordinator.cc:672] ExecState: query id=bc46ff8a74592f6b:9146663900000000 execution cancelledI1106 10:12:25.581620 29667 coordinator-backend-state.cc:921] query_id=bc46ff8a74592f6b:9146663900000000 target backend=127.0.0.1:27000: Sending CancelQueryFInstances rpcI1106 10:12:25.582509 28909 control-service.cc:213] CancelQueryFInstances(): query_id=bc46ff8a74592f6b:9146663900000000I1106 10:12:25.582554 28909 query-exec-mgr.cc:126] QueryState: query_id=bc46ff8a74592f6b:9146663900000000 refcnt=3I1106 10:12:25.582571 28909 query-state.cc:923] Cancel: query_id=bc46ff8a74592f6b:9146663900000000I1106 10:12:25.582594 28909 krpc-data-stream-mgr.cc:337] cancelling active streams for fragment_instance_id=bc46ff8a74592f6b:9146663900000000I1106 10:12:25.582669 28908 impala-server.cc:1483] Invalid or unknown query handle: bc46ff8a74592f6b:9146663900000000.I1106 10:12:25.582744 28908 control-service.cc:173] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): bc46ff8a74592f6b:9146663900000000 remote host=127.0.0.1:43294I1106 10:12:25.583061 29667 coordinator.cc:932] CancelBackends() query_id=bc46ff8a74592f6b:9146663900000000, tried to cancel 1 backendsI1106 10:12:25.583123 29674 query-state.cc:686] bc46ff8a74592f6b:9146663900000000] Cancelling fragment instances as directed by the coordinator. Returned status: ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): bc46ff8a74592f6b:9146663900000000 remote host=127.0.0.1:43294I1106 10:12:25.583168 29674 query-state.cc:923] bc46ff8a74592f6b:9146663900000000] Cancel: query_id=bc46ff8a74592f6b:9146663900000000I1106 10:12:25.583214 29667 coordinator.cc:1284] Release admission control resources for query_id=bc46ff8a74592f6b:9146663900000000I1106 10:12:25.583902 29667 impala-hs2-server.cc:827] 4I1106 10:12:25.586117 29513 impala-server.cc:1352] Query successfully unregistered: query_id=bc46ff8a74592f6b:9146663900000000I1106 10:12:25.586696 29513 query-exec-mgr.cc:213] ReleaseQueryState(): deleted query_id=bc46ff8a74592f6b:9146663900000000

那么可能是卡在什么地方呢?看一下在impalad里是谁调用了CloseImpalaOperation,有可能卡在上层调用方。impala里有一个GetStackTrace()函数,定义在debug-util.h里,用来打印运行时的代码堆栈。于是我把 4 号日志改为:

+  VLOG_QUERY << "4" << "\n" << GetStackTrace();

再重现一次问题,可以看到堆栈:

I1106 10:26:58.427156   565 impala-hs2-server.cc:827] 4    @          0x2580dac  impala::ImpalaServer::CloseImpalaOperation()    @          0x2b5ccba  impala::ImpalaHiveServer2ServiceProcessor::process_CloseImpalaOperation()    @          0x2b5b418  impala::ImpalaHiveServer2ServiceProcessor::dispatchCall()    @          0x1cd7721  apache::thrift::TDispatchProcessor::process()    @          0x21ef3fc  apache::thrift::server::TAcceptQueueServer::Task::run()    @          0x21e4810  impala::ThriftThread::RunRunnable()    @          0x21e5e4c  boost::_mfi::mf2<>::operator()()    @          0x21e5ce0  boost::_bi::list3<>::operator()<>()    @          0x21e5a26  boost::_bi::bind_t<>::operator()()    @          0x21e5938  boost::detail::function::void_function_obj_invoker0<>::invoke()    @          0x215e37b  boost::function0<>::operator()()    @          0x274cef9  impala::Thread::SuperviseThread()    @          0x2754e96  boost::_bi::list5<>::operator()<>()    @          0x2754dba  boost::_bi::bind_t<>::operator()()    @          0x2754d7b  boost::detail::thread_data<>::run()    @          0x3f41e21  thread_proxy    @     0x7f2e95b706b9  start_thread    @     0x7f2e9263e4dc  cloneI1106 10:26:58.577329   565 query-exec-mgr.cc:213] ReleaseQueryState(): deleted query_id=ea4b5873a9a4bc14:240fca4800000000

ImpalaHiveServer2ServiceProcessor::process_CloseImpalaOperation 这种函数代码是Thrift编译出来的,在 be/generated-sources/gen-cpp/ImpalaHiveServer2Service.cpp 里,这种出问题的可能性较小。我用minidump打印出了impalad的所有线程堆栈(具体操作参考《》中的 “使用Minidump替代Core Dump” 章节,或参考官方文档 ),但并没有找到相关的线程,这条Debug思路走不下去。

定位哪个Commit开始有这个bug

转变思路,看是哪个Commit引入了这个bug,然后对症下药。impala-shell 卡住的地方是最近这个 commit 新加的代码:

即 shell/impala_shell.py 第1303 行:

shell/impala_shell.py 里处理 UnicodeDecodeError 时会调 close_query:1297     except UnicodeDecodeError as e:1298       # An error occoured possibly during the fetching.1299       # Depending of which protocol is at use it can come from different places.1300       # Possibly occours because we try to display binary data which contains1301       # undecodable elements.1302       if self.last_query_handle is not None:1303         self.imp_client.close_query(self.last_query_handle)1304       print('UnicodeDecodeError : %s \nPlease check for columns containing binary data '1305           'to find the possible source of the error.' % (e,), file=sys.stderr)

说明首先是遇到了 UnicodeDecodeError,然后想关闭查询时卡住了。所以有两个问题要解释:(1) 为什么抛 UnicodeDecodeError;(2) 为什么会卡住。

这个commit(fe6e625)其实加了一段会抛 UnicodeDecodeError 的代码,但只对 beeswax 协议的逻辑加了,在 shell/impala_client.py 里:

1016 class ImpalaBeeswaxClient(ImpalaClient):...1108   def fetch(self, query_handle):1109     while True:1110       result, rpc_status = self._do_beeswax_rpc(1111          lambda: self.imp_service.fetch(query_handle, False,1112                                         self.fetch_size))1113       if rpc_status != RpcStatus.OK:1114         raise RPCException()1115 1116       def split_row_and_decode_if_needed(row):1117         try:1118           return row.split('\t')1119         except UnicodeDecodeError:1120           return row.decode('utf-8', 'replace').split('\t')1121 1122       yield [split_row_and_decode_if_needed(row) for row in result.data]

突然发现还没有玩过 beeswax 协议的情况。在 impala 4.0 里,impala-shell 默认使用 hs2 协议和 impalad 通信,切回 beeswax 协议需要显式指定 --protocol=beeswax:

$ bin/impala-shell.sh --protocol=beeswax -q 'select substr("引擎", 1, 4)'Unknown Exception : 'NoneType' object is not iterableTraceback (most recent call last):  File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 1255, in _execute_stmt    for rows in rows_fetched:  File "/home/quanlong/workspace/Impala/shell/impala_client.py", line 1111, in fetch    lambda: self.imp_service.fetch(query_handle, False,TypeError: 'NoneType' object is not iterableCould not execute command: select substr("引擎", 1, 4)

用 beeswax 协议的话是直接报错退出,表现也不一样,并没有报 utf-8 解码失败的错。

然而直接回滚这个commit并不能修复这个问题,应该是跟thrift相关的某个改动引起的,因为utf-8解码错误就是从thrift层抛出来的。其实在四月份发布的 Impala 3.4 中是没有这个bug的,可以把中间跟 thrift 相关的commit都找出来,然后用二分法去找出问题的commit。

$ git log --oneline -i --grep=thrift --since='2020-04-01'981ef10 IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet)ee4043e IMPALA-10168: Expose JSON catalog objects in catalogd's debug page6542b60 IMPALA-10243: ConcurrentModificationException during parallel INSERTs63c435c IMPALA-9232 Potential overflow in serializeThriftMsg6bb3b88 IMPALA-9180 (part 1): Remove legacy ImpalaInternalService28181cb IMPALA-9930 (part 1): Initial refactor for admission control service79b17db IMPALA-9382: part 1: transposed profile prototype13f50ea IMPALA-9229: impala-shell 'profile' to show original and retried queriesfe6e625 IMPALA-10012: ds_hll_sketch() results ascii codec decoding error fix75146c9 IMPALA-10012: ds_hll_sketch() results ascii codec decoding error fix074731e IMPALA-3127: Support incremental metadata updates in partition levelee18dc3 IMPALA-9987: Improve logging around HTTP connections8aeb282 IMPALA-9540 Test that Impala Shell no longer sends duplicate "Host" headers in http mode.b67c090 IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf8b09343 IMPALA-9719: Upgrade sasl-0.1.1 -> 0.2.1e46bbf2 IMPALA-9692 (part 1): Refactor TBackendDescriptor to protobufbc9d7e0 IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.bd4458b IMPALA-9422: Re-visit and improve join node and builder's counters2451846 IMPALA-9543 Reduce duplicate code in thrift CMakeLists.txt

最终定位到是这个commit开始有问题:

bc9d7e0 IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

具体内容见Github:

或Gerrit:

但这个commit实在太大了,一次性做了好几件事情,比如给impala-shell加了个option可以指定任意virtualenv,设置PYTHONIOENCODING=utf8,把impala-shell依赖的thrift版本升级到0.11.0等。现在过了很久了没法clean地回滚这个commit,怀疑是PYTHONIOENCODING设置为utf8引起的,但改为ascii并不能修复问题。完全理解这个大commit工作量有点大,这条路不好走下去。

柳暗花明 —— tcpdump

还是回到最初的思路,直接定位impalad端卡在哪了导致不返回RPC结果。同事提议可以用 tcpdump 对比下正常和异常的查询,这确实是个很好的思路!其实之前也对比过正常查询和异常查询在 impalad 端的日志,发现都一样。但对比 TCP 包的内容却是个更深入的 debug 手段,应用层看不出端倪,我们就往下走一层,看 TCP 层是怎么样!

在我的开发环境里,impala-shell是通过localhost连接的impalad,所以监听的设备要选 lo (Local Loopback)。现在master分支的impala-shell默认使用hs2协议,因此连的是impalad的21050端口。由于只有我的 impala-shell 能连接到这个 impalad,因此我直接监听 21050 端口的 TCP 包就是所有该 impala-shell 的通信包了。如果是在生产环境中,则要只监听 impala-shell 使用的端口,以区分其它应用跟 impalad 的通信。

这里我用的指令是

sudo tcpdump -i lo 'port 21050'

把 impala-shell 启动起来,执行一条正常语句 select substr(“引擎”, 1, 3),然后再执行一条异常语句 select substr(“引擎”, 1, 4),明显能看到异常语句少了几个包。

正常语句的TCP包有26个:

11:04:52.169568 IP localhost.40938 > localhost.21050: Flags [P.], seq 2344715580:2344715671, ack 246892403, win 512, options [nop,nop,TS val 3249330901 ecr 3249311189], length 9111:04:52.169913 IP localhost.21050 > localhost.40938: Flags [P.], seq 1:181, ack 91, win 512, options [nop,nop,TS val 3249330902 ecr 3249330901], length 18011:04:52.169944 IP localhost.40938 > localhost.21050: Flags [.], ack 181, win 511, options [nop,nop,TS val 3249330902 ecr 3249330902], length 011:04:52.170831 IP localhost.40938 > localhost.21050: Flags [P.], seq 91:292, ack 181, win 512, options [nop,nop,TS val 3249330902 ecr 3249330902], length 20111:04:52.187515 IP localhost.21050 > localhost.40938: Flags [P.], seq 181:290, ack 292, win 512, options [nop,nop,TS val 3249330919 ecr 3249330902], length 10911:04:52.187852 IP localhost.40938 > localhost.21050: Flags [P.], seq 292:394, ack 290, win 512, options [nop,nop,TS val 3249330920 ecr 3249330919], length 10211:04:52.188237 IP localhost.21050 > localhost.40938: Flags [P.], seq 290:411, ack 394, win 512, options [nop,nop,TS val 3249330920 ecr 3249330920], length 12111:04:52.189283 IP localhost.40938 > localhost.21050: Flags [P.], seq 394:494, ack 411, win 512, options [nop,nop,TS val 3249330921 ecr 3249330920], length 10011:04:52.189459 IP localhost.21050 > localhost.40938: Flags [P.], seq 411:464, ack 494, win 512, options [nop,nop,TS val 3249330921 ecr 3249330921], length 5311:04:52.231963 IP localhost.40938 > localhost.21050: Flags [.], ack 464, win 512, options [nop,nop,TS val 3249330964 ecr 3249330921], length 011:04:52.290038 IP localhost.40938 > localhost.21050: Flags [P.], seq 494:594, ack 464, win 512, options [nop,nop,TS val 3249331022 ecr 3249330921], length 10011:04:52.290276 IP localhost.21050 > localhost.40938: Flags [P.], seq 464:517, ack 594, win 512, options [nop,nop,TS val 3249331022 ecr 3249331022], length 5311:04:52.290348 IP localhost.40938 > localhost.21050: Flags [.], ack 517, win 512, options [nop,nop,TS val 3249331022 ecr 3249331022], length 011:04:52.290815 IP localhost.40938 > localhost.21050: Flags [P.], seq 594:706, ack 517, win 512, options [nop,nop,TS val 3249331022 ecr 3249331022], length 11211:04:52.291191 IP localhost.21050 > localhost.40938: Flags [P.], seq 517:620, ack 706, win 512, options [nop,nop,TS val 3249331023 ecr 3249331022], length 10311:04:52.292094 IP localhost.40938 > localhost.21050: Flags [P.], seq 706:818, ack 620, win 512, options [nop,nop,TS val 3249331024 ecr 3249331023], length 11211:04:52.294791 IP localhost.21050 > localhost.40938: Flags [P.], seq 620:715, ack 818, win 512, options [nop,nop,TS val 3249331026 ecr 3249331024], length 9511:04:52.295077 IP localhost.40938 > localhost.21050: Flags [P.], seq 818:906, ack 715, win 512, options [nop,nop,TS val 3249331027 ecr 3249331026], length 8811:04:52.295349 IP localhost.21050 > localhost.40938: Flags [P.], seq 715:823, ack 906, win 512, options [nop,nop,TS val 3249331027 ecr 3249331027], length 10811:04:52.295624 IP localhost.40938 > localhost.21050: Flags [P.], seq 906:1008, ack 823, win 512, options [nop,nop,TS val 3249331027 ecr 3249331027], length 10211:04:52.335924 IP localhost.21050 > localhost.40938: Flags [.], ack 1008, win 512, options [nop,nop,TS val 3249331068 ecr 3249331027], length 011:04:52.458929 IP localhost.21050 > localhost.40938: Flags [P.], seq 823:871, ack 1008, win 512, options [nop,nop,TS val 3249331191 ecr 3249331027], length 4811:04:52.459074 IP localhost.40938 > localhost.21050: Flags [P.], seq 1008:1172, ack 871, win 512, options [nop,nop,TS val 3249331191 ecr 3249331191], length 16411:04:52.459080 IP localhost.21050 > localhost.40938: Flags [.], ack 1172, win 511, options [nop,nop,TS val 3249331191 ecr 3249331191], length 011:04:52.460225 IP localhost.21050 > localhost.40938: Flags [P.], seq 871:12312, ack 1172, win 512, options [nop,nop,TS val 3249331192 ecr 3249331191], length 1144111:04:52.503980 IP localhost.40938 > localhost.21050: Flags [.], ack 12312, win 512, options [nop,nop,TS val 3249331236 ecr 3249331192], length 0

异常语句的TCP包只有19个:

11:05:00.805456 IP localhost.40938 > localhost.21050: Flags [P.], seq 1172:1263, ack 12312, win 512, options [nop,nop,TS val 3249339537 ecr 3249331192], length 9111:05:00.805554 IP localhost.21050 > localhost.40938: Flags [P.], seq 12312:12492, ack 1263, win 512, options [nop,nop,TS val 3249339537 ecr 3249339537], length 18011:05:00.805561 IP localhost.40938 > localhost.21050: Flags [.], ack 12492, win 511, options [nop,nop,TS val 3249339537 ecr 3249339537], length 011:05:00.805751 IP localhost.40938 > localhost.21050: Flags [P.], seq 1263:1464, ack 12492, win 512, options [nop,nop,TS val 3249339537 ecr 3249339537], length 20111:05:00.810144 IP localhost.21050 > localhost.40938: Flags [P.], seq 12492:12601, ack 1464, win 512, options [nop,nop,TS val 3249339542 ecr 3249339537], length 10911:05:00.810290 IP localhost.40938 > localhost.21050: Flags [P.], seq 1464:1566, ack 12601, win 512, options [nop,nop,TS val 3249339542 ecr 3249339542], length 10211:05:00.810415 IP localhost.21050 > localhost.40938: Flags [P.], seq 12601:12722, ack 1566, win 512, options [nop,nop,TS val 3249339542 ecr 3249339542], length 12111:05:00.810576 IP localhost.40938 > localhost.21050: Flags [P.], seq 1566:1666, ack 12722, win 512, options [nop,nop,TS val 3249339542 ecr 3249339542], length 10011:05:00.810650 IP localhost.21050 > localhost.40938: Flags [P.], seq 12722:12775, ack 1666, win 512, options [nop,nop,TS val 3249339542 ecr 3249339542], length 5311:05:00.851921 IP localhost.40938 > localhost.21050: Flags [.], ack 12775, win 512, options [nop,nop,TS val 3249339583 ecr 3249339542], length 011:05:00.911128 IP localhost.40938 > localhost.21050: Flags [P.], seq 1666:1766, ack 12775, win 512, options [nop,nop,TS val 3249339643 ecr 3249339542], length 10011:05:00.911375 IP localhost.21050 > localhost.40938: Flags [P.], seq 12775:12828, ack 1766, win 512, options [nop,nop,TS val 3249339643 ecr 3249339643], length 5311:05:00.911468 IP localhost.40938 > localhost.21050: Flags [.], ack 12828, win 512, options [nop,nop,TS val 3249339643 ecr 3249339643], length 011:05:00.911948 IP localhost.40938 > localhost.21050: Flags [P.], seq 1766:1878, ack 12828, win 512, options [nop,nop,TS val 3249339644 ecr 3249339643], length 11211:05:00.912416 IP localhost.21050 > localhost.40938: Flags [P.], seq 12828:12932, ack 1878, win 512, options [nop,nop,TS val 3249339644 ecr 3249339644], length 10411:05:00.912848 IP localhost.40938 > localhost.21050: Flags [P.], seq 1878:1980, ack 12932, win 512, options [nop,nop,TS val 3249339644 ecr 3249339644], length 10211:05:00.959907 IP localhost.21050 > localhost.40938: Flags [.], ack 1980, win 512, options [nop,nop,TS val 3249339691 ecr 3249339644], length 011:05:01.097326 IP localhost.21050 > localhost.40938: Flags [P.], seq 12932:12980, ack 1980, win 512, options [nop,nop,TS val 3249339829 ecr 3249339644], length 4811:05:01.139967 IP localhost.40938 > localhost.21050: Flags [.], ack 12980, win 512, options [nop,nop,TS val 3249339871 ecr 3249339829], length 0

在这里我们终于看到了区别!接下来再重现一次,把两个查询的TCP包都dump下来。在一个终端打开impala-shell,另一个终端执行

sudo tcpdump -i lo 'port 21050' -w good_query_tcp_dump.txt

这样会把tcp包的内容都dump到文件good_query_tcp_dump.txt里。在impala-shell中执行正常查询 select substr(“引擎”, 1, 3),然后Ctrl+C关掉tcpdump进程。再执行

sudo tcpdump -i lo 'port 21050' -w bad_query_tcp_dump.txt

这次在impala-shell中跑异常查询 select substr(“引擎”, 1, 4),再次Ctrl+C关掉tcpdump。这样我们就采集到两个查询的TCP包了。直接用vim打开,能看到里面有文本内容,FetchResults、GetLog、CloseImpalaOperation这些都是成对出现,一个是request,另一个是response。正常查询在CloseImpalaOperation后还会拿profile:

正常查询的TCP dump
但是异常查询在CloseImpalaOperation后只有一个未知的小包:
异常查询TCP dump
从这里我们已经能看出,impalad其实已经返回了CloseImpalaOperation的结果!看起来是 impala-shell 收到 CloseImpalaOperation 的结果后,又发出了个很短的包,然后在等结果。看一下 impala-shell 这边 thrift 层的代码在做什么。

Thrift相关代码研究

前面已经用 pystack 打出了 impala-shell 的代码堆栈,最底下几层是 thrift 代码:

File "/home/quanlong/workspace/Impala/shell/build/thrift-11-gen/gen-py/ImpalaService/ImpalaHiveServer2Service.py", line 155, in CloseImpalaOperation    return self.recv_CloseImpalaOperation()  File "/home/quanlong/workspace/Impala/shell/build/thrift-11-gen/gen-py/ImpalaService/ImpalaHiveServer2Service.py", line 167, in recv_CloseImpalaOperation    (fname, mtype, rseqid) = iprot.readMessageBegin()  File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 148, in readMessageBegin    name = self.trans.readAll(sz)  File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 60, in readAll    chunk = self.read(sz - have)  File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 162, in read    self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))  File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 117, in read    buff = self.handle.recv(sz)

好在都给出了文件名和行号,直接就能找到代码。从上到下依次看,recv_CloseImpalaOperation() 卡在第二行代码的 iprot.readMessageBegin():

def recv_CloseImpalaOperation(self):    iprot = self._iprot    (fname, mtype, rseqid) = iprot.readMessageBegin()    if mtype == TMessageType.EXCEPTION:        x = TApplicationException()        x.read(iprot)        iprot.readMessageEnd()        raise x    result = CloseImpalaOperation_result()    result.read(iprot)    iprot.readMessageEnd()    if result.success is not None:        return result.success    raise TApplicationException(TApplicationException.MISSING_RESULT, "CloseImpalaOperation failed: unknown result")

readMessageBegin() 是在 thrift/protocol/TBinaryProtocol.py 里定义的,这是 thrift 自己的代码。前面的 recv_CloseImpalaOperation 所在的文件是 ImpalaService/ImpalaHiveServer2Service.py,这是 thrift 编译出来的 python 代码。

133 def readMessageBegin(self):134     sz = self.readI32()135     if sz < 0:136         version = sz & TBinaryProtocol.VERSION_MASK137         if version != TBinaryProtocol.VERSION_1:138             raise TProtocolException(139                 type=TProtocolException.BAD_VERSION,140                 message='Bad version in readMessageBegin: %d' % (sz))141         type = sz & TBinaryProtocol.TYPE_MASK142         name = self.readString()143         seqid = self.readI32()144     else:145         if self.strictRead:146             raise TProtocolException(type=TProtocolException.BAD_VERSION,147                                      message='No protocol version header')148         name = self.trans.readAll(sz)  # <-------- 卡在了这一行149         type = self.readByte()150         seqid = self.readI32()151     return (name, type, seqid)

如上代码,堆栈显示是卡在了第148行的 readAll() 调用上。这个函数主体是个 if 语句,由第一行读入的 int32 值的正负来决定走哪个分支。可以直接改这段代码,把 sz 的值 print 出来。对比正常和异常和查询,发现正常查询的 sz 值都是负的,异常查询在读到一个正的 sz 值后进入了 else 分支,然后就卡住不动了。所以 impala-shell 里的 thrift 代码很可能解读了错误的 sz 值,从而在等一些不可能到来的数据。而这个误读的 sz 值,很可能是前面解码失败的 utf-8 字符的遗留字节。

带着这个猜想,我回头调查为什么会出现 utf-8 解码失败。研究了一些 thrift 的代码、python2 和 python3 在str和unicode上的不同表现、以及处理 utf-8 的规范 (即 “UTF-8 Sandwith”——系统外部是byte数组,系统内部是utf8字符串,在系统边界做编码/解码),最终发现 thrift 的 fast_decode 逻辑里有一个 bug:。大致解析一下:Thrift在解译(decode) 字节数组到thrift对象时有一个fast_decode的方案,调用C++的library来加速解码。在fast_decode的代码(C++代码)里调用了 Python 的decode函数来解码 utf-8 字符串,但却没有指定错误处理函数,因此在里面直接抛了 UnicodeDecodeError 出来。

修完这个后 impala-shell 的问题也就修复了。其实 thrift-0.11.0 已经做到了 “UTF-8 Sandwith” 的规范,因此在客户端解译 thrift 对象时是不应该抛 UnicodeDecodeError 的,因为这属于 Thrift 层的内部,字符串都已经是正常的 utf-8 形式了。问题出在 impalad 这边还用的是 thrift-0.9.3,因此impalad这边 thrift 内部的字符串还是字节数组的形式,并没有做任何 utf-8 相关的事情,所以坑了一把高版本的 thrift 客户端。当然这本身是客户端的 bug,修完就好了。

总结

这次 Debug 之所以曲折,是因为一开始走错了方向,一味地在找 impalad 端卡在哪,因为无法确认其是否正常返回了 RPC,一开始也很难怀疑 Thrift 层有问题。陷入僵局后,还是用 tcpdump 打开了局面,证实了 impalad 其实已经返回了 RPC,只是 impala-shell 这边在等过多的输入。然后才走上正轨,去研究 Thrift 层为什么会抛 utf-8 解码失败的错,从而修复了问题。这次最大的感悟就是,系统都是分层的,这一层看不清的问题,不防往下走一层,或许就看清了。

最后列举一下一些有用的知识点:

  • pystack: 用来打印 python 进程的堆栈
  • minidump: 用来打印 C++ 进程的堆栈,Impala BE端 debug 经常用到
  • tcpdump: 用来监听TCP层的传输
  • UTF-8 Sandwith:Python程序在 UTF-8 方面做到Python2和Python3兼容的方案,具体可以参考这个 PPT: https://nedbatchelder.com/text/unipain/unipain.html

转载地址:http://ajzgf.baihongyu.com/

你可能感兴趣的文章
解决问题之 MySQL慢查询日志设置
查看>>
contOS6 部署 lnmp、FTP、composer、ThinkPHP5、docker详细步骤
查看>>
TP5.1模板布局中遇到的坑,配置完不生效解决办法
查看>>
PHPstudy中遇到的坑No input file specified,以及传到linux环境下遇到的坑,模板文件不存在
查看>>
TP5.1事务操作和TP5事务回滚操作多表
查看>>
composer install或composer update 或 composer require phpoffice/phpexcel 失败解决办法
查看>>
TP5.1项目从windows的Apache服务迁移到linux的Nginx服务需要注意几点。
查看>>
win10安装软件 打开时报错 找不到 msvcp120.dll
查看>>
PHPunit+Xdebug代码覆盖率以及遇到的问题汇总
查看>>
PHPUnit安装及使用
查看>>
PHP项目用xhprof性能分析(安装及应用实例)
查看>>
composer安装YII
查看>>
Sublime text3快捷键演示
查看>>
sublime text3 快捷键修改
查看>>
关于PHP几点建议
查看>>
硬盘的接口、协议
查看>>
VLAN与子网划分区别
查看>>
Cisco Packet Tracer教程
查看>>
02. 交换机的基本配置和管理
查看>>
03. 交换机的Telnet远程登陆配置
查看>>