gpt4 book ai didi

impala - Impala 中无效元数据和刷新命令之间的区别?

转载 作者:行者123 更新时间:2023-12-02 08:17:21 33 4
gpt4 key购买 nike

我看到这个链接影响 Impala version 1.1 :

Since Impala 1.1, REFRESH statement only works for existing tables. For new tables you need to issue "INVALIDATE METADATA" statement.

这对于更高版本的 Impala 仍然适用吗?

最佳答案

根据Cloudera的Impala guide (Cloudera Enterprise 5.8)但保持不变 5.9 :

INVALIDATE METADATA and REFRESH are counterparts: INVALIDATE METADATA waits to reload the metadata when needed for a subsequent query, but reloads all the metadata for the table, which can be an expensive operation, especially for large tables with many partitions. REFRESH reloads the metadata immediately, but only loads the block location data for newly added data files, making it a less expensive operation overall. If data was altered in some more extensive way, such as being reorganized by the HDFS balancer, use INVALIDATE METADATA to avoid a performance penalty from reduced local reads. If you used Impala version 1.0, the INVALIDATE METADATA statement works just like the Impala 1.0 REFRESH statement did, while the Impala 1.1 REFRESH is optimized for the common use case of adding new data files to an existing table, thus the table name argument is now required.

与处理现有表相关:

The table name is a required parameter [for REFRESH]. To flush the metadata for all tables, use the INVALIDATE METADATA command. Because REFRESH table_name only works for tables that the current Impala node is already aware of, when you create a new table in the Hive shell, enter INVALIDATE METADATA new_table before you can see the new table in impala-shell. Once the table is known by Impala, you can issue REFRESH table_name after you add data files for that table.

所以看起来它确实保持不变。我相信 CDH 5.9 是与 Impala 2.7 一起提供的。

关于impala - Impala 中无效元数据和刷新命令之间的区别?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42239213/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com