gpt4 book ai didi

clickhouse - 在 ClickHouse 中填充物化 View 超出了内存限制

转载 作者:行者123 更新时间:2023-12-04 17:33:42 24 4
gpt4 key购买 nike

我正在尝试在使用 ReplicatedMergeTree 引擎的表上使用 ReplicatedAggregatingMergeTree 引擎创建实体化 View 。

几百万行后,我得到 DB::Exception: Memory limit (for query) exceeded。有办法解决这个问题吗?

CREATE MATERIALIZED VIEW IF NOT EXISTS shared.aggregated_calls_1h
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/shared/aggregated_calls_1h', '{replica}')
PARTITION BY toRelativeDayNum(retained_until_date)
ORDER BY (
client_id,
t,
is_synthetic,
source_application_ids,
source_service_id,
source_endpoint_id,
destination_application_ids,
destination_service_id,
destination_endpoint_id,
boundary_application_ids,
process_snapshot_id,
docker_snapshot_id,
host_snapshot_id,
cluster_snapshot_id,
http_status
)
SETTINGS index_granularity = 8192
POPULATE
AS
SELECT
client_id,
toUInt64(floor(t / (60000 * 60)) * (60000 *60)) AS t,
date,
toDate(retained_until_timestamp / 1000) retained_until_date,
is_synthetic,
source_application_ids,
source_service_id,
source_endpoint_id,
destination_application_ids,
destination_service_id,
destination_endpoint_id,
boundary_application_ids,
http_status,
process_snapshot_id,
docker_snapshot_id,
host_snapshot_id,
cluster_snapshot_id,
any(destination_endpoint) AS destination_endpoint,
any(destination_endpoint_type) AS destination_endpoint_type,
groupUniqArrayArrayState(destination_technologies) AS destination_technologies_state,
minState(ingestion_time) AS min_ingestion_time_state,
sumState(batchCount) AS sum_call_count_state,
sumState(errorCount) AS sum_error_count_state,
sumState(duration) AS sum_duration_state,
minState(toUInt64(ceil(duration/batchCount))) AS min_duration_state,
maxState(toUInt64(ceil(duration/batchCount))) AS max_duration_state,
quantileTimingWeightedState(0.25)(toUInt64(ceil(duration/batchCount)), batchCount) AS latency_p25_state,
quantileTimingWeightedState(0.50)(toUInt64(ceil(duration/batchCount)), batchCount) AS latency_p50_state,
quantileTimingWeightedState(0.75)(toUInt64(ceil(duration/batchCount)), batchCount) AS latency_p75_state,
quantileTimingWeightedState(0.90)(toUInt64(ceil(duration/batchCount)), batchCount) AS latency_p90_state,
quantileTimingWeightedState(0.95)(toUInt64(ceil(duration/batchCount)), batchCount) AS latency_p95_state,
quantileTimingWeightedState(0.98)(toUInt64(ceil(duration/batchCount)), batchCount) AS latency_p98_state,
quantileTimingWeightedState(0.99)(toUInt64(ceil(duration/batchCount)), batchCount) AS latency_p99_state,
quantileTimingWeightedState(0.25)(toUInt64(ceil(duration/batchCount)/100), batchCount) AS latency_p25_large_state,
quantileTimingWeightedState(0.50)(toUInt64(ceil(duration/batchCount)/100), batchCount) AS latency_p50_large_state,
quantileTimingWeightedState(0.75)(toUInt64(ceil(duration/batchCount)/100), batchCount) AS latency_p75_large_state,
quantileTimingWeightedState(0.90)(toUInt64(ceil(duration/batchCount)/100), batchCount) AS latency_p90_large_state,
quantileTimingWeightedState(0.95)(toUInt64(ceil(duration/batchCount)/100), batchCount) AS latency_p95_large_state,
quantileTimingWeightedState(0.98)(toUInt64(ceil(duration/batchCount)/100), batchCount) AS latency_p98_large_state,
quantileTimingWeightedState(0.99)(toUInt64(ceil(duration/batchCount)/100), batchCount) AS latency_p99_large_state,
sumState(minSelfTime) AS sum_min_self_time_state
FROM shared.calls_v2
WHERE sample_type != 'user_selected'
GROUP BY
client_id,
t,
date,
retained_until_date,
is_synthetic,
source_application_ids,
source_service_id,
source_endpoint_id,
destination_application_ids,
destination_service_id,
destination_endpoint_id,
boundary_application_ids,
process_snapshot_id,
docker_snapshot_id,
host_snapshot_id,
cluster_snapshot_id,
http_status
HAVING destination_endpoint_type != 'INTERNAL'

最佳答案

您可以尝试使用 clickhouse-client--max_memory_usage 选项来增加限制。

--max_memory_usage arg "Maximum memory usage for processing of single query. Zero means unlimited."

https://clickhouse.yandex/docs/en/operations/settings/query_complexity/#settings_max_memory_usage

或者不是填充而是手动将数据复制到表中作为

INSERT INTO .inner.shared.aggregated_calls_1h
SELECT
client_id,
toUInt64(floor(t / (60000 * 60)) * (60000 *60)) AS t,
...

关于clickhouse - 在 ClickHouse 中填充物化 View 超出了内存限制,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57571242/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com