gpt4 book ai didi

cassandra - 从cassandra 2.1.4升级到2.1.5

转载 作者:行者123 更新时间:2023-12-02 09:13:13 33 4
gpt4 key购买 nike

大家

几天前,我将6节点EC2集群从cassandra 2.1.4升级到了2.1.5。

从那时起,我所有的节点的CPU使用率都“爆炸”了-在很多时候,它们的CPU利用率为100%,平均负载在100-300之间(!!!)。

升级后并没有立即开始。此后几个小时开始,其中一个节点开始运行,然后慢慢地,越来越多的节点开始表现出相同的行为。
它似乎与我们最大的色谱柱系列的压实相关,并且在压实完成后(开始后约24小时),节点似乎恢复了正常。大约只有2天,所以我希望它不会再次发生,但是我仍在监视它。

这是我的问题


这是错误还是预期的行为?


如果这是预期的行为-


这个问题的解释是什么?
是否在我错过的地方有记载?
我应该以其他方式升级吗?也许每24小时一次1或2个节点?最佳做法是什么?


如果是错误-


知道吗
我应该在哪里报告?我应该添加什么数据?
降级到2.1.4是否行得通?


任何反馈,这将是巨大的

谢谢

阿米尔

更新:

这就是所讨论的表的结构。

创建表tbl1(

key text PRIMARY KEY,

created_at timestamp,

customer_id bigint,

device_id bigint,

event text,

fail_count bigint,

generation bigint,

gr_id text,

imei text,

raw_post text,

"timestamp" timestamp


)紧凑的存储

AND bloom_filter_fp_chance = 0.01

AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'

AND comment = ''

AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}

AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}

AND dclocal_read_repair_chance = 0.0

AND default_time_to_live = 0

AND gc_grace_seconds = 864000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 0

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = 'NONE';


日志显示得很少(至少对我来说)。这是日志外观的一小段

INFO [WRITE- / 10.0.1.142] 2015-05-23 05:43:42,577 YamlConfigurationLoader.java:92-从文件//etc/cassandra/cassandra.yaml加载设置

INFO [WRITE- / 10.0.1.142] 2015-05-23 05:43:42,580 YamlConfigurationLoader.java:135-节点配置:[authenticator = AllowAllAuthenticator; authorizer = AllowAllAuthorizer; auto_snapshot = true; batch_size_warn_threshold_in_kb = 5; batchlog_replay_throttle_in_kb = 1024; broadcast_rpc_address = 10.0.2.145; cas_contention_timeout_in_ms = 1000; client_encryption_options =; cluster_name = Gryphonet21集群; column_index_size_in_kb = 64; commit_failure_policy =停止; commitlog_directory = / data / cassandra / commitlog; commitlog_segment_size_in_mb = 32; commitlog_sync =定期; commitlog_sync_period_in_ms = 10000; compaction_throughput_mb_per_sec = 16; parallel_counter_writes = 32; parallel_reads = 32; parallel_writes = 32; counter_cache_save_period = 7200; counter_cache_size_in_mb = null; counter_write_request_timeout_in_ms = 5000; cross_node_timeout = false; data_file_directories = [/ data / cassandra / data]; disk_failure_policy =停止; dynamic_snitch_badness_threshold = 0.1; dynamic_snitch_reset_interval_in_ms = 600000; dynamic_snitch_update_interval_in_ms = 100; endpoint_snitch = GossipingPropertyFileSnitch; hinted_handoff_enabled = true; hinted_handoff_throttle_in_kb = 1024; internal_backups = false; index_summary_capacity_in_mb = null; index_summary_resize_interval_in_minutes = 60; inter_dc_tcp_nodelay = false; internode_compression =全部; key_cache_save_period = 14400; key_cache_size_in_mb = null; max_hint_window_in_ms = 10800000; max_hints_delivery_threads = 2; memtable_allocation_type = heap_buffers; native_transport_port = 9042; num_tokens = 16; partitioner = RandomPartitioner; Permissions_validity_in_ms = 2000; range_request_timeout_in_ms = 10000; read_request_timeout_in_ms = 5000; request_scheduler = org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms = 10000; row_cache_save_period = 0; row_cache_size_in_mb = 0; rpc_address = 0.0.0.0; rpc_keepalive = true; rpc_port = 9160; rpc_server_type = sync; saved_caches_directory = / data / cassandra / saved_caches; seed_provider = [{class_name = org.apache.cassandra.locator.SimpleSeedProvider,参数= [{seeds = 10.0.1.141,10.0.2.145,10.0.3.149}]}]]; server_encryption_options =; snapshot_before_compaction = false; ssl_storage_port = 7001; sstable_preemptive_open_interval_in_mb = 50; start_native_transport = true; start_rpc = true; storage_port = 7000; thrift_framed_transport_size_in_mb = 15; tombstone_failure_threshold = 100000; tombstone_warn_threshold = 1000; rickle_fsync = false; rickle_fsync_interval_in_kb = 10240; truncate_request_timeout_in_ms = 60000; write_request_timeout_in_ms = 2000]

INFO [HANDSHAKE- / 10.0.1.142] 2015-05-23 05:43:42,591 OutboundTcpConnection.java:494-无法与/10.0.1.142握手版本

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,713 MessagingService.java:887-在过去5000毫秒内丢弃了135条MUTATION消息

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,713 StatusLogger.java:51-池名称活动挂起已完成已阻止所有时间已阻止

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,714 StatusLogger.java:66-CounterMutationStage 0 0 0 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,714 StatusLogger.java:66-ReadStage 5 1 5702809 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,715 StatusLogger.java:66-RequestResponseStage 0 45 29528010 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,715 StatusLogger.java:66-ReadRepairStage 0 0 997 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,715 StatusLogger.java:66-MutationStage 0 31 43404309 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,716 StatusLogger.java:66-GossipStage 0 0 569931 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,716 StatusLogger.java:66-AntiEntropyStage 0 0 0 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,716 StatusLogger.java:66-CacheCleanupExecutor 0 0 0 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,717 StatusLogger.java:66-MigrationStage 0 0 9 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,829 StatusLogger.java:66-ValidationExecutor 0 0 0 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,830 StatusLogger.java:66-采样器0 0 0 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,830 StatusLogger.java:66-MiscStage 0 0 0 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,831 StatusLogger.java:66-CommitLogArchiver 0 0 0 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,831 StatusLogger.java:66-MemtableFlushWriter 1 1 1756 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,831 StatusLogger.java:66-PendingRangeCalculator 0 0 11 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,832 StatusLogger.java:66-MemtableReclaimMemory 0 0 1756 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,832 StatusLogger.java:66-MemtablePostFlush 1 2 3819 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,832 StatusLogger.java:66-CompactionExecutor 2 32 742 0 0

INFO [ScheduledTasks:1] 2015-05-23 05:43:42,833 StatusLogger.java:66-InternalResponseStage 0 0 0 0 0

INFO [HANDSHAKE- / 10.0.1.142] 2015-05-23 05:43:45,086 OutboundTcpConnection.java:485-带有/10.0.1.142的握手版本

更新:

问题仍然存在。我认为在对每个节点进行一次压缩后,该节点会恢复正常,但事实并非如此。几个小时后,CPU跳到100%,平均负载在100-300范围内。

我降级到2.1.4。

更新:

使用phact的dumpThreads脚本获取堆栈跟踪。此外,尝试使用jvmtop,但似乎挂起了。

输出太大,无法粘贴到这里,但是您可以在 http://downloads.gryphonet.com/cassandra/上找到它。

用户名:cassandra
密码:cassandra

最佳答案

尝试使用jvmtop查看cassandra进程在做什么。它有两种模式,一种是查看当前正在运行的线程,另一种是显示每个类过程(--profile)的cpu分布,将两个输出都粘贴到此处

关于cassandra - 从cassandra 2.1.4升级到2.1.5,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30404621/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com