Let's say I have an Athena table mytable
partitioned by columns A, B, and C
.
假设我有一个由列A、B和C分区的Athena表MyTable。
We will insert data into the s3 bucket from an unload query. The data will be partitioned by A/B/C
. There will be different values for each.
我们将从卸载查询将数据插入到S3存储桶中。数据将按A/B/C分区,每个分区将有不同的值。
We want the table to only point to 1 partition for C
. The way we are thinking of doing this is each time we introduce a new value of C
, we will
我们希望该表只指向C的1个分区。我们考虑这样做的方式是,每次我们引入一个新的C值时,我们将
ALTER TABLE mytable DROP PARTITION (A = 'some A value', B = 'some B value')
ALTER TABLE MYTABLE DROP PARTITION(A=‘一些A值’,B=‘一些B值’)
ALTER TABLE mytable ADD PARTITION (A = 'some A value', B = 'some B value', C = 'new C value')
ALTER TABLE MyTable添加分区(A=‘SomeA Value’,B=‘SomeBValue’,C=‘new C Value’)
Let's say there are 10 different values for C for a given A/B combo.
假设对于给定的A/B组合,C有10个不同的值。
A=A1/B=B1/C=C1/
A=A1/B=B1/C=C2/
A=A1/B=B1/C=C3/
A=A1/B=B1/C=C4/
...
If I query mytable
on only A and B, and mytable
is only looking at 1 partition, would Athena still scan across C2, C3, C4
? The query I would run is
如果我只在A和B上查询MyTable,而MyTable只查看一个分区,那么Athena是否仍会跨C2、C3、C4进行扫描?我要运行的查询是
SELECT ...
FROM mytable
WHERE A = <some A value>
and B = <some B value>
I would not query by C
, so I want to make sure Athena does not scan over multiple C partitions.
我不会按C进行查询,所以我希望确保Athena不会扫描多个C分区。
更多回答
优秀答案推荐
Yes, Athena would scan all the partition of C like C1, C2, C3, C4 unless you don't filter the partition of C in where condition.
是的,雅典娜将扫描C的所有分区,如C1、C2、C3、C4,除非您在WHERE条件下不过滤C的分区。
If you don't write the query something like this
如果您不编写如下所示的查询
select * from table
where A = <some A value> and
B = <some B value> and
C = <some C value >
If you write like this
如果你这样写的话
select * from table
where A = <some A value> and
B = <some B value>
It would scan all the partitions of C.
它将扫描C的所有分区。
更多回答
Even if mytable is only pointing to 1 partition of C? Then is Alter Table even altering anything?
即使MyTable只指向C的1个分区?那么,ALTER TABLE是否会改变什么呢?
@JeremyFisher So each time you will be having only one partition of C in your table, is that what you are saying?
@JeremyFisher所以每次您的表中只有一个C分区,这就是您要说的吗?
我是一名优秀的程序员,十分优秀!