gpt4 book ai didi

amazon-web-services - 从 Postgres RDS 到 Redshift 的 AWS DMS 复制任务在 S3 存储桶上被拒绝访问

转载 作者:行者123 更新时间:2023-12-05 04:24:00 26 4
gpt4 key购买 nike

我们已经部署了一个 DMS 复制任务来将我们的整个 Postgres 数据库复制到 Redshift。表是用正确的模式创建的,但数据没有传到 Redshift 并被保留在 DMS 用作中间步骤的 S3 存储桶中。这都是通过 Terraform 部署的。

我们已经按照 replication instance Terraform docs 中的描述配置了 IAM 角色所有三个 dms-access-for-endpoint , dms-cloudwatch-logs-role , 和 dms-vpc-role IAM 角色已创建。 IAM 角色通过不同的堆栈部署到 DMS 的部署位置,因为这些角色由另一个成功部署的运行不同任务的 DMS 实例使用。

data "aws_iam_policy_document" "dms_assume_role_document" {
statement {
actions = ["sts:AssumeRole"]

principals {
identifiers = [
"s3.amazonaws.com",
"iam.amazonaws.com",
"redshift.amazonaws.com",
"dms.amazonaws.com",
"redshift-serverless.amazonaws.com"
]
type = "Service"
}
}
}

# Database Migration Service requires the below IAM Roles to be created before
# replication instances can be created. See the DMS Documentation for
# additional information: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Security.html#CHAP_Security.APIRole
# * dms-vpc-role
# * dms-cloudwatch-logs-role
# * dms-access-for-endpoint
resource "aws_iam_role" "dms_access_for_endpoint" {
name = "dms-access-for-endpoint"
assume_role_policy = data.aws_iam_policy_document.dms_assume_role_document.json
managed_policy_arns = ["arn:aws:iam::aws:policy/service-role/AmazonDMSRedshiftS3Role"]
force_detach_policies = true
}

resource "aws_iam_role" "dms_cloudwatch_logs_role" {
name = "dms-cloudwatch-logs-role"
description = "Allow DMS to manage CloudWatch logs."
assume_role_policy = data.aws_iam_policy_document.dms_assume_role_document.json
managed_policy_arns = ["arn:aws:iam::aws:policy/service-role/AmazonDMSCloudWatchLogsRole"]
force_detach_policies = true
}

resource "aws_iam_role" "dms_vpc_role" {
name = "dms-vpc-role"
description = "DMS IAM role for VPC permissions"
assume_role_policy = data.aws_iam_policy_document.dms_assume_role_document.json
managed_policy_arns = ["arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole"]
force_detach_policies = true
}

但是,在运行时,我们在 CloudWatch 中看到以下日志:

2022-09-01T16:51:38 [SOURCE_UNLOAD   ]E:  Not retriable error: <AccessDenied> Access Denied [1001705]  (anw_retry_strategy.cpp:118)
2022-09-01T16:51:38 [SOURCE_UNLOAD ]E: Failed to list bucket 'dms-sandbox-redshift-intermediate-storage': error code <AccessDenied>: Access Denied [1001713] (s3_dir_actions.cpp:105)
2022-09-01T16:51:38 [SOURCE_UNLOAD ]E: Failed to list bucket 'dms-sandbox-redshift-intermediate-storage' [1001713] (s3_dir_actions.cpp:209)

我们还在存储桶本身上启用了 S3 服务器访问日志,以查看这是否会为我们提供更多信息。这就是我们所看到的(匿名):

<id> dms-sandbox-redshift-intermediate-storage [01/Sep/2022:15:43:32 +0000] 10.128.69.80 arn:aws:sts::<account>:assumed-role/dms-access-for-endpoint/dms-session-for-replication-engine <code> REST.GET.BUCKET - "GET /dms-sandbox-redshift-intermediate-storage?delimiter=%2F&max-keys=1000 HTTP/1.1" 403 AccessDenied 243 - 30 - "-" "aws-sdk-cpp/1.8.80/S3/Linux/4.14.276-211.499.amzn2.x86_64 x86_64 GCC/4.9.3" - <code> SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3.eu-west-2.amazonaws.com TLSv1.2 -

以上建议服务 dms-session-for-replication是正在接收 AccessDenied 响应的相关服务,但我们无法确定这是什么以及我们如何解决它。

我们尝试将存储桶策略添加到 S3 存储桶本身,但这没有用(这也包括 S3 服务器访问日志存储桶):

resource "aws_s3_bucket" "dms_redshift_intermediate" {
# Prefixed with `dms-` as that's what the AmazonDMSRedshiftS3Role policy filters on
bucket = "dms-sandbox-redshift-intermediate-storage"
}

resource "aws_s3_bucket_logging" "log_bucket" {
bucket = aws_s3_bucket.dms_redshift_intermediate.id
target_bucket = aws_s3_bucket.log_bucket.id
target_prefix = "log/"
}

resource "aws_s3_bucket" "log_bucket" {
bucket = "${aws_s3_bucket.dms_redshift_intermediate.id}-logs"
}

resource "aws_s3_bucket_acl" "log_bucket" {
bucket = aws_s3_bucket.log_bucket.id
acl = "log-delivery-write"
}

resource "aws_s3_bucket_policy" "dms_redshift_intermediate_policy" {
bucket = aws_s3_bucket.dms_redshift_intermediate.id
policy = data.aws_iam_policy_document.dms_redshift_intermediate_policy_document.json
}

data "aws_iam_policy_document" "dms_redshift_intermediate_policy_document" {
statement {
actions = [
"s3:*"
]

principals {
identifiers = [
"dms.amazonaws.com",
"redshift.amazonaws.com"
]
type = "Service"
}

resources = [
aws_s3_bucket.dms_redshift_intermediate.arn,
"${aws_s3_bucket.dms_redshift_intermediate.arn}/*"
]
}
}

我们如何修复 <AccessDenied>我们在 CloudWatch 上看到的问题并允许将数据加载到 Redshift? DMS 能够 PUT S3 存储桶中的项目,因为我们看到加密的 CSV 出现在其中(服务器访问日志也证实了这一点),但 DMS 无法随后 GET为 Redshift 从中取出文件。 AccessDenied 响应还表明这是 IAM 角色问题而不是安全组问题,但我们的 IAM 角色是根据文档配置的,因此我们对可能导致此问题的原因感到困惑。

最佳答案

我们认为是 IAM 问题,实际上是安全组问题。 Redshift 的 COPY 命令难以访问 S3。通过向 Redshift 安全组添加 HTTPS 的 443 导出规则,我们能够再次拉取数据

resource "aws_security_group_rule" "https_443_egress" {
type = "egress"
description = "Allow HTTP egress from DMS SG"
protocol = "tcp"
to_port = 443
from_port = 443
security_group_id = aws_security_group.redshift.id
cidr_blocks = ["0.0.0.0/0"]
}

因此,如果您遇到与问题相同的问题,请检查 Redshift 是否可以通过 HTTPS 访问 S3。

关于amazon-web-services - 从 Postgres RDS 到 Redshift 的 AWS DMS 复制任务在 S3 存储桶上被拒绝访问,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73581127/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com