gpt4 book ai didi

amazon-web-services - AWS Glue Cloudformation 排除模式排除 : String

转载 作者:行者123 更新时间:2023-12-03 07:19:10 25 4
gpt4 key购买 nike

我已经在AWS控制台上成功设置了一个胶水爬虫。现在我有一个 Cloudformation 模板来模拟整个过程,除了我无法添加 Exclusions:字段到模板。背景:在 AWS Glue API 中,Exclusions: 字段表示全局模式,用于排除与数据存储(在我的示例中为 S3 数据存储)内的特定模式匹配的文件或文件夹。

尽管付出了很大的努力,我还是无法将 glob 模式填充到胶水爬网程序控制台上,尽管脚本中的所有其他值都与爬网程序配置一起填充,即 S3Target、爬网程序名称、IAM 角色和分组行为,所有这些胶水设置/fields 从 CFN 模板成功填充,排除字段除外,在 Glue 控制台上也称为排除模式。我的 CFN 模板通过了验证,并且我已经运行了爬网程序,希望排除 glob(尽管隐藏)仍会产生影响,但不幸的是,我似乎无法填充“排除”字段?

Here's the S3Target Exclusion AWS Glue API guide

Here's an AWS sample YAML CFN for a Glue Crawler

Here's a helpful YAML string array guide

YAML

 CFNCrawlerSecDeraNUM:
Type: AWS::Glue::Crawler
Properties:
Name: !Ref CFNCrawlerName
Role: !GetAtt CFNRoleSecDERA.Arn
#Classifiers: none, use the default classifier
Description: AWS Glue crawler to crawl SecDERA data
#Schedule: none, use default run-on-demand
DatabaseName: !Ref CFNDatabaseName
Targets:
S3Targets:
- Exclusions:
- "*/readme.htm"
- "*/sub.txt"
- "*/pre.txt"
- "*/tag.txt"
- Path: "s3://sec-input"
TablePrefix: !Ref CFNTablePrefixName
SchemaChangePolicy:
UpdateBehavior: "UPDATE_IN_DATABASE"
DeleteBehavior: "LOG"
# Added single schema grouping Glue API option
Configuration: "{\"Version\":1.0,\"CrawlerOutput\":{\"Partitions\":{\"AddOrUpdateBehavior\":\"InheritFromTable\"},\"Tables\":{\"AddOrUpdateBehavior\":\"MergeNewColumns\"}},\"Grouping\":{\"TableGroupingPolicy\":\"CombineCompatibleSchemas\"}}"

JSON

"CFNCrawlerSecDeraNUM": {
"Type": "AWS::Glue::Crawler",
"Properties": {
"Name": {
"Ref": "CFNCrawlerName"
},
"Role": {
"Fn::GetAtt": [
"CFNRoleSecDERA",
"Arn"
]
},
"Description": "AWS Glue crawler to crawl SecDERA data",
"DatabaseName": {
"Ref": "CFNDatabaseName"
},
"Targets": {
"S3Targets": [
{
"Exclusions": [
"*/readme.htm",
"*/sub.txt",
"*/pre.txt",
"*/tag.txt"
]
},
{
"Path": "s3://sec-input"
}
]
},
"TablePrefix": {
"Ref": "CFNTablePrefixName"
},
"SchemaChangePolicy": {
"UpdateBehavior": "UPDATE_IN_DATABASE",
"DeleteBehavior": "LOG"
},
"Configuration": "{\"Version\":1.0,\"CrawlerOutput\":{\"Partitions\":{\"AddOrUpdateBehavior\":\"InheritFromTable\"},\"Tables\":{\"AddOrUpdateBehavior\":\"MergeNewColumns\"}},\"Grouping\":{\"TableGroupingPolicy\":\"CombineCompatibleSchemas\"}}"
}
}

最佳答案

您正在将 Exclusions 作为新的 S3Target 对象传递到 S3Targets 列表。

尝试更改此:

  Targets:
S3Targets:
- Exclusions:
- "*/readme.htm"
- "*/sub.txt"
- "*/pre.txt"
- "*/tag.txt"
- Path: "s3://sec-input"

对此:

  Targets:
S3Targets:
- Path: "s3://sec-input"
Exclusions:
- "*/readme.htm"
- "*/sub.txt"
- "*/pre.txt"
- "*/tag.txt"

关于amazon-web-services - AWS Glue Cloudformation 排除模式排除 : String,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58850677/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com