gpt4 book ai didi

aws-cloudformation - 使用 CloudFormation 模板运行爬网程序

转载 作者:行者123 更新时间:2023-12-03 07:34:58 25 4
gpt4 key购买 nike

此 CloudFormation 模板按预期工作,并创建本文所需的所有资源:

Data visualization and anomaly detection using Amazon Athena and Pandas from Amazon SageMaker | AWS Machine Learning Blog

但是 WorkflowStartTrigger 资源实际上并不运行爬网程序。如何使用 CloudFormation 模板运行爬网程序?

Resources:
MyRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Principal:
Service:
- "glue.amazonaws.com"
Action:
- "sts:AssumeRole"
Path: "/"
Policies:
-
PolicyName: "root"
PolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Action: "*"
Resource: "*"

MyDatabase:
Type: AWS::Glue::Database
Properties:
CatalogId: !Ref AWS::AccountId
DatabaseInput:
Name: "dbcrawler123"
Description: "TestDatabaseDescription"
LocationUri: "TestLocationUri"
Parameters:
key1 : "value1"
key2 : "value2"

MyCrawler2:
Type: AWS::Glue::Crawler
Properties:
Description: example classifier
Name: "testcrawler123"
Role: !GetAtt MyRole.Arn
DatabaseName: !Ref MyDatabase
Targets:
S3Targets:
- Path: 's3://nytaxi162/'
SchemaChangePolicy:
UpdateBehavior: "UPDATE_IN_DATABASE"
DeleteBehavior: "LOG"
TablePrefix: test-
Configuration: "{\"Version\":1.0,\"CrawlerOutput\":{\"Partitions\":{\"AddOrUpdateBehavior\":\"InheritFromTable\"},\"Tables\":{\"AddOrUpdateBehavior\":\"MergeNewColumns\"}}}"


WorkflowStartTrigger:
Type: AWS::Glue::Trigger
Properties:
Description: Trigger for starting the Crawler
Name: StartTrigger
Type: ON_DEMAND
Actions:
- CrawlerName: "testcrawler123"

最佳答案

您应该能够通过创建附加到 lambda 的自定义资源来做到这一点,其中 lambda 实际上执行启动爬网程序的操作。您甚至应该能够让它等待爬虫完成执行

关于aws-cloudformation - 使用 CloudFormation 模板运行爬网程序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64300994/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com