- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我曾经使用名为Export DynamoDB table to S3
的数据管道模板将DynamoDB表导出到文件。我最近更新了我所有的DynamoDB表,使其具有按需配置,并且该模板不再起作用。我可以肯定这是因为旧模板指定了要消耗的DynamoDB吞吐量百分比,这与按需表无关。
我尝试将旧模板导出为JSON,删除对吞吐量百分比消耗的引用,并创建一个新管道。但是,这是不成功的。
谁能建议如何将提供吞吐量的旧式管道脚本转换为新的按需表脚本?
这是我原始的运行脚本:
{
"objects": [
{
"name": "DDBSourceTable",
"id": "DDBSourceTable",
"type": "DynamoDBDataNode",
"tableName": "#{myDDBTableName}"
},
{
"name": "EmrClusterForBackup",
"coreInstanceCount": "1",
"coreInstanceType": "m3.xlarge",
"releaseLabel": "emr-5.13.0",
"masterInstanceType": "m3.xlarge",
"id": "EmrClusterForBackup",
"region": "#{myDDBRegion}",
"type": "EmrCluster"
},
{
"failureAndRerunMode": "CASCADE",
"resourceRole": "DataPipelineDefaultResourceRole",
"role": "DataPipelineDefaultRole",
"scheduleType": "ONDEMAND",
"name": "Default",
"id": "Default"
},
{
"output": {
"ref": "S3BackupLocation"
},
"input": {
"ref": "DDBSourceTable"
},
"maximumRetries": "2",
"name": "TableBackupActivity",
"step": "s3://dynamodb-emr-#{myDDBRegion}/emr-ddb-storage-handler/2.1.0/emr-ddb-2.1.0.jar,org.apache.hadoop.dynamodb.tools.DynamoDbExport,#{output.directoryPath},#{input.tableName},#{input.readThroughputPercent}",
"id": "TableBackupActivity",
"runsOn": {
"ref": "EmrClusterForBackup"
},
"type": "EmrActivity",
"resizeClusterBeforeRunning": "true"
},
{
"directoryPath": "#{myOutputS3Loc}/#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}",
"name": "S3BackupLocation",
"id": "S3BackupLocation",
"type": "S3DataNode"
}
],
"parameters": [
{
"description": "Output S3 folder",
"id": "myOutputS3Loc",
"type": "AWS::S3::ObjectKey"
},
{
"description": "Source DynamoDB table name",
"id": "myDDBTableName",
"type": "String"
},
{
"default": "0.25",
"watermark": "Enter value between 0.1-1.0",
"description": "DynamoDB read throughput ratio",
"id": "myDDBReadThroughputRatio",
"type": "Double"
},
{
"default": "us-east-1",
"watermark": "us-east-1",
"description": "Region of the DynamoDB table",
"id": "myDDBRegion",
"type": "String"
}
],
"values": {
"myDDBRegion": "us-east-1",
"myDDBTableName": "LIVE_Invoices",
"myDDBReadThroughputRatio": "0.25",
"myOutputS3Loc": "s3://company-live-extracts/"
}
}
{
"objects": [
{
"name": "DDBSourceTable",
"id": "DDBSourceTable",
"type": "DynamoDBDataNode",
"tableName": "#{myDDBTableName}"
},
{
"name": "EmrClusterForBackup",
"coreInstanceCount": "1",
"coreInstanceType": "m3.xlarge",
"releaseLabel": "emr-5.13.0",
"masterInstanceType": "m3.xlarge",
"id": "EmrClusterForBackup",
"region": "#{myDDBRegion}",
"type": "EmrCluster"
},
{
"failureAndRerunMode": "CASCADE",
"resourceRole": "DataPipelineDefaultResourceRole",
"role": "DataPipelineDefaultRole",
"scheduleType": "ONDEMAND",
"name": "Default",
"id": "Default"
},
{
"output": {
"ref": "S3BackupLocation"
},
"input": {
"ref": "DDBSourceTable"
},
"maximumRetries": "2",
"name": "TableBackupActivity",
"step": "s3://dynamodb-emr-#{myDDBRegion}/emr-ddb-storage-handler/2.1.0/emr-ddb-2.1.0.jar,org.apache.hadoop.dynamodb.tools.DynamoDbExport,#{output.directoryPath},#{input.tableName}",
"id": "TableBackupActivity",
"runsOn": {
"ref": "EmrClusterForBackup"
},
"type": "EmrActivity",
"resizeClusterBeforeRunning": "true"
},
{
"directoryPath": "#{myOutputS3Loc}/#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}",
"name": "S3BackupLocation",
"id": "S3BackupLocation",
"type": "S3DataNode"
}
],
"parameters": [
{
"description": "Output S3 folder",
"id": "myOutputS3Loc",
"type": "AWS::S3::ObjectKey"
},
{
"description": "Source DynamoDB table name",
"id": "myDDBTableName",
"type": "String"
},
{
"default": "us-east-1",
"watermark": "us-east-1",
"description": "Region of the DynamoDB table",
"id": "myDDBRegion",
"type": "String"
}
],
"values": {
"myDDBRegion": "us-east-1",
"myDDBTableName": "LIVE_Invoices",
"myOutputS3Loc": "s3://company-live-extracts/"
}
}
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:198) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java
最佳答案
我为此打开了与AWS的支持票。他们的 react 非常全面。我将其粘贴在下面
感谢您就此问题与我们联系。
不幸的是,DynamoDB的数据管道导出/导入作业不支持DynamoDB的新按需模式[1]。
使用按需容量的表没有为读取和写入单位定义的容量。在计算管道的吞吐量时,数据管道依赖于此定义的容量。
例如,如果您有100个RCU(读取容量单位)并且管道吞吐量为0.25(25%),则有效管道吞吐量将为每秒25个读取单位(100 * 0.25)。
但是,在按需容量的情况下,RCU和WCU(写入容量单位)反射(reflect)为0。无论管道吞吐量值如何,计算出的有效吞吐量均为0。
当有效吞吐量小于1时,管道将不执行。
您是否需要将DynamoDB表导出到S3?
如果您仅将这些表导出用于备份,则建议使用DynamoDB的按需备份和还原功能(与按需容量类似的名称,容易混淆)[2]。
请注意,按需备份不会影响表的吞吐量,并且会在几秒钟内完成。您只需支付与备份相关的S3存储成本。
但是,客户无法直接访问这些表备份,只能将其还原到源表。如果您希望对备份数据执行分析或将数据导入其他系统,帐户或表,则此备份方法不适合。
如果您需要使用数据管道来导出DynamoDB数据,那么前进的唯一方法是将表设置为Provisioned Capacity模式。
您可以手动执行此操作,也可以使用AWS CLI命令[3]将其作为事件包含在管道本身中。
例如(按需也称为按请求付费模式):
$ aws dynamodb update-table --table-name myTable --billing-mode PROVISIONED --provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=100
$ aws dynamodb update-table --table-name myTable --billing-mode PAY_PER_REQUEST
关于amazon-dynamodb - 如何使用数据管道导出具有按需配置的DynamoDB表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54666788/
我是一名优秀的程序员,十分优秀!