gpt4 book ai didi

amazon-web-services - 直接从 Amazon Transcribe 获取结果(无服务器)

转载 作者:行者123 更新时间:2023-12-02 00:13:44 27 4
gpt4 key购买 nike

我使用无服务器 Lambda 服务通过 Amazon Transcribe 将语音转录为文本。我当前的脚本能够从 S3 转录文件并将结果作为 JSON 文件存储在 S3 中。

是否有可能直接获取结果,因为我想将其存储在数据库中(AWS RDS 中的 PostgreSQL)?

感谢您的指点

serverless.yml

...
provider:
name: aws
runtime: nodejs10.x
region: eu-central-1
memorySize: 128
timeout: 30
environment:
S3_AUDIO_BUCKET: ${self:service}-${opt:stage, self:provider.stage}-records
S3_TRANSCRIPTION_BUCKET: ${self:service}-${opt:stage, self:provider.stage}-transcriptions
LANGUAGE_CODE: de-DE
iamRoleStatements:
- Effect: Allow
Action:
- s3:PutObject
- s3:GetObject
Resource:
- 'arn:aws:s3:::${self:provider.environment.S3_AUDIO_BUCKET}/*'
- 'arn:aws:s3:::${self:provider.environment.S3_TRANSCRIPTION_BUCKET}/*'
- Effect: Allow
Action:
- transcribe:StartTranscriptionJob
Resource: '*'

functions:

transcribe:
handler: handler.transcribe
events:
- s3:
bucket: ${self:provider.environment.S3_AUDIO_BUCKET}
event: s3:ObjectCreated:*

createTextinput:
handler: handler.createTextinput
events:
- http:
path: textinputs
method: post
cors: true
...

resources:
Resources:
S3TranscriptionBucket:
Type: 'AWS::S3::Bucket'
Properties:
BucketName: ${self:provider.environment.S3_TRANSCRIPTION_BUCKET}
...

handler.js

const db = require('./db_connect');

const awsSdk = require('aws-sdk');

const transcribeService = new awsSdk.TranscribeService();

module.exports.transcribe = (event, context, callback) => {
const records = event.Records;

const transcribingPromises = records.map((record) => {
const recordUrl = [
'https://s3.amazonaws.com',
process.env.S3_AUDIO_BUCKET,
record.s3.object.key,
].join('/');

// create random filename to avoid conflicts in amazon transcribe jobs

function makeid(length) {
var result = '';
var characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
var charactersLength = characters.length;
for ( var i = 0; i < length; i++ ) {
result += characters.charAt(Math.floor(Math.random() * charactersLength));
}
return result;
}

const TranscriptionJobName = makeid(7);

return transcribeService.startTranscriptionJob({
LanguageCode: process.env.LANGUAGE_CODE,
Media: { MediaFileUri: recordUrl },
MediaFormat: 'wav',
TranscriptionJobName,
//MediaSampleRateHertz: 8000, // normally 8000 if you are using wav file
OutputBucketName: process.env.S3_TRANSCRIPTION_BUCKET,
}).promise();
});

Promise.all(transcribingPromises)
.then(() => {
callback(null, { message: 'Start transcription job successfully' });
})
.catch(err => callback(err, { message: 'Error start transcription job' }));
};

module.exports.createTextinput = (event, context, callback) => {
context.callbackWaitsForEmptyEventLoop = false;
const data = JSON.parse(event.body);
db.insert('textinputs', data)
.then(res => {
callback(null,{
statusCode: 200,
body: "Textinput Created! id: " + res
})
})
.catch(e => {
callback(null,{
statusCode: e.statusCode || 500,
body: "Could not create a Textinput " + e
})
})
};

最佳答案

我认为您最好的选择是在存储转录时从 s3 事件触发 lambda,然后将数据发布到您的数据库。正如 Dunedan 提到的,您不能直接从转录到数据库。

您可以像这样通过无服务器将事件添加到 lambda:

storeTranscriptonInDB:
handler: index.storeTransciptInDB
events:
- s3:
bucket: ${self:provider.environment.S3_TRANSCRIPTION_BUCKET}
rules:
- suffix: .json

成绩单文件的 s3 key 将为 event.Records[#].s3.object.key我会遍历记录以便彻底,并且对每个记录都做这样的事情:

const storeTransciptInDB = async (event, context, callback) => {
const records = event.Records;
for (record of event.Records) {
let key = record.s3.object.key;
let params = {
Bucket: record.s3.bucket.name,
Key: key
}
let transcriptFile = await s3.getObject(params).promise();
let transcriptObject = JSON.parse(data.Body.toString("utf-8"));
let transcriptResults = transcriptObject.results.transcripts;
let transcript = "";
transcriptResults.forEach(result => (transcript += result.transcript + " "));
// at this point you can post the transcript variable to your database
}
}

关于amazon-web-services - 直接从 Amazon Transcribe 获取结果(无服务器),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57774120/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com