typescript - OpenAI 速率限制 429 Bug-6ren

typescript - OpenAI 速率限制 429 Bug

转载作者：行者123 更新时间：2023-12-02 22:48:15

24

4

我正在尝试使用this使用 OpenAI + Pinecone 为 YouTube 视频创建语义搜索的存储库，但我在此步骤中遇到 429 错误 - “运行命令 npx tsx src/bin/process-yt-playlist.ts 来预处理转录本并从中获取嵌入OpenAI，然后将它们插入到 Pinecone 搜索索引中。”

感谢任何帮助!!

附件是我的 openai.ts 文件

import pMap from 'p-map'
import unescape from 'unescape'

import * as config from '@/lib/config'

import * as types from './types'

import pMemoize from 'p-memoize'
import pRetry from 'p-retry'
import pThrottle from 'p-throttle'

// TODO: enforce max OPENAI_EMBEDDING_CTX_LENGTH of 8191

// https://platform.openai.com/docs/guides/rate-limits/what-are-the-rate-limits-for-our-api
// TODO: enforce TPM
const throttleRPM = pThrottle({
  // 3k per minute instead of 3.5k per minute to add padding
  limit: 3000,
  interval: 60 * 1000,
  strict: true
})

type PineconeCaptionVectorPending = {
  id: string
  input: string
  metadata: types.PineconeCaptionMetadata
}

export async function getEmbeddingsForVideoTranscript({
  transcript,
  title,
  openai,
  model = config.openaiEmbeddingModel,
  maxInputTokens = 100, // TODO???
  concurrency = 1
}: {
  transcript: types.Transcript
  title: string
  openai: types.OpenAIApi
  model?: string
  maxInputTokens?: number
  concurrency?: number
}) {
  const { videoId } = transcript

  let pendingVectors: PineconeCaptionVectorPending[] = []
  let currentStart = ''
  let currentNumTokensEstimate = 0
  let currentInput = ''
  let currentPartIndex = 0
  let currentVectorIndex = 0
  let isDone = false

  // const createEmbedding = pMemoize(throttleRPM(createEmbeddingImpl))

  // Pre-compute the embedding inputs, making sure none of them are too long
  do {
    isDone = currentPartIndex >= transcript.parts.length

    const part = transcript.parts[currentPartIndex]
    const text = unescape(part?.text)
      .replaceAll('[Music]', '')
      .replaceAll(/[\t\n]/g, ' ')
      .replaceAll('  ', ' ')
      .trim()
    const numTokens = getNumTokensEstimate(text)

    if (!isDone && currentNumTokensEstimate + numTokens < maxInputTokens) {
      if (!currentStart) {
        currentStart = part.start
      }

      currentNumTokensEstimate += numTokens
      currentInput = `${currentInput} ${text}`

      ++currentPartIndex
    } else {
      currentInput = currentInput.trim()
      if (isDone && !currentInput) {
        break
      }

      const currentVector: PineconeCaptionVectorPending = {
        id: `${videoId}:${currentVectorIndex++}`,
        input: currentInput,
        metadata: {
          title,
          videoId,
          text: currentInput,
          start: currentStart
        }
      }

      pendingVectors.push(currentVector)

      // reset current batch
      currentNumTokensEstimate = 0
      currentStart = ''
      currentInput = ''
    }
  } while (!isDone)
  let index = 0;

  console.log("Entering embeddings calculation")
  // Evaluate all embeddings with a max concurrency
  // const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
  const vectors: types.PineconeCaptionVector[] = await pMap(
    pendingVectors,
    async (pendingVector) => {
      // await delay(6000); // add a delay of 1 second before each iteration
      console.log(pendingVector.input + " " + model)


      // const { data: embed } = await openai.createEmbedding({
      //   input: pendingVector.input,
      //   model
      // })

      async function createEmbeddingImpl({
        input = pendingVector.input,
        model = 'text-embedding-ada-002'
      }: {
        input: string
        model?: string
      }): Promise<number[]> {
        const res = await pRetry(
          () =>
            openai.createEmbedding({
              input,
              model
            }),
          {
            retries: 4,
            minTimeout: 1000,
            factor: 2.5
          }
        )
      
        return res.data.data[0].embedding
      }

      const embedding = await pMemoize(throttleRPM(createEmbeddingImpl));
      

      const vector: types.PineconeCaptionVector = {
        id: pendingVector.id,
        metadata: pendingVector.metadata,
        values: await embedding(pendingVector)
      }
      console.log(index + " THIS IS THE NUMBER OF CALLS TO OPENAI Embedding: " + embedding)
      index++;
      return vector
    },
    {
      concurrency
    }
  )

  return vectors
}

function getNumTokensEstimate(input: string): number {
  const numTokens = (input || '')
    .split(/\s/)
    .map((token) => token.trim())
    .filter(Boolean).length

  return numTokens
}

我尝试将 api 调用之间的时间间隔增加到远低于限制，但不知何故我仍然遇到相同的错误。

最佳答案

如果您没有任何积分，OpenAI 会发送 429 Rate Limit 错误。我一直在使用 3 个月后过期的免费积分。您可以在使用页面上查看您的可用积分:

https://platform.openai.com/account/usage

旁注:一旦我将信用卡存档，大约需要 5 分钟时间限制才会消失

关于typescript - OpenAI 速率限制 429 Bug，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/75763453/

24

4

0

文章推荐： python - 如何使用 python 修复字符串中不正确的 html 标签？

文章推荐： sql - 将存储为 VARCHAR 的 BINARY 转换为 BINARY

文章推荐： python - Python 中的 ChatGPT 包装器作为命令行解释器

typescript - 从另一个 TypeScript 包导入 TypeScript 类型
我已经写了并且 npm 发布了这个:https://github.com/justin-calleja/pkg-dependents 现在我正在用 Typescript 编写这个包:https://g
typescript - 如何在模拟函数时允许部分 TypeScript 类型 - Jest、TypeScript
我有一个函数，我想在 TypeScript 中模拟它以进行测试。在我的测试中，我只关心 json和 status .但是，当使用 Jest 的 jest.spyOn 时我的模拟函数的类型设置为返回 h
typescript - 如何在 Typescript 中声明一个从支持 Typescript 的库返回类型的函数
我正在使用一个库 (Axios)，它的包中包含 Typescript 声明。我想声明一个将 AxiosResponse(在库的 .d.ts 文件中声明)作为参数的函数。我有以下内容: functio
typescript - 使用引用将一个 typescript 文件加载到另一个 typescript 文件中
我是 Typescript 的新手。我想使用将一个 Typescript 文件加载到另一个 Typescript 文件中标签。我做了一些事情，但它不起作用!请帮助我。 first.ts: imp
typescript - atom-typescript - 为什么无法识别这些 Typescript 配置选项？
为什么我会收到下面屏幕截图中显示的错误？ Atom 说我的 tsconfig.json“项目文件包含无效选项”用于 allowJs、buildOnSave 和 compileOnSave。但是应该允
typescript - 将所有 TypeScript 文件编译成单个 TypeScript 文件
所以我正在创建一个 TypeScript 库，我可以轻松地将所有生成的 JS 文件编译成一个文件。有没有办法将所有 .ts 和 .d.ts 编译成一个 .ts 文件？除了支持 JS 的版本(较少的智
typescript - 更安全的 TypeScript - 与普通 TypeScript 有什么区别
Microsoft Research 提供了一种名为Safer TypeScript 的新 TypeScript 编译器变体: http://research.microsoft.com/en-us/
typescript - 将多个 typescript 文件合并为一个 typescript 定义文件
我需要这个来在单个文件中分发 TypeScript 中的库。有没有办法将多个 typescript 文件合并到(一个js文件+一个 typescript 定义)文件中？最佳答案要创建一个库，您可以
typescript - typescript 中的装饰器返回函数的时间
用例:我想知道一个函数在 typescript 中执行需要多少时间。我想为此目的使用装饰器。我希望装饰器应该返回时间以便(我可以进一步使用它)，而不仅仅是打印它。例如: export functio
typescript - Typescript 中可空的条件类型
我想检查一个类型是否可以为 null，以及它是否具有值的条件类型。我尝试实现 type IsNullable = T extends null ? true : false; 但是好像不行 type
typescript - TypeScript 如何推断回调参数类型
我的问题是基于这个 question and answer 假设我们有下一个代码: const myFn = (p: { a: (n: number) => T, b: (o: T) => v
typescript - TypeScript 中的后缀双感叹号
我知道双重否定前缀，我知道 TypeScript 的单后缀(非空断言)。但是这个双后缀感叹号是什么？ /.*验证码为(\d{6}).*/.exec(email.body!!)!![1] 取自here
typescript - typescript 的全局模块定义
我正在使用以下文件结构在 Webstorm 中开发一个项目 | src | ... | many files | types | SomeInterface |
typescript - TypeScript 对象字面量中的类型定义
在 TypeScript 类中，可以为属性声明类型，例如: class className { property: string; }; 如何在对象字面量中声明属性的类型？我试过下面的代码，但它
typescript - TypeScript 中的非破坏性类型断言
我正在寻找一种在不丢失推断类型信息的情况下将 TypeScript 中的文字值限制为特定类型的好方法。让我们考虑一个类型Named，它保证有一个名字。 type Named = { name:
typescript - TypeScript 中任意数量类型的非析取联合
在 TypeScript 中，我想创建一个联合类型来表示属于一个或多个不同类型的值，类似于 oneOf在 OpenAPI或 JSON Schema .根据a previous answer on a
typescript - Typescript 中函数声明的可重用类型注释？
type Func = (foo:string) => void // function expression const myFunctionExpression:Func = function(f
typescript - TypeScript 中联合类型的相应子类型的泛型
假设我有一个联合类型，我正在使用类似 reducer 的 API 调用模式，看起来像这样: type Action = { request: { action: "create
typescript - typescript 中的安全类型去抖功能
我在 typescript 中有以下去抖功能: export function debounce( callback: (...args: any[]) => void, wait: numb
typescript - vue3中props的定义方式是什么(Typescript)
在 Vue3 的 defineComponent 函数中，第一个泛型参数是 Props，所以我在这里使用 Typescript 接口(interface)提供我的 props 类型。喜欢: expor

首页

博学

6Ren·AI

商城

typescript - OpenAI 速率限制 429 Bug