gpt4 book ai didi

python - OpenAI 嵌入余弦相似度搜索 'Input vector should be 1-D' 错误

转载 作者:行者123 更新时间:2023-12-02 05:47:26 24 4
gpt4 key购买 nike

我在 Jupyter Notebook 中遇到以下错误

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[2], line 39
37 query = input("Enter your query: ")
38 print("Recommended contacts:")
---> 39 for contact in search_contacts(query):
40 print(contact)

Cell In[2], line 33, in search_contacts(query)
31 scores = {}
32 for contact, embedding in embeddings.items():
---> 33 scores[contact] = 1 - cosine(query_embedding, embedding)
34 return sorted(scores, key=scores.get, reverse=True)[:5]

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\scipy\spatial\distance.py:668, in cosine(u, v, w)
626 """
627 Compute the Cosine distance between 1-D arrays.
628
(...)
663
664 """
665 # cosine distance is also referred to as 'uncentered correlation',
666 # or 'reflective correlation'
667 # clamp the result to 0-2
--> 668 return max(0, min(correlation(u, v, w=w, centered=False), 2.0))

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\scipy\spatial\distance.py:608, in correlation(u, v, w, centered)
575 def correlation(u, v, w=None, centered=True):
576 """
577 Compute the correlation distance between two 1-D arrays.
578
(...)
606
607 """
--> 608 u = _validate_vector(u)
609 v = _validate_vector(v)
610 if w is not None:

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\scipy\spatial\distance.py:301, in _validate_vector(u, dtype)
299 if u.ndim == 1:
300 return u
--> 301 raise ValueError("Input vector should be 1-D.")

ValueError: Input vector should be 1-D.

这是我的代码

import pandas as pd
import openai
import numpy as np
from scipy.spatial.distance import cosine

# Authenticate to OpenAI
openai.api_key = "API_KEY"

# Load the CSV file
contacts = pd.read_csv("c:/tmp/connect.csv")

# Generate embeddings for each contact using GPT-3
embeddings = {}
for index, row in contacts.iterrows():
combined = row["Combined"]
response = openai.Completion.create(
model="text-davinci-002",
prompt=f"generate embeddings for {combined}",
temperature=0.5,
)
embedding = response["choices"][0]["text"]
embeddings[combined] = embedding

# Search function to return recommended contacts based on a user's query
def search_contacts(query):
query_embedding = openai.Completion.create(
model="text-davinci-002",
prompt=f"generate embeddings for {query}",
temperature=0.5,
)["choices"][0]["text"]
scores = {}
for contact, embedding in embeddings.items():
scores[contact] = 1 - cosine(query_embedding, embedding)
return sorted(scores, key=scores.get, reverse=True)[:5]

# Example usage
query = input("Enter your query: ")
print("Recommended contacts:")
for contact in search_contacts(query):
print(contact)

我的 connect.csv 文件如下所示:

<表类="s-表"><头>联合<正文>全名:Alex Goodwill;公司:HyperCap;职位:业务顾问全名:Amy Power;公司:好莱坞; Position: Strategy & Operations - CEO办公室

需要帮助弄清楚如何修复此错误。我进行了谷歌搜索,但没有找到任何可以帮助我理解如何将非一维数组传递给余弦相似性搜索的信息。

最佳答案

您正在尝试计算文本而不是向量的余弦相似度。嵌入是具有语义意义的文本的向量表示。您不会通过提示完成端点来创建嵌入。您需要使用嵌入端点。

response = openai.Embedding.create(
input=[
"Sample text goes here",
"there can be one or several phrases in each batch"
], engine="text-embedding-ada-002"
)

响应将包含每个短语的嵌入。例如:

"data": [
{
"embedding": [0, 0, 0,....],
"index": 0,
"object": "embedding"
},
{
"embedding": [0, 0, 0,....],
"index": 1,
"object": "embedding"
}
],
"model": "text-embedding-ada-002-v2",
"object": "list",
"usage": {
"prompt_tokens": ,
"total_tokens":
}
}

因此您可以从响应中获取嵌入并计算余弦相似度。

response['data'][0]['embedding']

关于python - OpenAI 嵌入余弦相似度搜索 'Input vector should be 1-D' 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75252902/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com