快速上手:使用云资源
Weaviate 是一个开源的向量数据库,旨在为 AI 应用程序提供支持。本快速上手指南将向您展示如何
- 设置集合 - 创建一个集合并将数据导入到其中。
- 搜索 - 对您的数据执行相似度(向量)搜索。
- RAG - 使用生成模型执行检索增强生成 (RAG)。
- 查询代理 - 使用自然语言提示/问题从您的数据中获取答案。
仅云服务
如果您在操作过程中遇到任何问题或有其他疑问,请使用功能。
先决条件
一个 Weaviate Cloud 沙箱实例 - 您需要一个管理员 API 密钥 和一个 REST 端点 URL 来连接到您的实例。有关更多信息,请参阅下面的说明。如果您不想使用 Weaviate Cloud,请查看使用 Docker 的 本地快速上手。
如何设置 Weaviate Cloud 沙箱实例
访问 Weaviate Cloud 控制台 并创建一个免费的沙箱实例,如下面的交互式示例所示。
- 集群配置通常需要 1-3 分钟。
- 当集群准备就绪时,Weaviate Cloud 会在集群名称旁边显示一个复选标记 (
✔️)。 - 请注意,Weaviate Cloud 会在沙箱集群名称后添加随机后缀以确保唯一性。
如何检索 Weaviate Cloud 凭据(WEAVIATE_API_KEY 和 WEAVIATE_URL)
创建 Weaviate Cloud 实例后,您需要
- REST 端点 URL 和
- 管理员 API 密钥.
您可以从 WCD 控制台 检索它们,如下面的交互式示例所示。
Weaviate 支持 REST 和 gRPC 协议。对于 Weaviate Cloud 部署,您只需要提供 REST 端点 URL - 客户端将自动配置 gRPC。
一旦您拥有 REST 端点 URL 和 管理员 API 密钥,您就可以连接到沙箱实例并使用 Weaviate。
安装客户端库
请按照以下说明安装其中一个官方客户端库,可在 Python、JavaScript/TypeScript、Go 和 Java 中找到。
如果某个片段无法工作或您有任何反馈,请打开一个 GitHub issue。
pip install -U weaviate-client[agents]
步骤 1:创建集合并导入数据
您可以在导入数据时选择两种路径
以下示例创建一个名为 Movie 的集合。数据将使用 Weaviate EmbeddingsWeaviate Embeddings 是 Weaviate Cloud 用户(嵌入模型提供商)的托管嵌入推理服务。它直接从 Weaviate Cloud 数据库实例生成数据的向量嵌入和查询。 模型提供商进行向量化。您也可以自由使用任何其他可用的 嵌入模型提供商。
如果某个片段无法工作或您有任何反馈,请打开一个 GitHub issue。
import weaviate
from weaviate.classes.config import Configure
import os
# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
# Step 1.1: Connect to your Weaviate Cloud instance
with weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate_api_key,
) as client:
# Step 1.2: Create a collection
movies = client.collections.create(
name="Movie",
vector_config=Configure.Vectors.text2vec_weaviate(), # Configure the Weaviate Embeddings vectorizer
)
# Step 1.3: Import three objects
data_objects = [
{"title": "The Matrix", "description": "A computer hacker learns about the true nature of reality and his role in the war against its controllers.", "genre": "Science Fiction"},
{"title": "Spirited Away", "description": "A young girl becomes trapped in a mysterious world of spirits and must find a way to save her parents and return home.", "genre": "Animation"},
{"title": "The Lord of the Rings: The Fellowship of the Ring", "description": "A meek Hobbit and his companions set out on a perilous journey to destroy a powerful ring and save Middle-earth.", "genre": "Fantasy"},
]
movies = client.collections.use("Movie")
with movies.batch.fixed_size(batch_size=200) as batch:
for obj in data_objects:
batch.add_object(properties=obj)
print(f"Imported & vectorized {len(movies)} objects into the Movie collection")
以下示例创建一个名为 Movie 的集合。数据应已包含 预先计算的向量嵌入由嵌入模型(来自 OpenAI、Anthropic 等提供商)生成的向量嵌入。。当您从不同的向量数据库迁移数据时,此选项很有用。
如果某个片段无法工作或您有任何反馈,请打开一个 GitHub issue。
import weaviate
from weaviate.classes.config import Configure
import os
# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
# Step 1.1: Connect to your Weaviate Cloud instance
with weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate_api_key,
) as client:
# Step 1.2: Create a collection
movies = client.collections.create(
name="Movie",
vector_config=Configure.Vectors.self_provided(), # No automatic vectorization since we're providing vectors
)
# Step 1.3: Import three objects
data_objects = [
{"properties": {"title": "The Matrix", "description": "A computer hacker learns about the true nature of reality and his role in the war against its controllers.", "genre": "Science Fiction"},
"vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]},
{"properties": {"title": "Spirited Away", "description": "A young girl becomes trapped in a mysterious world of spirits and must find a way to save her parents and return home.", "genre": "Animation"},
"vector": [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]},
{"properties": {"title": "The Lord of the Rings: The Fellowship of the Ring", "description": "A meek Hobbit and his companions set out on a perilous journey to destroy a powerful ring and save Middle-earth.", "genre": "Fantasy"},
"vector": [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]}
]
# Insert the objects with vectors
movies = client.collections.get("Movie")
with movies.batch.fixed_size(batch_size=200) as batch:
for obj in data_objects:
batch.add_object(properties=obj["properties"], vector=obj["vector"])
print(
f"Imported {len(data_objects)} objects with vectors into the Movie collection"
)
步骤 2:语义(向量)搜索
语义搜索基于含义查找结果。这在 Weaviate 中称为 nearText。以下示例搜索 2 个与 sci-fi 含义最相似的对象(limit)。
如果某个片段无法工作或您有任何反馈,请打开一个 GitHub issue。
import weaviate
import os, json
# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
# Step 2.1: Connect to your Weaviate Cloud instance
with weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate_api_key,
) as client:
# Step 2.2: Use this collection
movies = client.collections.use("Movie")
# Step 2.3: Perform a semantic search with NearText
response = movies.query.near_text(
query="sci-fi",
limit=2
)
for obj in response.objects:
print(json.dumps(obj.properties, indent=2)) # Inspect the results
语义搜索基于含义查找结果。这在 Weaviate 中称为 nearVector。以下示例搜索 2 个与查询向量最相似的向量(limit)。
如果某个片段无法工作或您有任何反馈,请打开一个 GitHub issue。
import weaviate
import os, json
# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
# Step 2.1: Connect to your Weaviate Cloud instance
with weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate_api_key,
) as client:
# Step 2.2: Use this collection
movies = client.collections.use("Movie")
# Step 2.3: Perform a vector search with NearVector
response = movies.query.near_vector(
near_vector=[0.11, 0.21, 0.31, 0.41, 0.51, 0.61, 0.71, 0.81],
limit=2
)
for obj in response.objects:
print(json.dumps(obj.properties, indent=2)) # Inspect the results
示例响应
{
"genre": "Science Fiction",
"title": "The Matrix",
"description": "A computer hacker learns about the true nature of reality and his role in the war against its controllers."
}
{
"genre": "Fantasy",
"title": "The Lord of the Rings: The Fellowship of the Ring",
"description": "A meek Hobbit and his companions set out on a perilous journey to destroy a powerful ring and save Middle-earth."
}
步骤 3:检索增强生成 (RAG)
在本步骤中进行检索增强生成 (RAG) 时,您需要一个 Claude API 密钥。您也可以使用另一个生成 模型提供商。
检索增强生成 (RAG),也称为生成式搜索,通过使用 用户查询 和 从数据库检索的数据 的组合来提示大型语言模型 (LLM)。
以下示例将查询 sci-fi 的语义搜索与提示结合起来,使用 Anthropic 生成模型 (generative-anthropic) 生成一条推文。
如果某个片段无法工作或您有任何反馈,请打开一个 GitHub issue。
import os
import weaviate
from weaviate.classes.generate import GenerativeConfig
# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
anthropic_api_key = os.environ["ANTHROPIC_API_KEY"]
# Step 2.1: Connect to your Weaviate Cloud instance
with weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate_api_key,
headers={"X-Anthropic-Api-Key": anthropic_api_key},
) as client:
# Step 2.2: Use this collection
movies = client.collections.use("Movie")
# Step 2.3: Perform RAG with on NearText results
response = movies.generate.near_text(
query="sci-fi",
limit=1,
grouped_task="Write a tweet with emojis about this movie.",
generative_provider=GenerativeConfig.anthropic(
model="claude-3-5-haiku-latest"
), # Configure the Anthropic generative integration for RAG
)
print(response.generative.text) # Inspect the results
检索增强生成 (RAG),也称为生成式搜索,通过使用 用户查询 和 从数据库检索的数据 的组合来提示大型语言模型 (LLM)。
以下示例将向量相似度搜索与提示结合起来,使用 Anthropic 生成模型 (generative-anthropic) 生成一条推文。
如果某个片段无法工作或您有任何反馈,请打开一个 GitHub issue。
import os
import weaviate
from weaviate.classes.generate import GenerativeConfig
# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
anthropic_api_key = os.environ["ANTHROPIC_API_KEY"]
# Step 2.1: Connect to your Weaviate Cloud instance
with weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate_api_key,
headers={"X-Anthropic-Api-Key": anthropic_api_key},
) as client:
# Step 2.2: Use this collection
movies = client.collections.use("Movie")
# Step 2.3: Perform RAG with on NearVector results
response = movies.generate.near_vector(
near_vector=[0.11, 0.21, 0.31, 0.41, 0.51, 0.61, 0.71, 0.81],
limit=1,
grouped_task="Write a tweet with emojis about this movie.",
generative_provider=GenerativeConfig.anthropic(
model="claude-3-5-haiku-latest"
), # Configure the Anthropic generative integration for RAG
)
print(response.generative.text) # Inspect the results
示例响应
🕶️ Unplug from the system & join Neo's journey 💊🐰
"The Matrix" will blow your mind 🤯 as reality unravels 🌀
Kung-fu, slow-mo & mind-bending sci-fi 🥋🕴️
Are you ready to see how deep the rabbit hole goes? 🔴🔵 #TheMatrix #WakeUp
步骤 4:查询代理
Weaviate 查询代理 是一种预构建的代理服务,旨在根据存储在 Weaviate Cloud 中的数据回答自然语言查询。用户只需提供自然语言提示/问题,查询代理就会负责处理所有中间步骤以提供答案。
如果某个片段无法工作或您有任何反馈,请打开一个 GitHub issue。
import os
import weaviate
from weaviate.agents.query import QueryAgent
# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
# Step 2.1: Connect to your Weaviate Cloud instance
with weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate_api_key,
) as client:
# Step 2.2: Instantiate a new agent object
qa = QueryAgent(client=client, collections=["Movie"])
# Step 2.3: Perform a query using Search Mode
response = qa.search("Find a cool sci-fi movie.", limit=1)
# Print the response
for obj in response.search_results.objects:
print(f"Movie: {obj.properties['title']} - {obj.properties['description']}")
这是打印的响应
Movie: The Matrix - A computer hacker learns about the true nature of reality and his role in the war against its controllers.
下一步
我们建议您查看以下资源以继续学习 Weaviate。
继续快速游览教程 – 一份涵盖配置集合、搜索等重要主题的端到端指南。
查看Weaviate 学院 – 一个以 AI 原生开发为中心的学习平台。
使用客户端库配置、管理和查询 Weaviate 的快速示例。
为学习如何使用 Weaviate 的新用户提供的指南和技巧。
问题和反馈
如果您有任何问题或反馈,请在 用户论坛 中告诉我们。
