Spring AI中使用嵌入模型和向量数据库实现RAG应用

在本文中，我们将探讨以下内容：

嵌入模型简介。
使用 DocumentReaders 加载数据。
将嵌入存储在 VectorStore 中。
实施 RAG（检索增强生成），又名提示填充。

OpenAI、Azure Open AI、Google Vertex 等大型语言模型 (LLM) 都是在大型数据集上进行训练的。但这些模型并未针对您的私人数据进行训练，因此它们可能无法回答特定于您的领域的问题。但根据您的私人数据训练模型可能既昂贵又耗时。那么，我们如何利用这些法学硕士来回答特定于我们领域的问题呢？

实现此目的的一种方法是使用 RAG（检索增强生成），也称为提示填充。使用 RAG，我们将从数据存储中检索相关文档并将其传递给法学硕士以生成答案。在此过程中，我们将使用嵌入模型将文档转换为嵌入并将其存储在向量数据库中。

了解检索增强生成 (RAG)
您的企业可能将结构化数据存储在关系数据库中，将非结构化数据存储在 NoSQL 数据库中，甚至存储在文件中。您将能够使用 SQL 和 NoSQL 数据库的查询语言有效地查询关系数据库。您还可以使用Elasticsearch、Solr等全文搜索引擎来查询非结构化数据。

但是，您可能希望使用具有语义意义的自然语言来检索数据。

例如，“我喜欢 Java 编程语言”和“Java 始终是我的首选语言”具有相同的语义，但使用不同的词语。尝试使用确切的单词检索数据可能不会有效。

这就是嵌入发挥作用的地方。嵌入是单词、句子或文档的向量表示。您可以使用这些嵌入来使用自然语言检索数据。

您可以将结构化和非结构化数据转换为嵌入并将它们存储在矢量数据库中。然后，您可以使用自然语言查询矢量数据库并检索相关数据。然后，您可以查询传递相关数据的 AI 模型以获得响应。

检索增强生成（RAG）是优化法学硕士输出的过程，在生成响应之前，除了训练数据之外，还使用额外的知识库。

嵌入API
Embedding API 可以将单词、句子、文档或图像转换为嵌入。嵌入是单词、句子或文档的向量表示。

例如，单词“Apple”可以表示为向量 [0.1, 0.2, 0.3, 0.4, 0.5]。句子“我爱苹果”可以表示为向量 [0.1, 10.3, -10.2, 90.3, 2.4, -0.5]。

Spring AI 提供了EmbeddingClient接口，用于将文本或文档转换为嵌入。您可以使用任何受支持的EmbeddingClient实现，例如OpenAiEmbeddingClient、OllamaEmbeddingClient、 AzureOpenAiEmbeddingClient、VertexAiEmbeddingClient等。

根据您要使用的实现，您可以添加相应的依赖项并在application.properties文件中配置属性。

例如，如果您想使用 OpenAI 的 EmbeddingClient，您可以将以下依赖项添加到pom.xml文件中。

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    <version>0.0.8</version>
</dependency>

在application.properties文件中配置属性。

spring.ai.openai.api-key=${OPENAI_API_KEY}

# You can override the above common api-key for embedding using the following property
spring.ai.openai.embedding.api-key=${OPENAI_API_KEY}

通过上述配置，您可以注入EmbeddingClient并将文本或文档转换为嵌入，如下所示：

@Component
class MyComponent {
    private final EmbeddingClient embeddingClient;
    
    public MyComponent(EmbeddingClient embeddingClient) {
        this.embeddingClient = embeddingClient;
    }
    
    public void convertTextToEmbedding() {
        //Example 1: Convert text to embeddings
        List<Double> embeddings1 = embeddingClient.embed("I like Spring Boot");
        
        //Example 2: Convert document to embeddings
        List<Double> embeddings2 = embeddingClient.embed(new Document("I like Spring Boot"));
        
        //Example 3: Convert text to embeddings using options
        EmbeddingRequest embeddingRequest =
                new EmbeddingRequest(List.of("I like Spring Boot"),
                        OpenAiEmbeddingOptions.builder()
                                .withModel("text-davinci-003")
                                .build());
        EmbeddingResponse embeddingResponse = embeddingClient.call(embeddingRequest);
        List<Double> embeddings3 = embeddingResponse.getResult().getOutput();
    }
}

矢量数据库
矢量数据库是存储嵌入的数据库。您可以将单词、句子或文档的嵌入存储在矢量数据库中。您可以使用矢量数据库来使用自然语言查询嵌入并检索相关数据。

Spring AI 提供了一个VectorStore接口来存储和检索嵌入。目前，Spring AI 提供了VectorStore实现，例如SimpleVectorStore、 ChromaVectorStore、Neo4jVectorStore、PgVectorStore、RedisVectorStore等。

让我们看看如何使用SimpleVectorStore来存储和检索嵌入。

@Configuration
class AppConfig {
    @Bean
    VectorStore vectorStore(EmbeddingClient embeddingClient) {
        return new SimpleVectorStore(embeddingClient);
    }
}

@Component
class MyComponent {
    private final VectorStore vectorStore;
    
    public MyComponent(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }
    
    public void storeAndRetrieveEmbeddings() {
        // Store embeddings
        List<Document> documents = 
                List.of(new Document("I like Spring Boot"),
                        new Document("I love Java programming language"));
        vectorStore.add(documents);
        
        // Retrieve embeddings
        SearchRequest query = SearchRequest.query("Spring Boot").withTopK(2);
        List<Document> similarDocuments = vectorStore.similaritySearch(query);
        String relevantData = similarDocuments.stream()
                            .map(Document::getContent)
                            .collect(Collectors.joining(System.lineSeparator()));
    }
}

在上面的代码中，我们将文档添加到 VectorStore，VectorStore 在内部使用 EmbeddingClient 将文档转换为嵌入，并将它们存储在矢量数据库中。

然后，我们使用自然语言查询 VectorStore 并检索相关数据。我们使用withTopK()方法指定了要返回的相似文档的最大数量。

文档阅读器和文档编写器
在上面的示例中，我们直接从 String 构造了一个Document实例来表示文本或文档。但在实际应用程序中，您可能希望从文件、数据库或任何其他来源读取文档。

Spring AI提供了DocumentReader和DocumentWriter接口来从不同的源读取和写入文档。

截至目前，Spring AI 提供了DocumentReader实现，例如JsonReader、 TextReader、PagePdfDocumentReader等。

VectorStore接口扩展了DocumentWriter接口，因此您可以使用任何VectorStore实现作为DocumentWriter。

让我们看看如何使用TextReader读取文本文档并将其存储在 VectorStore 中。

@Component
class MyComponent {
    private final VectorStore vectorStore;
    
    @Value("classpath:myfile.txt")
    private Resource resource;
    
    public MyComponent(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }
    
    public void storeEmbeddingsFromTextFile() {
        var textReader = new TextReader(resource);
        textReader.setCharset(Charset.defaultCharset());
        List<Document> documents = textReader.get();

        vectorStore.add(documents);
    }
}

在上面的示例中，我们从类路径文件中读取文本并将其存储在 VectorStore 中。

实施 RAG（检索增强生成）
现在我们已经了解了如何将文档转换为嵌入并将其存储在向量数据库中，以及如何使用自然语言检索相关文档，让我们看看如何实现 RAG。

@RestController
class RAGController {
    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    RAGController(ChatClient chatClient, VectorStore vectorStore) {
        this.chatClient = chatClient;
        this.vectorStore = vectorStore;
    }
    
    // 假设我们已经从包含人员信息的文件中读取了文档 
    // 并按照上一节的描述将其存储在 VectorStore 中。
    
    @GetMapping("/ai/rag/people")
    Person chatWithRag(@RequestParam String name) {
        // 使用自然语言查询 VectorStore，查找有关个人的信息。
        List<Document> similarDocuments = 
                vectorStore.similaritySearch(SearchRequest.query(name).withTopK(2));
        String information = similarDocuments.stream()
                .map(Document::getContent)
                .collect(Collectors.joining(System.lineSeparator()));
        
        //构建 systemMessage 以指示人工智能模型使用传递的信息
        // 回答问题。
        var systemPromptTemplate = new SystemPromptTemplate("""
              You are a helpful assistant.
              
              Use the following information to answer the question:
              {information}
              """);
        var systemMessage = systemPromptTemplate.createMessage(
                Map.of("information", information));

        // 使用 BeanOutputParser 将响应解析为 Person 的实例。
        var outputParser = new BeanOutputParser<>(Person.class);
        
        // 构建用户信息（userMessage），要求人工智能模型介绍这个人。
        PromptTemplate userMessagePromptTemplate = new PromptTemplate("""
        Tell me about {name} as if current date is {current_date}.

        {format}
        """);
        Map<String,Object> model = Map.of("name", name,
                "current_date", LocalDate.now(),
                "format", outputParser.getFormat());
        var userMessage = new UserMessage(userMessagePromptTemplate.create(model).getContents());

        var prompt = new Prompt(List.of(systemMessage, userMessage));

        var response = chatClient.call(prompt).getResult().getOutput().getContent();

        return outputParser.parse(response);
    }
}

record Person(String name,
              String dateOfBirth,
              int experienceInYears,
              List<String> books) {
}

上述代码的解释已包含在注释中。

总体而言，RAG 过程涉及以下步骤：

使用DocumentReaders从不同来源加载文档。
使用EmbeddingClient将文档转换为嵌入并将其存储在VectorStore中。
使用自然语言查询VectorStore并检索相关文档。
构造SystemMessage以指示 AI 模型使用传递的信息来回答问题。
构造UserMessage向 AI 模型询问信息。
构造提示并调用 AI 模型以获得响应。
使用OutputParsers将响应解析为所需的格式。
返回响应。

结论
在本文中，我们了解了如何使用 Embedding API 将文本或文档转换为嵌入。我们还了解了如何使用矢量数据库来存储和检索嵌入。我们实施了 RAG（检索增强生成），利用检索到的信息来回答使用 AI 模型的问题。