Vivek Skanda. Retrieval Augmented Generation in Production with Haystack: Building Trustworthy, Scalable, Reliable, and Secure AI Systems (Early Release)

Файл формата zip
размером 2,57 МБ
содержит документ формата epub

Добавлен пользователем morozov_97 13.05.2024 21:28
Описание отредактировано 14.05.2024 18:26

Vivek Skanda. Retrieval Augmented Generation in Production with Haystack: Building Trustworthy, Scalable, Reliable, and Secure AI Systems (Early Release)

O’Reilly Media, Inc., 2024. — 45 p. — (Early Release) — ISBN 978-1-098-16514-7.

Извлечение дополненной генерации в производстве с помощью Haystack: создание надежных, масштабируемых, надежных и безопасных систем искусственного интеллекта.
In today's rapidly changing AI technology environment, software engineers often struggle to build real-world applications with large language models (LLM). The benefits of incorporating open source LLMs into existing workflows is often offset by the need to create custom components. That's where Haystack comes in. This open source framework is a collection of the most useful tools, integrations, and infrastructure building blocks to help you design and build scalable, API-driven LLM backends.

With Haystack, it's easy to build extractive or generative QA, Google-like semantic search to query large-scale textual data, or a reliable and secure ChatGPT-like experience on top of technical documentation. This guide serves as a collection of useful retrieval augmented generation (RAG) mental models and offers ML engineers, AI engineers, and backend engineers a practical blueprint for the LLM software development lifecycle.

An emerging paradigm is the leveraging of Generative AI to unlock data-centric insights for customers across various industries using large language models (LLMs) such as the OpenAI GPT models, Anthropic’s Claude models, Google Gemini, Meta’s Llama models, Mistral, etc. However, an engine alone cannot propel a vehicle. State-of-the-art LLMs like GPT-4 excel at language-based tasks due to their a priori knowledge, acquired through training on a vast representative corpus of documents (including websites, books, etc.) and tasks involving these documents.

While LLMs demonstrate exceptional out-of-the-box performance, their inherent value is limited. Enterprise use-case lie in adapting these LLMs to their custom data sources and customer workflows. One approach for this involves feeding the LLM relevant context as part of the input. However, this method presents several challenges, including latency, cost, and model forgetfulness when dealing with large context sizes.

Large Language Models like GPT-3.5 have ushered in a new era of artificial intelligence and computing. LLMs are large scale neural networks, composed of several billion parameters, and trained on natural language processing tasks. Language models aim to model the generative likelihood of word sequences, to predict the probabilities of future (or missing) tokens. The simplest language models are bigram, trigram (n-gram in general) models where the probability of the following word depends on the previous n-1 words.

Brief Contents (Not Yet Final)
Using Generative AI with Haystack (available)
Trustworthy AI (unavailable)
Scalable AI (unavailable)
Observable AI (unavailable)
Governance of AI (unavailable)
Keeping Up with the Pace of AI Development (unavailable)