Skip to product information
Data-driven AI applications learned with RamaIndex
Data-driven AI applications learned with RamaIndex
Description
Book Introduction
Let's connect data and LLM to create truly useful AI applications.

Generative AI and large language models (LLMs) are powerful tools with enormous potential, but they also have clear weaknesses, such as generating misleading information, processing short contexts, and struggling to reflect up-to-date data.
This book provides specific guidance on how to use Augmented Search Generation (RAG) and LlamaIndex to overcome these limitations, and helps you practice the entire process of AI application development, from data collection, indexing, search, querying, prompt engineering, and deployment, by creating your own project with Python and Streamlit.
This book covers everything from basic concepts to building chatbots and agents, customization, and actual deployment strategies.
Through this book, you'll go beyond simple hands-on training to experience the entire process of handling and optimizing data, and develop the skills to build accurate and intelligent AI applications that transcend the limitations of an LLM.

  • You can preview some of the book's contents.
    Preview

index
Translator's Preface xii
Beta Reader Review xiv
Beginning xvi
About this book xvii

PART I: Introducing Generative AI and LlamaIndex

CHAPTER 1 Understanding Large Language Models 3
1.1 Introducing Generative AI and LLM 4
__1.1.1 What is Generative AI? 4 / 1.1.2 What is LLM? 4
1.2 Understanding the Role of LLM in Modern Technology 6
1.3 Exploring the Challenges Facing LLMs 8
1.4 Augmenting LLM with RAG 12
1.5 Summary 14

CHAPTER 02 LlamaIndex: A Hidden Gem? Introducing the LlamaIndex Ecosystem 15
2.1 Technical Requirements 15
2.2 Language Model Optimization: The Interaction of Fine Tuning, RAG, and LlamaIndex 16
__2.2.1 Is RAG the Only Solution? 16 / 2.2.2 Features of LlamaIndex 18
2.3 Discovering the Advantages of Disclosing Complexity Gradually 20
__2.3.1 Important Aspects to Consider 21
2.4 LlamaIndex Hands-on Project - Introduction to PITS 21
__2.4.1 How it works 21
2.5 Preparing the Coding Environment 23
__2.5.1 Installing Python 24 / 2.5.2 Installing Git 24 / 2.5.3 Installing LlamaIndex 25 / 2.5.4 Registering an OpenAI API Key 25 / 2.5.5 Exploring Streamlit – The Perfect Tool for Quick Build and Deployment 28 / 2.5.6 Installing Streamlit 29 / 2.5.7 Wrapping Up 29 / 2.5.8 Final Checks 30
2.6 Understanding the LlamaIndex Code Repository Structure 31
2.7 Summary 32

PART II: Starting Your First LlamaIndex Project

CHAPTER 03 Starting Your Journey with LlamaIndex 37
3.1 Technical Requirements 37
3.2 Understanding the Essential Components of LlamaIndex: Documents, Nodes, and Indexes 38
__3.2.1 Documents 38 / 3.2.2 Nodes 41 / 3.2.3 Creating Node Objects Manually 43 / 3.2.4 Automatically Extracting Nodes from Documents Using Splitters 43 / 3.2.5 Nodes Don't Like to Be Alone? They Crave Relationships 45 / 3.2.6 Why Are Relationships Important? 46 / 3.2.7 Indexes 47 / 3.2.8 Almost There? 49 / 3.2.9 How Does This Work Under the Hood? 50 / 3.2.10 Quick Review of Core Concepts 52
3.3 Building Your First Interactive, Augmented LLM Application 52
__3.3.1 Understanding logic and debugging applications using LlamaIndex's logging capabilities 54 / 3.3.2 Customizing LLM used in LlamaIndex 55 / 3.3.3 It's as easy as 1-2-3 55 / 3.3.4 Temperature parameters 56 / 3.3.5 Understanding how to use Settings for customization 58
3.4 Practice? Starting the PITS Project 59
__3.4.1 Examining the Source Code 61
3.5 Summary 64

CHAPTER 04 Importing Data into the RAG Workflow 65
4.1 Technical Requirements 65
4.2 Data Collection via LlamaHub 66
4.3 LlamaHub Overview 67
4.4 Collecting Content Using the LlamaHub Data Loader 68
__4.4.1 Collecting data from web pages 68 / 4.4.2 Collecting data from databases 70 / 4.4.3 Collecting bulk data from sources in various file formats 71
4.5 Parsing a Document into Nodes 76
__4.5.1 Understanding a Simple Text Splitter 76 / 4.5.2 Using a More Advanced Node Parser 78 / 4.5.3 Using a Relational Parser 82 / 4.5.4 Confused Between Node Parsers and Text Splitters? 83 / 4.5.5 Understanding Chunk Size and Chunk Overlap 83 / 4.5.6 Including Relations with Include_Prev_Next_rel 85 / 4.5.7 Practical Ways to Use This Node Creation Model 86
4.6 Using Metadata to Improve Context 88
__4.6.1 SummaryExtractor 90 / 4.6.2 QuestionsAnsweredExtractor 91 / 4.6.3 TitleExtractor 91 / 4.6.4 EntityExtractor 92 / 4.6.5 KeywordExtractor 93 / 4.6.6 PydanticProgramExtractor 94 / 4.6.7 MarvinMetadataExtractor 94 / 4.6.8 Defining a Custom Extractor 95 / 4.6.9 Is More Metadata Always Better? 95
4.7 Estimating the costs that may arise when using a metadata extractor 96
__4.7.1 Simple Best Practices for Minimizing Costs 97 / 4.7.2 Estimating Maximum Costs Before Running the Actual Extractor 97
4.8 Privacy Protection with Metadata Extractors, and Beyond 99
__4.8.1 Deleting Personal Data and Other Sensitive Information 101
4.9 Increase Efficiency Using Data Collection Pipelines 102
4.10 Handling Documents with Mixed Text and Tabular Data 106
4.11 Practice? Uploading Study Materials to PITS 107
4.12 Summary 109

CHAPTER 05 Indexing with LlamaIndex 111
5.1 Technical Requirements 111
5.2 Data Indexing - A Holistic Perspective 112
__5.2.1 Common Features of All Index Types 113
5.3 Understanding VectorStoreIndex 114
__5.3.1 A Simple Example of Using VectorStoreIndex 114 / 5.3.2 Understanding Embeddings 116 / 5.3.3 Understanding Similarity Search 118 / 5.3.4 How Does LlamaIndex Generate These Embeddings? 122 / 5.3.5 Which Embedding Model Should I Use? 124
5.4 Index Persistence and Reuse 125
__5.4.1 Understanding StorageContext 127 / 5.4.2 Differences between Vector Storage and Vector Database 129
5.5 Other index types in LlamaIndex 131
__5.5.1 SummaryIndex 131 / 5.5.2 DocumentSummaryIndex 133 / 5.5.3 KeywordTableIndex 135 / 5.5.4 TreeIndex 137 / 5.5.5 KnowledgeGraphIndex 142
5.6 Building Indexes on Top of Indexes Using ComposableGraph 145
__5.6.1 How to use ComposableGraph 146 / 5.6.2 More detailed explanation of this concept 147
5.7 Estimating the Potential Cost of Index Building and Queries 148
5.8 Practice? Indexing PITS Learning Materials 152
5.9 Summary 153

PART III Searching and Exploiting Indexed Data 155

CHAPTER 06 Querying Data, Step 1 - Contextual Search 157
6.1 Technical Requirements 157
6.2 Query Mechanism Overview 158
6.3 Understanding the Basic Search Engine 158
__6.3.1 VectorStoreIndex Finder 160 / 6.3.2 SummaryIndex Finder 162 / 6.3.3 DocumentSummaryIndex Finder 164 / 6.3.4 TreeIndex Finder 167 / 6.3.5 KeywordTableIndex Finder 170 / 6.3.6 KnowledgeGraphIndex Finder 172 / 6.3.7 Common Characteristics Shared by All Finders 176 / 6.3.8 Efficient Use of Search Mechanisms - Asynchronous Operations 177
6.4 Building an Advanced Search Mechanism 178
__6.4.1 Simple Search Methods 178 / 6.4.2 Implementing Metadata Filters 179 / 6.4.3 Using Selectors for More Advanced Decision Logic 182 / 6.4.4 Understanding Tools 184 / 6.4.5 Query Transformation and Rewriting 186 / 6.4.6 Creating More Specific Subqueries 188
6.5 Understanding the Concepts of Dense and Sparse Search 191
__6.5.1 Dense Search 191 / 6.5.2 Sparse Search 192 / 6.5.3 Implementing Sparse Search in LlamaIndex 195 / 6.5.4 Exploring Other Advanced Search Methods 198
6.6 Summary 199

CHAPTER 07 Querying Data, Step 2 - Postprocessing and Response Synthesis 200
7.1 Technical Requirements 200
7.2 Reordering, Transforming, and Filtering Nodes Using Postprocessors 201
__7.2.1 Exploring how postprocessors filter, transform, and reorder nodes 202 / 7.2.2 SimilarityPostprocessor 204 / 7.2.3 KeywordNodePostprocessor 205 / 7.2.4 PrevNextNodePostprocessor 208 / 7.2.5 LongContextReorder 209 / 7.2.6 PIINodePostprocessor and NERPIINodePostprocessor 209 / 7.2.7 MetadataReplacementPostprocessor 210 / 7.2.8 SentenceEmbeddingOptimizer 212 / 7.2.9 Time-based postprocessors 213 / 7.2.10 Reordering postprocessors 215 / 7.2.11 Final thoughts on node postprocessors 220
7.3 Understanding the Response Synthesizer 220
7.4 Implementing Output Parsing Techniques 224
__7.4.1 Extracting Structured Output Using an Output Parser 225 / 7.4.2 Extracting Structured Output Using a Pydantic Program 229
7.5 Building and Using a Query Engine 230
__7.5.1 Exploring Various Methods of Building a Query Engine 230 / 7.5.2 Advanced Utilization of the QueryEngine Interface 231
7.6 Practice? Creating a Quiz in PITS 239
7.7 Summary 242

CHAPTER 08 Building Chatbots and Agents with LlamaIndex 243
8.1 Technical Requirements 243
8.2 Understanding Chatbots and Agents 244
__8.2.1 Exploring ChatEngine 246 / 8.2.2 Understanding the Different Chat Modes 248
8.3 Implementing Agent Strategies in Your App 258
__8.3.1 Building tools and ToolSpec classes for agents 259 / 8.3.2 Understanding the inference loop 262 / 8.3.3 OpenAIAgent 264 / 8.3.4 ReActAgent 269 / 8.3.5 How do I interact with the agent? 271 / 8.3.6 Enhancing the agent using utility tools 271 / 8.3.7 Using the LLMCompiler agent for more advanced scenarios 276 / 8.3.8 Using the low-level agent protocol API 279
8.4 Hands-on: Implementing Conversation Tracking for PITS 282
8.5 Summary 288
Summary 133

PART IV Customization, Prompt Engineering, and Conclusion

CHAPTER 09 Customizing and Deploying the LlamaIndex Project 291
9.1 Technical Requirements 291
9.2 Customizing RAG Components 292
__9.2.1 The impact of LLaMA and LLaMA 2 on the open-source environment 292 / 9.2.2 Running local LLMs using LM Studio 293 / 9.2.3 Routing between LLMs using services like Neutrino or OpenRouter 300 / 9.2.4 How about customizing the embedding model? 303 / 9.2.5 Leveraging the plug-and-play convenience of Llama Packs 303 / 9.2.6 Using the Llama CLI 306
9.3 Using Advanced Tracking and Evaluation Techniques 308
__9.3.1 Tracking RAG Workflows with Phoenix 309 / 9.3.2 Evaluating Our RAG System 312
9.4 Introduction to Distribution Using Streamlit 319
9.5 Hands-on? Step-by-Step Deployment Guide 321
__9.5.1 Deploying a PITS Project to the Streamlit Community Cloud 323
9.6 Summary 327

CHAPTER 10 Guidelines and Best Practices for Prompt Engineering 328
10.1 Technical Requirements 328
10.2 Why Prompts Are Your Secret Weapon 329
10.3 Understanding How LlamaIndex Uses Prompts 332
10.4 Customizing the Default Prompt 335
__10.4.1 Using Advanced Prompt Techniques in LlamaIndex 339
10.5 The Golden Rule of Prompt Engineering 340
__10.5.1 Accuracy and clarity of expression 340 / 10.5.2 Directiveness 340 / 10.5.3 Context quality 340 / 10.5.4 Context amount 341 / 10.5.5 Required output format 342 / 10.5.6 Inference cost 342 / 10.5.7 Overall system latency 343 / 10.5.8 Choosing the right LLM for the task 343 / 10.5.9 Common methods for creating effective prompts 346
10.6 Summary 349

CHAPTER 11 Conclusion and Additional Resources 351
11.1 Other Projects and Further Learning 351
__11.1.1 LlamaIndex Example Collection 352 / 11.1.2 Moving Forward? Replit Bounty 355 / 11.1.3 The Power of Majority? LlamaIndex Community 356
11.2 Key Takeaways and Final Words of Encouragement 357
__11.2.1 The Future of RAG in the Larger Context of Generative AI 359 / 11.2.2 Philosophical Considerations 362
11.3 Summary 363

Search 365

Detailed image
Detailed Image 1

Into the book
When I first encountered the LlamaIndex framework, I was impressed by its comprehensive official documentation.
But I soon realized that the vast number of options could be overwhelming for beginners.
So my goal was to provide a beginner-friendly guide that would help you explore the framework's features and leverage them in your projects.
The deeper you delve into the internal mechanisms of LlamaIndex, the better you will understand its effectiveness.
This book aims to bridge the gap between official documentation and your understanding by deconstructing complex concepts and providing practical examples, enabling you to confidently build RAG applications while avoiding common pitfalls.

--- p.xvi

LlamaIndex allows you to quickly build smart LLMs that can adapt to specific use cases, injecting targeted information to obtain accurate and relevant answers without relying solely on general pre-trained (GPT) knowledge.
It also provides an easy way to connect external datasets to LLMs such as GPT-4, Claude, and Llama.
In other words, LlamaIndex connects your personalized knowledge with the extensive capabilities of LLM.

--- p.18

LlamaHub lets you access a variety of data sources with just a few lines of code.
The generated Document object can be parsed and indexed into nodes according to the application's needs.
The unified output as a LlamaIndex Document object means that you don't have to deal with complex handling of different data types in your core business logic.
This complexity is abstracted by the framework.

--- p.67

Over time, software projects accumulate extensive documentation, including technical specifications, API documentation, user guides, and developer notes.
Tracking this information can be difficult, especially when your team needs to quickly reference specific details.
Implementing a SummaryIndex in your project's document repository allows developers to perform fast searches across all documents.
For example, a developer might ask, "What is the error handling procedure for the payment gateway API?"
SummaryIndex scans indexed documents to retrieve relevant sections discussing error handling, without the need for complex embedding models or intensive computational resources.
This index is particularly useful in environments where resource constraints make it impractical to maintain extensive vector storage, or where simplicity and speed are priorities.

--- p.131

The RAG applications we build need to be as autonomous as possible in deciding which tools to use based on specific user queries and the datasets they're working with.
Some hard-coded solutions only provide good results in limited scenarios.
This is where an inference loop comes in.
/ The inference loop is a fundamental aspect of an agent, allowing it to intelligently decide which tools to use in different scenarios.
This is important because requirements in complex real-world applications can vary greatly, and static approaches limit the effectiveness of agents.

--- p.262

Sometimes a single LLM may not be ideal for all interactions.
Finding the optimal balance between cost, latency, and accuracy in complex RAG scenarios can be a challenging task when choosing a single LLM.
But what if you could mix multiple LLMs in the same application and dynamically choose which one to use for each interaction? This is precisely the purpose of third-party services like Neutrino and OpenRouter.
These services can significantly improve RAG workflows by providing intelligent routing capabilities for queries between different LLMs.
--- p.300

Publisher's Review
Generative AI: It's Time to "Create" It, Not "Watch" It

In the era of generative AI, represented by ChatGPT, this book was created for developers who want to turn their ideas into real-world services but are unsure where to start.
We cannot be satisfied with simply calling the API and writing the model.
It should be able to train with your own data, communicate in real time, and deploy services immediately.
This book guides you through that process step by step.
This practical guide avoids complex concepts, teaches through practice, and gives you the confidence that you can do it too.

This book covers the principles of generative AI and RAG (Augmented Search Generation) in a hands-on manner, using Python and Streamlit.
In particular, we will implement the entire RAG pipeline, from data collection, parsing, indexing, searching, and post-processing, focusing on LlamaIndex.
Rather than simply explaining the functionality, it shows, through specific code, why each step is necessary and how it can be optimized in actual projects.
It also covers a wide range of practical, real-world applications, including chatbot and agent implementation, custom prompt engineering, cost management, and deployment strategies.
Once you understand core components like VectorStore, KnowledgeGraph, and QueryEngine, you'll be able to quickly transform your ideas into data-driven AI applications.

By the time you finish this book, you'll have evolved from a simple user to a developer capable of implementing and improving AI.
For those who want to learn through code rather than complex theories, and who want to complete a practical LLM project using RAG and LlamaIndex, this book will be the most realistic starting point.

Key Contents

● Understanding the LlamaIndex ecosystem and its core components
● Collect, parse, and index data from various sources (web, DB, files, etc.)
● Design and use of customized indexes such as VectorStore and KnowledgeGraph
● Learn efficient query, search, post-processing, and response synthesis techniques.
● Developing interactive applications by implementing chatbots and agents
● Best practices and utilization strategies for prompt engineering
● Addressing cost estimation, privacy, and ethical considerations
● Deploying and scaling projects using Streamlit
GOODS SPECIFICS
- Date of issue: October 30, 2025
- Page count, weight, size: 392 pages | 188*245*19mm
- ISBN13: 9791194587842

You may also like

카테고리