📚 This video introduces the use of embeddings and ChromaDB for the multi doc retriever.
⚙️ Having a GPU is recommended for faster processing, but it can also be run on a CPU.
📄 The video demonstrates how to work with multiple PDF files instead of text files.
📖 There are two ways of doing embeddings: using Hugging Face embeddings or using instructor embeddings.
🔎 The instructor embeddings are custom embeddings that can be used for specific purposes.
💻 LangChain is used to locally run the embeddings.
📥 The model and necessary files are downloaded for usage.
💻 The embeddings are set up for vector storage.
🔍 ChromaDB is used to set up the vector store.
🔑 The video introduces the use of instructor embeddings in a retriever for LangChain retrieval QA.
🔍 The retriever utilizes the instructor embeddings to find contexts that match a given query.
📚 The top documents selected by the embeddings in the retriever provide relevant information for specific queries.
💡 LangChain Retrieval QA is able to find answers from the same paper and can provide information about ToolFormer.
🔎 By asking questions about ToolFormer, we can learn about its functionalities and the tools that can be used with it.
📚 LangChain Retrieval QA is useful for extracting specific information from papers and can even provide insights from related survey papers.
📚 Using embedding system for instructing better without relying on OpenAI for language models.
🔍 The system is able to retrieve information and answer specific questions about retrieval augmentation and differences between REALM and RAG models.
🔒 More privacy in data processing by not sending all data to the large language model for embeddings.
💡 Using an actual language model for replying and embedding.
💻 Deleting and bringing back the ChromaDB database.
🔧 Exploring custom models for various tasks.