Photo Credits to DALLĀ·E 3

Jais on Inference Endpoints

Tip Please check this out in colab to run the code easily! Introduction Goal I want jais-13B deployed with an API quickly and easily. Info In this blog you will learn: How to leverage TGI and Inference Endpoints with jais How to deploy a model on the HW of your choice using the Hub Client Library Fundamental concepts on how decoding works and why they matter Approach There are lots of options out there that are ā€œ1-clickā€ which is really cool!...

January 22, 2024 Ā· 7 min Ā· 1431 words Ā· Derek Thomas
Photo Credits to DALLĀ·E 3

Arabic RAG 6: Putting it together

Goal This is part 6 of 6 in our tutorial on Arabic RAG. We have created all of the components we need to make our RAG solution. All that is left is to stitch them together! Note In this blog you will learn how to: Quickly and efficiently deploy jais using Inference Endpoints Combine all the components of RAG into a functional system Create a beautiful Gradio App for RAG If you want to skip all this and actually try the app here it is: https://huggingface....

December 13, 2023 Ā· 6 min Ā· 1161 words Ā· Derek Thomas
Photo Credits to DALLĀ·E 3

Arabic RAG 5: VectorDB

Goal This is part 5 of 6 in our tutorial on Arabic RAG. Iā€™ll be using this blog as a guide, but to actually run this tutorial, its best that you run this notebook as described in part 1. Info In this blog you will learn: Important VectorDB considerations How to prepare your data for ingestion How to use LanceDB for RAG Approach VectorDB Why do we even need a VectorDB?...

December 6, 2023 Ā· 6 min Ā· 1186 words Ā· Derek Thomas
Photo Credits to DALLĀ·E 3

Arabic RAG 4: Get Embeddings

Goals This is part 4 of 6 in our tutorial on Arabic RAG. Iā€™ll be using this blog as a guide, but to actually run this tutorial, its best that you run this notebook as described in part 1. Before diving in, I want to you to think about how much money it costs to embed 2M articles. Make an estimate and see how accurate your guess is at the end of the blog....

December 3, 2023 Ā· 8 min Ā· 1502 words Ā· Derek Thomas
Photo Credits to DALLĀ·E 3

Arabic RAG 3: Pre-Processing

Goal This is part 3 of 6 in our tutorial on Arabic RAG. Iā€™ll be using this blog as a guide, but to actually run this tutorial, its best that you run this notebook as described in part 1. In this blog you will learn: Chunking Considerations How to leverage the very useful Haystack library from Deepset for preprocessing your data for RAG How to structure your data and code for parallel pre-processing Preprocessing import json from pathlib import Path import pickle from tqdm....

November 30, 2023 Ā· 10 min Ā· 1973 words Ā· Derek Thomas
Photo Credits to DALLĀ·E 3

Arabic RAG 2: Tokenizer Analysis

Goal This is part 2 of 6 in our tutorial on Arabic RAG. Iā€™ll be using this blog as a guide, but to actually run this tutorial, its best that you run this notebook as described in part 1. In this blog you will learn: How to choose an Embedding Model Why you need to think about token analysis for Arabic RAG How to analyze a tokenizer to estimate words per token How to visualize this to justify your decisions Why Analyze Tokenization?...

November 28, 2023 Ā· 6 min Ā· 1193 words Ā· Derek Thomas
Photo Credits to DALLĀ·E 3

Arabic RAG 1: Getting the Data

Goal This is part 1 of 6 on a tutorial for Arabic RAG. RAG is short for Retrieval Augmented Generation. It took itā€™s name from Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks though itā€™s current usage is much more similar to the RALM Paper. In this tutorial you will learn: Why is RAG important How to download Wikipedia How to format Wikipedia for scalable processing Addressing Hallucinations Large Language Models (LLMs) get blamed (though unfairly IMHO) for ā€œhallucinatingā€....

November 26, 2023 Ā· 4 min Ā· 850 words Ā· Derek Thomas