LLM Fine-Tuning and RAG Course Chatbot

CS639 project on quantized LLM inference, LoRA fine-tuning, and RAG with Elasticsearch and Streamlit.

LLM Fine-Tuning and RAG Course Chatbot

LLM RAG pipeline flow chart

This CS639 project explored practical LLM workflows for course-assistant applications. It compared pre-trained inference, LoRA fine-tuning on lecture transcripts, and retrieval-augmented generation for exam preparation.

What It Does

  • Runs 4-bit quantized Llama-3.2-1B-Instruct inference with HuggingFace Transformers.
  • Fine-tunes the model on course lecture transcripts using LoRA.
  • Loads course transcripts into Elasticsearch.
  • Builds a Streamlit RAG chatbot using Haystack and HuggingFace generation, displaying retrieved transcript evidence for transparency.

Tech Stack

HuggingFace Transformers, bitsandbytes, PEFT/LoRA, Elasticsearch, Haystack, Streamlit, Python.

Project Specification


← Back to Projects