LLM Fine-Tuning and RAG Course Chatbot

LLM RAG pipeline flow chart

This CS639 project explored practical LLM workflows for course-assistant applications. It compared pre-trained inference, LoRA fine-tuning on lecture transcripts, and retrieval-augmented generation for exam preparation.

What It Does

Runs 4-bit quantized Llama-3.2-1B-Instruct inference with HuggingFace Transformers.
Fine-tunes the model on course lecture transcripts using LoRA.
Loads course transcripts into Elasticsearch.
Builds a Streamlit RAG chatbot using Haystack and HuggingFace generation, displaying retrieved transcript evidence for transparency.

Tech Stack

HuggingFace Transformers, bitsandbytes, PEFT/LoRA, Elasticsearch, Haystack, Streamlit, Python.

Links

Project Specification

← Back to Projects