Fine-Tuning T5: A University Cafeteria Chatbot
Training a Text-to-Text Transfer Transformer (T5) to answer questions about UAM cafeteria menu using Hugging Face and PyTorch.
Project Overview
This project demonstrates the process of fine-tuning a pre-trained language model to specialize in a specific domain. By using a “text-to-text” approach, we transform a general-purpose model into a specialized assistant capable of answering questions about the menus, prices, and services of the Universidad Autónoma de Madrid (UAM) cafeterias.
Technologies Used
- Python: The core programming language.
- Hugging Face Transformers: Library for accessing the T5 model and the
TrainerAPI. - PyTorch: Deep learning framework used for tensor manipulation and hardware acceleration (CUDA/MPS).
- SentencePiece: Tokenization backend for the T5 model.
- JSON: Format used for the custom Q&A dataset.
Methodology
The workflow follows the standard pipeline for Natural Language Processing (NLP) adaptation:
- Conceptual Setup: Understanding that T5 (Text-to-Text Transfer Transformer) views every task as a string-to-string problem.
- Data Tokenization: Converting Spanish natural language queries into numerical “tokens” that the neural network can process.
- Custom Dataset Creation: Implementing a
CafeteriaDatasetclass in PyTorch to handle input/target encoding, padding, and truncation. - Model Training: Utilizing the
TrainerAPI to perform backpropagation over 10 epochs, adjusting the model’s internal weights (parameters) to minimize the error between generated and expected answers. - Inference Pipeline: Developing a function to load the fine-tuned weights and generate real-time responses using “Beam Search” for better text quality.
Key Concepts: The “Attention” Mechanism
Unlike older models that read text strictly word-by-word, the Transformer architecture uses Attention. This allows the model to:
- Identify that in the question “How much does a coffee cost?”, the words “cost” and “coffee” are the most relevant.
- Ignore filler characters or punctuation that don’t add semantic value to the price inquiry.
Results
By the end of the training process, the model successfully transitioned from general language understanding to specific domain expertise. It can now accurately handle queries such as:
- Input: “¿Cuánto cuesta el café con leche?” (How much is a latte?)
- Output: The specific price defined in the
cafeteria.jsondataset. - Input: “¿El bocata de jamón es vegano?” (Is the ham sandwich vegan?)
- Output: “NO” A context-aware refusal based on the ingredients.