All Projects
PythonNLPNLTKOpenCVImage ProcessingChatbot

Multimodal AI Chatbot

CompletedSemester Project

A conversational AI that processes both natural language and digital images, enabling mood-reactive visual transformations and context-aware dialogue.

Project Snapshot

Core Capability

Dual-mode processing: conversational NLP + image manipulation

Key Innovation

Mood-based colormap filters that mirror conversation context to visuals

Technical Stack

Python, NLTK, OpenCV, Naïve Bayes classifier

User Interaction

Cross-modal: chat prompts image actions; Image loading enriches conversation state

Key Contributions

  • Built a conversational core using NLTK pattern matching with pronoun reflection (ELIZA-style dialogue)
  • Implemented image I/O pipeline: file upload, preset library management, and display utilities
  • Developed three core image filters (grayscale, edge detection, Gaussian blur) plus mood-based colormap system
  • Created mood-context mapping so visual transformations reflect emotional state of conversation
  • Integrated Naïve Bayes classifier to automatically route user input to appropriate response handler
  • Built command parser for natural language filter requests (e.g., "apply edge to my image")

Research Focus

  • How can visual feedback reinforce conversational context and emotional tone?
  • What makes a chatbot feel "alive" vs. mechanical in multimodal interactions?
  • How can image processing serve as a tool for reflection rather than just automation?
  • What happens when dialogue and vision processing are tightly coupled rather than siloed?

Active Build Tracks

  • NLP conversation engine with context preservation across turns
  • Image pipeline: upload, storage, retrieval, and display management
  • Filter library: standard CV transforms + mood-based creative mapping
  • Integration layer bridging text commands to vision operations
  • Classifier for intent detection and routing

Project Visual

Multimodal AI Chatbot

No original project images available

Overview

This project explores what happens when you merge conversational AI with real-time image manipulation. Rather than treating text and vision as separate domains, I designed a system where the chatbot's mood and conversation context directly influence how it transforms images. A user can ask the bot questions, load an image, and then apply a filter that reflects the emotional tone of the ongoing chat. The core innovation is the mood-based colormap system—instead of generic filters, the bot applies OpenCV colormaps (Spring for "happy," Ocean for "sad," Jet for "hype") so every transformation feels contextually grounded in conversation.

My Role

Solo developer. I owned the full architecture: conversation logic, image I/O, filter design, and integration. I chose NLTK pattern matching + reflection as the conversational foundation because it allowed me to focus engineering effort on the vision side while still maintaining fluid dialogue. For images, I designed the mood-mapping system to make visual feedback feel less like a utility and more like a collaborative creative act.

Outcome

The chatbot successfully processes both modalities in real time and feels cohesive—loading an image or applying a filter doesn't break the conversation; it extends it. The mood-based colormap system is the standout: it's a simple idea (map emotion names to color palettes), but it makes multimodal interaction feel intentional rather than bolted-on. Technically, all assignment requirements were met: custom image loading, preset library, multiple filter types, and a creative visual component. The larger takeaway: multimodal AI doesn't require massive models or complexity—it requires thoughtful integration of what each modality can express.