Cloudzilla Logo

How to Build an AI Document Chatbot using Next.js and Gemini API

Have You Ever Wanted to Chat with Your Documents?

In this tutorial, we'll build an AI document chatbot that can process PDF and Word documents, understand their content, and engage in conversations about them using Google's Gemini API.

Using Next.js for our web application and Google's Gemini AI, the chatbot will not only read your documents but also provide summaries and answer questions about their content.

Whether you're a developer looking to get started with AI or trying to build tools for document analysis, this project offers a practical introduction to combining web development with artificial intelligence.

Project Overview

We'll build a full-stack application that allows users to:

  • Upload PDF and Word documents
  • Process and extract text from documents
  • Get AI-generated summaries of uploaded documents
  • Chat with an AI about the document's contents

Table of Contents

  1. Project Setup
  2. Main Chat Interface
  3. Type Definitions
  4. API Routes
  5. Getting Your Gemini AI API Key
  6. Running Your Project
  7. Deploy to Vercel

Prerequisites

  • Node.js 18+ installed
  • Basic knowledge of React and TypeScript
  • Code editor (VS Code recommended)
  • Google Cloud account with Gemini API access
  • Basic understanding of async/await and API calls

Project Setup

1. Create a New Next.js Project

Open up your VS Code terminal and create the project by running this command:

npx create-next-app@latest gemini-chatbot --typescript --tailwind

This command creates a new Next.js project folder called "gemini-chatbot" and configures it with TypeScript and Tailwind CSS.

Next, enter your new project directory using this command:

cd gemini-chatbot

Project Structure

This is how the project structure should look at the end of the article.

gemini-chatbot/
├── src/
│   ├── app/
│   │   ├── api/
│   │   │   ├── chat/
│   │   │   │   └── route.ts         # Chat API endpoint
│   │   │   └── process-document/
│   │   │       └── route.ts         # Document processing endpoint
│   │   ├── page.tsx                 # Main chat interface
│   │   ├── layout.tsx               # Root layout
│   │   └── globals.css              # Global styles
│   └── types/
│       └── chat.ts                  # Type definitions

2. Install Required Dependencies

Let’s install these required dependencies for the project using this command:

npm install @google/generative-ai @langchain/community @langchain/google-genai lucide-react
  • @google/generative-ai: Provides access to Google's Gemini AI language models
  • @langchain/community: Provides document loaders for various file formats and handles text splitting and processing
  • @langchain/google-genai: Integrates with Google's Generative AI and enables features like chat memory and chains

Main Chat Interface (src/app/page.tsx)

The page.tsx file serves as the main chat interface of our application.

Initial Imports

'use client';

import { useState, useRef, useEffect } from 'react';
import { Message } from '@/types/chat';
import { Send, Upload, Loader, Bot } from 'lucide-react';

From the code above:

  • 'use client': Marks this as a client-side component
  • useState: For managing component state
  • useRef: For DOM references
  • useEffect: For side effects
  • Message: Our custom type for chat messages
  • Icon imports from lucide-react

State and Refs

export default function Home() {
  // State management
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState('');
  const [isLoading, setIsLoading] = useState(false);
  const [error, setError] = useState<string | null>(null);
  const [documentContext, setDocumentContext] = useState<string>('');

  // Refs
  const messagesEndRef = useRef<HTMLDivElement>(null);
  const fileInputRef = useRef<HTMLInputElement>(null);

From the code above:

  • messages: Array of chat messages
  • input: Current text input value
  • isLoading: Loading state indicator
  • error: Error message state
  • documentContext: Stores processed document text
  • messagesEndRef: Reference for auto-scrolling
  • fileInputRef: Reference for the file input element

Auto-scroll Implementation

  // Auto-scroll to bottom of messages
  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
  };

  useEffect(() => {
    scrollToBottom();
  }, [messages]);

From the code above:

  • scrollToBottom: Function to scroll to the latest message
  • useEffect: Triggers scroll when messages update

File Upload Handler Function

  // Handle file upload
  const handleFileUpload = async (files: FileList | null) => {
    if (!files) return;

    setIsLoading(true);
    setError(null);

    try {
      const allowedTypes = [
        'application/pdf',
        'application/msword',
        'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
      ];

      const uploadedFiles = Array.from(files)
        .filter(file => allowedTypes.includes(file.type));

      for (const file of uploadedFiles) {
        const formData = new FormData();
        formData.append('file', file);

        const response = await fetch('/api/process-document', {
          method: 'POST',
          body: formData,
        });

        if (!response.ok) throw new Error('Failed to process document');

        const { text, summary } = await response.json();

        setDocumentContext(prev => prev + '\n' + text);

        setMessages(prev => [
          ...prev,
          {
            role: 'system',
            content: `Document "${file.name}" has been processed.`
          },
          {
            role: 'assistant',
            content: `Here's a summary of the document:\n\n${summary}`
          }
        ]);
      }
    } catch (error) {
      setError('Failed to process document. Please try again.');
    } finally {
      setIsLoading(false);
    }
  };

This code validates file types, processes multiple files sequentially, updates the document context, adds system and summary messages, and handles errors and loading states.

Chat Submission Handler Function

  // Handle chat submission
  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    if (!input.trim() || isLoading) return;

    const userMessage: Message = {
      role: 'user',
      content: input
    };

    setMessages(prev => [...prev, userMessage]);
    setInput('');
    setIsLoading(true);

    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          messages: [...messages, userMessage],
          documentContext
        }),
      });

      if (!response.ok) throw new Error('Failed to get response');

      const { content } = await response.json();

      setMessages(prev => [...prev, {
        role: 'assistant',
        content
      }]);
    } catch (error) {
      setError('Failed to send message. Please try again.');
    } finally {
      setIsLoading(false);
    }
  };

This code block does the following:

  • Prevents empty submissions
  • Adds user message immediately
  • Sends context to API
  • Handles API response
  • Updates messages with AI response

Getting Your Gemini AI API Key

  1. Visit Google AI Studio, and sign in with your Google account.
  2. Click "Get API Key", then click on “Create API key” to create your API key.

Remember to monitor activities on your app to prevent abuse and over billing from Google Cloud and don’t expose your API keys.

Running the Project

1. Environment Variable Setup

Create a .env.local file in your project root:

GOOGLE_API_KEY=paste_your_gemini_api_key_here

After creating the file, paste your Google Gemini API key.

2. Start the Development Server

npm run dev

3. Open http://localhost:3000 in Your Browser

Deploy to Vercel

To deploy your project to Vercel, you must have a Vercel account (you can sign up with your GitHub account).

Steps to Deploying Your Project

  1. Prepare Your Project

    Ensure your project is production-ready:

    npm run build
    
  2. Push to GitHub

    To push your project to GitHub, run these commands sequentially:

    # Initialize git repository (if not already done)
    git init
    
    # Add all files
    git add .
    
    # Commit changes
    git commit -m "Initial commit"
    
    # Add your GitHub repository as remote
    git remote add origin https://github.com/yourusername/your-repo-name.git
    
    # Push to GitHub
    git push -u origin main
    
  3. Deploy to Vercel

    Go to your Vercel Dashboard, click "New Project" or import your GitHub repository.

  4. Set the Environment Variables

    Add your Gemini API key to environment variables in the imported project settings:

    GOOGLE_API_KEY=your_gemini_api_key_here
    
  5. Visit Your Deployed Site

    You can check out the deployed site using the provided link.

Conclusion

This project shows how to build an AI chatbot that can interact with documents using Google's Gemini AI. We built it to demonstrate things like handling files, working with AI, and building a clean UI.

The complete code is available in the repository, and you can extend it further by adding features like:

  • Authentication
  • Document management
  • More file format support
  • Improved error handling
  • Conversation history persistence
  • Pricing

Resources

Author
Jethro Magaji
Jethro Magaji is a student at Kaduna State University With Frontend Development and UI/UX Design skills. He is passionate and enthusiastic about Blockchain technology and also uses creative thinking to solve business problems using a user-centered approach. He spends most of his time either learning a new skill or teaching others what he loves doing best.
More Articles by Author
Cloudzilla is FREE for React and Node.js projects
No Credit Card Required

Cloudzilla is FREE for React and Node.js projects

Deploy GitHub projects across every major cloud in under 3 minutes. No credit card required.
Get Started for Free