QueryMind - Intelligent Database Querying Through Natural Language

Abstract

Businesses require natural-language access to databases without exposing records to large language models, as this compromises data confidentiality and conflicts with governance mandates. This research paper introduces QueryMind, a system that resolves this issue by transforming natural-language enquiries into secure SQL through a schema-aware Retrieval-Augmented Generation (RAG) pipeline featuring SELECT-only validation.

We tested the system on an e-commerce database with five different model configurations. The best setup, llama3.1-8b with mxbai-embed-large, achieved 100% accuracy across all benchmark queries. The entire pipeline averaged 33 seconds per query, with 86% of that time dedicated to LLM generation. The smaller models gave answers faster, in an average of 4 to 7 seconds, but they were only 40 to 60 percent accurate.

Security tests in five different situations indicated that the system could withstand SQL injection, XSS, and session-hijacking attacks. The performance of local and remote databases was almost the same, with only a 7% difference in total processing times. The results show that bigger LLMs take longer to process, but they make SQL generation more accurate and safer. They also show that a fully local, schema-restricted setup is reliable and keeps data private.

The Problem

SQL Complexity

Traditional database querying requires deep knowledge of SQL syntax, schema structure, and table relationships.

Accessibility Barrier

Non-technical business users and executives struggle to extract data insights without technical support.

Privacy Concerns

Organizations cannot expose sensitive data to external AI APIs due to privacy and compliance requirements.

Key Features

💬

Natural Language Interface

Ask questions in plain English like "Show customers in Bahrain" or "Count orders per customer"

🧠

RAG-Enhanced Accuracy

Retrieval-Augmented Generation ensures accurate SQL queries by leveraging schema context

🔐

Schema-Only Access

LLM accesses only database schema (table/column names), never actual records - ensuring complete data privacy

🖥️

100% Local Processing

No data sent to external APIs; runs entirely on your machine for complete privacy

👤

Session Isolation

Each user gets isolated vector stores for their schema, ensuring data separation

⚡

Performance Metrics

Track RAG retrieval, LLM generation, and execution times for transparency

See QueryMind in Action

Watch how natural language questions are converted to SQL queries in real-time

1

Ask Your Question

QueryMind

Ask your question

List all customers from Bahrain

Generate SQL

→

2

AI Generates SQL

SQL Query

SELECT * FROM customers 
WHERE country = 'Bahrain';

RAG: 2.45s LLM: 28.1s Exec: 0.03s

→

3

View Results

Results

ID	Name	Country
101	Ahmed Ali	Bahrain
205	Sara Khalid	Bahrain
312	Omar Jassim	Bahrain

System Architecture

User Interface (Flask Web App)

↓

Input Validator

RAG Retrieval

SQL Validator

↓

ChromaDB

Ollama LLM

MariaDB

Experimental Results

We evaluated QueryMind with different model configurations to find the optimal balance between accuracy and performance.

Model Configuration	Accuracy	Avg Generation Time
Llama 3.1:8b + mxbai-embed-large	100%	28.97s
Qwen3:1.7b + mxbai-embed-large	60%	6.69s
Gemma3:1b + mxbai-embed-large	60%	5.19s
Qwen3:1.7b + all-minilm	60%	4.52s
Gemma3:1b + all-minilm	40%	4.20s

Key Findings

Larger models achieve higher accuracy but require more generation time
Smaller models are faster but less accurate for complex queries
Embedding model quality impacts retrieval accuracy significantly
All configurations pass security validation tests

Meet Our Team

Cybersecurity Graduates - Batch 2025

MA

Mohamed Almaraghi

Project Manager

View Profile →

AA

Ali Alsowaidi

LLM Developer

View Profile →

AB

Abdulrahman Bumeajib

Backend Developer

View Profile →

Acknowledgment

🎓

Dr. Abdulla Aldoseri

Project Supervisor

We extend our deepest gratitude to Dr. Abdulla Aldoseri for his invaluable guidance, continuous support, and mentorship throughout this project. His expertise in artificial intelligence and cybersecurity provided crucial direction that helped us navigate complex technical challenges. Dr. Aldoseri's dedication to our success and his insightful feedback were instrumental in shaping QueryMind into a robust and secure system. We are truly grateful for his commitment to our academic and professional growth.