Database Query Prompt Templates

AI prompt templates for writing database queries. Create optimized SQL, MongoDB, and other database queries.

Overview

Database queries can make or break application performance. These prompts help you write queries that are correct, efficient, and maintainable, whether you're working with SQL, MongoDB, or other databases. The key is providing your schema context and explaining what data you need, not just asking for 'a query.'

Best Practices

1

Always include your table/collection schema, column names, types, and relationships

2

Describe the data you need in business terms, not just technical terms

3

Mention the database system and version (PostgreSQL 15, MySQL 8, MongoDB 6, etc.)

4

Specify data volumes so the AI can consider performance implications

5

Include any existing indexes or constraints that might affect query design

Prompt Templates

1. SQL Query Generator

Write a SQL query for this requirement:

**Database:** [POSTGRESQL / MYSQL / SQLITE / etc.]
**Tables involved:**
```sql
[PASTE SCHEMA OR DESCRIBE TABLES]
```

**What I need:**
[DESCRIBE THE DATA YOU WANT TO RETRIEVE]

**Filters/conditions:**
- [CONDITION 1]
- [CONDITION 2]

**Sorting:** [HOW TO ORDER RESULTS]
**Pagination:** [IF NEEDED]

**Example expected output:**
| column1 | column2 | column3 |
|---------|---------|---------|}
|... |... |... |

**Performance notes:**
- Table sizes: [APPROXIMATE ROW COUNTS]
- Existing indexes: [LIST INDEXES IF KNOWN]
Database: PostgreSQL 15
Tables: users (id, name, email, created_at), orders (id, user_id, total, status, created_at), order_items (order_id, product_id, quantity, price)
Need: Top 10 customers by total spend in the last 30 days with their order count
Filters: Only completed orders, exclude test accounts (email like '%@test.com')
Sorting: By total spend descending
Table sizes: 100K users, 5M orders
```sql
-- Top 10 customers by spend in last 30 days
-- Excludes test accounts and non-completed orders

SELECT 
 u.id,
 u.name,
 u.email,
 COUNT(o.id) AS order_count,
 SUM(o.total) AS total_spent
FROM users u
INNER JOIN orders o ON o.user_id = u.id
WHERE 
 o.status = 'completed'
 AND o.created_at >= CURRENT_DATE - INTERVAL '30 days'
 AND u.email NOT LIKE '%@test.com'
GROUP BY u.id, u.name, u.email
ORDER BY total_spent DESC
LIMIT 10;
```

**Performance recommendations:**

For 5M orders, add these indexes if not present:
```sql
-- Composite index for the WHERE + JOIN
CREATE INDEX idx_orders_user_status_date 
 ON orders(user_id, status, created_at)
 WHERE status = 'completed';
```

**Query explanation:**
- `INNER JOIN` excludes users with no orders (appropriate here since we want customers with spend)
- Filtering in WHERE before GROUP BY reduces rows to aggregate
- Partial index with `WHERE status = 'completed'` is smaller and faster since we only query completed orders

When you need to write complex queries with joins, aggregations, or specific performance requirements

  • Include table sizes, a query that's fine for 10K rows might be disastrous for 10M rows
  • Show expected output format to clarify exactly what columns and shape you need

2. Query Optimization

Optimize this slow query:

**Database:** [DATABASE SYSTEM AND VERSION]

**Current query:**
```sql
[PASTE SLOW QUERY]
```

**Current performance:**
- Execution time: [HOW LONG IT TAKES]
- EXPLAIN output: [PASTE IF AVAILABLE]

**Schema:**
```sql
[RELEVANT TABLE DEFINITIONS WITH INDEXES]
```

**Data volumes:**
- [TABLE 1]: [ROW COUNT]
- [TABLE 2]: [ROW COUNT]

**Acceptable tradeoffs:**
- [e.g. Can denormalize if needed]
- [e.g. Slightly stale data is OK]

Provide:
1. Analysis of why it's slow
2. Optimized query
3. Recommended indexes
4. Any schema changes if beneficial
Database: MySQL 8
Query: SELECT with 3 JOINs taking 45 seconds
Execution time: 45s (needs to be under 1s)
Table sizes: products (500K), categories (1K), reviews (10M), product_tags (2M)
Tradeoffs: Can cache results, denormalization OK for read performance
## Analysis: Why It's Slow

1. **Full table scan on reviews** - No index on `product_id`, scanning 10M rows
2. **Subquery for avg rating** - Executes once per product (N+1 pattern)
3. **OR condition breaking index** - `WHERE category_id = 5 OR featured = 1` can't use indexes efficiently

## Optimized Query

```sql
-- Step 1: Pre-aggregate reviews (can be materialized view)
WITH review_stats AS (
 SELECT 
 product_id,
 AVG(rating) AS avg_rating,
 COUNT(*) AS review_count
 FROM reviews
 WHERE created_at >= DATE_SUB(NOW(), INTERVAL 1 YEAR)
 GROUP BY product_id
)
SELECT 
 p.id,
 p.name,
 p.price,
 c.name AS category_name,
 COALESCE(rs.avg_rating, 0) AS avg_rating,
 COALESCE(rs.review_count, 0) AS review_count
FROM products p
INNER JOIN categories c ON c.id = p.category_id
LEFT JOIN review_stats rs ON rs.product_id = p.id
WHERE p.category_id = 5

UNION

SELECT... -- Same query with WHERE p.featured = 1
```

## Recommended Indexes

```sql
-- Critical: Index for reviews aggregation
CREATE INDEX idx_reviews_product_date 
 ON reviews(product_id, created_at, rating);

-- For product lookups
CREATE INDEX idx_products_category ON products(category_id);
CREATE INDEX idx_products_featured ON products(featured) WHERE featured = 1;
```

## Consider: Materialized View

Since reviews don't change constantly:

```sql
CREATE TABLE review_stats_cache AS
SELECT product_id, AVG(rating) as avg_rating, COUNT(*) as review_count
FROM reviews GROUP BY product_id;

-- Refresh hourly via cron/scheduler
```

**Expected improvement:** 45s → <500ms with indexes, <100ms with materialized view

When queries are too slow and you need to understand why and how to fix them

  • Always include EXPLAIN output if possible, it shows exactly where time is spent
  • Mention acceptable tradeoffs, often the fastest solution involves caching or denormalization

3. MongoDB Query Builder

Write a MongoDB query/aggregation:

**MongoDB version:** [VERSION]
**Collection:** [NAME]

**Document structure:**
```javascript
[SHOW SAMPLE DOCUMENT]
```

**What I need:**
[DESCRIBE THE DATA OR AGGREGATION]

**Conditions:**
- [FILTER 1]
- [FILTER 2]

**Output format:**
```javascript
[DESIRED OUTPUT SHAPE]
```

**Performance considerations:**
- Collection size: [DOCUMENT COUNT]
- Indexes: [EXISTING INDEXES]
MongoDB: 6.0
Collection: events
Document: { userId, type, properties: { page, duration... }, timestamp }
Need: User engagement stats - avg session duration and page views per user for active users this week
Conditions: type = 'pageview' or 'session', timestamp this week
Output: { userId, avgDuration, pageViews }
```javascript
// User engagement stats for the past week
db.events.aggregate([
 // Stage 1: Filter to relevant events this week
 {
 $match: {
 type: { $in: ['pageview', 'session'] },
 timestamp: {
 $gte: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000)
 }
 }
 },
 
 // Stage 2: Group by user, separate pageviews from sessions
 {
 $group: {
 _id: '$userId',
 pageViews: {
 $sum: { $cond: [{ $eq: ['$type', 'pageview'] }, 1, 0] }
 },
 totalDuration: {
 $sum: {
 $cond: [
 { $eq: ['$type', 'session'] },
 '$properties.duration',
 0
 ]
 }
 },
 sessionCount: {
 $sum: { $cond: [{ $eq: ['$type', 'session'] }, 1, 0] }
 }
 }
 },
 
 // Stage 3: Calculate average and reshape output
 {
 $project: {
 _id: 0,
 userId: '$_id',
 pageViews: 1,
 avgDuration: {
 $cond: [
 { $gt: ['$sessionCount', 0] },
 { $divide: ['$totalDuration', '$sessionCount'] },
 0
 ]
 }
 }
 },
 
 // Stage 4: Sort by engagement (optional)
 { $sort: { pageViews: -1 } }
]);
```

**Recommended index:**
```javascript
// Compound index covering the $match stage
db.events.createIndex(
 { type: 1, timestamp: -1 },
 { name: 'idx_type_timestamp' }
);

// If querying specific users often:
db.events.createIndex(
 { userId: 1, type: 1, timestamp: -1 }
);
```

**Performance notes:**
- `$match` at the start uses index and reduces documents early
- Single `$group` stage avoids multiple passes over data
- Conditional `$sum` is more efficient than separate `$match` stages per type

When building MongoDB queries, especially aggregation pipelines for analytics or reporting

  • Put $match stages first to reduce documents flowing through the pipeline
  • Show a sample document, MongoDB's schema-less nature means assumptions are dangerous

4. Database Schema Design

Design a database schema for this feature:

**Database:** [SQL / NOSQL - SPECIFIC SYSTEM]
**Feature:** [WHAT YOU'RE BUILDING]

**Data requirements:**
- [ENTITY 1 AND ITS ATTRIBUTES]
- [ENTITY 2 AND ITS ATTRIBUTES]

**Relationships:**
- [RELATIONSHIP 1 - e.g. User has many Orders]
- [RELATIONSHIP 2]

**Query patterns (how data will be accessed):**
- [QUERY 1 - e.g. Get user's recent orders]
- [QUERY 2 - e.g. Search products by category]
- [QUERY 3 - e.g. Dashboard aggregations]

**Scale expectations:**
- [EXPECTED DATA VOLUMES]
- [READ/WRITE RATIO]

**Constraints:**
- [CONSTRAINT 1 - e.g. email must be unique]
- [CONSTRAINT 2 - e.g. need audit history]
Database: PostgreSQL
Feature: Multi-tenant SaaS subscription billing
Entities: Organization, User, Subscription, Invoice, Payment
Relationships: Org has many Users, Org has one Subscription, Subscription has many Invoices
Query patterns: Get org's current subscription, list invoices by date range, monthly revenue report
Scale: 10K orgs, 500K users, 1M invoices, read-heavy (10:1)
Constraints: Orgs isolated (multi-tenant), need full audit trail
## Schema Design

```sql
-- Organizations (tenants)
CREATE TABLE organizations (
 id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
 name VARCHAR(255) NOT NULL,
 slug VARCHAR(100) UNIQUE NOT NULL,
 created_at TIMESTAMPTZ DEFAULT NOW(),
 updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Users (belong to one org)
CREATE TABLE users (
 id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
 organization_id UUID NOT NULL REFERENCES organizations(id),
 email VARCHAR(255) NOT NULL,
 name VARCHAR(255),
 role VARCHAR(50) DEFAULT 'member',
 created_at TIMESTAMPTZ DEFAULT NOW(),
 UNIQUE(organization_id, email) -- Email unique per org
);

-- Subscriptions (one per org, keeps history)
CREATE TABLE subscriptions (
 id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
 organization_id UUID NOT NULL REFERENCES organizations(id),
 plan_id VARCHAR(50) NOT NULL, -- 'starter', 'pro', 'enterprise'
 status VARCHAR(20) NOT NULL, -- 'active', 'canceled', 'past_due'
 current_period_start TIMESTAMPTZ NOT NULL,
 current_period_end TIMESTAMPTZ NOT NULL,
 canceled_at TIMESTAMPTZ,
 created_at TIMESTAMPTZ DEFAULT NOW(),
 
 -- Only one active subscription per org
 CONSTRAINT one_active_per_org EXCLUDE USING gist (
 organization_id WITH =,
 tstzrange(current_period_start, current_period_end) WITH &&
 ) WHERE (status = 'active')
);

-- Invoices
CREATE TABLE invoices (
 id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
 organization_id UUID NOT NULL REFERENCES organizations(id),
 subscription_id UUID REFERENCES subscriptions(id),
 amount_cents INTEGER NOT NULL,
 currency CHAR(3) DEFAULT 'USD',
 status VARCHAR(20) NOT NULL, -- 'draft', 'open', 'paid', 'void'
 due_date DATE NOT NULL,
 paid_at TIMESTAMPTZ,
 created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Payments
CREATE TABLE payments (
 id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
 invoice_id UUID NOT NULL REFERENCES invoices(id),
 amount_cents INTEGER NOT NULL,
 payment_method VARCHAR(50), -- 'card', 'bank_transfer'
 external_id VARCHAR(255), -- Stripe charge ID, etc.
 status VARCHAR(20) NOT NULL,
 created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Audit log (append-only)
CREATE TABLE audit_log (
 id BIGSERIAL PRIMARY KEY,
 organization_id UUID NOT NULL,
 user_id UUID,
 action VARCHAR(100) NOT NULL,
 entity_type VARCHAR(50) NOT NULL,
 entity_id UUID NOT NULL,
 changes JSONB,
 created_at TIMESTAMPTZ DEFAULT NOW()
);
```

## Indexes for Query Patterns

```sql
-- Get org's current subscription
CREATE INDEX idx_subscriptions_org_status 
 ON subscriptions(organization_id, status)
 WHERE status = 'active';

-- List invoices by date range
CREATE INDEX idx_invoices_org_date 
 ON invoices(organization_id, created_at DESC);

-- Monthly revenue report
CREATE INDEX idx_invoices_status_date 
 ON invoices(status, created_at)
 WHERE status = 'paid';

-- Audit log queries (always by org)
CREATE INDEX idx_audit_org_date 
 ON audit_log(organization_id, created_at DESC);
```

## Multi-Tenancy Strategy

Using row-level security for tenant isolation:

```sql
ALTER TABLE invoices ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON invoices
 USING (organization_id = current_setting('app.current_org')::uuid);
```

## Notes

- **UUIDs** for primary keys: Better for distributed systems, harder to guess
- **Subscription history**: Keep all subscriptions, query `WHERE status = 'active'` for current
- **Amount in cents**: Avoids floating-point issues with currency
- **Audit log is append-only**: Never update or delete, just insert

When designing new features, planning database architecture, or reviewing schema decisions

  • List your query patterns upfront, they drive index design and sometimes schema structure
  • Include scale expectations, a schema for 10K rows differs from one for 10B rows

Common Mistakes to Avoid

Not providing the schema, the AI guesses column names and types, often incorrectly

Forgetting to mention the database system, SQL syntax varies significantly between PostgreSQL, MySQL, and SQLite

Asking for 'optimized' queries without providing table sizes or existing indexes

Frequently Asked Questions

Database queries can make or break application performance. These prompts help you write queries that are correct, efficient, and maintainable, whether you're working with SQL, MongoDB, or other databases. The key is providing your schema context and explaining what data you need, not just asking for 'a query.'

Related Templates

Have your own prompt to optimize?