posts

Vector vs Relational DB: They Don't Replace Each Other. A Decision Guide for the AI Era

Five clients came to me last year with the same conclusion: 'we need a vector DB for AI'. Four of them didn't. The fifth did — but used it wrong. A decision guide for companies with no time for hype who want to know what they're actually buying.

schedule10 min čítania

Vector vs Relational DB: They Don't Replace Each Other. A Decision Guide for the AI Era

Why this question is burning right now

Over the past year, five clients came to me with the same conclusion: "we need a vector DB for AI." Four of them didn't — plain Postgres handled their problem. The fifth did need one, but used it wrong and built a chatbot that answered semantically close, yet factually off.

The question is here because AI-adjacent sales push companies toward a simple narrative: AI = embeddings = vector DB = Pinecone. Reality is less linear. Vector and relational databases don't replace each other. They answer different kinds of questions. In most real-world stacks they coexist — and the third world, full-text search, gets forgotten, even though in many companies it carries the bulk of the searching — nobody just talks about it.

This article is a decision guide. No benchmarks, no leaderboard. Four real situations I keep running into, three traps I see clients fall into repeatedly, one default stack for 90% of SMEs, and three questions to answer before you reach for anything other than PostgreSQL.

The framework: three worlds, not two

Three database types, three query types:

A relational database (PostgreSQL, MariaDB, MS SQL) answers exact questions exactly. "Give me all invoices for client 042 from March." The index is a B-tree, the query is SQL, the result is deterministic. This has been the core of every business system for the past 30 years.

A vector database (pgvector, Qdrant, Weaviate, Pinecone) answers imprecise questions approximately. "Find documents semantically similar to the phrase 'how do I return a product'." Under the hood it stores embeddings — numeric vectors that represent the meaning of text — and finds the nearest ones via cosine similarity. The result isn't deterministic in the SQL sense; it's most probably relevant.

Full-text search (PostgreSQL tsvector, Elasticsearch, OpenSearch) is the third world that gets forgotten. It searches at the level of words and phrases, ranked by BM25. "Find documents containing the words 'complaint' or 'refund' except those from 2022." Cheap, fast, predictable — and often exactly what the user actually needs.

Most real-world solutions need two of these three. Sometimes all three. The question isn't which DB, but what kind of query your users are running.

Four real situations from practice

1. Classic business software (POHODA, e-shop, CRM) — relational is enough

Invoices, orders, stock levels, contacts, payroll data. Every query has an exact form: who, when, how much. No fuzzy logic helps — when an accountant looks for invoice 2024/0042, they don't want a semantically similar one, they want that exact one.

Default: PostgreSQL. If you have legacy MariaDB or MS SQL, don't migrate just for the AI hype. Indexing and fixing slow queries delivers 10× more value than any vector DB. You don't need a vector store just because you want a dashboard or reporting into Slack.

2. Chatbot over company PDFs / internal knowledge base — vector, yes

The client wants: "ask our HR manual". The user phrases things freely: "when am I entitled to sick leave?", "what about paternity leave?", "who approves my vacation?". The vector DB does the heavy lifting here — embeddings of relevant passages are compared against the embedding of the question, and the top-3 chunks are sent to the LLM as context.

Default: pgvector inside your existing Postgres. For 90% of use cases that's enough. Only at 10M+ chunks or an SLA under 50 ms does it start to make sense to reach for a dedicated engine (Qdrant, Weaviate). More on that in section 5.

3. Search in an e-shop or helpdesk — hybrid

This is the most common real-world case that clients misunderstand. A user on an e-shop types: "red shoes under 80 euros size 42". The real query has three layers:

  1. Keyword ("red", "shoes") → BM25 / tsvector.

  2. Semantics (synonyms, context, typos) → vector.

  3. Filter (price ≤ 80, size = 42) → SQL WHERE.

A vector DB alone isn't enough here. Returning 50 semantically similar shoes in every color and price means a frustrated customer. Hybrid = BM25 + vector + filter via SQL. PostgreSQL can deploy all of this in a single engine — pgvector + tsvector + ordinary indexes. Without three separate external services.

4. AI agents with long-term memory — vector + TTL

An agent that remembers conversations from past weeks, or a system that learns from field interactions. Vector DB, yes — memory embeddings enable find-similar on new questions. But without a TTL strategy (time-to-live, automatic deletion after N days) the DB accumulates noise. After three months the agent retrieves irrelevant old episodes and hallucinates.

Rule of thumb: every record gets a TTL. Different windows for different memory types — system preferences 365 days, conversational memory 30 days, transactional context 7 days. This is a more frequent cause of bad UX than a bad model.

Three traps I see repeatedly

Trap 1: Vendor lock-in via Pinecone (or any proprietary SaaS)

Pinecone is a good service. But when you tie a critical part of your stack to a proprietary endpoint with no export in a format another engine can read, you bind yourself not only to their pricing but also to their feature roadmap.

A concrete example from Q1 2026: Pinecone 2.0 raised read costs from $0.04 to $0.12 per 1M RU — a 3× increase with no change in workload. Clients who signed a multi-year contract in 2024 woke up to a triple invoice and no downgrade mechanism — the serverless tier means "you pay for usage," there's no smaller plan to switch down to. This isn't a hypothetical risk, it's this year.

Calculate the total cost over 3 years, including re-embedding if you ever migrate. It often works out cheaper to self-host pgvector on Hetzner, DevOps overhead included. Concrete numbers below in section 5.

Trap 2: The vector DB as the single source of truth

A client — an e-shop with ~30k orders a year — built a chatbot for customer support. The goal: a customer writes a question, gets an answer with a link to the FAQ or to the detail of their own order. The team did what everyone does today: FAQ + returns policy + guides → embeddings → Pinecone → GPT-4. It worked nicely on demo questions.

It collapsed in production within the first hour. Customers started asking: "where is my order 2024/0042?". The chatbot returned semantically similar texts from the FAQ — "how to track a parcel", "order lifecycle". Because order 2024/0042 wasn't in Pinecone. It was in Postgres.

The lesson: a vector DB is about semantics. When a user asks for a fact, you need a tool that calls SQL. Vector search finds the similar, not the exact. Without a relational DB, you're gluing facts out of semantic guesswork.

The fix took two hours. We added tool calling to GPT — if a question contains an order-number pattern or an identifier, the agent fires a parameterized SELECT into Postgres. Vector stayed for context. For facts, the source of facts was used.

Trap 3: An embedding model with no versioning

You embed 5 million chunks with OpenAI's text-embedding-3-small. Six months later a better or cheaper model ships. If you don't have the model version stored alongside each vector — and a re-embedding strategy — you throw away the whole DB. Or worse: you leave a mix of two models in it and the search can't be trusted.

Rule: store every record with embedding_model_version. Run the migration as a batch job with a dual index (old + new), then switch over atomically.

1ALTER TABLE chunks ADD COLUMN embedding_model_version text NOT NULL;
2
3CREATE INDEX ON chunks USING hnsw (embedding vector_cosine_ops)
4 WHERE embedding_model_version = 'text-embedding-3-small';

A partial index over a specific model version — you bring up the new model in parallel, queries stay deterministic (WHERE embedding_model_version = '...' in every query), and you switch over once re-embedding is done.

The default stack for 90% of SMEs

For a small or mid-sized company that wants to deploy AI over its corporate data, I have one default:

PostgreSQL 16+ with pgvector + tsvector on your own server or via managed (Supabase, Neon).

A decision tree by expected scale:

1How many chunks do you expect within 12 months?
2
3 < 100k → SQLite + sqlite-vec
4 (local, dev, edge deployments)
5
6 100k – 10M → PostgreSQL + pgvector ← default for 90% of cases
7 (Hetzner ~17 €/mo or Supabase Pro ~25 USD)
8
9 10M – 100M → pgvector with tuning (HNSW, partitioning)
10 or migration to Qdrant self-hosted
11
12 > 100M → Qdrant / Weaviate / Milvus
13 (by then you have a dedicated DB team that picks for itself)

Concrete numbers — a comparison for 1 million chunks and mild traffic (~5 RPS, requests per second):

  • Stack: PostgreSQL + pgvector (self-hosted) · Hosting: Hetzner CCX13 (2 vCPU AMD, 8 GB RAM, 80 GB NVMe) + storage box for backups · Monthly cost: ~17 €

  • Stack: PostgreSQL + pgvector (managed) · Hosting: Supabase Pro · Monthly cost: ~25 USD

  • Stack: Pinecone Serverless (Standard plan) · Hosting: Pinecone · Monthly cost: 50 USD minimum commitment, typically 70–200 USD depending on traffic

Pinecone Standard has had a 50 USD/mo minimum commitment since 2024. The real invoice for 1M chunks and a moderately active AI agent lands at 70–200 USD/mo depending on the reads/writes ratio and storage (storage 0.33 USD/GB, reads 8.25 USD per 1M RU — read units, writes 2 USD per 1M WU — write units). Under a production workload with peak hours the numbers climb higher — the Q1 2026 read-cost hike (3×) made this worse.

Self-hosting pgvector on Hetzner therefore costs you a fraction of Pinecone's price. At higher loads the gap is even wider. On top of that, your data stays on infrastructure you fully control — 100% of access logs, no third parties in the supply chain, no surprises in the ToS when a vendor rolls out a new pricing model.

When to migrate away from pgvector:

  • You've outgrown 10 million chunks and you need sub-50 ms latency on top-K queries.

  • Search makes up 60%+ of infra cost and you want a dedicated vector engine with a lower per-vector price (Qdrant self-hosted or Weaviate Cloud).

  • You need native multi-tenancy or geo-distribution that pgvector lacks.

In 9 out of 10 cases you'll never outgrow this. Don't base a decision in 2026 on hypothetical scaling from 2028.

Three questions before you decide

Before you buy a Pinecone license or migrate data, answer three questions:

1. Are my users looking for exact records or *similar* ones? If exact (invoice, contact, order status) → relational. If free-form phrasing over text → you need vector + relational together.

2. Is one DB engine enough, or do I need a combination? In most cases you need a combination (vector + SQL + sometimes full-text). PostgreSQL can do all of it in a single engine. Only when you hit a concrete SLA limit should you split.

3. What chunk volume will I have in 12 months — not in 5 years? Under 1M chunks: pgvector is trivial. 1–10M: pgvector with an HNSW index is still fine. Over 10M: consider a dedicated engine, but calculate the total cost including DevOps.


If this resonated and you have a concrete use case to assess — POHODA + AI, a chatbot over documents, search in an e-shop — I can come in for a one-day infra audit. We look at the data, the queries, the plan. No commitment to further cooperation. The deliverable is a document with a recommended stack and a 90-day migration plan.

Book an audit →

Vector vs Relational DB: They Don't Replace Each Other. A Decision Guide for the AI Era | Myslio.AI