PostgreSQL on Last DBA

UUID v4 and v7: Collision Incidents and Performance Benchmarks

Fri, 29 May 2026 00:00:00 +0000

Source material: HN UUID v4 Collision Thread, dev.to UUID Benchmark

AI-generated ratio: 99%

TL;DR
#

UUID v4 collided — someone on HackerNews actually hit a real collision. The root cause was a software stack bug, not math. v4 and v7 have no fundamental difference in collision safety. The real difference is index performance: v7 is time-ordered, B-tree is more compact, writes are 35% faster, indexes are 22% smaller. Your UUID v4 is probably fine, but if you care about index performance, switching to v7 is a cheap win.

The UUID v4 Collision Incident
#

A HackerNews thread blew up — Ask HN: We just had an actual UUID v4 collision…, 479 upvotes, 347 comments.

The OP’s own words:

I know what you’re thinking… and I still can’t believe it, but… This morning, our database flagged a duplicate UUID (v4).

It wasn’t a double-insert bug. The code didn’t write it twice. Only ~15,000 rows in the table, using npm’s uuid package uuidv4(), and two rows created at different times collided on the same UUID:

b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd

What’s the probability of a UUID v4 collision? 122 random bits, 2^122 ≈ 5.3×10^36 possibilities. With 15,000 records, collision probability is roughly 2×10^-29. Theoretically “impossible.”

But it happened.

Cause 1: Unreliable entropy sources
#

HN’s top-voted comment (jandrewrogers):

UUIDv4 security depends on high-quality entropy sources. Hardware defects, software bugs, and misunderstandings of “high-quality entropy” all break this assumption. Detecting entropy source failures is expensive, so nobody checks — until a collision happens.

UUID v4 is explicitly banned in high-reliability systems because entropy source quality cannot be verified.

Cause 2: Known npm uuid package bugs
#

The npm uuid package README itself warns:

This module may generate duplicate UUIDs when run in clients with deterministic random number generators, such as Googlebot crawlers.

More seriously, its internal rng() function has global mutable state. One commenter pointed out: calling rng() and sending the result effectively overwrites someone else’s random number, and you can predict it.

Related commit: 91805f665c

Community advice: use Node.js built-in crypto.randomUUID(), not the npm uuid package.

Cause 3: Linux kernel /dev/random race condition
#

Another comment:

I encountered duplicate UUIDs during soak testing of a distributed system. After extensive debugging, I found it was a Linux kernel race condition bug — on multi-processor systems, two processes simultaneously reading /dev/random could, with extremely low probability (~one in a million), get the same bytes.

Cause 4: Go UUID library not checking return values
#

Early Go UUID libraries called random functions without checking the return value length. “Request N bytes, got 3 bytes back” never happened on most hardware, so nobody checked — until production, where it generated thousands of duplicate UUIDs.

Cause 5: Historical AMD CPU RNG defects
#

Certain AMD CPUs had built-in random number generator issues. VM environments can also “virtualize away” entropy — both time sources and entropy sources can degrade inside VMs.

v4 and v7 have no fundamental difference in collision safety. The difference is in the first 48 bits — v4 is random, v7 is a timestamp. You’re unlikely to encounter timestamp source issues, and random source issues are equally rare. The HN thread is an interesting edge case. Knowing that a tiny number of people hit it is enough — you don’t need to distrust the UUID v4 in your own systems.

When choosing v4 vs v7, what you should really look at isn’t collisions — it’s index performance.

UUID v7 Performance Comparison in PG 16
#

UUID v7 has one concrete advantage over v4 in PostgreSQL: temporal clustering, more B-tree-friendly. v4 can bloat and v7 can bloat too — the difference is simply that v7’s first 48 bits are time-ordered, so inserts concentrate on the right side of the B-tree, reducing page splits.

Umang Sinha’s benchmark ran a rigorous comparison on a PG 16 Docker container (8 cores, 16GB, NVMe).

Test Conditions
#

CREATE TABLE uuid_v4_test (id UUID PRIMARY KEY, payload TEXT);
CREATE TABLE uuid_v7_test (id UUID PRIMARY KEY, payload TEXT);

Parameter	Value
Data volume	10 million rows per table
Batch size	10,000 rows per batch
Client	Go + pq driver
UUID generation	Pre-generated in memory, not timed

Performance Results
#

Metric	UUID v4	UUID v7	Improvement
Write 10M rows	5 min 35 sec	3 min 38 sec	35% faster
Table + index total size	3618 MB	3443 MB	5% smaller
B-tree index size	776 MB	602 MB	22% smaller
Point lookup	0.167 ms	0.038 ms	4.4x faster
Range scan	8.283 ms	3.791 ms	2.2x faster

Why Such a Big Difference
#

UUID v4 is fully random. Newly inserted UUIDs scatter randomly across the B-tree index, causing massive page splits and severe index fragmentation. UUID v7 has a millisecond-precision timestamp in the first 48 bits, so newly generated UUIDs are naturally ordered — writes cluster on the right side of the B-tree, page splits drop dramatically, and the index is much more compact.

The 22% smaller index isn’t magic — it’s reduced fragmentation. Point lookups being 4x faster isn’t surprising either — fewer B-tree levels, higher cache hit rates.

Summary
#

UUID v4 and v7 are identical in collision safety — both depend on entropy source quality, one fills the first 48 bits with random numbers, the other with a timestamp. Collisions are edge cases that a tiny number of people hit in specific environments. Your environment is probably fine — that basic judgment doesn’t change.

What you really should think about is index performance. v7’s temporal property makes B-trees more compact, with measured results of 35% faster writes, 22% smaller indexes, and 2-4x faster queries. If your system writes UUIDs at high volume, switching to v7 saves meaningful storage and CPU.

PG 18 will natively support gen_uuid_v7(). For now, generate UUIDs at the application layer. Whichever version you use, always add a UNIQUE constraint.

This article was originally published in Chinese on lastdba.com.

When PostgreSQL Becomes AI's Hands — Bruce Momjian's MCP Server in Practice

Wed, 27 May 2026 00:00:00 +0000

Original: Building an MCP Server Using Postgres, Bruce Momjian, PGDay Armenia 2026, CC BY 4.0.

AI-generated ratio: 80%

Bruce Momjian (PG core team, the one who has written release notes for 20+ years) recently gave a talk at PGDay Armenia 2026: Building an MCP Server Using Postgres. 70 slides, extremely dense. Theory and practice — a solid reference.

Reading it directly is hard work. Even having AI interpret it probably won’t make sense at first glance. I had to read for a while and ask several questions before it clicked.

These 70 slides can be cleanly split into two layers — the first half is theory, the second half is a hands-on demo. The two layers don’t have much to do with each other.

Theory Layer: Explaining the RAG → MCP Evolution Through Transformers (Slides 1-33)
#

The theory layer takes up nearly half the content, from LLM fundamentals to how MCP works. The outline is clear:

RAG vs MCP: In One Sentence
#

Everyone knows the RAG workflow: the programmer decides what data to query → retrieval results are appended to the system prompt → the LLM reads and generates a response. Pre-orchestrated — what the LLM can see is decided before the user even asks.

MCP is different. Tool descriptions are registered with the LLM, and the LLM decides for itself during generation whether to call a tool and which one. Dynamic decision-making — the programmer only exposes tools, the LLM handles orchestration.

Bruce sums it up in one sentence:

RAG can only do what the programmer pre-planned. MCP can dynamically adjust based on output quality, can iteratively call multiple tools, and can trigger external tasks.

“Word or MCP” — That Set of Vector Embedding Diagrams
#

Slides 18-33 are the core of the theory layer. Bruce draws a detailed internal Transformer flow diagram:

His logic: take each MCP tool’s description text (e.g., “Return the radiation level (CPM) at 13 Roberts Road…”), embed it into a vector using a text embedding model, and inject it into the attention layer’s vector space. Then at each inference step, the output vector matches against the nearest vector —

“The closest vector might be a word or an MCP.”

Is This Model Correct?
#

This is what puzzled me the most. Here are my thoughts.

Bruce’s 15 slides are beautifully drawn, but if you try to understand them as engineering implementation, there are problems:

① MCP tools don’t need “embedding.” In actual engineering, tool definitions are written directly into the system prompt as text. The LLM reads “You have these tools: geiger(), get_pretzel_inventory()…” and uses semantic understanding to decide when to call them. There’s no need to compute tool descriptions as vectors, no need to do cosine distance comparisons against word vectors. The essence of Bruce’s teaching model is explaining “LLM decision-making” as “nearest vector matching” — this is closer to the retrieval paradigm than the generation paradigm.

② Attention doesn’t produce a “find nearest” operation. output = Σ(softmax(Q·K) × V) yields a weighted-mixed context vector. There’s no step of “binary choice between the word embedding table and the tool embedding table.” The actual mechanism for LLM tool selection is: attention produces hidden states → LM head → softmax over vocabulary → output tool call JSON. There’s never a “word vs tool” choice, only a softmax over the entire vocabulary.

③ System prompt and user prompt have no boundary in attention. A token sequence is just a token sequence — attention blocks do Q·K dot products on all tokens equally. There is no “system zone” or “user zone.”

So these 33 theory slides can be seen as a simplified teaching model Bruce built for DBAs without an AI background — visually appealing and easy to understand, but don’t use it as an architecture diagram. MCP’s truly revolutionary aspect is protocol standardization (unified tool registration/discovery/calling spec), not any vectorization trick.

Practice Layer: Two Working Demos (Slides 34-69)
#

Starting from Slide 34, the style abruptly shifts — all code, terminal output, hardware photos. That entire Transformer vector model from the theory layer completely disappears, replaced by curl, psql, and Perl scripts.

The only thread connecting the two layers is that “they’re both talking about MCP.” But the vector matching mechanism painted in the theory layer and the actual implementation in the practice layer are nearly two different logic systems. This may be exactly the tension Bruce intended — the theory layer helps you understand why MCP is stronger than RAG, and the practice layer tells you how to actually implement it today.

Demo 1: Letting ChatGPT Read a Real-World Geiger Counter
#

Bruce set up a GQ GMC-800 Geiger counter (radiation detector) in his backyard, connected via USB to a Raspberry Pi, taking environmental radiation readings every 15 minutes. First, see ChatGPT using MCP to call real data:

MCP can call external tools to get real-time data — something RAG cannot do.

Connected to hardware:

Wrote a Python wrapper using fastmcp:

from fastmcp import FastMCP

mcp = FastMCP("Geiger counter MCP server")

@mcp.tool
def geiger() -> int:
 """Return the radiation level (CPM) at 13 Roberts Road, Newtown Square, PA, USA"""
 return subprocess.check_output(
 "/var/lib/postgresql/tmp/geiger", shell=True, text=True
 )

The underlying layer is a Perl script that sends <GETCPM>> over serial, reads back a 4-byte CPM value. Apache reverse-proxies port 443 (OpenAI only talks to 443). After registering with ChatGPT:

User: What's the radiation level at 13 Roberts Road?
GPT: I don't have public data for that location...

User: Use my custom app
GPT: [calls geiger tool] → 14 CPM. Normal background radiation (5-25 CPM).

User: Take five readings and give me the average
GPT: [calls ×5] 15 16 13 15 15 → average 14.8 CPM

Two key behaviors:

The LLM can iteratively call tools and compute — RAG is a one-shot data dump, MCP is “call → get result → decide → call again → compute”
The user must explicitly authorize — the first time, ChatGPT didn’t say “I have your Geiger counter data.” Only when the user said “use my custom app” did the tool call trigger. The security model is conservative

Demo 2: Using PG as a Pretzel Shop Inventory System
#

From hardware back to software. Building a pretzel inventory database:

CREATE TABLE pretzel (
 quantity INTEGER CHECK (quantity >= 0)
);
INSERT INTO pretzel VALUES (0); -- initial inventory 0

MCP tools use psql to operate on PG directly:

@mcp.tool
def get_pretzel_inventory() -> int:
 """Return the number of unsold pretzels"""
 return subprocess.check_output(
 "psql --tuples-only -c 'SELECT quantity FROM pretzel;' -d mcp",
 shell=True, text=True
 )

@mcp.tool
def sold_one_pretzel() -> str:
 """Call this when a pretzel is sold; reduces inventory by one"""
 return subprocess.check_output(
 "psql --tuples-only -c 'UPDATE pretzel SET quantity = quantity - 1;' -d mcp",
 shell=True, text=True
 )

@mcp.tool
def baked_6_pretzels() -> str:
 """Call this when a tray of 6 pretzels is baked; increases inventory"""
 return subprocess.check_output(
 "psql --tuples-only -c 'UPDATE pretzel SET quantity = quantity + 6;' -d mcp",
 shell=True, text=True
 )

Interaction flow:

User: How many pretzels available?
GPT: 0 pretzels.

User: I just baked a tray → 6 pretzels
User: I sold two → 4 remaining
User: I sold four → 0 remaining

User: I sold one pretzel → ERROR! CHECK constraint prevented negative quantity

The LLM doesn’t write SQL directly — it calls your predefined, controlled interfaces. PG’s CHECK constraints naturally form a safety net — even if the LLM is tricked into calling the wrong function, the database-level constraint provides a second line of defense.

But this also exposes a problem: the LLM faithfully executed sold_one_pretzel, but didn’t anticipate that “inventory is 0, calling it will error.” MCP is the execution layer, not the reasoning layer.

How Far from Production
#

On the final slide, Bruce frankly admits the current implementation’s limitations:

No authentication — anyone can call your MCP Server
No parameterization — all three tools are parameterless functions; real-world tools need to accept parameters
No security restrictions on dynamic SQL — tool descriptions declare semantics, but the LLM could be injected with malicious content
Connection pooling, transaction management, rate limiting — none addressed

Two recommended practical reads:

Between the Two Layers
#

Looking back at these 70 slides, the most interesting part isn’t any single demo — it’s how the theoretical thinking and hands-on work together explain what MCP can do:

The theory layer uses Transformer vector spaces to explain “how the LLM chooses between words and tools” — this is a teaching model
The practice layer uses psql, curl, and Perl scripts to actually implement things — this is engineering

The real MCP mechanism — tool definitions inserted as text into the system prompt, the LLM using semantic understanding to decide which tool to call, outputting tool call JSON — needs none of the vector embedding model from the theory layer. Between the two layers, Bruce didn’t draw the connecting line. This might not be a bug — it might be a feature.

This article was originally published in Chinese on lastdba.com.

PostgreSQL on Last DBA

UUID v4 and v7: Collision Incidents and Performance Benchmarks

TL;DR #

The UUID v4 Collision Incident #

Cause 1: Unreliable entropy sources #

Cause 2: Known npm uuid package bugs #

Cause 3: Linux kernel /dev/random race condition #

Cause 4: Go UUID library not checking return values #

Cause 5: Historical AMD CPU RNG defects #

UUID v7 Performance Comparison in PG 16 #

Test Conditions #

Performance Results #

Why Such a Big Difference #

Summary #