The idea that computers are deterministic is one of the most persistent and dangerous myths in engineering. A single CPU executing a single integer instruction is deterministic. Everything built on top of that โ is not.
Floating Point: The Numbers Lie
Floating-point numbers approximate real numbers using a fixed number of bits. Most real numbers can't be represented exactly โ they get rounded to the nearest representable value.
0.1 + 0.2 = 0.30000000000000004 // in IEEE 754 float64
This is not a bug. This is the specification. But it means:
(a + b) + c โ a + (b + c) โ addition is not associative. Rearranging the order of operations changes the result.
Different compilers โ may reorder FP operations for performance. -ffast-math in GCC explicitly permits this. Same source code, different binary, different results.
Different hardware โ x87 uses 80-bit internal precision. SSE uses 64-bit. ARM uses 64-bit with different rounding modes. Same algorithm, different CPU, different answer.
FMA (fused multiply-add) โ a*b+c computed with one rounding step instead of two. Whether the compiler uses FMA changes the result at the last bit.
Every neural network, every physics simulation, every financial calculation is built on floating point. None of them are bit-for-bit reproducible across hardware without extreme care.
Thread Scheduling: The OS Decides
Your program has four threads. The OS decides which one runs on which core, for how long, and when to interrupt it. This decision depends on:
What else is running on the machine
Thermal throttling state
Power management decisions
Interrupt arrival timing
Memory pressure and page fault handling
Two threads writing to the same variable without a lock: the result depends on which thread's write lands last. This changes every run. Every microsecond of timing difference in the OS scheduler produces a different interleaving of instructions, and a potentially different outcome.
Thread A: read counter (5) โ increment โ write (6)
Thread B: read counter (5) โ increment โ write (6)
Expected: 7 Actual: 6 // sometimes
The word "sometimes" is the signature of non-determinism. The bug only appears under specific timing. It passes every test, then fails in production. This is a heisenbug โ observing it (adding logging, running a debugger) changes the timing and makes it disappear.
Clocks: Time Is a Lie
Every call to "get current time" returns a different value. Any logic that depends on it is non-deterministic by definition.
Wall Clock
Can jump backwards (NTP sync โ the daemon can slew time gradually or step it instantly, depending on how far off you are). Can leap forward (daylight saving, leap seconds). Two machines disagree on "now" by milliseconds to seconds. Timestamps from different machines are not comparable without a synchronisation protocol.
NTP adjustments โ ntpd/chrony periodically discipline the clock. A large correction can shift wall time by hundreds of milliseconds in a single adjustment. Any code measuring elapsed time via wall clock (Date.now(), gettimeofday()) will see phantom durations โ intervals that appear negative or impossibly long.
Process pre-emption โ the OS scheduler can suspend your process at any point to run another. Your code sees this as time vanishing. A function that "takes 2ms" might measure as 50ms because the kernel gave the CPU to someone else in the middle. On a busy system, pre-emption is frequent and unpredictable. Priority inversion, CPU throttling, and other runnable processes all affect when your thread gets time.
Hypervisor pauses โ in virtualised environments (EC2, Azure VMs, VMware), the hypervisor can steal entire vCPUs to serve other tenants. Your guest OS doesn't know this happened โ it just sees time jump forward when it gets scheduled back. These "steal time" pauses can be tens to hundreds of milliseconds. In extreme cases (live migration, noisy neighbours), seconds. Every timing assumption your code makes is invalidated โ health checks fire, locks expire, leader elections trigger, all because the hypervisor took your CPU away and your wall clock jumped.
Monotonic Clock
Never goes backwards. Good for measuring durations. But different on every machine. And the resolution varies โ some systems give you nanoseconds, some give you milliseconds. Two measurements of the "same" duration will differ.
Timeouts are non-deterministic โ they depend on system load.
Cache TTLs are non-deterministic โ expiry timing varies.
Rate limiters are non-deterministic โ request timing varies.
Pre-emption pauses are non-deterministic โ OS scheduler decides, not you.
Hypervisor steal time is non-deterministic โ you can't even detect it from inside the VM without checking /proc/stat.
NTP corrections are non-deterministic โ clock discipline happens when the daemon decides.
Any test that uses sleep() or Date.now() is a non-deterministic test.
Memory: The Invisible State
The program sees virtual addresses. The CPU sees physical pages, cache lines, TLB entries, and prefetch queues. None of this is visible to your code, but it affects behaviour:
Cache hits vs misses โ same algorithm, 100ร speed difference depending on what was accessed recently
ASLR โ Address Space Layout Randomization. Memory addresses differ every execution. Pointer values differ. Any code that depends on pointer ordering is non-deterministic
Page faults โ first access to a memory page triggers a kernel trap. When this happens depends on access patterns and OS paging decisions
NUMA topology โ which CPU socket owns which memory bank. Cross-socket access is slower. Thread placement affects performance non-deterministically
GPUs: Massively Parallel Non-Determinism
A GPU runs thousands of threads simultaneously. The order in which they complete is non-deterministic. For integer work, this doesn't matter โ addition is commutative and associative. For floating point, it does:
1024 threads each compute a partial sum. The reduction tree adds them in pairs. Which pairs complete first varies โ addition order varies โ rounding varies โ result varies.
atomicAdd on float โ the order of accumulation is hardware-dependent. Same kernel, same data, different result each launch.
This is why the same neural network, with the same weights, same input, and same seed, can produce different outputs on different GPUs โ or even on the same GPU on different runs. The model's forward pass involves millions of floating-point reductions, each with non-deterministic accumulation order.
I/O: The Outside World
The moment your program reads from disk, network, stdin, a sensor, or anything external, determinism is gone.
Disk reads โ latency varies with physical head position (HDD), wear levelling (SSD), OS caching, and concurrent I/O from other processes
Network โ latency, packet loss, reordering, retransmission. Two identical requests take different times. Responses arrive in different orders
User input โ a human is the ultimate non-deterministic system. Keystroke timing, mouse position, decision-making
Sensors โ temperature, accelerometer, GPS. Physical world measurements have noise, drift, and sampling jitter
Languages and Runtimes
Many non-determinism sources are baked into the tools you use:
Python dict โ insertion-ordered since 3.7, but set iteration order is non-deterministic across runs (hash randomisation, enabled by default)
Go map โ iteration order is intentionally randomised to prevent developers depending on it
Java HashMap โ no guaranteed order. Hash seed randomised since Java 8 (string hashing)
Garbage collection โ when GC runs, how long it pauses, which objects it collects. All non-deterministic. Finalizer order is unspecified
JIT compilation โ which methods get compiled, when, and how aggressively depends on runtime profiling. Same code, different optimisation, different performance characteristics
What You Can Actually Do About It
Fix the seed โ control PRNG state. Necessary but not sufficient.
Use deterministic kernels โ CUDA has torch.use_deterministic_algorithms(True). Slower, but reproducible. Mostly.
Avoid FP accumulation order dependency โ use Kahan summation, or sort before reducing
Lock everything โ serialise concurrent access. Kills performance, guarantees ordering
Use logical clocks โ Lamport timestamps, vector clocks. Don't depend on wall time
Make I/O a seam โ inject deterministic test doubles for disk, network, time, and randomness
Accept it โ design for non-determinism. Idempotent operations, CRDTs, eventual consistency, property-based tests instead of exact equality
The Truth
One instruction on one core โ deterministic
One thread, integer only, no I/O โ deterministic
One thread, floating point โ deterministic on same hardware + same compiler. Not across platforms.
Multiple threads โ non-deterministic unless fully synchronised
Multiple machines โ non-deterministic. Always.
GPU โ non-deterministic by default. Deterministic mode available at a performance cost.
The real world โ non-deterministic. No exceptions.
A computer is a deterministic machine the same way a billiard table is frictionless โ true in the textbook, never true in practice. Every real system is a deterministic core wrapped in layers of non-determinism. The skill is knowing where the boundaries are.
Determinism is the collapsed state. Non-determinism is the superposition. Most systems exist in superposition most of the time โ we just pretend they've collapsed because the output usually looks the same.