nCPU: A CPU Entirely Simulated by Neural Networks on GPU

A CPU Running on GPU

The nCPU project pushes the boundaries of useful absurdity: it's a complete processor where every component is implemented as a neural network running on GPU. Registers, memory, flags, program counter — everything is a PyTorch tensor.

How It Works

Each ALU operation is a trained model:

Addition: uses Kogge-Stone carry-lookahead algorithm, implemented via 8 neural passes
Multiplication: learned lookup table for byte pairs
Logical operations: vectorized neural truth tables
Shifts: attention-based bit routing

The result? 100% accuracy on integer arithmetic, verified by 347 automated tests.

Performance Inversion

The most counter-intuitive finding: multiplication is 12x faster than addition.

In a classic CPU, MUL is always slower than ADD. Here, it's reversed. The lookup table for MUL (21 µs) has no sequential dependency, while the carry-lookahead adder (248 µs) requires O(log n) propagation stages.

23 Models, 135 MB

The "CPU" includes 23 trained models totaling 135 MB:

Arithmetic (ADD/SUB/CMP): 100% accuracy
Multiplication: 100% accuracy
Logic (AND/OR/XOR): 100% accuracy
Shifts: 100% accuracy
Math functions (sin, cos, sqrt, exp, log): trained

Overall Performance

~262 µs per instruction cycle
~4,975 instructions per second
Model loading in 60ms

This is obviously orders of magnitude slower than a real CPU. Performance isn't the point.

Why It's Interesting

This project is a materialized thought experiment. It demonstrates that neural networks can implement any computable function — including a complete processor.

It's also an exploration of boundaries between hardware and software, between deterministic and learned computation. The code is available on GitHub for those who want to explore this strange intersection.