Neural Networks: The Mechanism Behind the Metaphor

By Leigh Garrity— May 9, 2026

Neural Networks: The Mechanism Behind the Metaphor

A neural network is a mathematical function composed of stacked layers. Each layer takes a set of numbers as input, multiplies them by a matrix of weights, adds a constant offset, and passes the result through a nonlinear function. The output of that operation becomes the input to the next layer. Repeat this enough times and you have a system capable of approximating extraordinarily complex relationships between inputs and outputs.

The "neural" in the name is historical baggage. The architecture was loosely inspired by how neurons fire in biological brains — input exceeds threshold, signal propagates forward. But the resemblance is superficial, and the metaphor has been doing damage ever since. What you're actually working with is linear algebra at scale, with a nonlinear twist inserted at each layer to prevent the whole stack from collapsing into a single matrix multiplication. The brain metaphor makes the thing sound more mysterious than it is, and also less precise than it needs to be. Worth noting, then setting down.

“

• neural network: A mathematical function built from sequential layers, each applying a weighted linear transformation followed by a nonlinear activation. The biological framing is historical; the mechanism is matrix multiplication, repeated.

How the Layers Actually Work

Take an input — a vector of numbers representing whatever the network is processing. Each number in that vector gets multiplied by a corresponding weight. The weights are just numbers too, stored in a matrix. The matrix multiplication produces a new vector. Then an activation function gets applied to each element of that vector, a simple nonlinear operation, something like "if this value is negative, set it to zero." The result is the output of one layer.

That output becomes the input to the next layer. Same sequence: multiply by weights, add offset, apply activation. And again. And again.

No hidden complexity underneath it. The sophistication of what these systems can do emerges from scale and depth, not from any exotic operation at the layer level.

"Deep" means the stack is tall. A shallow network might have two or three layers. A deep network has many more. Large-scale models used in production today have hundreds of layers, each containing millions of parameters — the illustrative numbers matter less than the proportion: the total parameter count in a large model is measured in the billions, and most of those parameters are weights in layer matrices. The word "deep" in "deep learning" is literal. It refers to the depth of the layer stack. Nothing more.

Why does depth matter mechanically? Each layer can learn to represent the input at a different level of abstraction. In a network trained on images, early layers tend to detect low-level features — edges, gradients, local contrast. Middle layers detect shapes and textures. Later layers detect higher-order combinations that correspond to recognizable structures. Nobody programmed these representations. They emerged from training. The network discovered them by adjusting weights until its outputs matched the desired outputs closely enough.

That capacity for emergent abstraction is what makes deep networks powerful. It is also, precisely, what makes them hard to interpret.

After training, the weights are fixed. The network is a deterministic function: put the same numbers in, get the same numbers out. There is nothing stochastic happening at inference time in the basic case. The weights are the model. Everything the network "knows" is encoded in those numbers, and those numbers are the problem we'll get to shortly.

“

• depth: The number of sequential transformation layers in a network. Greater depth allows the network to represent increasingly abstract features of the input without any explicit programming of those representations. "Deep learning" is an architectural description, not a claim about the sophistication of what's learned.

When You'll Actually Need This

The first time a CAIO asks whether their agency's AI procurement should include model interpretability requirements, you need a real answer. Not a vendor answer. A real one.

Interpretability is an active research field. It is not a solved problem. The weights in a trained network are floating-point numbers. They don't have names. They don't correspond to concepts a human defined. You cannot open a trained network and read off why it made a particular decision the way you can open an audit log and trace an access event.

"We trained it and it works" is closer to the truth than most vendors will admit. The performance is real — these systems do things that are genuinely useful. But the mechanism that produces the performance is not fully understood, even by the researchers who built the systems. Mechanistic interpretability is an active subfield trying to change this, and there's genuine progress being made. Researchers have identified structures inside trained networks that appear to correspond to recognizable concepts — circuits that detect specific features, attention patterns that track relationships. But the field is not at a point where you can point to a specific layer and say "this is where the model decided the document was classified."

In federal procurement, this matters concretely. Agencies operating under FISMA, deploying AI in adjudication or benefits contexts, or trying to satisfy emerging executive-branch AI governance requirements are going to ask about explainability. The honest answer is that explainability for deep networks is a spectrum, not a binary, and the state of the art is still being written. Knowing this — and being able to say it without flinching — is more credible than the vendor who tells the CAIO it's handled.

The buyer who hears you say "interpretability is an open problem and here's where the research currently stands" will trust you more than the buyer who hears a clean answer that doesn't match what their technical staff already knows.

“

• interpretability: The degree to which a trained network's internal reasoning can be examined and explained by humans. For deep networks, this remains an active research problem. Vendors who describe it as solved are ahead of the science.

Okta / IDAM Concept Mapping

The LDAP directory tree is a useful structural anchor. It's hierarchical — entries organized from root to leaves, each with a defined position in the tree. It's queryable — you can ask it precise questions and get precise, traceable answers. And it's transparent by design: every access decision is auditable. The logic that governs a query result is human-readable, stored in attributes and schema that an administrator wrote and can read. You can trace exactly why a user got or didn't get access, step by step, all the way up the tree.

A neural network's layer structure looks similar on a diagram. Inputs at one end, outputs at the other, organized hierarchy in between. The resemblance is real, and it's a useful starting point for the conversation. Up to a point.

LDAP's transparency is not incidental — it's the point. The rules governing access decisions are legible because humans wrote them to be legible. A neural network's "rules" are the weights: millions of floating-point numbers that emerged from training. Nobody wrote them. Nobody can read them in any meaningful sense. The intermediate representations between layers don't correspond to concepts a human defined or can look up. The opacity isn't a configuration problem waiting for a better admin interface. It's a fundamental property of how these systems work, and the research community is still working out what to do about it.

LDAP: transparent and auditable by design. Neural networks: opaque by nature, with interpretability as an ongoing research project. That gap is not a gap in the product. It's a gap in the science. Your buyer needs to know the difference.

The mechanical picture is actually simple. Layers of weighted transformations, stacked deep, producing outputs that nobody fully understands from the inside. The performance is real. So is the opacity. The seller who can hold both without flinching is the one who gets invited back for the technical deep-dive.

“