The Model, the Host, and the Training Run

By Leigh Garrity— May 6, 2026

The Model, the Host, and the Training Run

When a buyer says "DeepSeek sends data to China," the concern is legitimate. The mechanism is wrong. And in a CISO conversation, mechanism is everything — because identifying the right mechanism points to a real control, and chasing the wrong one consumes the meeting.

Here's what's actually happening when a customer runs DeepSeek R1 through Amazon Bedrock.

What the Model Actually Is

A trained model is a file. More precisely, it's a large collection of floating-point numbers (the weights) that encode learned associations between inputs and outputs. DeepSeek R1 at the 70-billion-parameter scale is roughly 140 gigabytes of these numbers, depending on quantization. It has no network stack. It has no scheduler. It has no process that wakes up at 2 a.m. and phones home. When you load it into an inference runtime, it does one thing: takes a sequence of tokens as input and produces a probability distribution over the next token as output. That's the complete behavioral surface.

The weights are static. They don't update during inference. They don't learn from your prompts. They don't accumulate state between sessions unless you explicitly build a memory layer on top of the inference call. The model you loaded on Tuesday is identical to the model you loaded on Monday, regardless of what you asked it.

Three Layers, Cleanly Separated

The confusion in most security conversations collapses three distinct layers into one.

Training is what happened before you got the artifact. Someone at DeepSeek, in China, assembled a dataset, ran a compute job, applied RLHF, and produced the weights. This is the layer where Chinese jurisdiction, data sourcing decisions, and fine-tuning choices all live. It's also the layer you have no runtime visibility into, because it's finished. The training run is over.

Hosting is where inference actually runs. When AWS offers DeepSeek R1 through Bedrock, the weights are loaded onto AWS hardware in US-East datacenters. Your API call travels to AWS infrastructure under your AWS IAM controls. The response comes back from that same infrastructure. No packet in this exchange touches a DeepSeek server, a Chinese network, or any infrastructure outside your AWS account boundary. AWS's shared responsibility model applies in full: your VPC, your KMS keys, your CloudTrail logs.

Telemetry is what the inference provider can see. In the Bedrock scenario, AWS logs inference requests according to your account's logging configuration. DeepSeek-the-company has no visibility into your prompts, your completions, or your usage patterns. They shipped the weights; they don't operate the runtime.

Each layer has different owners, different controls, and different risk profiles. Treating all three as equivalent Chinese-jurisdiction risks produces a threat model that's simultaneously too broad and too narrow, missing the exposures that are actually present.

“

Okta Concept Mapping: Trust Anchors

The closest IDAM analog is the trust anchor in PKI. When you add a CA certificate to your trust store, you're making a one-time decision to accept artifacts that chain to that root — and from that point forward, your own infrastructure handles validation, logging, and enforcement. The CA doesn't see your TLS handshakes. The analogy holds this far: running DeepSeek on AWS gives you the same runtime isolation. Your inference infrastructure is yours; DeepSeek-the-company doesn't see your traffic.

The analogy breaks at provenance. With a CA, trust is verifiable at runtime — certificate chains, CRLs, OCSP responses let you trace trust back to a known root. With a trained model, the equivalent of the certificate chain is the training process, and that process is opaque and complete. You can observe the model's behavior; you cannot audit the training run that produced it. The trust anchor question becomes: what did we agree to trust, and on what basis?

The Conversation Worth Having

When a CISO raises the data-goes-to-China objection, the accurate response is redirection. For the Bedrock scenario, that particular concern doesn't hold. The actual exposure sits elsewhere.

Training data provenance. What was in the corpus? DeepSeek has published some information about training data composition, but not at the granularity a federal procurement officer needs. If the use case involves sensitive domains — legal, medical, national security-adjacent — what the model learned from, and whether that data was lawfully obtained, is legitimate and currently underspecified.

Embedded content policies. RLHF and fine-tuning shape model behavior in ways that aren't fully documented. There are documented cases where DeepSeek R1 declines to engage with certain topics in ways that differ from comparable Western models. Whether this represents a security concern or an operational limitation depends on the use case, but it's a real behavioral characteristic worth testing before deployment, not after.

Export control questions. The Commerce Department's AI-related export control framework is still evolving, and the status of specific model weights under EAR is not always clear. For federal customers and contractors, this is a question that legal needs to answer before security does — and right now, the answer isn't clean.

None of these dissolve because the data stays in US-East. They're harder questions, without tidy answers yet. The packet-routing conversation is a distraction from all three.

[FLAG FOR ACCURACY REVIEW: DeepSeek R1 parameter count and file size are illustrative estimates. AWS Bedrock availability of DeepSeek R1 should be verified against current AWS model catalog. Export control framework characterizations should be reviewed against current BIS guidance before publication.]