TO2D

Architecture Lab

Language Models

LLMs in Software Systems

Traditional software is usually built from components that behave predictably enough to be tested, reasoned about, and composed into larger systems.

Large language models introduce a different kind of component. Their outputs are probabilistic, which means the same input can produce different results. This does not make them unusable. It means the surrounding software architecture matters more.

Before deciding how to build systems around language models, it helps to observe what kind of component they actually are.

Deterministic and Probabilistic Behavior

A deterministic component behaves like this:

same input → same output

A probabilistic component behaves more like this:

same input → set of possible outputs

Traditional software engineering relies heavily on the first pattern. LLM-based systems introduce the second.

Simple Observations

Some tasks are narrow and stable.

What is the capital of Japan?
→ Tokyo

Some tasks require representation and transformation.

A shop sells 3 pencils for $2. How much do 10 pencils cost?
→ $6.67

Some tasks depend heavily on domain structure.

Why do we wear seatbelts in cars?
Why is orbital velocity needed for a rocket to stay around Earth?

These examples show that model behavior is not generic in the abstract. Success and failure are shaped by the domain, the structure of the task, and the constraints of the expected answer.

Failures Are Informative

When a model fails, the failure often has structure.

It may indicate:

  • a local calculation mistake
  • a formatting problem
  • a domain misunderstanding
  • a missing environmental constraint

This matters because software systems do not just consume answers. They depend on outputs that must fit into larger processes.

Why This Matters

If model behavior is probabilistic, then reliability cannot come from the model alone. It has to come from how the model is used inside the system.

This suggests a different question:

not just how to prompt the model,
but what role the model should play inside software.

That leads to a more useful view: the language model as an operator inside a constrained system.

Observing Operator Behavior

Once the model is viewed as an operator, a natural next step is to observe how its behavior changes across domains, formats, and constraints.

operator behavior across inputsuser promptdomainoutput typeconstraintsinputsLLM operatorraw outputparsed outputerror typeverificationoutputserror classificationdomain · local · structural · environmental

Given a prompt, a domain category, an expected output type, and optional constraints, the system can observe:

  • raw output
  • parsed output
  • classified error type
  • whether the error is domain-level, local, structural, or environmental
  • what changed if retried

This kind of observation turns model behavior from a black box into something that can be studied, categorized, and engineered around.

Summary

Language models are probabilistic components. Traditional software expects deterministic behavior. The gap between these two creates the core design challenge.

Reliability cannot come from the model alone. It comes from how the model is used inside the system: what role it plays, what constraints surround it, and how failures are handled.

probabilistic component
    ↓
constrained system role
    ↓
deterministic software behavior

That is the starting point for everything that follows.