TO2D

Architecture Lab

Operators / Observability

Progress Requires Observability

Software engineers have known for decades that you cannot debug what you cannot see. Logs, metrics, traces, dashboards — these are not luxuries. They are the minimum infrastructure required to make correct decisions about a running system.

But it is important to clarify what this means.

Observability is not simply watching a system. An engineer staring at a dashboard is observing the system. Observability describes something different: the system itself exposing enough signals to explain its internal behavior. If the system does not emit those signals, no amount of monitoring or inspection can reconstruct what happened.

This distinction matters even more for autonomous systems. Modern agent systems operate inside environments that already exist — web applications, APIs, document pipelines, enterprise systems. The agent does not control the instrumentation of those environments. It only receives whatever signals the environment exposes.

Progress therefore depends entirely on what the system reveals. If the environment exposes the signals required to understand state transitions, the agent can diagnose failures and adjust its behavior. If the environment hides those signals, the agent is forced to guess.

observing a system     → looking at outputs
observable system      → signals explain state
autonomous system      → decisions depend on signals

What Observability Actually Means

Observability is not monitoring. Monitoring tells you when a known metric crosses a threshold. Observability is the capacity to infer internal system state from external signals.

monitoring:     did the known thing fail?
observability:  what is actually happening inside the system?

Observability is not about collecting data. It is about exposing the right data.

A system is observable when its outputs, errors, and side effects carry enough information to reconstruct the decisions that produced them. This matters because corrective action requires diagnosis, and diagnosis requires evidence.

Observability should not be confused with simply observing a system. An engineer can observe a system through dashboards or logs, but those tools only work because the system was designed to emit signals in the first place. Observability is therefore a property of the system, not a property of the observer. A system that does not emit the signals needed to infer its internal state remains opaque regardless of how much monitoring infrastructure surrounds it.

systemoutputserrorsdiagnosiscorrective actionsignalsinferenceresponse

Without sufficient signals, the gap between what happened and what you can determine about what happened becomes the gap between progress and stagnation.

When Hidden State Breaks Decisions

The observability problem becomes concrete when two different environment states produce the same observation. The agent receives identical information in both cases, but the correct action is different.

REAL ENVIRONMENT STATEstate Acaptcha requiredstate Bnormal loging(x)observation"login page"agentdecision ?CORRECT ACTION DIFFERSstate A → solve captchastate B → submit logintruestatescollapsedblind

Observability failure. Two different environment states produce the same observation. Because the agent cannot distinguish the states, it cannot determine the correct action.

The Engineering Parallel

Consider a production system with no logs. When a request fails, the engineer has nothing to work with. The failure occurred somewhere inside the system, but the system exposed no trace of its internal path.

request → [ system ] → failure

no logs
no metrics
no trace
no structured error

The engineer cannot improve this system. Not because the problem is hard, but because the system does not expose the information needed to reason about it.

Now add structured logging, request tracing, and error categorization:

request → [ system ] → failure
                ↓
         trace: auth → db → serialize → fail at step 3
         error: missing field "account_id"
         latency: 340ms

Now the engineer can act. The signal revealed the internal state. The system became observable, and therefore improvable.

Why Observability Matters for Agents

In traditional software systems, observability exists to help engineers understand what the system is doing. Logs, metrics, traces, and dashboards expose internal behavior through external signals. When these signals are present, engineers can diagnose failures and determine the correct corrective action.

Agent systems operate under the same constraint.

Agents do not instrument the environments they interact with. They inherit whatever signals the environment exposes. In browser automation, for example, the environment is the browser and the page itself — both of which already exist independently of the agent. This makes observability a structural constraint rather than a tooling choice.

An agent does not have direct access to the environment it interacts with. It only sees the signals that the software system exposes. These signals might include:

page structure
API responses
error messages
state transitions
side effects of previous actions

If these signals are incomplete, the agent cannot reliably determine what action to take next. In that situation, the system behaves like a black box.

The agent may retry actions, explore alternative paths, or produce explanations for what might be happening. But without sufficient signals, those actions become guesses rather than decisions.

ENGINEERlogs → diagnosis → fix → deployAGENTsignals → inference → action → observeSHARED CONSTRAINTprogress depends on what the system reveals

Why Many Agent Failures Are Misdiagnosed

When agents fail, the natural reaction is to blame the model. Perhaps the reasoning was incorrect. Perhaps the prompt was insufficient. Perhaps the model needs more context or memory.

Sometimes those explanations are correct.

But many failures that appear to be reasoning failures are actually observability failures. The agent was never given enough information to determine the correct next action in the first place.

This is similar to debugging a production system without logs. Even the best engineer cannot reliably diagnose a problem when the system does not reveal what is happening internally.

Agents face the same limitation. If the signals do not reveal the state of the environment, the agent cannot reason its way to the correct action.

Failure Mode

action failed silently
wrong element selected
state changed unexpectedly
context was stale

Root Cause

no error returned
no visibility into page state
no change notification
no freshness signal

In each case, the system failed to expose state that would have allowed the agent to correct itself. The environment was not observable at the resolution the agent required.

Observability and Decision Making

Correct action requires diagnosis. Diagnosis requires evidence.

In an observable system, outputs, errors, and side effects provide that evidence. These signals allow the operator — human or agent — to reconstruct what the system is doing.

When those signals are missing, decision making becomes speculative. The system may still produce outputs, but those outputs cannot be reliably interpreted.

opaque system + capable agent = unreliable behavior
observable system + capable agent = improvable behavior

Observability in Autonomous Systems

Autonomous software systems make decisions continuously. For these systems to remain stable, the environment must expose signals that allow the agent to answer three questions:

1. What just happened?
2. What state is the system currently in?
3. What action will move the system toward the desired state?

If those questions cannot be answered from the available signals, the system is not observable. In that case, progress depends on trial and error rather than informed action.

The Observability Stack

In traditional engineering, observability has a well-known stack: logs, metrics, and traces. Each layer exposes different information about system behavior.

For agent systems, the equivalent stack maps to the operator–environment interaction model.

ENGINEERINGAGENT SYSTEMSlogserror signalsmetricsstate exposuretracesaction historydashboardsenvironment modelobservability = improvabilitywhathow muchpathoverview

Observability as a Precondition

Observability is not a feature. It is a precondition for every other system property.

reliability     requires knowing when failures occur
correctness     requires knowing what was produced
adaptability    requires knowing what changed
improvement     requires knowing what to change

You cannot verify a boundary you cannot observe. You cannot correct an error you did not detect. You cannot improve a system whose behavior is invisible.

This is why observability sits between error signals and environment discovery in the operator model. Signals expose state. Observability explains what that state means. Discovery uses that explanation to navigate unknown territory.

Observability Before Intelligence

A common assumption in agent development is that better reasoning models will solve reliability problems.

But reasoning operates on the information available to the system. If the system does not expose the variables that determine success or failure, better reasoning does not solve the problem. It only produces more sophisticated guesses.

For this reason, improving observability often produces larger reliability gains than improving the model itself.

Designing for Observability

If observability is a precondition for reliability, then system design must treat signal exposure as a first-class concern.

every action should produce a verifiable signal
every failure should produce a structured error
every state transition should be detectable
every environment constraint should be discoverable

Systems that satisfy these properties give the operator — whether human or autonomous — the information required to make correct decisions. Systems that do not satisfy them create an upper bound on reliability that no amount of model improvement can overcome.

Relationship to Signals

Error signals are one specific kind of observable. They indicate when outputs violate system constraints. Observability generalizes this: it encompasses all signals, not just failure signals.

error signals    ⊂    observable signals    ⊂    environment state

Error signals are the most action-relevant observables because they directly indicate what needs correction. But a system with only error signals and no other observability cannot diagnose why failures occur or predict when they will recur.

Relationship to Environment Discovery

Environment discovery is the process of learning the structure of an unknown environment through interaction. Observability determines what the discovery process can learn.

signals expose state           (signals)
observability explains state   (this article)
discovery navigates state      (environment discovery)

If the observation function is too narrow, the system cannot discover constraints that lie outside the observable space — regardless of how many interactions it performs.

Progress Requires Observability

Before improving reasoning, planning, or memory, it is worth asking a simpler question:

Does the system expose enough information
to determine the correct next action?

If the answer is no, the agent will struggle regardless of model capability. An autonomous system cannot improve performance on variables it cannot observe.

environment → signals → observability → diagnosis → correction → progress

Progress requires observability.