Language Models / Representation Mapping

Representation Mapping

Layer 1: operator view (abstract)|Layer 2: representation mapping (concrete)

Language models are often introduced as systems that answer questions.

question → model → answer

That framing is useful, but inside software systems language models are frequently used in a different way. Instead of generating final answers, they transform information between representations.

This process is called representation mapping.

Basic Idea

A representation is simply a structured way of describing information.

HTML documents
natural language text
logs and traces
JSON objects
system state

Representation mapping converts information from one representation into another.

representation_A → model → representation_B

The model acts as a bridge between the two.

Mathematical View

Let A be an environment representation, B be a system representation, and M be the language model.

The transformation can be written as:

M : A → B

The model maps elements from domain A into domain B. Because language models are probabilistic, the result is not deterministic:

y ~ M(x)

where:
  x ∈ A    environment representation
  y ∈ B    system representation

The system later determines whether the result is acceptable.

Why Representation Mapping Works Well With LLMs

Language models are trained on large amounts of mixed structured data: code, documents, markup, natural language, configuration files, structured outputs.

Because of this, they are unusually good at translating between representations.

HTML       → JSON
text       → schema
logs       → categories
documents  → structured objects

Instead of writing rules for every case, the model performs the mapping.

Example: Browser Automation

Browser environments contain strong structure: HTML, DOM hierarchy, form semantics, navigation elements.

A login page may contain many elements, but an automation system only needs a small representation describing how to interact with it.

Environment

DOM
HTML elements
form fields
buttons

System Representation

{
  "username_field": "...",
  "password_field": "...",
  "submit_action": "..."
}

DOM → login representation

Once that representation exists, the automation system can proceed deterministically.

This is useful for several reasons.

First, it reduces the amount of prompt engineering required. The system does not need to encode the entire interaction path in the prompt. It only needs the model to map the environment into a representation the receiver can use.

Second, it takes advantage of structure the model already understands well. Language models have already seen large amounts of HTML, forms, labels, and structured formats like JSON. In many cases the model is better used as a translator between these representations than as an open-ended planner.

Third, it removes a large class of infrastructure that would otherwise need to be built and maintained manually. Without this pattern, the system often drifts toward brittle selectors, custom parsing logic, UI recreation layers, or prompt-heavy task logic. Those approaches usually create more maintenance and reliability problems later.

Used this way, the language model is not just answering a question. It is acting as an operator over a structured environment, and that solves a whole class of problems that would otherwise require significantly more infrastructure.

Example: Document Processing

Many business systems depend on extracting structure from documents such as emails, invoices, support tickets, and reports.

These environments already contain useful information, but not in a form that software can directly operate on.

Environment

emails
support tickets
contracts
reports

System Representation

{
  "issue_type": "refund",
  "priority": "high",
  "requires_action": true
}

document → structured system object

Once that representation exists, the rest of the system can behave deterministically.

This is useful for several reasons.

First, it reduces the amount of prompt engineering required. The system no longer depends on asking for the perfect final answer in one shot. It only needs the model to map the document into a representation the system understands.

Second, it uses strengths the model already has. Language models are very good at reading natural language, identifying entities, inferring categories, and producing structured formats such as JSON.

Third, it avoids a large amount of brittle document-specific infrastructure. Without this pattern, systems often drift toward custom parsers, ad hoc regex pipelines, rule-heavy extraction logic, or endless prompt tweaking. Those approaches usually become difficult to maintain as document formats and workflows expand.

Used this way, the language model acts as an operator that converts messy business inputs into stable system objects.

Example: Log Analysis

Operational systems generate large volumes of raw signals: logs, stack traces, monitoring events, failure reports.

These signals contain useful information, but they are often too noisy or unstructured for downstream systems to use directly.

Environment

system logs
error traces
monitoring events

System Representation

{
  "failure_category": "database_timeout",
  "component": "payments-service",
  "severity": "warning"
}

logs → diagnostic representation

Once that representation exists, the system can classify, route, alert, or recover in more deterministic ways.

This is useful for several reasons.

First, it reduces the need for hand-built classification logic. Instead of maintaining large rule sets for every possible log format, the system can use the model to map diverse raw signals into a stable internal representation.

Second, it leverages knowledge the model already has. Language models have seen many examples of stack traces, log formats, service names, and operational terminology, which makes them good at identifying likely categories and structure.

Third, it removes a class of infrastructure that often becomes brittle over time. Without this pattern, teams usually end up with fragile log parsers, regex-heavy alerting systems, and inconsistent incident routing behavior. Those systems become increasingly hard to maintain as services and failure modes grow.

Used this way, the language model acts as an operator that converts noisy operational data into a representation the system can reason about.

Why This Is Useful

Representation mapping allows language models to interact with deterministic software systems. Instead of asking the model to solve the entire problem, the system asks the model to perform a translation step.

environment → model → representation → system logic

Once the correct representation exists, deterministic software can take over.

Representation Mapping vs Domain Operators

Representation mapping and domain operators are closely related but describe different layers of the system.

Domain Operators

Describe the conceptual role of the model inside a system. The model is treated as an operator that transforms one domain into another.

⊙ : A → B

Architectural view

Representation Mapping

Describes what the operator is actually doing. Focuses on the transformation between specific representations.

HTML → spec
docs → object
logs → state

Implementation view

Relationship to Deterministic Boundaries

Representation mapping produces candidate representations. These representations may still contain errors or inconsistencies. Deterministic boundaries ensure that only valid representations enter the system.

Representation mapping creates structure. Deterministic boundaries verify that structure.

Summary

Representation mapping is the process of transforming information from one representation into another representation that software systems can use.

Language models are particularly well suited for this task because they have learned relationships across many forms of structured and semi-structured data.

Instead of producing final answers, the model acts as a translator between environments and systems.

environment → model → representation → system

Once the representation is correct, deterministic software can take over.