to2d

A space for ideas, notes, and ongoing work.

Sequential Keyboard Actions

Nov 2025

BrowserAgent Update

One of the biggest reliability gaps in browser agents isn't navigation — it's keyboard behavior.

The computer-use model emits sequences like:

Down Down Down
Tab*5 Enter
Tab Tab Enter

The agent previously treated everything as combinations, which caused form-filling failures.

I extended the agent to support:

  • space-separated sequential keypresses
  • repetition syntax (Tab*5)
  • automatic detection between sequences and combos
  • clearer validation and error reporting

This closes a major ergonomics gap for complex form automation.

← Back to Open Source