Back to Blog

How Coline Code Works

Coline Team

A deep dive into how we built a full cloud IDE with on-demand compute, persistent workspaces, and an AI pair programmer inside a productivity platform.

The Problem

We wanted users to be able to write and run code inside Coline. Not a code block in a note. A real development environment: file system, terminal, Git, dependency management, the works. And it had to work entirely in the browser with zero local setup.

How It is Structured

Coline Code has three layers. A web client (the editor), a control plane (Next.js API routes that handle orchestration and billing), and a runtime plane (Docker containers on EC2 where user code actually runs).

Splitting control from compute was the first real decision. It means if a runtime crashes, we still know the full state of the workspace and can spin up a replacement. It also means we can scale compute independently. And because the control plane is just API routes on our existing Next.js deployment, we shipped the first version without building any new infrastructure for orchestration.

Giving Users a Machine

When you open a workspace, we provision an EC2 instance, boot a Docker container with our runtime image, and you are coding in under 60 seconds. The interesting decision here is that we pull the runtime image fresh from our container registry on every boot instead of baking it into a machine image.

That costs us about 15 seconds of startup time, but it makes deploys trivial. We push a new Docker image and every new workspace gets it. No image rebuilds, no version coordination, no rollout process. For a runtime we ship changes to multiple times a week, the tradeoff is obvious.

Idle instances get stopped automatically by a cron job. Stopped instances keep their disk, so resuming is fast. When a workspace is fully done, the instance terminates and the workspace lives on as a snapshot.

The Runtime

Each instance runs a Node.js server inside Docker. It handles everything that happens inside a workspace: the file system, terminals, Git, and command execution.

Terminals are real PTY processes, not emulated. We use node-pty and bridge I/O over WebSocket. Multiple clients can attach to the same session, which matters when Kairo (our AI agent) needs to watch command output alongside the user.

For Git, we just shell out to the CLI. We thought about using a library like isomorphic-git, but shelling out to the real thing turned out to be simpler, more correct, and easier to debug. For clones we use partial clones with blob filters so you are not downloading the entire history of a large repo.

Keeping Files Alive

The hard problem in a cloud IDE is not running code. It is making sure your files survive when compute stops. We solve this with a sync queue that runs on the runtime itself.

Every file change queues the workspace for a snapshot. But we do not sync immediately. The queue is debounced: it waits 30 seconds after the last change, with a hard cap at 2 minutes. So during rapid edits, we are not constantly uploading. But you also never lose more than a couple minutes of work.

Snapshots go to Cloudflare R2 as compressed archives. When you reopen a workspace later, we provision a fresh instance and restore from the latest snapshot.

That restore flow is where we learned the most. The naive version streamed the archive directly from R2 to the new instance. It failed constantly. EC2 says an instance is "running" well before it can actually serve requests, and streams break in creative ways when one end just booted. We ended up doing two things: exponential-backoff health checks before sending any data, and fully buffering the archive in memory instead of streaming. Not elegant, but it works reliably, and reliability beats elegance here.

We also built a Merkle tree system that fingerprints all source files in a workspace. This lets us detect whether anything changed since the last sync without downloading the full snapshot, which saves a lot of unnecessary transfers.

Kairo Code

Every workspace comes with Kairo, our AI agent. It is not a chatbot in a sidebar. Kairo reads your files, writes code, runs terminal commands, and navigates the codebase using LSP (go to definition, find references, hover for type info).

The most important design decision we made: Kairo uses the exact same API as the editor. When it reads a file, it hits the same endpoint. When it runs tests, it uses the same command execution. There is no separate environment, no shadow filesystem. Kairo sees exactly what you see, including the effects of its own changes.

This sounds obvious in retrospect, but early on we considered giving the agent its own sandboxed context for safety. The problem is that two views of the same workspace inevitably diverge. You end up with a whole class of bugs where something works for the AI but not for the user, or vice versa. Sharing the same runtime eliminated all of that.

Beyond standard file and grep search, Kairo can do semantic search across the codebase. Ask "where does the app handle payments" and it finds the relevant files by meaning, not just by matching the word "payment" in a filename. The index is built from embeddings over source files, and it updates as the workspace changes.

Kairo also has persistent memory. When you tell it something about your project ("we use Zustand", "prefer single quotes"), it stores that and retrieves it in future sessions. The agent gets better the more you use it without you needing to repeat yourself.

All edits show up as diffs you review before applying. Destructive actions require confirmation. The agent never force-pushes and checks for secrets before committing.

The Editor

The editor is our own component. File tree, multi-tab editing, built-in terminal, diff viewer for reviewing Kairo's changes. There are two layouts: a full-width editor mode for focused coding, and an agent view where the editor and Kairo sit side by side and the agent can highlight files and diffs inline as it works.

You can start a workspace by cloning from GitHub, importing from Coline Files, or uploading a local folder. Every path converges to the same workspace format, so the experience is the same regardless of how you got there.

What We Learned

Do not stream between things that might not be ready. Streaming archives between R2 and a freshly-booted instance failed about 20% of the time. Buffering the payload before sending fixed it. Sometimes the boring solution is the right one.

"Running" is not "ready." EC2 reports instances as running before they can serve requests. Without health checks with backoff, about 15% of cold workspace restores failed. We now retry with exponential backoff and re-verify health between attempts.

Optimize for deploy speed, not boot speed. Baking a custom machine image would shave 15 seconds off startup but make every deploy a multi-step process. Pulling a fresh Docker image on boot means shipping a runtime update is one command. When you are iterating on the runtime daily, this pays for itself immediately.

Give the AI the same interface as the user. No special endpoints, no separate state. Same API, same filesystem, same terminal. Fewer moving parts, fewer bugs where the two views of the workspace disagree.

What is Next

We are building Coline in Coline Code. Right now we are working on warm container pools for faster cold starts, collaborative editing, and expanding Kairo's LSP coverage. The compute layer is designed to be swappable, so we can move to a different runtime infrastructure without touching the client or the control plane.

Try Coline Code

Clone a repo, spin up cloud compute, and start coding with Kairo right in your browser. No local setup required.

No credit card required.

Written by the Coline Team -