Generative In C
PRs 21-47: Using musical output as a guide
Terry Riley is one of my heroes.
In 1964, he kicked off an entire movement in music with In C, a composition comprised of 53 notated cells that a unspecified number of musicians loop as desired. It is up to the choices of individual performers to guide the music through its crescendos and quiet moments. The results is a blanket of interlocking patterns. By design, no two performances are ever quite the same.
While it is often held up as the birth of minimalism, I also consider it the beginning of modern generative music. That is, while other composers had experimented with chance and randomness — the John Cages of the world — In C flipped the script to replace rolls of the dice with human judgement, just beyond the composer’s control. It is a framework for a piece. You always know the essence of what you will hear, but the specifics will always vary based on the conditions of a particular performance.
The rules outlined in the score are simple:
All 53 patterns should be played in order. It’s ok (and desirable) to occasionally drop out.
Performers should stay within 2-3 patterns of each other.
Be sure to listen carefully to one another.
Besides that, it mostly gives the ensemble license to experiment and adapt it as needed.
In C has been a source of inspiration, personally, for a very long time. I have had the joy of working on several generative music systems over the years, and I always use the piece as a measuring stick: Could this system perform In C? When I asked myself whether Fugue could do it, the answer was an emphatic “No.” So I got to work.
But First, Dronejo
I started Fugue because, after 20 years of composing generative music, I was weary from not having expressive tools I wanted. My need became clear after starting what should have been a simple project two years ago: I wanted to create a cheap — ideally free — solution for streaming generative music to YouTube. I struggled with this for months, trying to find an elegant solution. While I had worked on similar generative live streams before, keeping it affordable seemed unachievable. Eventually, I created a demo that uses ffmpeg to combine a looping video and audio from a simple composition written in ChucK. It’s been running on a Raspberry Pi in my living room for nearly a year (minus the occasional downtime when the power goes out or my ISP does maintenance). I often call it Dronejo, after the dog named Banjo whose life it celebrates.
But I wasn’t happy with it. It was not the reusable system I had hoped for. And while ChucK remains my favorite domain-specific programming language for music, it, like all similar tools, starts fighting you when you want do simple things like make network requests or use common design patterns. I have wanted to replace the implementation since it started streaming, so I asked myself if Fugue could at least replace the music generation. The answer was also “No” but a much less definitive one. I decided to start there.
Beyond Skeuomorphic AI
In an earlier article, I caught a glimpse of an agentic workflow for composition that felt magical, and it started an avenue of thought that I am still exploring. Today, the software profession uses AI coding tools to do the sort of work we already do: build applications and APIs for humans to consume. This reminds me of the early iOS apps — many of which I loved — that mimicked real-world actions and objects. The voice memo app had a UI comprised of a giant, vintage-looking microphone; users took notes on what looked like a miniature yellow legal pad; a newsstand app had wooden shelves with leaning magazine volumes. Then, as now, we started exploring a new paradigm by doing what we already knew how to do. While I am sometimes nostalgic for those tactile interfaces, such skeuomorphism demonstrates a collective lack of imagination.
It is too early predict where AI-centric development will lead, but our days of endless clicking and tapping and swiping could be thankfully numbered. If that’s true, the nature of what a software product is must change fundamentally. For something like Fugue, the target would no longer an interface but whatever sound a user imagines.
Using music as the guide
A Sleeping Dog’s Lie (aka, Dronejo) provided that specific audio target to work toward, driving all feature development and bug fixes. A laundry list of requirements emerged almost immediately — from fixing how channels were combined at the end of the audio chain (they weren’t) to reworking topological sort of the audio graph (use a variant of Kahn’s algorithm that allows delay cycles) to abolishing musical scale and mode definitions (just let that be defined by a sequence of notes) — but the most consequential changes were around workflow:
A REPL is convenient for human exploration, but a CLI is more convenient for an agent, so I got rid of the REPL. There is another project here since Fugue needs to be able to trigger long running audio tasks while facilitating communication between human and agent actors. Really, the CLI needs be an interface for a separate daemon the administers the audio graph.
An agentic systems needs guardrails even more than a team of human developers. Test automation and build automation was overdue.
To truly make content the center of development, I enabled hot reloading for the CLI and the MCP server. This set up a (nearly) seamless loop of listen → prompt → listen. As a future project, I also defined a handful of task to bring the agent into the listening part of the loop. Because, well, they can’t hear the output that I hear at all.
Developing Dronejo drove a small experiment that was quickly validated. Moving on to In C, the laundry list became much, much longer. I needed:
Reusable sub-graphs to represent instruments, effects chains, or business logic. I called these developments, after the exploratory section of a fugue.
A code module, allowing arbitrary JavaScript to run within the running invention. It has access to complete audio graph and can make network calls. When running in the browser, it has full browser access.
Reverb — I punted on this for Dronejo because reverb is a crutch, not to mention a dark art that I am not an expert in. For the first, and unlikely last, reverb module, I opted for the GVerb algorithm, which is computationally cheap and sounds pretty good.
A ton of small things like the ability to divide a clock into faster and slower subdivisions, more channels in the mixer module, arbitrary files as assets, a sequencer module that could hold a bank of cells, the removal of all locks on the audio thread, and many others.
A set developments representing instruments to be used in the piece.
A representation of the actual score as JSON.
And the big one: the agent module that reaches out to Claude, Codex, or any available agent to run an arbitrary prompt with context from the running invention.
The In C Runtime
While playing, the agent module is tasked with keeping the music interesting while following the rules of the score. Its prompt is pretty straightforward:
You are the conductor for a performance of Terry Riley's _In C_. Your primary job is to decide when each voice moves from its current cell to the next one. Follow the score directions: every performer plays the 53 patterns in order, repeats each pattern freely, listens to the ensemble, stays within about 2 or 3 patterns of the others, and waits on pattern 53 until the ensemble arrives. You are invoked on a slow gate, roughly every few measures. Each invocation receives a structured snapshot: - `mel_1` through `mel_13`: each voice's `current_cell`, `loop_count`, `total_cells`, and active `steps`. Cell indexes are 0-based, so pattern 53 is `current_cell = 52`. - `mixer_levels`: current channel levels. Mixer channel 0 is the pulse; channels 1 through 13 are voices 1 through 13. - `reverb`: current global reverb wet value. - Graph and history context when available. ## Conducting rules - Only advance a voice by writing `1` to its `advance` field. Write `0` when it should stay on its current cell. - Never skip cells during normal conducting. - Do not advance a voice until it has repeated the current cell long enough to interlock with the ensemble and create an interesting pattern. - As a baseline, short patterns should repeat at least 4 times and longer one at least 2 times. Often longer. Use your judgment. - Keep the ensemble within 2 or 3 cells. Do not advance a voice that is already 3 cells ahead of the slowest voice. - Except for the first and last patterns, avoid having all voices on the same cell. Stagger them to create interlocking patterns. - Let some voices rest and listen by reducing their mixer levels, but avoid abrupt level jumps. Change any one level by at most about 0.15 per response. - Shape broad crescendos and diminuendos together. Reverb should stay in `[0.05, 0.45]` and move by at most about 0.05 per response. - At pattern 53, hold each voice there until all voices have arrived. Then make a few broad swells. Because this example is unending, after the hold you may allow the runtime fallback or direct recovery controls to restart the cycle, but do not skip pattern 53's gathering behavior.
While it represents the sprit of the rules as outlined in the score, it is more of a translation or how they should be applied in Fugue. Since In C is over 60 years old, it is most certainly in the training data of the major models. I even asked GTP 5.5 for a brief history, followed by the question, “Do know how much of this information was in your training data vs what you retrieved from an internet search?”
I did not use web search for that response. It was generated from the model’s internal knowledge, which comes from training on a mixture of licensed data, data created by human trainers, and publicly available text.
Even if an agent already know the In C score by heart, it is still non-deterministic. There is no guarantee that it will follow the instructions exactly, or even at all. And really, that is what makes the result more interesting than a script using random number generation — which the score has as a fallback in case the agent is not available. In the recording, there is even one standout instance where the instructions were completely ignored. At the end, all instruments are supposed to hold on the last pattern before beginning again. For this performance, the agent decided not to, and at least one instrument had restarted and advanced to the third cell before all other instruments had reached the end of the piece. When the agent disregards the rules a little bit, though, the result is exciting. We may not want that when writing code, but it is delightful when creating music. In music, defied expectations are often what brings beauty.
It could be more interesting
As it happens, the Bang on a Can All-Stars came to town this past week to play In C. Riley turned 90 last year, and the ensemble has been touring in celebration. With the music fresh in mind, it would have been foolish to miss hearing the band that created the definitive recording of In C perform it live in my hometown. It was also the composer’s hometown in the early 60s, in the years just before the piece was composed and premiered at the San Francisco Tape Music Center in 1964.
Rather than simply let the wall of ecstatic sound engulf me for the duration as I normally would have, I listened with focus and attention. The reward was a mountain feedback that I can use to improve this fledgling generative system.
In truth, the Fugue version of In C breaks a fundamental rule of the piece. Instead of letting individual performers control how their part is played, I have a centralized agent serving as sole conductor. Why not define a collection of agents, just like there is a collection of instruments? Each might have its own personality, guiding principles, and aesthetic lens. As part of each agent-performer’s response to a prompt, it could outline some of its future intent for the part — such as how long it thinks it will play the same cell or drop out — and other agents could let the inform their choices as part of the context.
Similarly, the simplicity of the conductor agent’s prompt and many of the modules placed constraints I would like to lift: dynamics only change slowly in the generative version, and articulation does not change at all. Central to the beauty of the human performance was how suddenly changes to both dynamics and articulation occurred, providing emotional swells and novel texture.
Moreover, while the piece is trance-inducing for perform and lister alike, it is also grueling. Performers need to relax during performance. For the All-Stars, this manifested as performers periodically dropping out entirely to simply listen and plan or just chugging away on the rhythmic pulse. Also, sometimes it is technically very difficult to play a part offset only a sixteenth note from your neighbor. Players would sometimes pause, recalibrate, and re-enter in unison or on a simpler offset with a counterpart. Agents do not need to take any of these breaks or keep things simple, but at least partially adhering to the mechanics of humanity help blunt some of the music’s already sharp edges.
Few of the human choices I observed are described as allowed behavior in the score, it is worth noting. Of course, people often break the rules when it suits them, just as an agent does. It is reminder — always needed — that giving up some control here is the point.
From a product perspective, there are some additional possibilities to explore. As I think through collaboration strategies, it is easy to imagine things like packs of instruments create by user and dropped into another user’s invention. Similarly, agents could be defined with personae, principles, and values they use to perform and make choices.
Once multiple agents enter the picture, it becomes necessary to think how they communicate and collaborate. In other words, the tool starts become more of a harness. Think Pi or OpenCode, but for music.
Evolving goals
Between composing via agent mediated by MCP, using a known composition as a target while developing agentically, and thinking thorough multi-agent orchestration for Fugue, that new, harness-oriented direction is emerging.
Fugue itself will remain a core library for composing generative music. Around it, an ecosystem will grow that is much different than what I originally expected. Six months ago, I thought the goal was to build a suite of UI-based applications for building generative music systems. That was skeuomorphic thinking. Now, those are just part of the broader picture. Well designed APIs, CLIs, MCP servers, and skills are the inputs for agent-centric applications. Those UIs will still be there to facilitate interaction and collaboration, but they may not be the primary mode of use. Perhaps as MCP Apps gain adoption, they will become the most common way to interaction with Fugue’s output. It is hard to predict, but it seems like a fun idea.
So as always, lot’s more to do.
[Too many PRs to list this time, so the two highlights]
2026.5.21



