← back

Codegen

2024

My freshman summer I worked at a small company in San Francisco. Founded by former Palantir engineers, our technology aimed to tackle large codebases and extract insights the same way you would with industrial organizations.

The idea is to create an understanding of the codebase as a large graph, the same way TreeSitter creates a tree of a file of code. Connect the functions between files, connect imports, connect references, and you now have the ability to navigate your entire codebase.

One of the many cool things you can now do is create large-scale code refactors with very little effort. Want to change a function? The graph will know what other functions reference it and which files will be affected, and it will do it intelligently. This is what Codegen's GraphSitter does. It's a language server with a strong understanding of the codebase.

Now sprinkle some language model assistance to write your short simple scripts and you have a very powerful tool. Ramp used these methods to solve a large loop problem they had with their codebase. There was a giant self-referencing loop that was incredibly hard to debug. But with Codegen, the task became trivial.

But there still remained a challenge: understanding this graph. That's where I created the GraphVisualizer. It's a tool that works on top of GraphSitter to create visualizations. I did a bunch of graph analysis to compute ways in which the graph can be modified, leveraging my small but strong previous experience with graph theory.

While hard to pinpoint, Ramp's estimated savings is a significant chunk because of the reductions in codebase modifications. Notion, Duolingo, and others also used this to understand and remove dead code, understand function dependencies before modifying them, and use this to explain their code to peers. My section of the product got a lot of traffic and even boosted adoption of the platform. It offered a way for engineers to do things they'd always kept dreaming of.

This was my first time writing production-ready code and my first time owning an entire subsystem of an application used frequently by engineers for everyday tasks. Understanding what people want and converting it to a tangible piece of software was hard. We also became popular on Twitter for a while.

You see how I visualize my dead code? Very demure. Very cutesy. pic.twitter.com/uhhdRNRHUY

— Chase (@ChaseMc67) August 24, 2024


It was neat creating something with a group of people, entering new frontiers no one has ever stepped into, and creating technology that people used. It was a wonderful experience that was capped off with a retreat to Hawaii. That's us just before we learned to surf in the waters of Kauai.