The AI Organization, part I
Most of the discussion of AI today focuses on 1:1 interactions, but we aren’t talking as much about how AI is going to change organizations
If your job involves coding or writing, change is coming: Large Language Models (LLMs) will change the way you work, and it may happen very quickly.
The consensus opinion now is that LLMs are going to be as important as the printing press1 or perhaps even the Industrial Revolution. Some are even saying it’s the end of the world as we know it.
Most of the discussion of AI today focuses on 1:1 interactions like ChatGPT, but we aren’t talking as much about how AI is going to change organizations. AI will dramatically enhance our collective communication abilities, and this will lead us to completely overhaul the structure of the corporation.
Throughout human history our technological development and scale of cooperation have been dictated by information and communication technologies2. LLMs erode many of the assumptions about information flow and coordination costs in the modern corporation. It's a bit like dropping cell phones into a classic movie like Home Alone... Morning Alone — it doesn't have quite the same ring to it. We predict an equally transformative evolution into what we call the AI organization. We humbly propose the following definition:
“
An AI organization is an entity that primarily uses AI to manage information flow within the organization and decide on team composition and function.
”
But before diving into how AI impacts organizations, we’ll first need to understand what corporations or firms are and how our pre-LLM communications technology grounds modern ones.
Why Companies?
Look around you — almost everything you see was created by a corporation. Why? One answer is legal: governments make it possible to create legal entities that trade and sell goods & services as one unit. This creates a convenient unit for human action, especially when dealing with taxes and regulatory requirements. But the legal explanation does not tell the full story of why the market has adopted this vehicle and not another.
What prevents us from operating solely as independent contractors, or conversely, working for a single all-encompassing entity such as a giant company or the government?
The economist Ronald Coase3 believed that transaction costs were the primary factor controlling the size of firms. For Coase, firms grow to the size where the cost of organizing an internal transaction in the firm (e.g. adding one extra person working to achieve a task) is higher than the same transaction on the open market (e.g. by hiring a contractor).
To understand this, Coase devotes a lot of time to understanding the following puzzle: Outside of a firm, market signals (i.e. price) seem to primarily guide decisions, while inside, human judgement (”management”) is the primary guiding force. Coase quotes economist Dennis Robertson’s description of firms in the sea that is the market: they are
“islands of conscious power in this ocean of unconscious co-operation like lumps of butter coagulating in a pail of buttermilk”
Organizations incur many transaction costs when they get help from entities outside the organization: it requires time to find the right outside organization or contractor, a well-defined spec of the work required, and, of course, time to negotiate contractual details. This is simply too much to do for every task, especially in the face of rapidly changing business needs.
Within a firm, these transaction costs disappear. There is a devoted talent pool with deep organizational knowledge, capable of doing iterative work with a flexible project scope and time horizon.
Of course, the catch is that when we make decisions inside a firm all the market signals are gone and we must rely solely on our judgement. Inside a firm, especially for the largest and most complex companies, it becomes very difficult for even the founders & owners to understand their own organization, and we are plagued by high coordination costs.
The modern corporation
In the past, limited communication and distribution networks constrained corporation size. If we asked a 18th century reader to look around them, it’s much more likely their possessions and services were NOT produced by a corporation.
Prior to the invention of the telegraph, information could only travel as fast as the fastest ship or horse, constraining the scale and degree of coordination a corporation could operate at. The British East India Company, while being the largest company of its time, was structured more as a collection of regional operations. Each operated with a significant degree of autonomy due to the difficulty of communication between different divisions. Transmitting information between London and Calcutta, for example, could take several months. It’s hard to imagine a product like the iPhone being created in these conditions - communicating the centralized vision of Apple's executive team with an international manufacturing operation would prove immensely challenging.
The telegraph made instant communication possible within a firm, and railroads ensured a distributed market to offer their services. The companies providing services like railways were among the first to rise to this unprecedented scale. The New York and Erie Railway was one of the largest railroad companies in the 1850s. However, its general superintendent, Daniel McCallum, noticed that they were much more inefficient than smaller railways. The telegraph had led to an over-reliance on centralized control. Could these companies maintain their economies of scale while also preserving some of the efficiencies of smaller railways?
To address this challenge McCallum invented the modern org chart, which organized staff in a hierarchy with fungible role types like ‘Superintendent of Road’ and ‘Brakemen’. McCallum gave more responsibility to divisional superintendents, which allowed them to focus on their divisions and effectively operate as if they were smaller and more efficient rail lines. McCallum’s invention flourished and created the modern corporation that provides so many of the goods and services we rely on today.
Scaling the modern organization
With the innovation of the org chart McCallum eased the tradeoff between scale and efficiency that many organizations face as they grow. This new ‘operating system’ for firms operates even more effectively in modern times by leveraging the internet, but the tradeoff between scale and efficiency has not gone away. Structures and processes that work very well at small scales are difficult to maintain at larger scales. Firms today are still plagued by problems arising from scale4.
This tradeoff is not unique to organizations and is also found in natural settings. Natural beehives are much more efficient than industrial ones, but they are impossible to manage at scale. Small farms are also more efficient per unit of land because they can grow various crops side by side - unfortunately, it is infeasible to scale them into industrial operations5.
Many readers will recognize how this dynamic plays out in organizations. Small organizations have the advantage that everyone knows each other. Every member can share context with every other member of the organization. Processes and projects can be very specifically tailored to the makeup of the team and the current situation the organization faces. It’s also easier to keep a small group aligned & motivated to take on an ambitious mission - hello, startups!
A large org on the other hand has more staff (this can scale pretty far, Amazon has over 1M employees, for reference ancient Athens had around 150k people) but suffers from high coordination costs6. Executives deal with scaling pains by creating a reporting hierarchy via an org chart. This system allows information to be encapsulated and sub-organizations to focus on their own objectives without the friction of needing to interact with every other part of the org. This parallels programming practices, where we routinely divide codebases into modules, selectively concealing information through encapsulation. Notably, at Amazon, the mandate for teams to interact exclusively via APIs is believed to have catalyzed the creation of AWS7.
In today's organizations, management decides team composition and function. Unfortunately, we often see unnecessary duplication and a lack of sharing of hard-earned knowledge between teams. And sometimes how we originally organize teams can quickly become sub optimal and not reflect an organization’s changing needs. This is where the power of AI comes into play.
History repeats itself?
Stepping back, we see:
Firms reach an efficiency limit and practical maximum scale (e.g. VOC or British East India Company)
A new communication technology emerges boosting firm productivity — the telegraph (and later the internet)
Firms struggle to integrate that technology into the old corporate structure (traditional pyramid structure)
The firm is refactored into new structure that allows it to fully utilize the new technology — what we have today: the org chart, OKRs, stand-ups, and so on up to a new scaling limit (e.g. Amazon today)
We predict a similar evolution today
Firms at a practical efficiency and scaling limit with our present technology
New communication technology emerges: Large Language Models, boosting firm productivity — You are (probably) here
The new technology (LLMs) is running on an old corporate operating system and not fully benefiting the organization — this is the Copilot model we see today — You (might be) here
Refactoring the firm with a new structure, the AI organization — We will help you get here
Making companies go brrrr: Why LLMs dissolve the assumptions behind the modern firm and the emergence of AI organizations
Here are some familiar characteristics of the modern firm:
Formal org charts with well-defined roles and levels (e.g. PM, level 6)
Cooperation between teams, especially those in different divisions, incurs high coordination costs. Two teams working on similar problems in different divisions, as a rule, don't talk and may only be vaguely aware of each other's work. It's interesting that at Google, the internal intranet search on its mono repo and document base functioned as a sort of mixer for the company because that was often the only way two teams working on similar topics could find out about each other.
Technical personnel are often trapped in knowledge bubbles within their own divisions, unable to share their expertise with other teams facing similar problems.
Use of OKRs to plan objectives and for teams to sync typically on a quarterly basis.
Sprints and stand-ups within teams, going over basic facts about what they did and how close they are to achieving their objective.
As we’ll see, LLMs route around and reduce the need for every one of these.
> Formal org charts with well-defined roles and levels (e.g. PM, level 6)
> Cooperation between teams, especially in different divisions, has high coordination costs. Two teams working on similar problems in different divisions, as a rule, don't talk and may only be vaguely aware of each other's work.
Right now we’re hand-designing information flows and team structure. Instead, let’s use LLMs to share information between teams and help route important work to the right people.
LLMs can summarize what work everyone does in an organization by parsing over their code, messages, and documents.
LLMs in conjunction with other AI techniques can also identify common problems in an organization and rank them by severity.
These models can then group the work of each team member by reviewing their code, messages, and documents, providing a comprehensive summary of their roles.
We can then route important information to the right people in the organization who have the relevant expertise.
This way of organizing information effectively forms dynamic 'flash' teams that cut across traditional organizational boundaries.
>Technical personnel are often trapped in knowledge bubbles within their own divisions, unable to share their expertise with other teams facing similar problems.
By training LLMs on company code/docs and/or embedding company code/docs in a vector space we can capture institutional knowledge ('tribal knowledge') and spread it around the organization and safeguard it against loss due to personnel changes8.
> Use OKRs to plan objectives and for teams to sync typically on a quarterly basis
> Sprints and stand-ups within teams, going over basic facts about what they did and how close they are to achieving their objective
LLMs can breakdown objectives into tasks from all levels — executives, managers, engineers, or PMs — providing technical scope. This can occur continuously to ensure alignment. The need to ask "who should I approach for..." is eliminated and information that would otherwise be in an outdated wiki is continuously updated and readily accessible. As for executives, they receive a dynamic dashboard they can chat with offering insights into their organization. Meetings and stand-ups become less prevalent in part because you can interact with simulated versions of any team-member9.
So the AI organization will feature
Universal dissemination of organizational know-how and context
More fluid and less defined team boundaries.10
Accelerated development and execution
What’s the catch ? AI organizations will likely give up on some degree of legibility such as having a well defined org chart, but already the biggest and most important organizations are difficult for their founders/owners to grasp. The shift towards the AI organization is the natural next stage in the evolution of organizations.
To facilitate this transformative shift, we've created Mutable.ai.
Introducing Mutable.ai: Empowering AI Organizations
To accelerate the emergence of AI organizations, we are building the foundation to support them, unifying how organizations communicate, build, and store knowledge on a single platform.
At Mutable.ai our mission is to build foundational AI tools that will accelerate the emergence of AI organizations. Our aim is to unify how organizations communicate, build and store knowledge on a single platform, all powered by the revolutionary capabilities of LLMs.
Given the central role of software creation in the global economy11, Mutable.ai is starting by helping software dev teams by building a multiplayer universal programming platform. Superficially, this will look like a unification of Slack, VS Code, and GitHub, but we aim to avoid vendor lock in and support integrations with your current tech (like GitHub) and also to include non engineers (especially PMs!) on day one.
Transcending Copilot
While the Copilot model is a valuable starting point, it doesn’t meet the prerequisites for supporting AI organizations: unified AI driven information flow linking all stakeholders. To paraphrase Henry Ford, Copilot built a faster horse and we're building a Ferrari.
We believe simply bolting on AI functionality to existing tools doesn't create the seamless work flow we’d expect in an AI organization. Working in a true AI organization will feel almost effortless - imagine ops firmly out of the way, programming overhead eliminated, instant access to information you need from across the entire organization, and support from the colleagues best positioned to make you more effective.
Mutable.ai, an AI native platform for AI organizations
So, instead of an AI copiloting the analog of an Airbus from the 70s, think of Mutable.ai as a Jaeger for your company, piloted by your team. This means shipping more features to your customers in less time, understanding what your team is building in real time, and fearlessly leveraging unfamiliar technologies without compromising the integrity of your systems.
To aid information flow within your organization, we're offering:
A codebase chat to ask questions about your codebase as well as pull requests or commits, TL;DR — too long, didn’t review and soon we will expand to include your docs + messaging + your issues/ticketing system
An auto wiki feature that serves as a living documentation of your codebase (and soon Notion/G-docs and Slack)
An auto bug identification feature that identifies potential issues in your codebase and incorporates notes, issues, and active tickets and even suggests good people to investigate these issues — forming a sort of virtual flash team to solve your most pressing issues
An auto stand up feature that serves as a snapshot and history of work across teams, utilizing all commits across company repos
We're also breaking down technical barriers and piercing knowledge bubbles:
Embedding the entire company corpus (code + other information) and in some cases fine tuning on the company corpus
Multi-file editing with natural language that is aware of your codebase
Automated test generation, that is aware of your testing style12
These features come with privacy controls that give you complete control over how information is shared depending on role, team, and seniority. For example, execs will have a birds-eye view of their entire organization that will allow them understand what is happening at any desired level of granularity. The more information you put in, the better it will work, but you are in control of who sees what information.
Coda
It’s clear we’ve hit an inflection point in the history of the firm. LLMs and AI more generally will transform the modern organization as we know it. The AI organization, the next stage in the evolution of the company, holds tremendous promise to lower coordination costs and to increase firm velocity without compromising safety.
Just as one structure is being torn down, we now see the scaffolds of a grander future one. While we hope we’ve made it clear that this structure must exist, it’s not entirely clear yet how exactly it will look. In the coming months, we will do what we do best: talk to you, ship features, and provide more examples and case studies (perhaps featuring you!) about the structure of this new organization.
If you want to be the first organization to have its own AI-powered mecha custom-fit to your team, please reach out to [email protected]. We’d love to partner with you. We’re a small but mighty team whose members were formerly part of organizations like DeepMind, Google, Stanford, and AWS.
Our new codebase app is out today, check out app.mutable.ai for sneak peak of this future.13
Best,
Omar Shams, Founder CEO
Tyler Cowen thinks LLMs could be as important as the printing press — with the advent of movable type, some clergy were horrified that everyone could read the Bible and comment on it .. people like Martin Luther. It’s a fun exercise to contemplate what the analog of a Martin Luther using LLMs would do today.
Although it is a long topic that probably deserves its own essay, it is worth pointing out that LLMs represent a fascinating "full circle" moment for language and communication technology. Any communication technology must pick a balance of expressivity and precision.
The highly precise and abstract languages of math and computer code have come to dominate technology — "Software is eating the world"ᵃ. However, with this specificity we have given up a good deal of transparency - code and variables are less intuitive than words.
Recently we have seen programming languages trending towards increased expressiveness at the expense of precision. This trend can be seen in the shift from assembly to C to Python. The advent of LLMs will supercharge this shift, allowing us to program with plain English.
Humanity has climbed a ladderᵇ starting from orally transmitted language towards increasingly abstract and capable languages. We have now come full circle, with LLMs that are able to operate natively in the languages we use to converse with each other.
[2a] With the notable exception of physical tasks, at least until robotics can learn in a more self-supervised way ("learn-to-plumb" is the new "learn-to-code")
Coase pays a lot of attention to property rights and argues that, in a deep sense, the initial allocation of property boundaries doesn't matter. If you can ignore transaction costs, the efficient outcome will prevail according to market prices. This is the so called Coase theorem, but importantly the assumption (which is never really true!) is that transaction costs are zero. It's interesting to apply this in the context of a larger organization and take transaction costs to be coordination costs. If coordination costs were a lot lower, companies would be more effective - the AI organization. Sometimes it's argued that if there were no transaction costs, the optimal firm size would be one, but Coase makes an exception for "marketing" costs and "brand," which in the valley we would call "distribution". So, what is the optimal firm size? It's hard to predict, but it would make sense for people to work together in the same organization when united by a common mission or objective.
More is different — A big lesson from physics is if you keep adding more of something at some point you aren’t thinking about the thing anymore but some new different emergent thing (quarks → atoms → phonons → ..) I asked Demis, the founder of DeepMind in 2018 if we simply needed to scale up our algorithms to see emergence using an argument along these lines. Right now it’s interesting to see a debate in the research about whether emergence “really” exists because in some ways there is a sudden emergence and on other measures like perplexity that decrease smoothly. Spoiler: It’s both and it depends on the scale you measure things!
For nice treatment of these topics checkout ‘Seeing Like a State’ and the inimitable Byrne Hobart’s treatment of the “East Asian Economic miracle” (Byrne is an investor in Mutable.ai)
Most organizations today are inspired by the corp system Napoleon invented that was crucial to the success of La Grande Armée. The most important feature of the corp system was that they were self sufficient armies that could achieve military objectives without support from another corp. Breaking down his army in this way was probably why Napoleon was able to move so fast relative to his adversaries. Speed is everything for startups.
The Prussians after getting their butt kicked by Napoleon developed mission type tactics, Auftragstaktik, that we’re so familiar today - you’re given a mission and some guidance, but you are expected to achieve the mission rather than follow an exact process at all times. Which is actually close to the analogy between asking an LLM to write a function versus writing it out yourself—won’t always get what you expect, but sometimes you’re pleasantly surprised.
Also, to state the obvious troops in Napoleonic Europe needed to be colocated to work together so even if an artillery officer from corp II, would be suited to help corp III with something, it wasn’t physically possible for them to work together if they were 50 miles apart. Physical constraints like this are much rarer today.
From Stone’s biography of Bezos: “A video-game designer argues that intelligent systems can be created from the bottom up if one devises a set of primitive building blocks. The book was influential in the creation of Amazon Web Services, or AWS, the service that popularized the notion of the cloud”.
The formula for Roman concrete, opus caementicium, was lost with the fall of the Roman Empire, and famous structures like the Pantheon (pictured in the cover) would collapse with modern concrete. AI can mimic the benefits of one-on-one tutors (solving Bloom's 2 sigma problem) and permanently store expert tutoring on disk.
You can talk to a simulated version of your teammate to get information about what they’re working on, to the extent that the information you’re looking for is already baked into the company corpus you can the information directly and quickly from the AI.
The breakdown in hand designed corporate org charts mirrors the shift in AI research away from symbolic AI and towards the dominant deep learning paradigm today. In "The Bitter Lesson," Rich Sutton argues that over the long run techniques which can utilize more compute to learn tend to outperform those with built-in specialized knowledge. This has been shown decisively by today's LLMs: deep learning's capability to be scaled up on massive amounts of compute and data ultimately led to vastly more intelligent models than symbolic AI ever produced. These models are typically less interpretable as they have less of our specialized knowledge built in.
We predict the same shift is coming for organizations. Replacing the role of human intuition and tradition in shaping organizations with data and compute may seem counterintuitive. It may lead to less interpretable organizations, with corporate structure derived from data rather than a clearly defined hierarchy. But the bitter lesson tells us that the gains in efficiency from the AI driven organization will be worth it.
Stripe published a research paper on software engineering efficiency and its effect on the global GDP — amount — $3 trillion. And with 25M developers and growing it’s fair to say there’s a lot of money to be made in the developer tools space.
Google has posters on campus titled "Testing on the Toilet" distributed in the bathrooms that offer tips for following Google's distinctive flavor of testing. There is also a concept of "readability" in a language, whereby you can only submit your code into Google's mono-repo in a particular language if you have "readability" in that language, or if someone else who does approves it. The problem is that Google has a long wait and complicated process for being given readability that feels like the old taxi medallion system. Now we can build "Uber for readability" using AI and bust the cartel, establish green zones for non-readability havers to write... Sorry, I got a little carried away there.
For those that prefer videos here is a playlist showing off some of the functionality we offer:
And a nice thing we put together for our launch
Interesting read. It seems that George Miller’s work which led to the concept of 7 +/- 2 as the limit of human working knowledge chunks (1 +/- 2 for too many people it seems some days...) needs to be considered in the discussion. I’ve often thought beyond the cost considerations correctly discussed in the article that much of org design and mgt work is an effort to get around first the physical limitations of one person’s work (e.g. exhaustion and hours in the day) and second the mental limitation of how many details a person can coordinate in planning (7 +/- 2; more or less.)
The article describes AI putting the correct information before a team. Given the AI “should” be able to identify who created all of the work product that went into the LLM within a company dataset, I consider it a possibility that organizational structure would mostly dissolve and an AI would simply assign team members to an issue based upon work product evaluations. A bit like how pitching is done in the major league. Never know when the game starts if the starter will have a complete game, middle relief, just relief, just a closer, or something else. Perhaps the most organic position in sports. Seems like AI could do the same with the vast majority of organization positions.
In this vision of the future, aren’t the companies—such as Microsoft and Google—with the most widespread access to an organisation’s data across a range of applications best positioned to run away with the benefits?
They already have visibility into what users are doing across platforms like Gmail, Office, Teams, and Docs, giving them an enormous advantage in understanding workflows, coordinating data, and training more powerful AI models.
At best, startups get indirect, gated access to this data through APIs. They are reliant on the incumbent’s systems. As this future is realised and becomes sufficiently valuable, these gatekeepers could tighten control over that access, limiting what startups can do even further.
Incumbents like Microsoft and Google could simply implement AI-driven organisational features directly into their existing ecosystems.
Effectively doing to new organisational AI productivity startups what Apple did to flashlight apps.