As 2024 comes to a close, another year of rapid change in the technology industry is in the books. Now is a good time to reflect on where the AI field is going and where it’s taking us.
Comprehensively reviewing every highlight that happened in AI this year would be an exhausting undertaking, one that could well stretch out into the next year to complete. Instead I’m going to focus on what I identify as key trends and themes to organize this retrospective with an eye for strategic, big picture thinking. So unless I explicitly mention what I see as a really impactful standalone event, I will generally fold events into a general trend/theme. This is my “compression algorithm” to cover as much ground in as little space as possible.
So in no particular order, here’s my list of biggest AI news for 2024. In part II of this two part series, I’ll follow up to discuss where the trends might be heading for 2025.
#1: The Rise of Agentic AI
If 2023 was the year of frantic scrambling and feverish hype, 2024 was a year of more sober assessment and consolidation. Probably the strongest consolidating trend is the move toward agentic AI. This year saw a swift growth in both progress and interest in agents—AI programs that can semi-autonomously take action and operate on the world— including sophisticated developments in popular development frameworks such as CrewAI, LangGraph, and Autogen, and SalesForce launching its customer service agents, AgentForce. 2025 will be the first year where we see how agentic AI play out in the wild, but the groundwork was set this year. Agentic AI is transitioning from an experimental hypothetical to a concrete, applicable reality.
AI that can interact with an external environment is a prospect too intriguing to ignore, so we can expect interest in agents to continue rising well into next year.
#2: The Ascent of MLOps/LLMOps.
2024 saw solid maturation in various MLOps frameworks such as Amazon Bedrock, Weights & Biases, Google AI Studio and MLFlow, showcasing that the industry has made serious strides with AI model development, inspection, testing and deployment infrastructure. MLOps is now maturing into a serious discipline in its own right, a respectable sub-specialization of DevOps that requires familiarity with one or more of these platforms. AI developers now have access to fully featured, sophisticated IDEs and operations frameworks for monitoring, iterating on, and deploying machine learning projects that any serious production-grade ML or LLM application now has to think about incorporating into its workflow.
#3: A Slow Down to Neural Scaling, and a Switch Up in Scaling Strategy
According to various reports, the big data scaling methods that rapidly catapulted the upward trajectory of model capabilities is showing signs of plateauing. This slowdown could be attributed to a number of causes, but a likely culprit is that the jet fuel that powered this upward trend, vast quantities of readily available high quality data, is drying up. This data drought has prompted engineers to explore newfangled and iffy synthetic data generation methodologies, or to shift focus to test-time compute scaling (giving the model more computational power and time during inference). It remains to be seen how long this scaling paradigm holds up, or if we will need new architectures and fresh thinking to keep pushing the envelop on increasing model capability. In the long term the answer will most likely be the latter, but the situation is still fuzzy for the time being.
#4: Chain of Thought and Automated Reasoning Models
OpenAI debuted its o1 model this year, which displays inbuilt chain-of-thought reasoning capabilities. Leveraging increased test-time compute, this model thinks out its response in an ordered sequence of planning and reasoning steps. Rather than just being trained on data, it’s trained how to reason over data. Consequently, it’s quite powerful at certain tasks, especially those of a STEM variety. It’s significantly more expensive to run however, so it remains to be seen how models that leverage these more advanced reasoning processes can remain cost-effective to be deployed at scale in the long term, or how truly better they are than other other, simpler models for specific use cases.
Nevertheless, OpenAI once again takes the crown this year for producing what is hands down the most impressive model in terms of sheer cognitive firepower.
#5: Anthropic’s Computer Use:
Again on the agentic AI front, Anthropic’s latest model is now capable of a fascinating experimental “computer use” feature, that allows it to navigate a computer as if it were a person. The technology is still in an early phase, but if it continues to mature, this capability could surely unlock a whole range of novel applications. Interest in the feature will surely persist into the new year. Computer use dovetails with the theme of agentic AI too in a major way.
#6: Anthropic’s Model Context Protocol:
Perhaps less sexy than some of the other news, Anthropic also released its open-source Model Context Protocol (MCP) late this year. MCP allows for a more streamlined mechanism for connecting LLMs to data sources via a two-way communications street, providing a direct means for LLMs to push and pull data from external repositories and environments. A clean API for establishing transactions between external resources and AI models could greatly enhance development and open up a plethora of new application possibilities for LLMs.
#7: The Debut of AI-Centric IDEs
On the software development front, many developers are beginning to acquaint themselves with various AI-centric code editors such as Windsurf and Cursor. These editors integrate AI on a deep level, and centralize various generative features, such as built-in chat windows, inline completions, “smart terminals”, and even AI agents into their functionality.
Figuring out how best to use AI for programming tasks is still a fresh subject of debate, and the SWE community remains divided about this contentious topic. Far from a passing trend, however, it’s becoming increasingly obvious that AI will become a staple of many workaday software development practices. Currently, its role is still largely disruptive, and at times intrusive and distracting, but it can be remarkably useful. The use of AI generated code has some serious methodological issues that need to be ironed out. Over time I expect it to find a more settled place in development workflows.
#8: A Continued Renaissance for AI Research
This one is my personal favorite standout of the year and also the trend with the clearest long term impact, so I’m going to comment more in depth on it.
The field of AI is a funnel, that one end opens up with wide-ranging, on-paper research, then narrows down through a filtering implementational process that terminates in practical, real-world end products. Typically, there is a lag between what gets worked out on paper, often in mathematics, and what eventually gets baked into usable, production-grade software releases. While it’s certainly not quick and easy to do good scientific or theoretical work, it doesn’t have the same red-tape, scalability requirements, and QA demands as enterprise-grade software production typically does. Much of it is couched in math that “moves at the speed of thought” rather than the speed of business. Since the turnover rate on computer science is typically faster than in business, so the research side of the funnel tends to be much thicker than the practical implementation side, creating a “delayed impact” effect.
As someone who tries to keep up with the research, I can safely say that scientific breakthroughs show no sign of abating and may indeed be accelerating. Terms that sound remote to laypeople, such as Neuro-symbolic Architectures, Komolgorov-Arnold Networks, Masked Diffusion Models, or State Space Models, and many, many more besides, could prove to have enormous practical utility further down the line. So much exciting research is taking place that it seems impossible for any one individual to assimilate it all.
Entrepreneurs and organizations that accelerate their theory-to-practice pipeline, that grab onto some of these conceptual advances early and figure out how to productize and build off them, could gain a massive advantage over competitors who remain stuck in yesteryear’s paradigm. There is a rich, and practically bottomless grab-bag of potential opportunities to pick over thanks to the flowering of AI research. Now that the compute infrastructure is greatly enhanced thanks to “picks and shovels” investment strategy in AI chips and improved development and testing frameworks, the delay between theoretical advance and practical application could be getting shorter than ever.
The astonishing progress of AI research is for me the healthiest sign that on an intellectual level, the field of AI is showing good vital signs. As a scientific field, it’s objectively booming—there are no “bubbles” when it comes to truth. This solid theoretical foundation is what the industry can safely build off of in perpetuity. Even if there are hiccups along the way, it’s clear to me that if the right research paths get the right love and attention, we may well be on our way toward that mystical goal of AGI sooner rather than later.
Even if scaling ends up a bust, many fruitful avenues of exploration are opening up. The AI boom may change course, but with no shortage of ideas in sight, it likely won’t grind to a halt.
#9: Gradual Over Breathless Adoption:
Businesses and organizations have generally settled on a gradualist approach to adopting AI, but interest and curiosity is steadily growing. A gap in education and understanding about the tech is very prevalent outside the tech industry, prompting many to be hesitant and distrustful of the technology. As a consultant, I notice this time and again.
It appears that rather than making sudden, drastic changes to their operations, organizations are slowly integrating AI into production processes via tentative trials and pilot programs, internal experiments, and other contained, reversible, and measured rollout initiatives. Such a tempered, gradualist approach is prudent; it shows a sane degree of caution about this still largely unproven new technology, and reflects a more sober, realistic attitude compared to the unrestrained mad-dash hype of 2023. The bigger tech companies with a lot of money to burn may be more willing to take risks, and what they do will have the public benefit of providing data to everyone else about how to go about what works and what doesn’t.
Clearly, an educational opportunity presents itself to attract more customers and generate value by making AI/ML technology more understandable and accessible to the public and to industries outside tech.
#10: New AI Media Content Capabilities:
2024 brought AI that can generate reasonably good sounding music samples, and also startling video generation capabilities such as OpenAI’s Sora or Google’s Veo. 2024 also seems to be the year where image generation tech got over its “too many fingers on the hand” problem. It’s likely that incredible new capacities for content generation are going to continue being developed and released through 2025.
#11: OpenAI Gives Reinforcement Learning Fine Tuning to Customers
OpenAI is keeping industry watchers busy until the very end of the year with their “12 days of shipmas” media blitz. On day 2 they announced they would be offering customers access to their reinforcement algorithms. To my knowledge, this is the first time that a commercially available reinforcement suite has been made widely available to non-specialists. This would mark the first time that a training-grade method rather than a purely fine-tuning method is made publicly available, allowing customers to fit models to their data at a deeper level than what’s previously been permitted. (RL technically applies to both training and fine tuning; the point is that it’s effective enough to serve both purposes.) Reinforcement is a powerful method that can achieve results with few shots, meaning users don’t need vast troves of data, as in pre-training, to get meaningful results.
#12: Novel Generative Architectures:
One particular highlight, and a bit of a sleeper hit, was the achievement of a general-purpose generative model based on a non-transformer architecture by the startup Liquid AI. This development is especially remarkable, because it serves as an informative counter-signal to all the chatter about transformer based generative models. It provides evidence that we don’t need to confine ourselves to transformers, and indeed, some of the biggest gains might come from outside of the transformer paradigm.
Using a different set of techniques couched in signal processing and differential equations, Liquid’s state space model is highly performative and efficient, and functions on a separate class of principles than transformers. State space models are more deterministic and stateful than transformers, which means they may be more reliable and better at remembering certain tasks involving complex time-series processing.
The startup is still in its infancy, but they are definitely one to watch. If I was an investor I’d be pouring money into Liquid AI.
#13: Startup Decart’s Model Generates A Playable Version of Minecraft
Startup Decart has created an AI model that can effectively “daydream” a playable version of the popular game Minecraft. This is an impressive feat of world-modeling. An AI model that can run a stable video game of significant complexity like Minecraft is one that can do a lot more than a mere stochastic language model. It can sustain a rule-based 3 dimensional system containing many objects and properties, much like the human mind does when it constructs a mental model of the world. Those who don’t care about video games still need to take this news and exercise some lateral thinking; this feat is a major accomplishment and could pave the way toward more sophisticated simulation tech in the future.
2024 was another busy year for AI, and much more was accomplished than I could ever hope to address here. Overall, 2024 was the year where agentic AI grew up, models began to reason (sort of), smooth scaling encountered some friction, research flourished, and various less known startups did some incredible work out of the main spotlight.
In the second part of this two part series, I will discuss what might be coming next year and how these trends and themes may evolve.