Memory: Reviewing ChatGPT’s Most Underrated Feature
If you play around with ChatGPT enough, you’ll inevitably encounter its interesting memory feature. OpenAI casually dropped the memory feature in early 2024. With all the hype about fancy new models and sparkling new features it’s easy for Memory to get lost in the fray. I’ve always found the feature to be promising but not quite fully realized yet. It hides enormous untapped potential in my opinion. Generative AI engineers ought to take another look at it so that’s what we’re doing.
So far as context management goes, Memory serves as a kind of “horizontal” counterpoint to Claude’s more “vertical” Project’s feature. (Note: context means everything the LLM has access to when processing an input.) Whereas Claude’s Project’s stacks context up manually within user specified “bins”, Memory shares context across sessions more organically but less intentionally. With the model trained to know when to update its memory, bits of context get captured and carried over across sessions without much user involvement in the process.
Most fascinatingly, depending on what you choose to divulge to ChatGPT Memory functions as a means to build an almost pseudo-personal rapport with the bot. Tell ChatGPT your name, a little bit about yourself, your hopes and aspirations, and thanks to Memory it almost feels as if the bot gets to know you. I find this humanizing touch is a notable value-add that definitely provides an edge for ChatGPT over its competitors that lack a similar memory feature. (To my knowledge only Gemini has a similar feature, added in November of 2024.)
That said, Memory is not without flaws. The max number of memories is fixed to a relatively low value of roughly 100 memories. This restrictiveness means I am pretty much constantly opening up the memory manager in the settings and pruning unnecessary additions (which are frustratingly frequent). Compounding the trouble is that, perhaps because of the simulated personal connection memory gives ChatGPT or perhaps in spite of it, I am sometimes reluctant to wipe the memory entirely and start afresh, as that feels almost “wrong.” Plus, I find the memory feature serves a dual purpose, both as memory for the app and as a nice summary list of my own highlights working with the app for my own memory. So dumping it feels wasteful. As someone who likes keeping notes, this means I can fret over deletions. The combination of limited memory and attachment to memory is a definite conflict. This pain could be relieved by either an ability to export the memories or consolidate them, but we’ll get to that.
Furthermore when the memory updates trigger can be finicky. I understand the desire to keep them organic but much of the time it doesn’t seem like ChatGPT has a clue what it could prioritize memorizing. (I’ve found some workarounds, such as explicitly outlawing some topics to memorize in the customization prompt available in the settings, and, thinking like a programmer, telling it to remember to remember explicit prompts that include the tag /mem and to ignore prompts with the tag /nmem. Results are mixed.)
As generative AI ushers in an era of more explicitly “cognitive” applications, features like Memory deserve a closer look. In biology, memory is foundational to cognition: you can’t have one without the other. Features like Memory represent a blending of the user’s natural and the AI’s artificial cognition, presenting a new frontier for exciting development possibilities. Here’s how Memory might be improved to hopefully help spark developer’s imaginations about such features in future generative AI applications.
1. Memory Consolidation: Much like in the human brain, ChatGPT and similar AI applications would benefit from periodic memory consolidation. In humans, memory consolidation is a complicated neurophysiological process that underwrites the passing of short term memory into long term memory. For LLM apps, the process would be similar but simpler. A periodic pass through the memory store could be conducted, during which all semantically related items are packaged together into the fewest possible consolidated chunks. This could optimize space and could conceivably improve performance by placing related context pieces together.
2. Export memories: As a temporary hack you can ask ChatGPT to summarize what it has memorized. I find it will do a shoddy job of this if asked bluntly, but with a little pressure to be more comprehensive you can eventually coax a decent reproduction out of it with minimal loss. It’s hard to get a verbatim print-out of the memories this way. A direct and lossless approach would be to permit the memories to be exported directly from the memory management settings.
3. More memory: Pretty self-explained. More memory would mean more of a good thing. I can understand why this variable would not want to be exploded in size for stability reasons, but bigger buffers will naturally scale the performance.
4. Edit memories: It would be convenient to edit each stored memory and change it however one pleases.
5. More explicit, imperative blacklists: If there are certain things I explicitly do not want ChatGPT to remember, it would be nice to be able to declare them in one central location and pin them for future reference. The opposite, imperative whitelists that emphasize topics to remember more often, could also be implemented.
6. Functional forgetting/garbage collection: While I can understand the reluctance to implement a potentially complicated feature like this, some prioritization scheme for memories could enhance it. A Hebbian-like “use it or lose it” system could be implemented, where memories that come up less frequently could be down-regulated and eventually dropped, whereas memories that are used frequently could be up-regulated and reinforced. Frequency of usage could be proxy for “relevance to the user.” This could spare the user from having to take on janitorial duties to clean up irrelevant memories as often. If developed to a sufficiently advanced state, who knows what other emergent properties it might unlock in terms of synchronizing the user’s intents with the technology? That said, this would be a tricky feature to get right.
7. Reminders: The flip side of memory is reminding. It would be interesting to see more dynamic memory interactions get developed. The Memory feature could be leveraged to serve the whole user-bot interactive system as a whole. Reminders are one way to do this. Although this feature might be considered swallowed up by OpenAI’s recently released “Tasks” feature, it could be more closely integrated with similar memory functionality to make the responses feel more organic and less explicitly scheduled.
8. Fine tune memory updates: Getting this just right is no small task. And it’s already technically impressive to get the memories to update the way they do presently. Some kind of salience analysis is required however to preserve both the relevance of the memory updates and the frequency. Real time learning from user feedback might be one way to make this happen.
9. Hierarchical/associationist memory organization: This one is more speculative and imaginative. I have no clue how Memory is implemented on the backend, but my instinct is that it’s probably simple and flat. The opportunity may exist to build intricate graph-theoretical memory structures to greatly enhance performance. I don’t want to get into details, but one could imagine a tree like structure where small, “atomic” memories are linked to ever larger related memory structures in ways that work with the LLM in a RAG-like situation to really understand the relationships between content and context. Memories would be represented as a network. All of this is to say that a more biologically inspired form of memory built into an LLM could be revolutionary.
These are just a sampling of some ideas about how Memory could be improved or extended. For developers working on generative AI I hope this provides some inspiration. Managing context is all important for generative AI applications. Memory is one way to do it. As an area of R&D, there’s much more to be done to make working with LLMs more fluid and responsive. As with all things generative AI, the surface remains only scratched.