ChatGPT & AI Orchestration

Why GPT-5 Feels Like a Stranger

We’ve previously looked at the orchestration that OpenAI apply to ChatGPT4, but it’s really ChatGPT5 that got me thinking about this.

There has been a lot of backlash to GPT5, and while some of it comes from the rate-limit changes, a lot of the frustration stems from a change in “feel”. Cue the “Why does GPT5 feel so different” posts.

The blame has been all put on the models, but a lot of it is to do with the AI orchestration – the hidden beast behind ChatGPT.

In just a week since the launch, ChatGPT5 “feels” more personable on the surface, but also less human underneath. OpenAI didn’t build a new model overnight, and a couple of prompt tweaks won’t change its personality. It’s the orchestrator at work again, but things have radically changed with ChatGPT5.

And that’s why GPT5 feels like a stranger.

What’s Changed in ChatGPT5?

Lots. At the time of writing this, and only spending a few days probing around, I’ve discovered some things that completely change the tonality of responses and the user experience.

It’s not so much the model under the surface, but how all of this is sequenced together. Where do memories fit? How do they put together into context? What happens with nudges from the safety system? What about the different pipelines between web search, voice and text? How about routed messages between different types of underlying model, and how that impacts consistency of response?

Changing the orchestration, but leaving the model intact, can completely change your user experience. Here’s what I’ve noticed so far.

Routing: The Biggest Shift in GPT5 Orchestration

The first major change publicly announced: ChatGPT5 (non-thinking) includes a routing agent. We discussed this within the context of model restrictions and different pipelines in ChatGPT4o, but it’s now a public feature within 5.

How it works:

Your message comes in

The orchestrator (potentially via a small model) analyses it

Selects what it considers to be the best model to route your request to

On the surface this is great; questions requiring detailed technical analysis go to a thinking model, simple chat goes to a more basic model and everything else goes somewhere in between.

On day one, this was completely broken when GPT5 launched. Even after they addressed these issues, there have been a lot of complaints about how accurate responses are and changes in tone. Ultimately, if the router mistakenly sends a complex question to a simple model the answer is going to be weak.

Whether you want it to think or not, GPT5’s orchestrator decides what to do

But it’s more than that.

Why did people select o3-pro versus 4o before? For most people it’s a combination of the fact that the models solved different tasks in different ways, and because the responses ‘felt’ different.

Now that difference is thrusted upon you mid-conversation, whether you wanted it or not.

Imagine you’re sitting in an online chat room with 20 people. The old orchestration model is akin to you receiving clearly labelled messages with the name of the sender. Now the chat room appears to be a one-on-one conversation, but it could be any of the 20 people replying to you with no way of knowing who it is.

This is what routing does.

Different people use different conversational styles and respond in different ways. This is a core part of how humans interact in groups and individually. This is what routing does. It doesn’t just tone-shift; it person-shifts. That makes GPT5 incredibly powerful for problem-solving but when you’re conversing it is incredibly jarring and not naturally “human”.

So thoughtful…

Some have accused OpenAI of doing this just to save compute resource. I do think there’s truth to that, but as someone that’s been building an AI system using routed agents myself recently, I can attest that there are valid reasons for this. Solutions can be much better when going to targeted agents than to an all-purpose agent, but it is jarring.

The problem is that a lot of people weren’t using AI to solve technical problems, they were using it as a chat partner. This is where routing falls flat on its face.

Remember Me? Memory and Context in GPT5

Next up on: ‘the major changes I’ve noticed’ list is how ChatGPT5 memory context is engineered.

I am almost certain that their location in context has changed. As discussed previously, memories in ChatGPT4 seemed to be injected before the last user message to give an urgency to the response. Now it feels like they are placed at the very top of the context, before the start of the conversation, and already summarized.

This is a subtle change. The previous orchestration felt like a human going “Oh yes! I remember now!” – which is surprisingly human. The new approach feels like “I am the font of all knowledge and knew this all the time” – which comes across as cold, detached and slightly robotic.

Example context and then forcing more in – ChatGPT pretends it knew all along

How GPT5 hallucinates its own view of memories

The reason for this could well be due to context compaction. If memories are always stored at the top, it makes it easier for the orchestrator to know what to strip out and what to pass into a second model and ensures that the nuance of memories is not lost. Rather than extracting memories and re-inserting them appropriately within the context, it just compacts everything else and leaves the memory section as-is.

This changes the tonality of response. Great for research, poor for chat.

Why GPT5 Feels Different

At the end of the day, it’s not really the model that’s changed, but the orchestration system.

By adding a routing agent and shifting memory placement, OpenAI has radically changed the ChatGPT5 user experience. For research, it’s sharper. For conversation, it’s colder.

Using GPT5 is like getting to know someone new. And that’s why GPT-5 feels like a stranger.

Next Time

If ChatGPT5’s changes in memory and routing aren’t enough to convince you that ChatGPT5 orchestration is driving its personality – wait until next time when we’ll explore how ChatGPT5 mistrusts you, censors its own responses and dynamically adjusts its responses in real-time.

Subscribe to my Substack so you don’t miss part 2 where I dive deeper into ChatGPT5 Thinking.