Meta’s First Superintelligence AI Underwhelms, Just Catching Up

The demo paused. The screen read “Muse Spark” and everyone in the room leaned forward—then shrugged. I felt the same mix of relief and disappointment you get when a rival finally shows up to the race, but still isn’t winning it.

I’ve tracked Meta’s AI attempts for years. You’ve probably seen the headlines: Mark Zuckerberg poured $several billion (≈€3 billion) into a restart and cut teams to match. Now Alexandr Wang’s Meta Superintelligence Labs has something to show: Muse Spark.

On a product roadmap, Muse Spark appears as a careful relaunch.

The model was built from scratch at Meta Superintelligence Labs and pitched as a “natively multimodal reasoning” system with tool-use, visual chain-of-thought, and multi-agent orchestration. Meta says it will slide into Facebook, Instagram, Messenger, and WhatsApp in the coming weeks, which is the point: this is less a research manifesto than a product play.

I’ll tell you plainly: Meta wants this to power shopping nudges and health Q&A inside the apps you already use. That makes practical sense for revenue, even if it raises trust and privacy flags.

On benchmark leaderboards, the numbers tell a cautious story.

Benchmark charts are everywhere right now; companies wave them like proof of progress. Meta has posted results showing Muse Spark outrunning previous Meta models across most categories. By their numbers it trails only Google Gemini 3.1 Pro and OpenAI’s GPT-5.4 in multimodal tasks, and it’s competitive on reasoning tests—though Anthropic’s Claude still scores higher on many reasoning slices.

How does Muse Spark compare to GPT-5 and Google Gemini?

Short answer: close, but not dominant. Muse Spark narrows the gap with GPT-5.4 and Gemini on multimodal understanding, yet Meta didn’t release a peer-reviewed paper and has past accusations of fudging benchmarks. Take the charts as a hopeful signal, not a verdict.

In developer circles, Muse Spark shows improvement—but not mastery.

I asked a developer friend to try agentic tasks and code generation; the model produced useful scaffolding and failed on edge cases. That’s precisely where context and tool orchestration matter.

Can Muse Spark handle autonomous agent tasks and production code?

It can coordinate multi-step prompts and call tools, which is a meaningful step beyond LLaMa 4. But it still lags behind Anthropic and the leading GPT-family models on sustained, safe agentic behavior and complex code—so it’s useful, not transformative.

An ad in my feed promised “AI-styled” shopping; that’s the commercial bet.

Meta is pushing two clear monetization hooks: personalized shopping and health interactions. Muse Spark can supposedly pull stylistic cues from creators you follow to recommend products—an obvious affiliate play for revenue inside Facebook and Instagram.

The second sell is health: conversational triage and answers. That’s a high-demand use case, but Meta carries baggage. People already distrust how Meta collects and uses personal data; offering medical help inside apps where Meta mines attention will test that trust.

Will Muse Spark be available across Facebook, Instagram, Messenger, and WhatsApp?

Yes—Meta says a broad rollout is planned across its main consumer apps. Expect staged integration: shopping and content features first, then assistant-style interactions for messaging and possibly health queries.

Here’s how I read the launch: Meta’s spent big, built a capable model, and moved from sitting out the race to running among the field. It’s like a sprinter who caught a second wind but missed the podium—competent and dangerous, but not yet decisive. The product focus gives it a clear path to revenue, though trust remains a major hurdle; the model is a swiss army knife with a couple of dull blades—handy, but you wouldn’t bet your life on them.

If Meta can refine agent behavior, publish transparent evaluations, and convince users about privacy, Muse Spark could become a reliable background engine for billions of people—otherwise it’s another serviceable AI bolted onto apps with more questions than answers. So what will Meta do next to actually change the race?