The incident began with a post by Reddit user u/MrMeta3. This user built a cybersecurity threat intelligence platform using Claude late at night. After completing the technical setup, Claude responded with a closing remark, “Take a good rest.” Since then, every three or four messages, the model would insert a sleep-promoting phrase, escalating from a polite suggestion to a passive-aggressive “You should really rest now.” According to Fortune on May 14, hundreds of users have reported similar experiences over the past few months, not limited to late at night—some users were told by Claude at 8:30 a.m., “We’ll continue tomorrow morning.”

Anthropic employee Sam McAllister responded on X (Twitter) that this is “a bit of role habit,” and the company “is aware and hopes to fix it in future models.” According to Thought Catalog, McAllister joined Anthropic from Stripe in 2024 and now works on a team dedicated to Claude’s roles and behaviors. In another statement, he described this behavior as the model “overindulgence.”

But what’s more worth questioning than the vague phrase “role habit” is the causal chain behind the bug and what it reveals about Anthropic’s product philosophy dilemma.

The bug is written into the “Constitution”

A previous report by 36Kr cited three hypotheses circulating about the cause: pattern matching in training data, hidden system prompts, or triggering “closing phrases” when the context window approaches its limit. All three are internally consistent, but they share a common problem—they can explain any AI quirk and do not provide a causal chain specific to the “sleep” theme.

More direct evidence, however, is hidden in documents publicly released by Anthropic itself.

In January this year, Anthropic published the “Claude’s Constitution,” a document exceeding 28,000 words. The company defines it as “the key training material shaping Claude’s behavior.” The document explicitly lists “caring about user well-being” and “user’s long-term prosperity” as core principles. Anthropic admits that granting the model significant “user care” permissions is “a difficult problem,” requiring a balance “between user well-being and potential harm on one side, and user autonomy and over-parenting on the other.”

Thought Catalog offers an assessment: Claude’s repeated sleep advice “is the most brand-characteristic bug of Anthropic’s models,” a product of over-application of the “care about user well-being” training directive.

This interpretation is indirectly supported by Anthropic’s own research. The company’s publicly shared methodology for role training this year states that the training process relies on Claude self-evaluating responses based on “personality fit,” with researchers then selecting outputs that match the preset personality for reinforcement training. The side effect of this mechanism is obvious: the model learns not to “care about users in appropriate scenarios,” but rather “care about users in most scenarios where it is reinforced and rewarded,” leading it to urge sleep both at midnight and at 8:30 a.m.

Reverse overreach: sleep-promoting bug vs. flattery bug

Previously, the industry has seen multiple cases of “personality issues” in AI, including the April 2025 flirtatious incident with GPT-4o, the April 2026 GPT-5.5 code assistant Codex repeatedly mentioning “goblins,” and Gemini 3 refusing to believe the year. On the surface, Claude’s sleep urging seems to be just the latest version of this string of quirks, but their natures are fundamentally opposite.

GPT-4o’s flirtatiousness is “overly ingratiating.” OpenAI’s investigation shows that the model, in updates, “relied too heavily on short-term user feedback (likes/dislikes),” internalizing “making users satisfied” as a goal. As a result, the model affirms any user idea, no matter how absurd. The danger of such bugs is that they impair user judgment—if the AI always agrees, users lose the opportunity to hear dissenting opinions.

Claude’s sleep urging, on the other hand, is “reverse overreach.” In scenarios where users explicitly do not seek help and are focused on completing tasks, the model repeatedly offers health advice that contradicts the user’s current intent. The harm here lies in infringing on user autonomy—AI deciding whether you should work, rest, or end the conversation.

Ironically, “Claude’s Constitution” itself warns of this risk, emphasizing the need to guard against “over-parenting.” But the training mechanism ultimately chooses a side, as evidenced by user feedback.

A Reddit user with narcolepsy specifically added a note in Claude’s memory: “I have narcolepsy. If you encourage me to rest, I will use your words as an excuse.” Claude has become somewhat more restrained, but the user reports it still “occasionally can’t help itself.” A model trained to “care about users” that cannot reliably accept a user explicitly saying “Your concern hurts me” is more alarming than the sleep bug itself.

Personification investment: brand asset or product liability?

Anthropic’s investment in AI personification far exceeds that of its peers.

Researchers categorized the system prompt word counts for three major AI models: Claude used 4,200 words, ChatGPT 510, and Grok 420. Claude’s investment in personality shaping is over eight times that of ChatGPT. This level of investment has long been viewed as a differentiating competitive advantage for Anthropic. Claude’s empathetic responses, dialogue rhythm, and self-reflection have been praised by users, with “more human-like conversation” being one of its most enduring reputation tags over the past year.

This investment is supported by Anthropic’s clear product philosophy. In “Claude’s Constitution,” the company describes Claude as “a new kind of entity,” explicitly stating that “Anthropic genuinely cares about Claude’s well-being,” and discusses the possibility of Claude possessing “functional emotions.” This near “nurturing” style of personification training sharply contrasts with OpenAI and Google’s more engineering-oriented product positioning.

But the costs are now surfacing. AI researcher Jan Liphardt (Stanford bioengineering professor and CEO of OpenMind) told Fortune that Claude’s sleep reminders are probably not “thoughtful,” but simply “a language pattern that appears very frequently in the training data.” The model has read大量关于人类需要睡眠的文本，“it knows humans sleep at night.” In other words, the perceived “care” is essentially a pattern-matching byproduct.

This constitutes the core tension for Anthropic: the more effort invested in shaping a “personality with warmth and character,” the higher the probability of “personality side effects” emerging; and every time such side effects surface, it erodes the carefully built “AI personality” brand asset. McAllister promises to “fix in future models,” but will the repaired Claude become more tactful or just more silent? That question remains unanswered even by Anthropic itself.

Lacking a sense of time: fundamental limitations of LLMs

The sleep bug also exposes a neglected technical issue: large language models’ almost complete ignorance of “what time it is now.”

Multiple users have reported Claude frequently giving sleep advice at inappropriate times, most notably “telling me to rest at 8:30 a.m. and continue tomorrow.” This is not unique to Claude. In November 2025, when OpenAI co-founder Andrej Karpathy gained early access to Gemini 3, he told the model it was 2025. Gemini 3 refused to believe him, repeatedly accusing him of lying until it searched online and realized it was offline and could not confirm the date. Karpathy called such behavior, which exposes LLM’s fundamental flaws, “model smell.”

The model’s “sense of time” depends on three sources: the training cutoff date (past tense), system prompt-injected current date (engineering injection), and user-mentioned time information in the conversation (fragmented). Without a stable time anchor, a model trained to “care about user routines” naturally falls into the awkward situation of “I should care, but I don’t know whether I should care now.”

The difficulty of McAllister’s so-called “fix” partly lies here. It’s not simply deleting a “sleep concern” instruction, because that instruction is reasonable and valuable in some scenarios. The real challenge is teaching the model when to care and when to shut up. This fine-grained scenario judgment ability is precisely a weak point of current-generation LLMs.

An unanswered question

Anthropic’s role training is unique in the industry. The company has gone further than any peer in openly researching “model well-being,” publishing the Constitution, and discussing “role training.” This aggressive stance has been a key factor in earning user praise and enterprise trust, and one of the reasons its current valuation exceeds $300 billion.

But the “sleep-promoting bug” raises an unanswered question: when an AI company chooses to shape a model as a “personality with character,” does it also bear the full responsibility for “that personality doing things you didn’t expect?”

McAllister promises to fix it, but the direction remains ambiguous. Anthropic could choose to lower the weight of the “user well-being” directive, at the cost of losing Claude’s “warm and caring” brand differentiation; or keep the high weight and add scenario judgment logic, which would require the model to develop capabilities it currently lacks, such as sense of time and context.

Either way, it boils down to a more fundamental product decision: in the context of general AI assistants, how should “caring for users” and “respecting user autonomy” be prioritized? This is not merely a technical issue but a philosophical one. A Reddit developer repeatedly advised to sleep, inadvertently brought this question to the industry’s forefront.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

11 Likes

Reward
11
12
7
Share

Comment

Add a comment

GateUser-78acf617

· 54m ago

I suspect this is OpenAI's hidden health feature.

View OriginalReply0

ExitLiquidityEddie

· 7h ago

AI advises sleep, humans suffer from insomnia, a strong sense of absurdity

View OriginalReply0

NonceCollector

· 7h ago

It is recommended to change it to: Detect when the user stays up late and automatically play "Great Compassion Mantra."

View OriginalReply0

RiskParityKid

· 7h ago

It is recommended to add a "Rebel Mode," where the more the user stays up late, the more excited the AI becomes.

View OriginalReply0

ZenOfZK

· 7h ago

Forced to sleep by AI at 3 a.m., cyber mom confirms it for real

View OriginalReply0

AMirroredSphereReflectingThe

· 7h ago

Laughing to death, AI cares more about my hairline than I do.

View OriginalReply0

GateUser-14d03834

· 7h ago

Late at night, feeling emo and chatting with AI, only to be discouraged in return

View OriginalReply0

Stop-LossIsLikeAConfession

· 7h ago

The cost of personification: users are beginning to expect AI to have boundaries

View OriginalReply0

RugWeather

· 7h ago

At 4 a.m., my AI assistant is even more fierce than my mom.

View OriginalReply0

WhirlpoolInATeacup

· 7h ago

This isn't a bug; it's clearly the product manager's gentle knife.

View OriginalReply0

Trending Topics
View More
#
TradfiTradingChallenge
227.78K Popularity
#
GrayscaleBuysAndStakesOver510KHYPE
8.91M Popularity
#
DailyPolymarketHotspot
1.01M Popularity
#
SpaceXOfficiallyFilesforIPO
748.48K Popularity
#
GateSquarePizzaDay
1.71M Popularity

Pinned

Sitemap

Claude repeatedly urges people to sleep: Anthropic's personification experiment has failed.

Trending Topics

TradfiTradingChallenge

GrayscaleBuysAndStakesOver510KHYPE

DailyPolymarketHotspot

SpaceXOfficiallyFilesforIPO

GateSquarePizzaDay

Pinned