DeepSeek Internal Testing "Image Recognition Mode," Multimodal Capabilities Officially Launched | Exclusive

robot
Abstract generation in progress

Mars Finance News, April 29 — Some users have reported that the DeepSeek web version has launched a “Image Recognition Mode.” Testing shows that this mode supports users uploading images for content understanding and analysis. Currently, this feature has not been fully rolled out, and the specific functionality boundaries are unclear. It is worth mentioning that today, DeepSeek’s multimodal development researcher Chen Xiaokang posted on X platform with the message “Now, we see you” and included a picture, in which DeepSeek’s iconic “whale” logo is shown removing its eye patch.
Earlier this month, DeepSeek launched “Quick Mode” and “Expert Mode,” with the former suitable for daily conversations and providing instant responses; the latter excels at complex questions but may require waiting during peak times. At that time, screenshots circulating online indicated that, besides “Quick” and “Expert” modes, DeepSeek also has a mode called “vision.” The latest “Image Recognition Mode” closely matches the previously circulated “vision” entry.
Analysis suggests that the opening of DeepSeek’s multimodal capabilities means its product matrix has extended from pure text conversations to image and text interactions, aligning with mainstream multimodal large models like GPT-4o and Gemini. (Wide-angle observation)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments