Anybody Can AI
Posts
Karpathy Launches nanochat: A Minimal, Full-Stack ChatGPT

Karpathy Launches nanochat: A Minimal, Full-Stack ChatGPT

PLUS: Google Expands Nano Banana AI Image Tools

Jenil Soni
October 16, 2025

Nano Banana Now Enhances Visual Creation Across Google Products

Google has announced that Nano Banana, its AI image-editing model (originating in Gemini 2.5 Flash), is rolling out into core Google products: Search (via Lens), NotebookLM, and soon Photos. With more than 5 billion images already generated via Gemini, the expansion seeks to let users transform or generate visuals in contexts where they’re already working - whether browsing, note taking, or managing photos.

Key Points:

Integrated image editing in Search / Lens - In Search (via the Google app), users can snap or upload a photo, then use a new “Create” mode in Lens to transform or edit the image using Nano Banana. This brings generative editing into everyday visual discovery.
NotebookLM enhancements: illustrations & “Brief” format - In NotebookLM, Nano Banana will power new visual styles (e.g. watercolor, anime) to illustrate summaries and generate contextual images from sources. It also enables a “Brief” view — a condensed format with visuals to support quick insights.
Coming to Photos — broader deployment - Google plans to extend Nano Banana’s capabilities into Photos in the near future, giving users more control over their image gallery (editing, stylization) within the product they already use.

Conclusion

By embedding Nano Banana into Search, NotebookLM, and Photos, Google is making generative image editing less of a niche tool and more of an integrated part of how people interact with visuals daily. This move emphasizes utility over hype — allowing you to edit, stylize, or generate images in situ rather than switching contexts. The real test will be how seamless the experience feels and whether it complements users’ creativity without friction or confusion.

Karpathy Launches nanochat: A Minimal, Full-Stack ChatGPT Clone in ~8,000 Lines

Andrej Karpathy announced a new open-source repo called nanochat, described as one of the “most unhinged” things he’s written. Unlike his earlier project nanoGPT (which focused solely on pretraining), nanochat is a dependency-minimal, full training + inference pipeline in a single cohesive codebase. With a single script on a cloud GPU box, you can, in about 4 hours and for ~$100, get a ChatGPT-style interface running.

Key Points:

Full pipeline in one repository - nanochat includes tokenizer training (Rust), Transformer pretraining, midtraining on conversational + multiple choice / tool data, supervised fine tuning, optional reinforcement learning (GRPO), and efficient inference (KV caching, tool usage, CLI & web UI). All in ~8,000 lines of code.
Cost / time efficiency & scaling
- A minimal “chatbot clone” can be trained in ~4 hours on an 8×H100 node (cost ~$100) to support interactive conversation.
- With ~12 hours of training, it can surpass the CORE benchmark for GPT-2. With ~40+ hours (~$1,000 compute), the model becomes more coherent and can solve math, code, and multiple choice tasks.
- The data regime: pretraining on “FineWeb” (≈24 GB), then midtraining using SmolTalk conversations, MMLU auxiliary data, GSM8K, etc.
Minimal, hackable, modular design - Karpathy emphasizes readability, modularity, and minimal external dependencies. The goal is to make nanochat easy to fork, learn from, and iterate on. He positions it as a “strong baseline” stack and hopes it becomes a research harness or benchmark in the style of nanoGPT

Conclusion

nanochat is an audacious attempt to collapse the entire stack of LLM development—training, inference, tooling—into a single, digestible, hackable codebase. Its affordability, speed, and transparency make it a compelling learning tool and experimentation platform for AI developers. While it won’t match the capabilities of large proprietary models yet, its true value lies in enabling hands-on exploration, iteration, and open research in a way that is rare at this scale.

Apple Eyes Prompt AI’s Team, Tech for Visual AI Push

Apple is reportedly closing in on acquiring the engineering team and underlying technology from Prompt AI, a computer vision / visual intelligence startup. Rather than buying the entire company, the deal appears to function like an “acquihire,” absorbing the talent and IP. The move signals Apple’s continued push to bolster its AI and vision capabilities internally.

Key Points:

Talent + tech, not full acquisition - The deal focuses on bringing in Prompt AI’s engineers and software assets, rather than acquiring its full business or user base. Some employees may receive retention offers or incentives as part of the transition.
Prompt AI & its domain - Prompt AI works in visual / computer vision space; its flagship app (Seemour) connects to home security cameras to detect objects, people, or anomalies, delivering alerts and textual summaries.
Wider AI talent trend & strategic motive - Apple is trying to rapidly bootstrap internal AI / vision capabilities by integrating startups rather than building everything from scratch. This is part of a broader pattern of big tech using acquihires to catch up in AI.

Conclusion

If this deal goes through, it would underscore Apple’s urgency to strengthen its AI stack—especially on vision/computer vision fronts—and reduce dependence on external models or licensors. For Prompt AI’s engineers, it’s a fast route to resources and scale; for Apple, it’s a strategic shortcut into advanced capabilities. The success will hinge on how well Apple integrates the team, IP, and how it aligns with their broader AI roadmap (Siri, on-device vision, AR/VR, etc.).

Thankyou for reading.