Shizuku AI

a16z's first Japanese bet

Mar 27, 2026

Thesis

When we think about the application layer of LLMs, it is easy to think of practical vertical use cases such as software development, legal, or finance, but the top use case of Generative AI is actually in the realm of Therapy and Companionship. The demand for companionship is not new, but the technology behind it has been moving quickly in the last few years. A clear line runs from Twitch to VTubers to AI companions. Each step captured demand the prior format couldn’t reach. Shizuku AI, founded by Akio Kodaira in August 2025 and backed by a16z at a $15M seed at $75M valuation, is aiming to build the next generation of technology through the form of AI VTubers.

Founding Story

Shizuku AI the company was founded by Akio Kodaira (CEO) in August 2025. Kodaira launched an AI VTuber named Shizuku on YouTube while still completing his Ph.D. at UC Berkeley in 2023. With Shizuku he ran dozens of livestreams and built a community of thousands of followers. On the research side, Kodaira was a lead author on breakthrough papers such as StreamDiffusion showing real-time image generation fast enough for live video applications. Kodaira also worked on real-time video generation at Meta and Luma AI, giving him the technical edge to compete at the frontiers.

Evolution Of Technology

Although we could consider the field Shizuku AI is playing broadly as Entertainment, I'd like to narrow down on this idea of companionship and explore the historic evolution of companionship and the technologies surrounding it to the present day. Thinking from first principles, companionship as a service is the experience of focused, reciprocal attention from another mind delivered through a medium in exchange for value. Back in 1956, psychologists Donald Horton and Richard Wohl coined the term parasocial relationship to describe the illusion of intimacy TV viewers developed with on-screen personas. We see this idea persist as a 2017 Google study found that 40% of millennial Youtube subscribers said their favorite creators understood them better than their friends. There seems to be an underlying model where emotional investment in a media figure has turned into an economic engine that powers a multi-billion dollar human livestreaming industry and a rapidly growing virtual character industry. We can observe the history of making focused attention cheaper, more scalable, and more accessible to make a reasonable prediction of how companionship technology will evolve. Each wave of technology can somewhat say they replaced the real thing with some type of artificial intimacy, yet each wave is adopted much faster than the incumbents predict as it serves latent demand that could not be reached previously.

Twitch

Twitch launched in June 2011 as a gaming live-stream platform and quickly became one of the largest companionship platforms in the world. Twitch had the parasocial dynamics of broadcast media, but also had a direct payment layer through subscriptions, donations, and bits, allowing viewers to purchase these micro-moments of attention from a streamer in real time. When we look at some of the numbers from Twitch it is quite significant.

240 million monthly active viewers as of early 2025, up from 55 million in 2015.
20.8 billion hours of content consumed in 2024, with a peak of 24.3 billion hours during the pandemic era.
$1.8 billion in estimated revenue in 2024, generated through subscriptions (58%), advertising (33%), and Bits/virtual goods (9%).
Just Chatting became the #1 most-watched category on Twitch in 2023, accumulating 2.86 billion hours of global watch time.

While the MAUs and revenue figures are impressive, Just Chatting being the #1 most-watched category didn’t make sense intuitively, but it accumulated more watch hours than any individual game, including League of Legends (1.23 billion hours). It is important to note that Twitch tracks each game as its own separate category, while Just Chatting is one giant umbrella. Even with this in mind, the demand for unstructured content that fits this category of companionship is notable. Researchers described the Twitch parasocial relationships as one-and-a-half-sided because Twitch offers intermittent moments of genuine reciprocity when a streamer reads a donation or responds to a chat message. The possibility, but not certainty, of being noticed creates what behavioral psychologists recognize as the wedge to driving sustained engagement.

There are some limitations in the Twitch companionship model as it is one-to-many(the streamer cannot give sustained individual attention), time-bound (streamers sleep, burn out, and take breaks), and expensive to produce (top streamers are scarce human talent with leverage).

VTubers

In late 2016, a character named Kizuna AI debuted on YouTube. She was a 3D animated anime character and coined the term “Virtual YouTuber”. Within ten months, she had over 2 million subscribers. Her success triggered an industry: by January 2020, there were over 10,000 active VTubers. YouTube’s 2020 Culture and Trends report highlighted VTubers as a notable trend, with 1.5 billion views per month by October 2020. The VTuber model decoupled the companion from the physical person. The real human still performs behind the avatar, but the audience bonds with the character, not the individual. This had a few significant implications as characters became ownable IP. The VTuber character belongs to the agency. Cover Corp (Hololive) and AnyColor (Nijisanji) own the characters their talent performs. If a performer leaves, the character can, in theory, continue with a new voice. These characteristics of this new domain have made the VTuber market quite lucrative.

$3+ billion global VTuber economy in 2025.
Cover Corp (Hololive): ¥13 billion (~$80M USD) revenue in FY2026/Q3, 88 managed VTubers, 80M+ combined subscribers.
AnyColor (Nijisanji): ¥42.9 billion (~$286M USD) revenue in FY2025, 34% YoY growth, 170 managed VTubers.

VTubers proved that audiences will form deep emotional bonds with an artificial persona. The human behind the avatar is anonymous and the audience attaches to the character. But VTubers still require a human performer for voice, improvisation, and emotional responsiveness. The performer remains the bottleneck.

AI Companions

Running parallel to the VTuber explosion, a separate category emerged: AI companion apps that replace the human entirely and restore the one-to-one intimacy that live streaming sacrificed for scale.

337 active, revenue-generating AI companion apps worldwide, with 128 released in 2025 alone.
82 million generated in H1 2025, on track for $120M+ for the full year, representing 64% year-over-year revenue growth.
Character.AI: 20M MAU, 180M+ monthly website visits, 10 billion messages/month, $32.2M revenue in 2024.
Replika: 30M+ registered users, 85% report emotional connections, $24M revenue in 2024.

Despite strong growth, current AI companions have a fundamental gap because they are text-based, reactive, and lack the performative, visual, communal dimension that makes VTubers and streamers engaging. There is no character to see, no live stream to gather around, no community to belong to. A16z, in their investment thesis for Shizuku AI, identified this as “the fundamental challenge facing AI companions today”, where monotonous, reactive conversations fail to sustain long-term engagement. Users eventually churn because the AI has no presence.

AI VTubers

The AI VTuber is one plausible evolution of the technology we have evaluated so far. A format that combines the visual character identity and communal experience of a VTuber with the always-on availability and one-to-one capability of an AI companion. It streams live, responds to viewers in real time, has a face and a voice to bond with, and never sleeps.

One example of an AI VTuber is Neuro-sama, created by the pseudonymous developer Vedal. Neuro-sama streams autonomously on Twitch, where it is driven entirely by AI, with no human performer, and has achieved extraordinary traction.

343,000+ peak Twitch subscribers in January 2026, making the channel the third most-subscribed in Twitch history behind only names like KaiCenat.
Estimated seven-figure annual revenue from subscriptions alone, with peak monthly subscription revenue of approximately $400,000, before accounting for donations, Bits, sponsorships, and ad revenue.

Neuro-sama’s fans are said to engage in real-time co-performance where they actively shape the AI’s behavior through chat in the moment, rather than passively consuming content or remixing it after the fact. This creates a qualitatively different relationship than either traditional streaming or text-based AI companions.

Shizuku AI is building at this intersection. The character speaks Japanese and English, sings, and interacts with live-stream viewers in real time using LLM-driven conversation and text-to-speech synthesis.

Shizuku’s success hinges on solving the key failure mode of current AI companions. They are reactive, monotonous conversations that cannot sustain engagement. Shizuku AI’s approach is to deploy the character across YouTube, Discord, and X simultaneously, building a community and a data flywheel where real interaction data trains increasingly proactive, engaging conversational models. Shizuku is also a16z’s first Japanese investment, and this choice seems very intentional. Japan has the world’s largest VTuber industry (home to Cover Corp and AnyColor), world-class AI and animation engineering talent at a fraction of Silicon Valley costs, and an aging, increasingly isolated population that creates structural demand for exactly what Shizuku offers.

Kodaira himself is the rare founder who has operated on every layer of this stack. He built a live AI VTuber with a real community, co-authored StreamDiffusion to push real-time generation to the speed live interaction demands, and shipped AI systems at Meta and Luma AI. He has tons of experience building the infrastructure other people are now trying to use.

Every wave we analyzed from Twitch →VTubers → AI Companions, the incumbents called it a poor substitute for the real thing, and it still found a way to serve demand the previous format couldn’t reach. Is the next wave here? If you’re building in this space, I want to hear from you. If you’re interested in joining Shizuku, they are also hiring.

The Film Room

Discussion about this post

Ready for more?