The Era of AI Directors: Technology Creating Hong Kong Martial Arts Vibes - Thousand Oaks - 1

These days, if you look at X, Facebook, or YouTube, you keep seeing videos produced using the model from China's Sidance.

The program Sidance 2.0, which creates such content, was released this February. Its announcement turned Hollywood upside down.

Sidance 2.0 has been released in over 100 countries through the fal platform, but the U.S. is not among them.

Even if you check it from Twitter in Korea, you get a message saying, "This service is not available in your area."

The technology behind this program is truly impressive. Sidance 2.0 has topped the global leaderboard in the AI video arena, surpassing Google BEO 3, OpenAI Sora 2, and Runway Gen 4.5. A Chinese company has produced excellent results with much less funding. The shock experienced during Deepfake has been replicated in the video field.

Watching this, I honestly feel a bit conflicted. Living in a city like LA, where the video industry is concentrated, I have friends working in VFX and a sibling in advertising production.

Until last year, people said, "AI-generated videos look awkward and obvious," but this year, their tune has changed. One friend mentioned, "It's now harder to promise clients that we won't use AI." This indicates that price competition has begun.

If a production budget of over $3,000 can be replaced with just a few hundred dollars, which client would ignore that?

Interestingly, the fact remains that "a few lines of text" cannot produce that video. If you type "a cat doing martial arts," it just shows a cat flailing around.

The Era of AI Directors: Technology Creating Hong Kong Martial Arts Vibes - Thousand Oaks - 2

The videos produced by people on Sidance or Kling are actually the result of prompts that are closer to a script length.

A ginger cat in a gi stands, styled like a 1980s Hong Kong martial arts film, in slow motion, with the camera moving from a wide shot to a close-up.

Writing a paragraph in English about this has now become a new job. It's not just a prompt engineer; it's a prompt director.

Who can extract the "desired image" more accurately in a few lines has become the real skill. The reason China is advancing rapidly in this field is actually due to data and infrastructure.

ByteDance operates TikTok and Douyin, and Kuaishou has its own short-form platform.

Video data is coming in endlessly, and once a model is created, it is immediately tested on CapCut with 800 million users. The feedback loop is one step shorter than that of American companies.

American companies take one step at a time, going through compliance, copyright, and safety reviews, while in China, they just throw it out there and see what happens.

The recent incident where Sidance learned Disney's IP and later apologized for "respecting intellectual property rights" illustrates this difference well.

I don't know how far this trend will go. However, what is certain is that five years ago, saying "it looks like AI video" was met with ridicule, but now saying "I can't tell if it's AI video" is considered a compliment.

Living on the outskirts of Hollywood and seeing billboards every day, I increasingly feel that the day is coming when you can't tell whether the model in the ad is a person or a pixel.

When that day comes, the more important question will be not who wins, but "which tools am I using?"

It's a bit funny that while living in the U.S., I can't use the most popular tools, but I don't think that situation will last long.