Madio

How long does AI take to render a math animation?

Sun May 10 2026 00:00:00 GMT+0000 (Coordinated Universal Time) · Sanatan Sharma

A common question from anyone who has not used an AI animation tool is how long it actually takes to get a finished clip. The honest answer depends on four stages, each with its own time profile, and on which model and which backend you are running.

This post breaks down the stages, gives typical times on Madio specifically, lists what makes a render slower, compares Gemini Flash to Gemini Pro Thinking, and ends with practical optimizations.

The four stages

Every AI animation goes through roughly the same pipeline. The names differ between tools but the stages do not.

Stage one is LLM code generation. The prompt goes to a model that emits scene code. On Madio, that scene code is Python that imports Manim community edition. The model is Google Gemini 3, defaulting to Flash for speed. The output is a Python file that defines a Scene subclass with a construct method.

Stage two is the render. The scene code is executed in a sandboxed environment with Manim community v0.18.1 installed. Manim walks through the construct method and emits one frame per animation tick at the target frame rate, usually 30 fps. The rendered frames are encoded into an MP4 with ffmpeg.

Stage three is optional narration. On Pro and Team tiers, Madio adds AI-generated narration aligned to the scene timing. This stage runs after the silent video is rendered and adds an audio track via ffmpeg.

Stage four is upload. The finished MP4 is uploaded to S3 and the URL returned to the user. This stage is bounded by network speed and is consistent across renders.

Typical times per stage on Madio

Numbers below are for a 30-second math animation at 1080p on the Starter tier with the default Gemini Flash backend.

Stage one, LLM code generation: 15 to 30 seconds. Simple prompts return scene code in around 15 seconds. Complex prompts with multiple objects, transformations, and labels can take up to 60 seconds. Prompts that ask for derivations or step-by-step proofs run longer because the LLM emits more code.

Stage two, render: 60 to 120 seconds for a 30-second clip. The rule of thumb is that 1080p Manim render time is 2 to 4 times the clip duration on a single CPU core. A 60-second clip takes 2 to 4 minutes to render. A 30-second clip takes 1 to 2 minutes.

Stage three, narration: 10 to 20 seconds when enabled. Narration uses a fast TTS pipeline and is not the bottleneck for short clips.

Stage four, upload: 3 to 8 seconds depending on file size and network conditions.

For a 30-second clip on the default backend, total wall-clock time is 90 to 180 seconds. For a 60-second clip, total is 3 to 5 minutes. For a 180-second clip on Pro, total is 5 to 10 minutes.

What makes it slower

A handful of factors can move you from the fast end of those ranges to the slow end, or beyond.

Long videos. Manim render time scales linearly with clip duration. A 180-second clip takes six times as long as a 30-second clip in the render stage alone. The LLM code generation stage scales sublinearly, but past 90 seconds the model also takes longer because the script is longer.

3D scenes. Manim's ThreeDScene is significantly slower to render than 2D. Camera moves through 3D space add another multiplicative factor. A 30-second 3D scene can take 4 to 6 minutes to render where its 2D equivalent takes 1 to 2 minutes.

Retries. If the first scene code does not compile or produces a broken animation, Madio retries with a corrected prompt. Each retry adds the full LLM stage plus the full render stage. A clip that needs two retries before producing valid output takes three times as long as a first-try success.

High-resolution output. 4K renders take roughly 4 times the time of 1080p renders because there are 4 times the pixels per frame. This is only available on the Team tier at 79 USD per month with 1000 credits. For most explainer use cases, 1080p is the right choice.

Complex Manim primitives. Calls like ApplyPointwiseFunction, ParametricFunction with high sampling, and VMobject with many points are slower than simple shapes. AI-generated code does not optimize for speed by default.

For a longer technical breakdown of each stage, see from prompt to MP4: the AI animation pipeline explained.

Gemini Flash vs Gemini Pro Thinking

Madio defaults to Gemini Flash for speed. Pro Thinking is available on the Pro tier as an opt-in for more complex prompts.

On Flash, LLM code generation takes 15 to 30 seconds for a typical prompt. The output is good for the five formats covered in explainer formats AI handles well. Flash struggles with multi-step derivations longer than about 30 seconds of output time.

On Pro Thinking, LLM code generation takes 60 to 180 seconds. The model uses chain-of-thought to plan the scene before emitting code. The output is more reliable on derivations, more careful with mathematical notation, and more likely to use MathTex correctly versus Tex. The trade-off is a 3 to 4 minute total wait for a 30-second clip versus 90 to 180 seconds on Flash.

When to choose which: Flash for first drafts, simple animations, and content where you want to iterate quickly. Pro Thinking for final renders, complex multi-step derivations, and animations where correctness matters more than speed.

How render time compares to alternatives

For perspective on whether 90 to 180 seconds is fast or slow.

Hand-coded Manim on a developer machine: faster after the developer is set up. A practiced Manim user can iterate on a 30-second clip in 30 to 60 seconds per render once the scene file exists. The setup cost is hours, the per-render cost after that is below a minute.

Hand-coded Manim cold: 4 to 20 hours for a first scene depending on how much Python and LaTeX setup is needed.

After Effects or DaVinci Resolve: real-time preview, but 1 to 5 hours of build time for a comparable 30-second math animation. The render time itself is short, the human time dominates.

Generic text-to-video models like Sora or Runway: 30 to 120 seconds per render. The output is rarely faithful to mathematical content. Render time is faster, but the rendered output is often unusable for math animation, which makes the comparison misleading.

Slide-based explainer tools like Animaker or Veed: 30 to 60 minutes of human time, near-instant render. Different output category, not comparable directly.

The 90 to 180 second figure for Madio is in the same ballpark as a generic text-to-video model and is dramatically faster than hand coding from scratch.

Optimizations

A few techniques to reduce Madio render time when speed matters.

Use shorter clips. A 30-second clip is more than 2 times faster than a 60-second clip end to end. For social-media format, 30 seconds is often the right length anyway.

Use lower resolution for previews. The free tier renders at 720p. Use it to check the prompt produces what you want, then re-render at 1080p on a paid tier. The 720p preview is roughly half the render time of the 1080p final.

Choose Flash for iteration. Once the prompt is right, optionally switch to Pro Thinking for the final version.

Avoid 3D unless required. A 2D animation of the same concept renders 3 to 5 times faster.

Write specific prompts. Vague prompts cause retries. A prompt that names the visual content explicitly produces correct code on the first try more often. The 12 prompt patterns at prompting AI for math animations are the practical guide.

Pre-rendered scenes via templates. The /templates library contains starting prompts that have been validated to render correctly. Forking a template costs less than starting from scratch.

For full pricing context, see /pricing. The free tier is the right place to test render times against your own content. Five renders is enough to measure where your prompts land in the time ranges above.

A practical expectation

For most users, the practical expectation should be this. A short math animation, 30 to 60 seconds, takes 2 to 5 minutes to render on Madio end to end on the default backend. Plan to send the prompt, do something else for a few minutes, and come back. If the result is good, you are done. If it is not, refine and re-run. Two to three iterations typically lands a publishable clip.

Compared to hand-coded Manim from a non-coder perspective, this is fast. Compared to a video editor like After Effects from an experienced motion designer's perspective, the AI route is slower per render but eliminates the 1 to 5 hours of build time per scene.

Where to start

If you want to time it for yourself, use the free tier at /create with a 30-second prompt and a stopwatch. Render five clips and average the times. Your numbers should land within the ranges above. If they do not, the prompt is likely the variable.

The render time is bounded by physics, not branding. Manim is a real frame-by-frame renderer. AI cannot skip that work. What AI does is replace the hours of human coding with seconds of LLM inference. The wall-clock time saved is in the human stage, not the render stage. That is the actual value proposition. Once that is clear, the 2 to 5 minute wait stops feeling slow.

Try Madio free →