How it works

ClipoStack turns a raw clip into a publish-ready cut in minutes. You upload footage, we transcribe it, draft an edit plan with explicit B-roll moments, and render the final file — all in one browser tab. No plugins, no installers, no timeline wrangling.

Most short clips go from upload to rendered video file in under five minutes.

From upload to export, in one flow

  1. Upload — drop an video file into the studio. We save it under a display name you can return to any time.
  2. Transcribe — speech becomes text with word-level timing, so captions appear on the beat instead of guessing from silence.
  3. Plan — the assistant proposes a director-style edit: hook at the top, emphasis beats, and explicit B-roll slots where cutaways belong. You're never staring at a blank timeline.
  4. B-roll — for each slot, search stock or pick from your saved library. Thumbnails preview on hover so you can scan fast.
  5. Export — render the final video with burned-in captions and cutaways in one click. Download the video file when it's done.

What makes it different

  • Captions that lock to the word. When your transcript returns timestamps, subtitles snap to each word — the way the best vertical creators cut by hand.
  • B-roll as first-class plan items. The AI marks the exact moments where a cutaway lands, so you pick footage against a structure instead of scrubbing the timeline looking for gaps.
  • Stock search that lives inside your library. Found a clip you like? It downloads into your asset folder and is reusable across every future video, so your library compounds over time.

Formats, limits & languages

InputMP4, MOV, MKV, AVI, WebM. Horizontal and vertical both welcome.
TranscriptionMultilingual out of the box (Whisper-large-v3-turbo); most European languages work well with no extra setup.
OutputMP4 with burned-in captions and cutaways, ready for YouTube, Reels, TikTok, Shorts.
Plan limitsSource minutes, renders, and stock searches are metered per month. Your remaining quota is visible in the studio.

Common questions

Do I need to write a script first?

No. Upload any clip where someone is talking. The transcript and edit plan are generated from the audio — the assistant does the structure for you.

Can I edit the AI's suggestions?

Every part of the plan is editable: caption text and placement, B-roll choices, timing, cuts. The AI gives you a starting point, not a locked result.

What happens to my footage?

Files stay with that video until you delete them. Read the Privacy page for exactly where they live and who can access them.