Help Page
1 Start Your Project
Click the AI-Generate Project button in the Info tab to get started. You can write a short prompt and select the number of scenes. This will create the basic structure of your movie, including the title, visual style, characters, and scenes.
2 Define the Visual Style
Carefully examine the generated structure. You can edit the title and synopsis. Additionally, check and improve the Visual Style prompt if needed. This will ensure a consistent aesthetic across your project.
3 Set Characters
Generate or upload portrait images for each character and accompany them with a voice file to maintain Character Consistency throughout your film.
4 Outline Your Scenes & Shots
In the main navbar, you can view your Scenes list. Click on a scene, then click on the text description to generate the scene's structure or Shots, including image/video, speech, sound, and music elements.
5 Generate Visuals
For each shot, generate an Image using AI or upload your own files. You can choose between different text-to-image models that offer various levels of quality, prompt adherence, and cost. Once you have an image you like, convert it into a Video using the Image-to-Video tool.
6 Create Audio Assets
Generate audio assets for Speech/Dialogues and/or Sound Effects for the shots that need them. You can also create or upload a Soundtrack for the scene.
7 Animate Characters
For shots featuring a character, select the Image Character option to create an image, and then use the Face Talking tool to sync the image with the corresponding audio, creating a talking face effect.
8 Edit As Needed
Use the New and Remove buttons on scenes, shots, and other elements to refine your project. You can also change the order of the shots with the Move button.
9 Compile Your Movie
Click the Export button on the right side of the top navbar to compile all visible video and sound assets into a complete movie. You can select the resolution of the video and the volume of each audio track.
10 Publish or Download
Download your created video and/or publish it on our Cinema page for other users to see your work. You can also upload and publish videos that you have edited using external tools.
GPT-3.5
Text 0.30/1000 tokens
GPT-4
An advanced language model that pushes the boundaries of natural language understanding and generation.
Text 6.00/1000 tokens
GPT-4o
An optimized variant of GPT-4, specifically fine-tuned for efficiency and speed without compromising on quality.
Text 3.00/1000 tokens
GPT-4o-mini
GPT-4o mini is the most cost-efficient small model that’s smarter and cheaper than GPT-3.5 Turbo.
Text 0.12/1000 tokens
SDXL (stabilityai)
A text-to-image generative AI model that creates beautiful images.
Image 2 credits 12 seconds Github
SDXL (replicate)
The official SDXL model from Stable Diffusion REST API, designed for seamless and efficient text-to-image generation. SDXL (API) leverages the powerful capabilities of Stable Diffusion to create high-quality, detailed images based on textual descriptions.
Image 1 credit
SDXL-Lora
An improved outpainting model that supports LoRA urls. This model uses PatchMatch to improve the mask quality..
Outpaint Github
SDXL ip adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate SDXL images with an image prompt.
Image Character 6 credits 49 seconds Github
Clarity
High resolution image Upscaler and Enhancer. Use at ClarityAI.co. A free Magnific alternative. Twitter/X: @philz1337x.
Enhance 7 credits 60 seconds Github
SD-Core
Stable Diffusion primary service for text-to-image generation, Stable Image Core represents the best quality achievable at high speed. No prompt engineering is required! Try asking for a style, a scene, or a character, and see what you get..
Image 6 credits
SDXL (fine-tuned cinematic)
A cinematic model fine-tuned on SDXL.
Image 1 credit 7 seconds
Stable Video (replicate)
SVD is a research-only image to video model.
Video 12 credits 84 seconds Github
Sadtalker 1
Stylized Audio-Driven Single Image Talking Face Animation.
Face Talking 70 credits 360 seconds Github
Sadtalker 2
Stylized Audio-Driven Single Image Talking Face Animation.
Face Talking 26 credits 240 seconds Github
ffmpeg-smooth
A tool to smooth videos by converting them to 30 fps using frame interpolation.
Smooth 1 credits
ffmpeg-reverse
A tool designed to reverse videos, allowing playback from end to beginning.
Reverse 1 credits
video-retalking 1
Audio-based Lip Synchronization for Talking Head Video.
Face Talking Video 52 credits 360 seconds Github
video-retalking 2
VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing in the Wild.
Face Talking Video 36 credits 360 seconds Github
XTTS-V2
Coqui XTTS-v2: Multilingual Text To Speech Voice Cloning.
Speech Research 1 credits 7 seconds Github
whisperspeech
An Open Source text-to-speech system built by inverting Whisper.
Speech 3 credits 26 seconds Github
hierspeechpp
Zero-shot speech synthesizer for text-to-speech and voice conversion.
Speech 1 credit 4 seconds Github
AudioLDM
Text-to-audio generation with latent diffusion models.
Sound Research 6 credits 143 seconds Github
Stable-Audio
Stable Audio Open is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts..
Sound 1 credit 7 seconds Github
Audiocraft
(wip) Audiocraft is a library for audio processing and generation with deep learning..
Music 9 credits 240 seconds Github
Stable-Audio
Stable Audio Open is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts..
Music 16 credits 100 seconds Github
Creative 4k
Takes images and upscales them all the way to 4K resolution. It performs heavy reimagining..
Enhance 50 credits
SD3 (stabilityai)
An advanced text-to-image generative AI model that represents the latest iteration in the Stable Diffusion series. SD3 combines cutting-edge technology with enhanced algorithms to produce stunning, high-quality images from textual descriptions.
Image 13 credits
Pulid
Use a face to make images. Uses SDXL fine-tuned checkpoints..
Image Character Research 2 credits 16 seconds Github
Conservative 4k
Takes images and upscales them all the way to 4K resolution. It minimizes alterations to the image and should not be used to reimagine an image.
Enhance 50 credits
Consistent Character
Create images of a given character in different poses.
Image Character 24 credits 180 seconds Github
SD-Ultra
The most advanced Stable Diffusion text to image generation service, Stable Image Ultra creates the highest quality images with unprecedented prompt understanding. Ultra excels in typography, complex compositions, dynamic lighting, vibrant hues, and overall cohesion and structure of an art piece.
Image 16 credits
SD3
A text-to-image model with greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
Image 7 credits Github
SD-outpaint
The Outpaint service inserts additional content in an image to fill in the space in any direction.
Outpaint 8 credits
FLUX schnell
The fastest image generation model tailored for local development and personal use.
Image 1 credit Github
live-portrait 1
Portrait animation using a driving video source.
Video Character 16 credits 110 seconds Github
SD-sketch
It upgrades rough hand-drawn sketches to refined outputs with precise control..
Image 2 Image 6 credits
SD-structure
This service excels in generating images by maintaining the structure of an input image, making it especially valuable for advanced content creation scenarios such as recreating scenes or rendering characters from models..
Image 2 Image 6 credits
SD-style
Extracts stylistic elements from an input image (control image) and uses it to guide the creation of an output image based on the prompt. The result is a new image in the same style as the control image..
Image 2 Image 8 credits
CogVideoX
Image-to-Video Diffusion Models with An Expert Transformer.
Video 60 credits 400 seconds Github
CogVideoX (replicate)
Text-to-Video Diffusion Models with An Expert Transformer.
Text 2 Video 60 credits 400 seconds Github
Luma dream machine
Text 2 Video 100 credits 60 seconds
FLUX realism
Image 1 credit 3 seconds
Luma dream machine
Video 100 credits 60 seconds
Stable Video (fal)
Video 15 credits
Sadtalker (fal)
Face Talking 9 credits 40 seconds
Kling
Text 2 Video 30 credits 360 seconds
Kling
Video 30 credits 360 seconds
FLUX pro
Faster, better FLUX Pro. Text-to-image model with excellent image quality, prompt adherence, and output diversity..
Image 8 credits
Recraft v3
Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis.
Image 8 credits
Minimax
Generate 6s videos with prompts or images. (Also known as Hailuo).
Text 2 Video 100 credits 400 seconds
Minimax
Generate 6s videos with prompts or images. (Also known as Hailuo).
Video 100 credits 400 seconds
Hunyuan
Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability.
Text 2 Video 100 credits 200 seconds
Kling pro
Generate video clips from your prompts using Kling 1.5 (pro).
Text 2 Video 125 credits 500 seconds
Kling pro
Generate video clips from your images using Kling 1.5 (pro).
Video 125 credits 500 seconds
Minimax (replicate)
Generate 6s videos with prompts or images. (Also known as Hailuo).
Video 100 credits 300 seconds
MMAudio
Advanced AI model that synthesizes high-quality audio from video content, enabling seamless video-to-audio transformation.
Sound 2 credits 10 seconds Github
Photon
High-quality image generation model optimized for creative professional workflows and ultra-high fidelity outputs.
Image 6 credits
Minimax Live
An image-to-video (I2V) model specifically trained for Live2D and general animation use cases.
Text 2 Video 100 credits 400 seconds
Minimax Live
An image-to-video (I2V) model specifically trained for Live2D and general animation use cases.
Video 100 credits 400 seconds
sync
Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization..
Face Talking Video 14 credits 100 seconds