4.4 KiB
Executable File
| name | description |
|---|---|
| video-generation | Use this skill when the user requests to generate, create, or imagine videos. Supports structured prompts and reference image for guided generation. |
Video Generation Skill
Overview
This skill generates high-quality videos using structured prompts and a Python script. The workflow includes creating JSON-formatted prompts and executing video generation through RunningHub API.
Core Capabilities
- Create structured JSON prompts for AIGC video generation
- Generate videos through RunningHub Vidu model (text-to-video-q3-turbo)
- Support up to 16 seconds video generation with audio
- Automatic camera switching and dialogue generation
Workflow
Step 1: Understand Requirements
When a user requests video generation, identify:
- Subject/content: What should be in the video
- Style preferences: Art style, mood, color palette
- Technical specs: Aspect ratio, resolution, duration
- Audio requirements: Background music, dialogue, sound effects
Step 2: Create Structured Prompt
Generate a structured JSON file in /mnt/user-data/workspace/ with naming pattern: {descriptive-name}.json
The prompt should include visual descriptions, camera movements, and audio specifications in a natural language format.
Step 3: Execute Generation
Call the Python script:
python /mnt/skills/public/video-generation/scripts/generate.py \
--prompt-file /mnt/user-data/workspace/prompt-file.json \
--output-file /mnt/user-data/outputs/generated-video.mp4 \
--aspect-ratio 16:9
Parameters:
--prompt-file: Absolute path to JSON prompt file (required)--output-file: Absolute path to output video file (required)--aspect-ratio: Aspect ratio of the generated video (optional, default: 16:9)
[!NOTE] Do NOT read the python file, instead just call it with the parameters.
Environment Variables
Set the following environment variable before running the script:
RUNNINGHUB_API_KEY: Your RunningHub API key
Example:
export RUNNINGHUB_API_KEY=a73d0e93afb4432c978e5bff30b7517e
Video Generation Example
User request: "Generate a short video clip depicting the opening scene from "The Chronicles of Narnia: The Lion, the Witch and the Wardrobe"
Step 1: Create a JSON prompt file with the following content:
{
"title": "The Chronicles of Narnia - Train Station Farewell",
"background": {
"description": "World War II evacuation scene at a crowded London train station. Steam and smoke fill the air as children are being sent to the countryside to escape the Blitz.",
"era": "1940s wartime Britain",
"location": "London railway station platform"
},
"characters": ["Mrs. Pevensie", "Lucy Pevensie"],
"camera": {
"type": "Close-up two-shot",
"movement": "Static with subtle handheld movement",
"angle": "Profile view, intimate framing",
"focus": "Both faces in focus, background soft bokeh"
},
"dialogue": [
{
"character": "Mrs. Pevensie",
"text": "You must be brave for me, darling. I'll come for you... I promise."
},
{
"character": "Lucy Pevensie",
"text": "I will be, mother. I promise."
}
],
"audio": [
{
"type": "Train whistle blows (signaling departure)",
"volume": 1
},
{
"type": "Strings swell emotionally, then fade",
"volume": 0.5
},
{
"type": "Ambient sound of the train station",
"volume": 0.5
}
]
}
Step 2: Use the generate.py script to generate the video
python /mnt/skills/public/video-generation/scripts/generate.py \
--prompt-file /mnt/user-data/workspace/narnia-farewell-scene.json \
--output-file /mnt/user-data/outputs/narnia-farewell-scene.mp4 \
--aspect-ratio 16:9
Do NOT read the python file, just call it with the parameters.
Output Handling
After generation:
- Videos are typically saved in
/mnt/user-data/outputs/ - Share generated videos with user using
present_filestool - Provide brief description of the generation result
- Offer to iterate if adjustments needed
Notes
- Always use English for prompts regardless of user's language
- JSON format ensures structured, parsable prompts
- RunningHub Vidu model supports up to 16 seconds video generation
- Audio is automatically generated including dialogue and sound effects
- The model has "director thinking" capability for automatic camera switching
- Iterative refinement is normal for optimal results