开源模型LTX-2视频生成提示词指南

人工智能 Alan 1天前 87次浏览 0个评论 扫描二维码

为了充分发挥 LTX-2 模型的效果,一个好的提示词(Prompt)起着决定性的作用。关键在于描绘你所讲述故事的完整画面,使其从头到尾自然流畅,并涵盖模型实现你愿景所需的所有要素。如果你是视频提示词编写的新手,本指南将帮助你构建高效的提示词。

提示词示例:

一个充满动作感的电影级镜头。一辆怪物卡车快速驶向摄像机,卡车经过摄像机时,摄像机向左摇摄以跟随卡车的疯狂行驶。卡车周围有尘土和运动模糊,摄像机有手持感,试图追踪远去的车辆。随后卡车漂移并调头,然后开回摄像机方向,直到呈现极度特写。

An action packed, cinematic shot of a monster truck driving fast towards the camera, the truck passes the cameras it pans left to follow the trucks reckless drive. dust and motion blur is around the truck, hand held feel to the camera as it tries to track its ride into the distance. the truck then drifts and turns around, then drives back towards the camera until seen in extreme close up.

提示词示例:

温暖阳光明媚的后院。摄像机以紧凑的电影特写镜头开始,画面中是一对30多岁的男女,面对面神情严肃。女人情绪激动且戏剧化,轻声说:“就这样了……爸爸疯了。我们已经失去爸爸了。”

男人呼了一口气,略带恼火:“别那么戏剧化,杰西。”

停顿一下。他瞥向一边,然后防御性地嘀咕:“他只是在玩。”

摄像机缓慢向右摇摄,展示出花园里的祖父,他戴着巨大的蝴蝶翅膀,像试图起飞一样在空中挥舞双臂。

他大喊:“呼——!”并全力扇动翅膀。

女人捂着脸,濒临崩溃。基调是面无表情的幽默、荒谬且带有一丝静谧的悲剧感。

A warm sunny backyard. The camera starts in a tight cinematic close-up of a woman and a man in their 30s, facing each other with serious expressions. The woman, emotional and dramatic, says softly, “That’s it… Dad’s lost it. And we’ve lost Dad.”
The man exhales, slightly annoyed: “Stop being so dramatic, Jess.”
A beat. He glances aside, then mutters defensively, “He’s just having fun.”
The camera slowly pans right, revealing the grandfather in the garden wearing enormous butterfly wings, waving his arms in the air like he’s trying to take off.
He shouts, “Wheeeew!” as he flaps his wings with full commitment.
The woman covers her face, on the verge of tears. The tone is deadpan, absurd, and quietly tragic.

包含的关键要素

  • 确立镜头(Establish the shot): 使用符合你偏好电影类型的摄影术语。包含比例或特定类别特征等要素,以进一步细化你想要的风格。

  • 设定场景(Set the scene): 描述光照条件、调色板、表面纹理和氛围以塑造情绪。

  • 描述动作(Describe the action): 将核心动作写成自然的序列,从头到尾流畅进行。

  • 定义角色(Define your character(s)): 包含年龄、发型、服装和显著细节。通过肢体语言表达情绪。

  • 明确运镜(Identify camera movement(s)): 指定视角何时转换以及如何转换。包含摄像机运动后主体或物体如何出现,能让模型更好地理解如何完成动作。

  • 描述音频(Describe the audio): 对环境音、音乐、音效和语音使用清晰的描述。对于对话,将文本放在引号之间,并(如果需要)提及你希望角色使用的语言和口音。

提示词示例:

内景。烤箱——白天。静态摄像机位于烤箱内部,透过轻微起雾的玻璃门向外看。温暖的金光笼罩着刚烤好的饼干。面包师的脸填满画面,眼睛瞪大,全神贯注,通过玻璃门凑近时呼吸使玻璃起雾。随着蒸汽升起,玻璃上划过微妙的反光。

面包师(戏剧性地耳语):“今天……我实现了完美。”

他凑得更近,鼻子几乎碰到玻璃。

“金色的边缘。柔软的中心。众神闻到这些饼干都会哭泣。”

面包师:“等一下——”

(停顿)

“我……是不是忘了放巧克力豆?”

切到侧视图——同事突然入画,随意地咀嚼着。

同事(嘴里塞满东西):“不。你忘了放糖。”

快速变焦拉回面包师惊恐的脸,紧贴着烤箱门,玻璃后的饼干塌陷。蒸汽以慢动作向上飘散。

皮克斯风格的表演和时机。

INT. OVEN – DAY. Static camera from inside the oven, looking outward through the slightly fogged glass door. Warm golden light glows around freshly baked cookies. The baker’s face fills the frame, eyes wide with focus, his breath fogging the glass as he leans in. Subtle reflections move across the glass as steam rises.
Baker (whispering dramatically): “Today… I achieve perfection.”
He leans even closer, nose nearly touching the glass.
“Golden edges. Soft center. The gods themselves will smell these cookies and weep.”
Baker: “Wait—”
(beat)
“Did I… forget the chocolate chips?”
Cut to side view — coworker pops into frame, chewing casually.
Coworker (mouth full): “Nope. You forgot the sugar.”
Quick zoom back to the baker’s horrified face, pressed against the oven door, as cookies deflate behind the glass. Steam drifts upward in slow motion.
pixar style acting and timing

为了获得最佳效果

  • 保持连贯: 将你的提示词保持在一个流畅的段落中,给模型一个连贯的场景来处理。

  • 使用现在时: 使用现在时动词来描述运动和动作。

  • 细节匹配: 细节描述要与镜头比例相匹配。特写镜头比广角镜头需要更精确的细节。

  • 运镜关系: 描述摄像机运动时,关注摄像机与主体之间的关系。

  • 篇幅建议: 你应该期望写 4 到 8 个描述性句子来涵盖提示词的所有关键方面。

  • 不断迭代: 不要害怕尝试!LTX-2 专为快速实验而设计,因此优化提示词是工作流程的一部分。

提示词示例:

内景。日间脱口秀布景——下午。

柔和的演播室灯光照亮暖色调的布景。摄像机摇摄展示坐在沙发上的三位嘉宾——一对中年夫妇和坐在对面的主持人时,观众发出微弱的低语。

主持人前倾身体,声音平稳但具有试探性:

主持人:“你们是什么时候第一次注意到你们的女儿,米西,开始失控的?”

女人的脸垮了下来;她颤抖地吸了一口气,开始哭泣。丈夫把手放在她肩上安慰她,低头看了一眼然后转回看向主持人。

父亲(安静地,带着内疚):“我们……我们要不知道我们做错了什么。”

演播室陷入片刻的寂静。摄像机切到主持人,他严肃地看着镜头。

主持人(对着摄像机):“让我们看一段我们团队准备的短片——记录了米西的堕落之路。”

灯光稍微变暗,摄像机推向母亲泪痕斑斑的脸。演播室监视器闪烁亮起,开始播放片段,观众屏住呼吸。

NT. DAYTIME TALK SHOW SET – AFTERNOON
Soft studio lighting glows across a warm-toned set. The audience murmurs faintly as the camera pans to reveal three guests seated on a couch — a middle-aged couple and the show’s host sitting across from them.
The host leans forward, voice steady but probing:
Host: “When did you first notice that your daughter, Missy, started to spiral?”
The woman’s face crumples; she takes a shaky breath and begins to cry. Her husband places a comforting hand on her shoulder, looking down before turning back toward the host.
Father (quietly, with guilt): “We… we don’t know what we did wrong.”
The studio falls silent for a moment. The camera cuts to the host, who looks gravely into the lens.
Host (to camera): “Let’s take a look at a short piece our team prepared — chronicling Missy’s downward path.”
The lights dim slightly as the camera pushes in on the mother’s tear-streaked face. The studio monitors flicker to life, beginning to play the segment as the audience holds its breath.

其他有用的术语

(这不是详尽的列表。使用它作为参考,帮助你打造想要的效果。)

分类 (Categories)

  • 动画 (Animation): 定格动画 (stop-motion)、2D/3D 动画、黏土动画 (claymation)、手绘 (hand-drawn)

提示词示例:

匹诺曹坐在审讯室里,看起来很紧张,微微出汗。他对自己非常小声地说:“不是我做的……不是我做的……我不是杀人犯”。匹诺曹的鼻子正在迅速变长。摄像机向房间后方的双面镜推近,随着摄像机接近,镜子变黑,并显露出两个FBI探员的模糊轮廓,他们站在另一侧昏暗的房间里。其中一人说:“我告诉你,我感觉这孩子有点不对劲。”

Pinocchio is sitting in an interrogation room, looking nervous, and slightly sweating. He’s saying very quietly to himself “I didn’t do it… I didn’t do it… I’m not a murderer”. Pinocchio’s nose is quickly getting longer and longer. The camera is zooming in on the double sided mirror in the back of the room, The mirror is turning black as the camera approaches it, and exposes a blurry silhouette of two FBI detectives who stand in the dark lit room on the other side. One of them is saying “I’m telling you, I have a feeling something is off with this kiddo

  • 风格化 (Stylized): 漫画书 (comic book)、赛博朋克 (cyberpunk)、8位像素 (8-bit pixel)、超现实 (surreal)、极简主义 (minimalist)、绘画风 (painterly)、插画风 (illustrated)

提示词示例:

一位年轻的非裔美国女性戴着未来主义的透明面罩,身穿连体衣,脖子上连着一根管子。她正在焊接一个机械臂。当她听到远处传来可疑的重击声时,她停下来看向右边。她从椅子上慢慢站起来,用愤怒的非裔美国口音说:“瑞克,我告诉过你跟在你屁股后面把那扇该死的门关上!”。接着,一个留着脏辫、穿着粗犷装备的未来主义蓝色外星探险家兴奋地拿着一个未来设备走进场景,用低沉的机械音说:“去他妈的门,看我发现了什么!”。外星人把设备递给女人,她兴奋地低头看着它,摄像机推近她被照亮的充满好奇的脸。然后她说:“这是我想的那个东西吗?”她兴奋地笑了。科幻风格电影场景。

The young african american woman wearing a futuristic transparent visor and a bodysuit with a tube attached to her neck. she is soldering a robotic arm. she stops and looks to her right as she hears a suspicious strong hit sound from a distance. she gets up slowly from her chair and says with an angry african american accent: “Rick I told you to close that goddamn door after you!”. then, a futuristic blue alien explorer with dreadlocks wearing a rugged outfit walks into the scene excitedly holding a futuristic device and says with a low robotic voice: “Fuck the door look what I found!”. the alien hands the woman the device, she looks down at it excitedly as the camera zooms in on her intrigued illuminated face. she then says: “is this what I think it is?” she smiles excitedly. sci-fi style cinematic scene

  • 电影感 (Cinematic): 古装剧 (period drama)、黑色电影 (film noir)、奇幻 (fantasy)、史诗太空歌剧 (epic space opera)、惊悚片 (thriller)、现代浪漫 (modern romance)、实验电影 (experimental film)、艺术电影 (arthouse)、纪录片 (documentary)

提示词示例:

充满动作感的电影镜头。男人无声地说:“我们需要跑。”摄像机推近他的嘴,然后立即尖叫:“现在!”。摄像机迅速拉远,他转身开始逃跑,摄像机以手持风格追踪他的奔跑。摄像机升降摇臂向上,展示他在繁忙的纽约夜晚街道上跑向远方。

Cinematic action packed shot. the man says silently: “We need to run.” the camera zooms in on his mouth then immediately screams: “NOW!”. the camera zooms back out, he turns around, and starts running away, the camera tracks his run in hand held style. the camera cranes up and show him run into the distance down the street at a busy New York night.

视觉细节 (Visual Details)

  • 光照条件: 摇曳的烛光、霓虹光辉、自然阳光、戏剧性阴影

  • 纹理: 粗糙的石头、光滑的金属、磨损的织物、光泽表面

  • 调色板: 鲜艳、柔和、单色、高对比度

  • 氛围元素: 雾、雨、灰尘、粒子、烟雾

提示词示例:

摄像机在一个宁静、阳光充足的青蛙瑜伽室开场。温暖的晨光洒在木地板上,香薰烟雾在空中慵懒地飘荡。年长的青蛙导师盘腿坐在中间,闭着眼睛,声音深沉而平静。“我们要与池塘合二为一。”所有青蛙轻声回答:“嗡……”“我们要与泥浆合二为一。”“嗡……”他淡淡地微笑着。“我们要与苍蝇合二为一。”一阵安静的停顿。

摄像机缓慢向侧面摇摄——一只青蛙抽动了一下,眼神游移。突然——嗖!——它的舌头弹射而出,在半空中抓住一只苍蝇并卷入嘴里。大师缓慢呼气,依然安详。

“但我们不追逐苍蝇……”

停顿。“……不在上课的时候。”那只内疚的青蛙僵住了,随后羞愧地低下头,把手折叠回冥想姿势。其他青蛙继续吟唱:“嗡……”摄像机在那只尴尬的青蛙身上停留片刻,它眼睛闭得太紧,假装什么都没发生。

The camera opens in a calm, sunlit frog yoga studio. Warm morning light washes over the wooden floor as incense smoke drifts lazily in the air. The senior frog instructor sits cross-legged at the center, eyes closed, voice deep and calm. “We are one with the pond.” All the frogs answer softly: “Ommm…” “We are one with the mud.” “Ommm…” He smiles faintly. “We are one with the flies.” A quiet pause.
The camera slowly pans to the side — one frog twitches, eyes darting. Suddenly — *thwip!* — its tongue snaps out, catching a fly mid-air and pulling it into its mouth. The master exhales slowly, still serene.
“But we do not chase the flies…”
Beat. “…not during class.” The guilty frog freezes, then lowers its head in visible shame, folding its hands back into the meditative pose. The other frogs resume their chant: “Ommm…” Camera holds for a moment on the embarrassed frog, eyes closed too tightly, pretending nothing happened.

声音和语音 (Sound and Voice)

  • 环境: 咖啡店环境音、滴雨和风吹声、鸟语花香的森林氛围

  • 对话风格: 充满活力的播音员、庄重洪亮的声音、失真的广播风格、机械单调音、孩童般的好奇

  • 音量: 安静耳语、嘀咕、大喊、尖叫

提示词示例:

在一个舒适的木镶板酒吧内,一场温暖亲密的电影级表演,柔和的琥珀色实用灯光和浅景深在背景中营造出光斑。镜头以中特写开场,拍摄一位20多岁的年轻女歌手,留着棕色短发和刘海,一边对着麦克风唱歌一边弹奏木吉他,她闭着眼睛,姿态放松。摄像机缓慢向左环绕她,保持她的脸和麦克风清晰对焦,而身后的两名弹吉他的男性乐队成员保持柔和的模糊。温暖的光线包裹着她的脸和头发,背景中的镶框照片和木墙缓缓掠过。现场环境音乐充满空间,由她清澈的人声引领,伴随着温柔的吉他弹奏。

A warm, intimate cinematic performance inside a cozy, wood-paneled bar, lit with soft amber practical lights and shallow depth of field that creates glowing bokeh in the background. The shot opens in a medium close-up on a young female singer in her 20s with short brown hair and bangs, singing into a microphone while strumming an acoustic guitar, her eyes closed and posture relaxed. The camera slowly arcs left around her, keeping her face and mic in sharp focus as two male band members playing guitars remain softly blurred behind her. Warm light wraps around her face and hair as framed photos and wooden walls drift past in the background. Ambient live music fills the space, led by her clear vocals over gentle acoustic strumming.

技术风格标记 (Technical Style Markers)

  • 镜头语言: 跟随 (follows)、追踪 (tracks)、横摇 (pans across)、环绕 (circles around)、上仰 (tilts upward)、推近 (pushes in)、拉远 (pulls back)、俯视 (overhead view)、手持运动 (handheld movement)、过肩镜头 (over-the-shoulder)、广角定场镜头 (wide establishing shot)、静态画面 (static frame)
  • 胶片特征: 抖动的定格动画、像素化边缘、镜头光晕、胶片颗粒

  • 比例指示: 广阔、史诗、亲密、幽闭

  • 节奏和时间效果: 慢动作、延时摄影、快速剪辑、长镜头、连续镜头、定格、淡入、淡出、无缝过渡、动态运动、突然停止

  • 特定视觉效果(如果相关): 粒子系统、运动模糊、景深

提示词示例:

一个动画电影镜头。一个机器人缓慢行走,摄像机向后移动车(dollys back),在中景镜头中保持机器人的缓慢行走。机器人开始缓慢而沉重地奔跑。然后它停下来,摄像机继续后退,直到一个类似的蓝色机器人出现在过肩镜头中。

An animated cinematic shot. a robot, walks slowly, the camera dollys back and keep the robots slow walk in a medium shot. the robot start running slowly and heavily. it then stops, and the camera keeps dollying back, until a blue similiar robot appears in an over the shoulder shot.

LTX-2 擅长什么

  • 电影构图: 具有考究光线、浅景深和自然运动的广角、中景和特写镜头。

  • 动情的人类时刻: LTX-2 擅长单人情感表达、微妙的手势和面部细微差别。

  • 氛围与设定: 雾、薄雾、黄金时刻光线、柔和阴影、雨、反射和环境纹理等天气效果都有助于使场景更具真实感。

  • 清晰易读的镜头语言: 清晰的指令如“缓慢推近”、“手持追踪”或“过肩镜头”能提高一致性。

  • 风格化美学: 绘画风、黑色电影、模拟胶片感、时尚大片、像素动画或超现实艺术风格,如果在提示词早期提及其名称,效果尤为出色。

  • 光线和情绪控制: 逆光、调色板、柔和轮廓光、闪烁的灯——这些比通用的情绪词更能锚定基调。

  • 声音: 角色可以用各种语言说话和唱歌。

提示词示例:

外景。小镇街道——早晨——现场新闻广播。

镜头开场是一名新闻记者站在一排被封锁的汽车前,黄色的警戒带在他身后飘动。光线温暖,清晨的阳光反射在摄像机镜头上。空气中充满了微弱的交谈声和远处的钻探声。

记者镇定但明显很兴奋,手持麦克风直视镜头。

记者(现场):

“谢谢你,西尔维亚。是的——这是我从未想过会在电视直播中说的一句话——但今天早上,在这个安静的佛蒙特州纽卡斯尔镇……发现了黑金!”

他稍微向身后的田野示意。

记者(咧嘴笑):

“如果我的摄影师能把镜头摇过去,你们就会看到这兴奋是为了什么。”

摄像机向右摇摄,慢慢展示出一个被戴着安全帽的工人们包围的建筑工地。片刻的寂静——然后,伴随着突然的轰鸣声,一股石油喷泉从地面喷涌而出,猛烈地冲向天空。

工人们欢呼雀跃,四散奔逃,黑色的液流在晨光中闪闪发光。摄像机轻微晃动,试图在混乱中保持对焦。

记者(画外音,盖过噪音大喊):

“就是它,伙计们——纽卡斯尔永远不会忘记的一刻!”

摄像机捕捉到石油雾气反射的阳光,然后拉远,展示整个场景——小镇的天际线映衬在狂野的石油喷泉背景下。

EXT. SMALL TOWN STREET – MORNING – LIVE NEWS BROADCAST
The shot opens on a news reporter standing in front of a row of cordoned-off cars, yellow caution tape fluttering behind him. The light is warm, early sun reflecting off the camera lens. The faint hum of chatter and distant drilling fills the air.
The reporter, composed but visibly excited, looks directly into the camera, microphone in hand.
Reporter (live):
“Thank you, Sylvia. And yes — this is a sentence I never thought I’d say on live television — but this morning, here in the quiet town of New Castle, Vermont… black gold has been found!”
He gestures slightly toward the field behind him.
Reporter (grinning):
“If my cameraman can pan over, you’ll see what all the excitement’s about.”
The camera pans right, slowly revealing a construction site surrounded by workers in hard hats. A beat of silence — then, with a sudden roar, a geyser of oil erupts from the ground, blasting upward in a violent plume.
Workers cheer and scramble, the black stream glistening in the morning light. The camera shakes slightly, trying to stay focused through the chaos.
Reporter (off-screen, shouting over the noise):
“There it is, folks — the moment New Castle will never forget!”
The camera catches the sunlight gleaming off the oil mist before pulling back, revealing the entire scene — the small-town skyline silhouetted against the wild fountain of oil.

LTX-2 应避免的内容

  • 内在状态: 避免使用如“悲伤”或“困惑”等情感标签而不描述视觉线索。请使用姿势、手势和面部表情代替。

  • 文字和Logo: LTX-2 目前无法生成可读或连贯的文本。避免出现标牌、品牌名称或印刷材料。

  • 复杂的物理或混乱的运动: 非线性或快速扭曲的运动(例如跳跃、杂耍)可能会导致伪影或故障。但是,舞蹈效果通常不错。

  • 场景过于复杂: 太多角色、分层动作或过多的物体会降低清晰度和模型准确性。

  • 不一致的光照逻辑: 避免混合相互冲突的光源(例如“温暖的日落伴随冷色荧光灯”),除非有明确的动机。

  • 过于复杂的提示词: 你添加的动作/角色/指令越多,其中一些在输出中无法呈现的可能性就越高。从简单的事物开始,并在迭代过程中叠加额外的指令。

翻译整理自:Prompting Guide for LTX-2

喜欢 (0)
[]
分享 (0)
发表我的评论
取消评论

表情 贴图 加粗 删除线 居中 斜体 签到

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址