Imagine a typical task for a real estate developer in the UAE: you need to sell apartments in residential towers that do not exist yet. Today there is only desert where, in five years, there should be gardens, pools and a comfortable lifestyle. How do you visualize something that has not been built yet?
Svyazi. Creative Agency, led by its founder Ilya Zmienko, who wrote this article based on his own hands-on experience, faced exactly this challenge. Using this case as a reference, we will show how a marketing team can create video content for campaigns using AI tools — step by step, from idea to final export.
Research the market and generate ideas in ChatGPT
Before you start prototyping, you need to understand your audience, competitors and the value proposition you want to highlight. On this basis you will generate ideas — the concepts that will later become the core of your future video.
Open ChatGPT and select the Thinking model. In our case we received an answer that included several distinct concepts, each with its own angle and emotional tone.
To get truly strong ideas, it is important not just to ask ChatGPT for “some concepts”, but to set structured tasks.
The more step-by-step your dialogue is, the more precisely the model will understand your vision. It is equally important to clarify constraints: what cannot be used in the project, which visual language is mandatory, which associations you want to trigger in viewers, and which ones you want to avoid.
Build a quick prototype in Sora
Once you have a pool of ideas, you need to understand which of them can realistically be produced. For this, use Sora. It is ideal for creating rough drafts:
- Visualize the concepts you came up with
- Quickly check whether the idea works visually
- Select 2–3 of the best options out of five
Sora works best with short, clear prompts. It is important to give only the key markers — for example, time of day, overall style, mood and a few objects that must remain unchanged. Very long descriptions often lead to blurry, unfocused visuals.
Create detailed visuals in Midjourney
After you have basic video sketches from Sora, you can move on to more refined visuals.
- Upload the draft frames from Sora into ChatGPT.
- Ask ChatGPT to write a detailed Midjourney prompt based on each image.
- Refine the prompt manually, adding specific details and removing anything redundant.
Midjourney produces stylised, visually appealing images that you can already show to stakeholders. However, Midjourney is not the best tool for visualising objects that do not yet exist in any form. For example, it will struggle to create a precise “automatic robotic arm for surgeons”, even if you upload technical drawings.
Midjourney understands your intent best when the prompt describes a complete scene.
This helps the model maintain consistency and produce images that feel coherent, not like a random collage.
Optional: photorealism in Unreal Engine
Unreal Engine is a full-scale game engine with a huge number of settings and possibilities. A detailed tutorial would not fit into one article.
Unreal Engine is suitable for projects where accuracy and realism are critical. For example, for visualising real buildings based on architectural drawings. The engine creates photorealistic 3D visualisations without the need for a dedicated 3D modeller.
However, Unreal Engine requires time and specialised skills, so it is rarely used for very fast, low-budget videos.
Typical use cases: real estate, product visualisations, technically complex projects.
Animate and build the video
Once the idea has been visualised, you can move on to animation. We recommend using several AI tools in parallel — send the same prompt to different video models and compare the results.
Kling. Suitable for simple animation and stitching shots together. Quickly creates dynamic elements and transitions.
Higgsfield. Produces animations with more controllable and predictable results. We recommend combining it with Kling: for example, generate the base animation in Higgsfield and then refine timing and transitions in Kling.
Veo 3. Generates video and works well with characters and people. It is suitable for dialogue-driven clips and for animating human movement, gestures and facial expressions.
By running one concept through several tools, you get a choice — and can select the version that best fits your brand.
Add voice and music with Suno or Artlist
When the video sequence is ready, it needs sound. For this you can use Suno or Artlist.
Suno is better suited for emotional music tracks, especially when the rhythm needs to match a specific edit. It is helpful to specify tempo, overall atmosphere and desired mood in advance — the model responds well to such input.
Artlist gives you an advantage in clean, professional voice-over and music, particularly for corporate or presentation videos. It is easier there to choose a voice that sounds natural and convincing for your audience — for example, a neutral global English accent suitable for an international audience in the UAE.
Upscale in Topaz
Upscaling is increasing your video resolution and perceived detail. When you need the frame to look crisp on large screens, you upscale it.
Topaz (for example, Topaz Video AI) handles this task very well. For upscaling you do not need additional prompts — you simply choose the desired resolution and quality profile, and the tool processes your video.
Final AI video pipeline
To sum up, here is what a practical AI-powered pipeline for marketing videos can look like:
- ChatGPT → research and idea generation (5–10 concepts).
- Sora → fast prototyping (select 2–3 ideas).
- ChatGPT → creation of detailed prompts for visual tools.
- Midjourney (or Unreal Engine) → final visualisation of key scenes.
- Kling / Higgsfield / Veo 3 → animation and video assembly.
- Suno / Artlist → music and voice-over.
- Topaz → upscaling and final polishing.
This approach allows marketing teams in the UAE and beyond to work with AI as a structured production pipeline — from first insight to finished video — instead of a chaotic set of experiments.




