Introducing our AI overlords

With the emergence of AI image generation tools such as Google Imagen, DALL-E 2 and Midjourney and my day job keeping me too busy to create any hobby project content myself, I thought it might be possible to create a short adventure game with the help of these AI tools.

As a side project to my side project!

The image generation tool I managed to get access to is called Midjourney. It is a very capable tool for hallucinating some image content. With my testing I have found it to be very capable of creating great locations and characters for a game that takes place in the future and generally looks weird.

Futuristic city corner environment study.

As you can see it is possible to create some very cool graphical content. The details in these images do not make perfect sense, but the overall shapes and concepts do. This is plenty for this project!

Some oddball characters. AI prompt: “RETRACTED maintenance crew RETRACTED“.

Generating images with AI (trying to)

Working with an AI image generator turned out to be a lot different than I had imagined. A lot more ”involved”. Even though an image can be generated from a single string of text, the arrangement of the words is an artform in itself. Finding the perfect wording is very important. A slight change in the input might end up generating a completely different image.

This makes AI great for happy accidents. Surprising images that might end up looking amazing. It makes me think that AI is the best thing to happen to creativity in a long long time. Just the fact that you have this relentless concept artist that is always willing to show you insane ideas.

This is also the major drawback of working with an AI. It is pretty much guaranteed that you will never get the same result twice in a different setting. Say, you want to create a scene in a game where we have a wide angle shot of a location and then the same image in a different state.

Prompt “rustic kitchen wide angle beautiful morning shot on film –w 450” generates this result. From these 4, we can upscale one, or ask for a new set of 4 images. None of these is perfect. as you can see the images are generally fine but for example the first image on the second row is just bonkers. But lets say we love it: upscale!

So here is the result of the upscale. This is did not turn out to be pretty! But lets say we love it and want to get a version with the cupboard doors open, why not! Here we go! Prompt “rustic kitchen with cabinet doors open wide angle beautiful morning shot on film –w 450“.

More rustic kitchen!

These rustic kitchens turned out completely different. They do not even have their cabinet doors open! Darnit! (but they do look a lot better than the first batch! This is a happy accident!

This behaviour also makes it impossible to generate the same location at different times of the day. Everything is random every time. the results can never be replicated – not even close.

So clearly the AI is not very co-operative and the results do not seem to look too great either. How on earth could this wild tool be used to create graphics for a game, where consistency of style and shapes and adhering to the given specifications is key! And we would also like the game to look great?

Is this at all possible?

Taming the beast

Clearly the most important “skill” of using any AI image generator is just your ability to write the prompts. Also any coherence in design is to figure out the prefix and suffix sentences that provide nice looking and expected results each time. Once you have discovered the correct prompt language creating images becomes easy.

Let’s explore some “brackets” I call them. I will generate a set of pretty random topics in different styles.

“Dramatic cinematic ### oil painting —w 450”

“Beautiful melancholy ### shot on film —w 450”

“Funny ### in style of anime –w 450”

It quickly becomes apparent that the most important bit is to describe the style and other features of the image, rather than what the image is about. the “–w 450” at the ends tells the system to make the image in wide aspect ratio. (For game screens I like to use 500 to get some extra width for parallax).

Also, you need to be open for the options the AI provides. It might be impossible to get what you want, but you might get something you can accept..

These brackets become the secret sauce for nailing the look of your project. So essentially they are the “IP” of the project and anyone who knows the wording can reproduce the look of your project 100%.

AI use in context of game graphics


Bear in mind that some images are simply out of reach for this AI. I tried to create a staircase, for example, and Midjourney just did not know how to make a staircase that makes any sense whatsoever.

So we better not have any scenes that take place in a staircase!

Because of this unpredictability one clear goal we have for this project is to only have a general outline of the story in first and then see what locations, characters and objects are even possible. After we have a rough outline and first batch of AI generated locations, we can go in deeper and let the story be shaped by the AI’s decisions.

This is very limiting, for sure. And what if some location the AI gives us is 70% there for what we need for the story? Remember the open cupboards example: altering the locations with Midjourney just is not possible. In these cases I am going to be compositing the final image from multiple AI generated passes. Creating the missing elements in isolation and then adding them into the image by hand. The “old fashioned way”.


One great trick I found is to add “interior” and “wide angle” to the location prompt. For an adventure game this tends to create very very nice looking locations that could almost used as is.

As you can see, interiors work great. I decided to create the scenes void of characters as we need more individual control over those. One issue I ran into was the lack of doors. For this I decided to let the AI create the doors separate and place them in the locations by hand.

The same technique will be used to add other elements to the scenes that might be hard to explain to the AI.


Exteriors also work quite well. The biggest issue I faced was that sometimes the walkable areas are a little hidden. These scenes might still work for establishing shots.

For an adventure game type deal these exteriors also work very well. We will never run out of areas to visit!


For the best bang for the buck I figures we could ask AI to create the characters separate and I would model them in 3D and project the AI generated imagery on the meshes. This allows us to use motion capture to add animations for the AI generated characters.

Most background characters will work with just a very simple one sided projection. But for the main characters we need to piece them together from multiple AI generated characters to get full coverage.

Using rigid body and cloth simulations on hanging hoses, scraggly cloth, etc. will add great life to the characters and make them less like projected paintings on simple geometry.

I found out that adding “crew” after the character made the AI create multiple versions for me to sample all at once and then kit bash to make the character textures. Also I need to create the front and the back versions for full coverage.

Once I have the characters modelled, they will be usable in all angles in any background, allowing us to overcome the AI limitation of not being able to replicate the same characters in different shots. Also we are able to tap into a vast library of pre existing mocap content and easily create new ones using the Rokoko suit or AI video to motion conversion.

I will use similar projection modelling to get things like vehicles and robots in to the game.

