As a fun little prototype, I wanted to see how much work would it be to use AI to create as much of the art for a 2.5D point and click style game as possible.
I figured the backgrounds would be a given, as the game would pretty much be 2D, but the characters were more complicated. I would need to have a lightning fast pipeline, a pipeline that allows me to use mocap data so creating animations would not be a bottle neck. I needed a way to translate AI generated art into form that can accept motion capture.
The best way seemed to be to generate character concepts with AI and model them into 3D.
I have been working in the games industry for 20+ years, modeling characters and backgrounds for a large part of it. So it would be interesting to see how AI would enable me to work faster. And by how much?
I planned to use the characters in wide shots only, so the possible low quality or messed up details would not matter as much, but the AI tech has come a long way since and the coherency of the art is way better than it used to be mere months ago.
I started by ordering the AI (Midjourney in this proto, but I use stable diffusion more) to make me a model sheet with turnaround images of a character. Similar to one you would have at an animation studio. That would be a good starting point for a thing like this.
The first result was very promising. Naturally it was not good for me as it was in black and white, but model sheets usually are. I try to not use any reference to any living (or dead) artists. But with AI it is impossible to not get influences from a bunch of artists. This way I just wish the waters are more muddled and there is no direct reference to anyone specific.
The second try was not great either. But it was a step in the right direction.
After some tweaking I had a prompt that produced art that was usable for me to use as a base for the 3D step:
The images had surprisingly good coherence for the different angles. These are very usable for modeling and texture projection. There is one major issue though: the images have perspective shift and are often from odd angles. But I had some ideas on how to overcome these issues.
As these images are created by an AI, no one has a claim to their copyright. That is the “contract” you sign when working with AI at the moment. They can be freely used by anyone, as replication of these images is pretty trivial. At least this is how I feel about AI art.
I chose an image the AI produced that had a good front and side view, but a challenging back view. Just to get to solve that problem and see how it affects things. is it even possible to extract textures for the final mode from this?
I started by using the side and the front view to get a first draft of the character going in Modo. Once the character was done, it was pretty clear to see that the different texture projections did not match properly on the mesh.
To fix this, I created morph maps for each projection, making sure the front, side and back thirds view matched on the mesh as well as possible.
Next I needed to do an old school UV unwrap on the mesh. It is time consuming and tedious, but had to be done properly so blending the textures together in photoshop would be easier to do. After the UV was done it was time to project all the different images on the UV using their respective morph maps.
The model is now finished and ready for rigging, for this I used Mixamo. In the past I have not used any auto-rigging tools, but in the spirit of efficiency I wanted to see if there would be a way to make this process as quick as possible. Mixamo did a pretty good job at rigging the character, but I needed to do some touch up on it. Especially the beard.
The background image was a lot simpler, as it would be used as a 2D image, not a 3D model.
Even though the image is 2D, it requires some work to merge the 3D character in it beautifully. First, we need to get the camera data from the image. I used a free tool called fSpy to reverse-engineer the camera.
The 3D version of the location is required for the perspective shift on the character to match, and to allow the character to pass behind objects. It is also useful for setting up the lights in 3D space and to get shadows from the location on the character.
Blender also has a great tool for generating UVs from camera projection, so it was the perfect tool for this step, although I needed to learn everything from scratch as the software is very alien to me.
Now that I have the all the pieces of the puzzle, it was just a matter of putting them together. I used Unity for this. I created a simple animator for the character locomotion, scaled the 3D location to match the character and set up some lights to match the scene lighting so the character would not stand out so much.
I spent 18 hours working on the 3D assets and unity scripting for this prototype. 12hours on the character model/uv/morph/textures/rigging – 3 hours on the location camera & mesh – and the rest on game scripting – not including any of the AI prompting.
On minimum the AI saved me 2 days of work on the character and 3 on the location. As per my estimation.
The end results are not perfect by any means. Closeups of the character are not ideal. The location is hard to art direct and can be very random. But if you accept the AIs shortcomings and work around them, creating a full game using AI as a co-worker is perfectly possible. It enables the creation of games that would otherwise be left unmade because of budget or time constraints for sure!