AI-generated art is an area that is growing incredibly quickly. In most cases, these tools allow you to type in a short description of what you want and the system will create the image automatically.
A recent episode of the Cortex podcast got into it quite a bit, and it opened my eyes to some great tools out there. Marques Brownlee has a great video that walks through the DALL-E 2 technology, arguably the best software out right now:
While most of us aren’t able to access DALL-E 2 at this point, there are a variety of other tools available. The most accessible is one called Stable Diffusion, which you can install on a Windows computer right now with some effort, or with this simple installation on a Mac. It can be a bit messy to set up, but it works — mostly.
I’ve run a bunch of examples through it. Some came out very poorly, like “three men standing at a whiteboard”:
Or maybe “homer simpson as a disney princess”:
That one is still somewhat impressive from a logic perspective (that’s Homer dressed up kind of like Belle), but it’s a mess.
Some did a bit better, like a “happy dog wearing a suit”:
Or maybe “walter white in fortnite”:
Not yet…
Ultimately, the technology isn’t quite there yet, but it’s close. If you watch the video above, it can do some amazing things. So what does that mean for us?
In the very short term, I think it could be good for things like stock imagery. While my “whiteboard” example above wasn’t good, other tools can do it better and it’ll be a fantastic way to acquire stock images for anything that you need.
Not too much later, though, we’re gonna see this used for more nefarious purposes. DALL-E has intentional features built-in to avoid this (no specific people, no adult content, etc), but as these tools become more open source, that will change quickly. There’s already so much fake news out there, and when you can couple it with a very realistic-looking fake image, things will get much worse. Photoshop can already do it now, but when anyone can just type in something like “Donald Trump choking a man” and get a result, we could see a flood of those kinds of images.
Further down the road, perhaps in 3-5 years, we’ll see this technology move to video. It’s a bit slow for that now, but it won’t be too long before we can get a realistic looking video of anything you want, which has huge implications.
That opens up the need for tools that can help to verify images and video as being authentic, but I’m not sure how that would really work. It’ll be interesting to see what people come up with to counter these kinds of tools.
In the meantime, it’s worth playing with tools like Stable Diffusion to get an idea of what’s possible to better understand where we might be going.
Leave a Reply