Inspired by images of the universe recently released by NASA, the first prompt I introduced to the Midjourney Research Lab’s Artificial Intelligence (AI) tool was “a spaceship surrounded by galaxies”. The result, as shown below, was an image of a ship suspended in space that seems to mirror the cosmos around it – pretty much true to the prompt.
For Midjourney founder David Holz, a powerful aspect of generative AI is its “ability to unify with language”, where we can “use language as a tool to create things”. Simply put, generative AI uses user commands to create new images based on the data set it has learned from different sources over time.
The rise of text-to-image generation has also raised philosophical questions about the definition of an “artist”.
British mathematician Marcus du Sautoy asserts in his book, The Creativity Code (Art and Innovation in the Age of AI), 2019, “Art is ultimately an expression of human free will and until computers have their own version, art created by a computer will always be traceable to the human desire to create. He states that if we were to create a ‘mind’ in a machine, it might offer insight into his thoughts.” But we are still a long way from creating conscious code,” concludes du Sautoy.
Similarly, Holz notes, “It’s important that we don’t see this as an AI ‘artist’. We think of it more as using AI to augment our imagination. It is not necessarily about art but about imagination. We ask, ‘what if’. AI somehow increases the power of our imagination.
Midjourney lets its users feed their prompts to its Discord server and then generates four images similar to the text. The user can choose to explore more variations and scale the perfect fit for a higher quality image. The bot entered open beta last month, giving users a number of free trials to bring their imaginations to life. The images generated can also be transformed into NFTs, for which until recently Midjourney charged royalties.
“It’s a giant community of almost a million people who all make pictures together, dream and have fun with each other. All invites are public and everyone can see each other’s images…it’s quite unique,” Holz told indianexpress.com.
Holz co-founded Leap Motion, a hand-tracked motion-capture user interface company, in 2010, and was on the 2014 Forbes 30 under 30 list. He now runs a small, self-funded research and design lab, Midjourney, who is exploring a bunch of miscellaneous projects, including the AI visualization tool, with 10 other colleagues.
Elaborating on the response the AI bot received, Holz says, “Many people are very happy and find using the product to be a deeply emotional experience. People use it for everything from a project to art therapy. There are people who have always had things in mind but were unable to express them before. Some people have conditions like aphantasia, where the mind can’t visualize things, and they’re now using the bot to visualize for the first time in their lives. Lots of great things are happening. »
The bot also works to prevent misuse of the platform to generate offensive images. Community guidelines urge users to refrain from using prompts that are “inherently disrespectful, offensive, or otherwise abusive” as well as generating “adult or gore content”. Midjourney also employs moderators who watch for policy violators and either give them a warning or ban them. It also has automated content moderation where certain words are banned on the server. AI, too, learns from user data, says Holz. “If people don’t like something, it generates less.”
I stumbled across the Midjourney bot during a quick glance through my Twitter feed, where I saw the psychedelhic user’s renditions of a somewhat post-apocalyptic Delhi.
Having tried AI bots like Disco Diffusion and Craiyon before, an interesting aspect of Midjourney’s discovery was to look at how different AIs would respond to the same texts. The images below show results generated with the same prompt, “city during monsoon rains”, by Midjourney, Disco Diffusion, a free AI tool hosted by Google Colab, and Craiyon, formerly known as DALL -E min.
While Craiyon offers relatively realistic images, Disco Diffusion shows surreal and impressionistic results, and Midjourney falls somewhere in the middle of the two.
According to Holz, Midjourney can be understood as a “playful and imaginative sandbox”. “The goal is to give everyone access to this sandbox, so that everyone can understand what is possible and where we are as a civilization. What can we do? What does this mean for the future ?
Holz dismisses fears that AI is here to “replace” humans or their jobs. “When computer graphics were invented, there were similar questions – will it replace artists? And it is not. On the contrary, computer graphics make artists more powerful,” he says .
Holz adds, “Every time we see something new there’s a temptation to try to figure out if it’s dangerous and we treat it like a tiger. AI is not a tiger. It’s actually more like a big river of water. A tiger is dangerous in a very different way from water. Water is something you can build a boat for, you can learn to swim, or you can create dams that generate electricity. He’s not trying to eat us, he’s not mad at us. He has no emotions or feelings or thoughts. It’s like a powerful force. It’s an opportunity. »