[ad_1]
DALL-E. Meta. Firefly. Secure Diffusion. Prefer it or not, it is simple that the AI picture technology market is unquestionably oversaturated now. Nonetheless, there has all the time been one standout.
To me, it is apparent that Midjourney was one of the best AI picture generator within the enterprise. Nonetheless, I do acknowledge that it nonetheless has some flaws, significantly with producing lifelike photographs and ones with lengthy prompts and textual content.
That is why I have been patiently ready for Midjourney V6, and final night time, it lastly got here. I shortly hopped on Discord and began producing as many photographs as I might. Let me inform you a fast spoiler: it is definitely worth the wait.
Listed below are among the finest photographs I created utilizing Midjourney V6 together with the identical immediate however utilized to Midjourney 5.2:
Midjourney v5 and v6 Output Comparability
It’s been slightly over 24 hours since Midjourney v6 got here out and let me inform you: the hype is actual. This has been, by far, my favourite picture generator. It by some means fastened each single one among my issues with the earlier model. Listed below are a few of my favourite examples:
Portraits
a lady mendacity in mattress together with her eyes closed, golden hour, closeup
My greatest gripe with Midjourney was that it could not actually generate lifelike photographs on par with DALL-E or Meta. The discharge of V6 appears to have solved that drawback. Their realism is on an entire totally different stage now. No extra waxy faces and exaggerated options. V6’s output is so good that, even should you zoom in, you may see the imperfections that make us human. That is an immense enchancment.
Landscapes
panorama, an autumn within the lake throughout nightfall, tranquility
Do not get me flawed: V5.2’s picture is fairly good, but it surely’s not precisely the look I am going for. I am searching for lifelike lake photographs, one thing that V6 was in a position to give me. This upgraded model can create authentic-looking photographs with out sacrificing creative high quality. It is method higher than DALL-E 3 on this entrance, in my view.
Product Renders
product images, a fragrance, studio lighting, shadow play, jasmine, tender
I will admit: I am not too certain about this one. The important thing distinction is that the product photographs I am getting from V5.2 appears to be like processed and market-ready, whereas V6 appears to be like extra uncooked, prefer it’s taken straight out of a digital camera. It might have one thing to do with the phrasing of my prompts since I’ve gotten used to cluttered V5 prompts, one thing that I must work on as a consequence of V6’s developed nuance.
I’ll say this although: should you’re a seasoned editor searching for detailed, well-shot uncooked photographs, V6 is rather a lot higher than V5.2.
Film Stills (Animated)
animated film nonetheless, a younger lady following a magical cat to a tree,
impressed by hayao miyazaki, whimsical, magical realism, clear strains, detailed, 8k
This can be a nice time to speak about nuance. In my immediate, I particularly requested a movie nonetheless that appears like Hayao Miyazaki’s work. V5.2’s output did not observe this in any respect, as an alternative going for a generic 3D DreamWorks type of animation. However, V6 adopted this instruction to a tee. It appears to be like straight out of Howl’s Shifting Fort.
I additionally extremely counsel you to zoom in and take a look at these particulars in V6’s output. The nonetheless is a lot extra vivid and vigorous. It is genuinely mindblowing how good Midjourney has improved over the past couple of months.
Film Stills (Reside Motion)
movie nonetheless, again shot of a person in a inexperienced jacket, symmetrical,
muted colours, directed by wes anderson
Midjourney V5 positively had an issue with oversimplifying or overcomplicating prompts, particularly ones with plenty of context. Take a look at the instance above: I saved it minimal however nonetheless, V5 wasn’t in a position to be artistic with the immediate he is given. V6 solves this drawback by filling within the gaps of my immediate whereas retaining its authentic thought.
PS. Sure, I do know. The man is lacking his proper ear however hey, it is V6’s first week!
Flat Illustrations
emblem for a shoe firm, clear background, paul rand
I by no means actually had any difficulty with producing logos with V5.2 however, after seeing these photographs side-by-side, I might actually inform that there was room for enchancment in hindsight. V6’s output retains the minimalism of V5.2 whereas including its distinctive spin to the illustrations that offers them extra identification.
Surrealism
the planets within the galaxy as hatching eggs of lovecraftian entities,
surrealism, cosmic, lovecraftian, ethereal, celestial our bodies
I’ve all the time praised Midjourney’s surrealist photographs as one among their sturdy factors. Nonetheless, it tends to overpopulate its outputs with topics that you simply typically cannot determine what is going on on — one thing you could see above.
V6, with its improved nuance, manages to strike a steadiness between fulfilling the immediate and being artistic. Now you can clearly see what they’re making an attempt to painting, even with little to no details about the topic.
Textual content Technology
a restaurant in a quiet stylish neighborhood with a neon signal that claims “Closed”,
night time, streetlights
One in every of Midjourney’s greatest guarantees earlier than V6 got here out was that it should repair its textual content technology, which is a big drawback throughout all AI turbines. The one one I’ve tried that is first rate on that finish is DALL-E 3, but it surely appears to be like like Midjourney’s subsequent in line.
It completely wrote “Closed” within the V6 picture, even including its personal aptitude. As for V5.2, nicely, until you have received a restaurant known as “CORSTARB,” I do not suppose it is minimize out for textual content technology.
Nonetheless, it is nonetheless not good, as you may see right here:
comedian panel, panicked captain america yelling “Get out of right here”, speech bubbles, gritty
This simply reveals that Midjourney nonetheless would not acknowledge letters because it’s nonetheless lacking a phrase from my immediate. For my part, this works finest with single or two-word texts solely. However hey, it is miles higher than its rivals. Even DALL-E 3 is not this good.
Excessive Context
An in depth oil portray of an previous sea captain, steering his ship by a storm.
Saltwater is splashing in opposition to his weathered face, willpower in his eyes.
Twirling malevolent clouds are seen above and stern waves threaten to submerge
the ship whereas seagulls dive and twirl by the chaotic panorama.
Thunder and lights embark within the distance,
illuminating the scene with an eerie inexperienced glow
Only a heads-up, I borrowed this immediate from OpenAI’s DALL-E 3 web page. It is typically arduous to think about components so as to add to a immediate. That is additionally a immediate that OpenAI used to check DALL-E’s nuance, so I might additionally take a look at it with V5 and V6, after which examine.
V5.2 really did a reasonably good job, however nonetheless missed a few components just like the eerie inexperienced glow, seagulls, and thunder. V6 adopted all the pieces besides seagulls, however there’s nonetheless one solitary seagull within the background, so this one passes the scent take a look at.
So, Did It Enhance?
It did enhance, by rather a lot.
I could not present you each take a look at I’ve completed but (I am reserving some for my subsequent article) but it surely’s already 100 occasions higher than V5.2 in my e-book. It managed to unravel the textual content technology and nuance points whereas concurrently bettering its creativity. Each picture I’ve created up to now with V6 is crisp, detailed, and correct.
What else is there to ask for?
The Backside Line
When V5 got here out, some mentioned that it was a backward step from V4.
Steadily, the crew listened to the neighborhood and improved its creativity, even including some functionalities within the course of. The end result was Midjourney V5.2, which was already my favourite AI picture generator out there.
Midjourney V6 is a big enchancment on V5.2. It took all the pieces that was already good with V5.2 and considerably tweaked its mannequin to create extra detailed and correct photographs. The whole lot that I’ve complained about with V5.2 — nuance, textual content, realism — they’ve fastened that after which some.
One of the best factor is that we are able to solely anticipate it to get higher from right here on out. The Midjourney crew is already crowdsourcing person picture opinions by A/B testing to enhance its mannequin.
Mark my phrases: Midjourney V6 is a turning level in AI picture technology. In a single day tons of graphic designer jobs may’ve simply been worn out. This may’ve been the very first realization of that.
[ad_2]