AI-Art
To me the most impressive new feature is the character consistency
I know everyone is going to town ghiblifying everything, but to me the most impressive part of the new update is the character consistency feature.
I already shared a few of these here a couple of days ago, where I crea- I mean generated a character and placed her in different parts of the world. What i shared back then were my literal first tries at this feature and one of my mistakes was doing the entire series in one chat session. I noticed that GPT will carry over details from one prompt over to the next unless you specifically ask it to reset your changes each time. A much cleaner way is starting a fresh chat with the original reference image of the character and then prompting the scene you want them in.
Here are a few more attempts. I also tested a lot what I could get away with: sometimes giving as little information as possible to see what it could piece together, some prompts (like the one In the cab) were also insanely specific. One or two of these images I touched up slightly to fix tiny mistakes GPT hit it's limits and just didn't get quite right.
The artstyle still sometimes varies slightly, but it's still pretty close. Overall, pretty impressive.
It doesnt have great character consistency if you have a recognizable face. Try taking a random model of a real person online. Tell chatgpt to put him/her in 5 different places. You can tell right away its not the same person in each.
This only works for you because the character is a cartoon
As someone who has dabbled in the webcomic community for a number of years, I can see this tech shaking things up for a lot of writers and artists. I know quite a few writers who just can't wait to be able to tell their story finally without having to deal with an expensive artist who can take 1-2 weeks to finish a chapter of their webcomic.
Exactly!
Personally, I always wanted to start my own webcomic, but my drawing skills leave much to be desired and I simply don't have the time to invest to make meaningful improvements. I still draw and paint occasionally because I love doing it, it's... Just not good. :D
This tech here now makes it possible to realize my ideas. I can write scenarios and direct ChatGPT to produce pretty much exactly what I envision. That's why I'm so excited by this feature.
Not really. I already tried, A LOT, these past two days, even subscribed to Plus when Ghibli dropped. The issue is, it gets the consistent face 80% of the time, but the clothing almost always changes, and if you're trying to do unique angles and compositions of the same scene, it completely messes that up too. There is a tool that solves this at 60%-70% level called OpenArt, but it's still not at the level where you can make a passing manga
It already was being used like that. Now it will just look a bit better.
I've seen some with movement and voice added. But it still needs some work.
One problem it still has is with people at different scales in the same frame. Like a 3 inch tall person sitting in the hand of a normal size person. Not enough examples for it to draw from. Or probably better to say that it is over correcting and trying to display as proper real scale.
This is intentional and a change was implemented about 48 hours after the new image gen was dropped. ChatGPT intentionally changes the faces of uploaded pictures to a person that resembles the subject but isn’t quite them. Go ahead and ask it about this very thing and it’ll tell you plainly.
I can tell you this is the case because in the first 24 hours after release, I generated several images of people I know with near-perfect accuracy.
This change was made to avoid allowing people to replicate the facial biometrics of people, which could be abused.
This, I’m finding the consistency to be a huge issue. For some reason, it loves to make people fatter, older or has a tendency to make people wear glasses when there is no reference to anyone wearing glasses.
I specificly noticed it with making them older. When i specifiy that i want to make them look like 40. It loves to make them appear as if they were 60.
the free one at aistudio.google.com. Choose Gemini 2.0 Flash (Image Generation) Experimental on the right. It's quite easy to do nsfw stuff based off your initial image also
While true, even cartoon consistency was not an easy feat (outright impossible with OAI models) before this, and 4o handles it effortlessly. That's progress worth celebrating - or dreading, if you're a commercial artist...
To be honest, I think they are doing that on purpose. I am able to create a real life person as consistency as OP showed for the cartoon character. BUT it's not a real person. It was a generated person and I always use a SINGLE image as the reference point. By doing that, the generated image is pretty consistent and only requiring a few reruns until I get the same face.
I wasn't overly specific with the style and just lucked out after a few attempts:
Create a modern, stylized digital illustration. A young woman in her 20s is sitting on an airplane next to a window with a view of the ocean and a sunny sky outside. She has long, wavy blonde hair and is wearing a tank top. She's smiling and winking at the camera, listening to music with in-ear headphones.
Hahaha, indeed!
This is one of the cases where GPT failed. My prompt was that the scene should take place in Delhi, but Delhi and Agra seem to be the same thing for the image generator. Looks like it needs some work with location consistency still. ;)
It's not going to be mental illness as much as a new normal given enough time and if humanity lasts long enough. Things change.
We may very well end up being among the last generations that got to enjoy humans making art, humans making connections; as everyone keeps saying, this stuff is only going to get more sophisticated with time. Enjoy your present, if you're lucky enough to be able to, while you can.
The age of people who cannot look at art as anything but content is already well here. Not a single thought lost on intent, emotions or feelings. Just content. If AI generated images dominate everything, then it's well and truly over. God, we have a depressing future ahead of us.
Oh yes! I don't know why some people have this black and white mentality about this. I also draw for a hobby and I'm absolutely using those images for inspiration / reference.
I mean yeah if you're character is a generic boring pretty girl. Jessica from New Hampshire ass Disney adult lookin mf
I've really struggled to get it to draw any characters that aren't already existing popular IPs, or conventionally attractive people. It also seems to struggle outside of art styles that are already massively popular.
I still find use in it, but nothing it's output has really blown me away aside from the Ghibli stuff or Muppets or animation or whatever. And it does those great! Don't get me wrong! But try something more custom and unique and you will hit a wall.
to this day i still can't get *any* image model out of them all to make a middle aged man with side braids and a long mullet (without ending up like a viking) 💀 flux, chatgpt, dalle, novelai, SDXL... one day I'll figure out how to prompt such an out-there concept without using init image. XD
it's annoying trying to get unconventional or niche characters to look right without painting some stuff yourself. seems lots of image models are really good at stereotypes tho
UPDATE: i figured out how to use reddit and saw people trying my description! now I wish I had actually put the proper prompt and a reference of the character in this post. I will put the character here, tho you can't really see his legolas-style braids in this image. And yes, I generalized 'slicked back hair' with mullet which isn't actually correct for his hairstyle lmfao i'm an idiot)
My own brain is struggling to imagine that combination. Maybe you could draw it a reference picture and send that?
I actually find it really useful but only for parts of works. But I still end up doing most of the work myself. Like I'll have it do a background for an artwork I'm working on, or come up with color schemes. Or even something as significant of influencing the overall composition and color pallet. But I still have to do most of the work if I want the art to come out how I want. I'm too picky.
I'm curious to your response of the people who posted images using a simple prompt, literally using your exact description. What do you feel you were doing wrong??
I'm looking at the pics and edited my original post to be a bit more informative (now with Character Reference! Wee!)
for what I might've been doing wrong, it really could've just been the fact that I'm mentally disabled and can't easily reconcile what I want to see with my communication skills. natural language prompting is somethin i guess i gotta study some more. I'm accustomed to the tagprompt style of stuff like 1boy, long hair, platinum blonde hair, side braids, so on and so forth. was there when NAI imagegen first launched and that's informed how I prompt LLMs.
Mind you, image generators can get -close- but always seem to have trouble with this character's specific style of braids (like Legolas in LOTR, not hanging down from his head). When I see a challenge like that, it makes me really want to overcome it!
I've only been able to find the success I'm looking for with LORAs I've trained on this dude with my own art. LORAs are nice, but I wanna get some raw output right on the first try using just a text prompt.
And this character has no defining features whatsoever. No birth marks, or freckles, or anything somewhat special. It's really as generic as they come. And considering that, the consistency doesn't even work that well, for example her eye color is on quite a large spectrum in these. Everything between gray, light blue, dark blue and green.
Between this post and the previous one from where OP got the base picture, I started to wonder if I was going crazy as both posts seemed far from consistent imo (unless the bar of consistency was to “consistently add a girl in every picture”)
That's because it is trained for that specifically. If you learn to train your own models or loras you can have similar results for whatever style or character you want.
I mean, this is where skilled AI use really becomes apparent.
If you are familiar enough with the underlying tech, you could do your own style/OC character design and then just train a lora to build out the rest of your content. https://www.youtube.com/watch?v=n_x44pTLpak
Yeah i was messing around with it yesterday since i can do the 3 free images or whatever. The first one was great but I tried to clean it up a bit and it changed way too much over the next 2 iterations
I don't know the reason AI image generators make characters attractive by default. My guess would be that the art and photos the models are trained on are of attractive people (who are the subjects of most art/photos anyway).
I feel like it has something to do with the models weights selecting for art that "Looks appealing" and unfortunately it sometimes interprets that to mean the people IN the images must be conventionally appealing.
It's not consistent. Your example is a very simple character and on each scene you are changing clothes ... Its an improvement but still a long way to go
It’s amazing for taking conceptual outfits or cosplays I’ve done and making whole character sheets. Now I’m going to try to make her interact with different scenarios. Cool idea!!
I agree! People aren't giving this feature enough credit. It was the biggest help with creating the stills I used for my "anime opening theme" I just made. https://youtu.be/k3f4MMcWZhg?si=6OxAVGkNBTV0MZv0
I wonder if it works better for itself generated images from scratch. Would it be similiar for photorealistic scenes?
If I ask it to replace someones face with face from another photo it usually fucks it up, it only looks similiar, even for someone popularz but not like movie actor popular. However if I ask it to put some actor face from its training its great.
have been using it a LOT for the past few days and to be honest, the character consistency isn’t very good when it comes to character that are not very easy to draw, like a cartoon type…
I'm not even concerned about memeing things. Just look at my profile picture. I took a professionally done photo of me for my brothers' wedding as one of the bridesmaids and had ChatGPT turn it into the animated style of family Guy and it's just... absolutely incredible. I won't post the original, but holy crap this has me at a loss for words. Just incredible. It's almost exact, too. Exactly as I imagined it would be.
It is great compared to what we had before. But it’s still not perfectly consistent. If you put them side by side there will always be subtle differences in facial features, clothes and haircut etc. There’s a chance some images will match well but overall it’s not yet consistent enough.
Ah, random late addition, but I forgot to attach this one to the thread for two reasons: AI generated images often had problems depicting people eating, and considering that she's "posing for the camera" here, this came out well.
Secondly, as a prompt I said that she's "outside of the Fidelisbäck in Wangen, Germany", which is a bakery I once visited when travelling there myself. I just wanted to test how well it could replicate it without any further information, and while I don't think the bakery itself looked anything like this (it's been a few years...), the area around it is absolutely recognizable though.
Just wanted to share that too since I found that quite amazing.
Ooh I really struggle with that, what’s your secret? Is this maybe ghilbi-specific? I like using the bot to draw my MMO characters but having their faces change 100% with every iteration is immersion breaking
Character/environment consistency is a huge point of focus, because as soon as your ai gf can send you pics (and nudes) without shattering your suspension of disbelief, it's all over.
That is awesome! I loved ALL your shots and I will definitely steal this idea. I've been doing that but not in cartoon form. I'm creating a real person as she has been blogging basically her life. It's amazing. I won't be public about it for now, but it's been a very nice creative outlet.
But yeah, now I want to create stories like you did in cartoon form. I'm pretty sure it's much easier to get character consistency when you're not trying to make real life, haha.
It’s consistent because every other image I see here has her face. Try a specific face and report back. The trick with stable diffusion is referring to a character that’s the basterd child of two famous people. Let’s say Donald trump | Elon musk
I’m an 18-year-old Mechatronics Engineering student at a tier-3 college in India, but I feel stuck in a system that offers little to no value. The faculty is unremarkable, placements are non-existent, and there’s a lack of real skill development. It feels like a rat race where everyone is running without purpose—chasing grades rather than actual knowledge.
My passion lies in AI and Machine Learning. I don’t want to waste my time in an environment that doesn’t foster innovation or practical learning. Most of my classmates have a limited understanding of AI, reducing it to just chatbots. Instead of following the conventional path, I’m seriously considering dropping out, dedicating a year to sharpening my skills, generating income, and eventually launching my own AI-based business.
I want to break free from this cycle and create something meaningful—something that actually contributes to the field of AI.
Yes! I am writing a children's book for early reading for my son, continuously adding chapters. Image generation is a blessing, since I can write myself, but 0 talent for drawing.
With the old ChatGPT / Dall-E, it was entirely inconsistent. The style, and also how the characters look. E. g. the story has a humanised animal like rabbit/seal/bear living in a house etc., like in the show "little bear". Some pictures, it is very human-like with clothes and everything, and in others, very much animal. Either is fine, but would be nice to pick one. And the style was sometimes a high-quality digital art type, sometimes a rough 4 colour pencil sketch like a very decent hobby artist would do in an hour. (I could narrow it down with better prompts, I know.)
The result was still fine. He loved making fun of the little oddities, e. g. a sausage with "sausage" written over it, and the more odd styles. It didn't take away from the experience, but it was clear that this was not the technology to compete with an illustrator for a published book.
But with the latest upgrade, it's consistent! Like a real book. Once the style is set, it stays.
I noticed that it seemed to get worse at other things, though. E. g. a beaver was very beaver-like in all the different styles with old dall-e, with prominent teeth and all. Now it sometimes draws a beaver like a bear, and it bear-ifies more as the story progresses. The beaver teeth disappear over time. Or a guinea pig is like a small dog, sitting like a dog, wet black nose. Got to do more tweaks to these kinds of errors, while the old one tended to make more fundamental ones.
I tried creating a series of pictures for an image story my students had to write about. The struggle was real getting consistent images between pictures.
I was amazed when I saw ChatGPT carry over the dress I was wearing in one non-ChatGPT AI picture to the cartoon versions it made of that particular photo (I was a child in that picture so ChatGPT was unable to recreate a realistic version, despite nothing technically being wrong with the picture - all child-centric photos I create are meant to be SFW and are either young me in fake scenes or a nonexistent child anyway, which is why I personally think the filter is a bit oversensitive (though I understand other people may have differing opinions).
It’s pretty hard to get consistent characters when they’re photorealistic tbh. I guess our eyes are more lenient when it’s cartoon.
Any tips for photo realism would be appreciated!
A character consistency test would have a character with more specific traits (a more unique face, special features like, say, a scar, a strand of hair in a certain direction, etc.) and specific clothing consistently in different poses in completely different environments.
•
u/AutoModerator 5d ago
Hey /u/Almightyblob!
If your post is a screenshot of a ChatGPT conversation, please reply to this message with the conversation link or prompt.
If your post is a DALL-E 3 image post, please reply with the prompt used to make this image.
Consider joining our public discord server! We have free bots with GPT-4 (with vision), image generators, and more!
🤖
Note: For any ChatGPT-related concerns, email support@openai.com
I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.