[ home / bans / all ] [ qa / jp ] [ maho ] [ f / ec ] [ b / poll ] [ tv / bann ] [ toggle-new / tab ]

/qa/ - Questions and Answers

Questions and Answers about QA

New Reply

Options
Comment
File
Whitelist Token
Spoiler
Password (For file deletion.)
Markup tags exist for bold, itallics, header, spoiler etc. as listed in " [options] > View Formatting "


[Return] [Bottom] [Catalog]

File:00353-2700800976-girl, fac….png (365.08 KB,512x512)

 No.96625[Last50 Posts]

Anyone else been messing around with the stable diffusion algorithm or anything in a similar vein?
It's a bit hard to make it do exactly what you want but if you're extremely descriptive in the prompt or just use a couple words it gives some pretty good results. It seems to struggle a lot with appendages but faces come out surprisingly well most of the time.

Aside from having a 3070 i just followed this guide I found on /g/ https://rentry.org/voldy to get things setup and it was pretty painless.

 No.96626

File:[MoyaiSubs] Mewkledreamy -….jpg (217.55 KB,1920x1080)

Ah, yeah, I've been reading up on it. I downloaded some 7GB danbooru thing for it. I wouldn't trust /g/ with an Etch-a-Sketch so I won't follow a guide from there, but I've saved some links from other places:
https://moritz.pm/posts/parameters
https://github.com/sddebz/stable-diffusion-krita-plugin
https://lexica.art/
https://docs.google.com/document/d/1gSaw378uDgCfn6Gzn3u_o6u2y_G69ZFPLmGOkmM-Ptk/edit

I'll get to trying this eventually, but so far I've just been procrastinating a bunch since I need to install and run python and do other stuff I don't understand. My VRAM is also only 6GB and I'm not sure if that's enough.

 No.96627

File:00147-149750438-megumin, k….png (254.67 KB,512x512)

>>96626
I'm not too concerned with the theory at the moment and more just wanted to know what practically has to be done to get it running. That guide more or less amounts to downloading some git repo, a model (this is the sketchiest part but you already did it), and download python 3.10.6. Then run a bat file and it works. From what I can tell the web-ui allocates 4gb of vram as a default and you'll have to pass arguments to get it to run more or less otherwise. It should run with an nvidia card that has 6gb.

That Krita plugin looks interesting, will check it out later.

 No.96629

File:62b580de9b13374d1a11e690fe….png (816.09 KB,1075x1500)

The face looks very Korean, I wonder if it's because the archetype is very common so AI is probably trained on a lot of it.

 No.96630

File:grid-0052.png (1.69 MB,1024x1024)

>>96629
here are the other faces from that batch
the model i'm using is supposedly trained on a set of images from danbooru, not sure why it'd look korean specifically other than chance

 No.96631

I have exactly 0 (zero) interest in AI art. I have not saved a single file from one of them to this day even. I wouldn't call it being a hater, but they really are just fundamentally unappealing to me.

 No.96633

I don't get what all the fuzz is about either. If you've seen one image, you've seen them all. They all have this weird quality to them. Maybe it's that there's absolutely nothing meaningful about them. Doesn't help that most of these images look like bad crops.

 No.96634

File:spaghetti.png (742.16 KB,576x704)

I'm in favor of it as long as the results resemble 2D ideals.

 No.96635

>>96634
why is she stuffing her boobs with spaghetti....

 No.96638

>>96635
Ever wondered why girls smell so good? This is why.

 No.96640

File:patchy2.png (2.86 MB,2200x1536)

img2img is neat

 No.96641

>>96634
dat' polydactyly
wow, so even AI has trouble drawing hands

 No.96645

File:grid-0102.png (5.12 MB,2048x2048)

surprised by how well this batch turned out
some of these could pass for a mediocre artist's work

 No.96646

File:download (16).png (557.43 KB,512x768)

what a big girl

 No.96648

>>96641
>so even AI has trouble drawing hands
Yeah, it must be related to how the algorithm copies things it gets confused and can't do hands. With faces the parts have general locations and you can meld shapes a bit, but with hands it's trying to copy a bunch of different positions and angles into one and it breaks. Anime faces might be one of the best things since they don't even make sense to begin with in regards to angles.

 No.96664

File:1663263321-Beautiful waifu….jpg (39.5 KB,512x512)

I just steal other peoples prompts and add waifu.
Also if anyone else is on AMD on Windows, I followed this guide and it works https://rentry.org/ayymd-stable-diffustion-v1_4-guide.
Also Also if anyone can help me figure out how to change output resolution, that would be swell.

 No.96741

>>96627
>>96634
>>96645
Yeah, I've been somewhat surprised by the quality of the more 3DCG drawings I've seen from it, but when it comes to more anime style the AI falls short. There's probably more subtleties that it can't pick up in batch because of differences in artist styles that causes these amateur-level drawings.

 No.96920

File:00080-1017444043-full body….png (502.04 KB,512x768)

I've been trying to create my Pathfinder character with it. I think this is the closest I've gotten, but it's still not there yet. I feel like I'm close, though...

 No.96922

Alright, I'm diving in. Might take a while to get stuff set up and figure out what I'm doing, however.

 No.96999

File:a.png (357.08 KB,512x512)

Making some progress...

 No.97000

File:b.png (368.9 KB,512x512)


 No.97002

File:index.png (Spoiler Image,4.25 MB,2048x1536)

>>97000
Ehh, so many of these are horrifying so I'm going to put them behind a spoiler. I think I'm going to try that thing tomorrow where you can selectively "refresh" parts of the image

 No.97004

File:aaaaaa.png (343.09 KB,512x512)

oh no, this wasn't what I wanted at all!

 No.97005

File:hehe.png (372.77 KB,512x512)

I need to download the base model, this danbooru one isn't working the best for, well, non-"anime" stuff

 No.97008

>>97000
Has science gone too far?

 No.97009

>>97002
From AI i've used myself these arent so bad

 No.97013

>>97004
Is that an anthropomorphic "furry" Koruri?

 No.97028

File:waterfox_ZRSVcPMhoC copy.png (550.51 KB,953x1039)

Okay, what the heck. There's this "textual inversion" thing which is a chuuni way of saying "custom trained models" and there's a few hundred shared ones for you to look at and download.
But, uhh...
Okay, the first one is an interesting find. Second one makes me think "okay maybe this isn't a coincidence" and third is "okay someone on a spinoff is involved with this".
There's like 500 of these total, mostly generic pop culture stuff, but these three REALLY stick out.

 No.97074

File:00410-493637731-bird sitti….png (850.78 KB,768x1024)

Perhaps unsurprisingly, someone released a furry model trained on e621 and it's able to do penises and sexual poses that the other databases can't. I think I'll make a thread on /secret/ for posting my experiments with it because porn tends to derail things.
Also, uhh... be very wary of trying stuff on the default model. It's trained on images of real people and I think there's going to be some legal challenges in the future.

Anyway, give me some prompts and/or images and I can mess with them if you don't want to configure this thing yourself. I have 3 models- a hybrid default/danbooru one, a pure danbooru one, and the aforementioned furry one. But, I'm really bad at it and still need to learn how stuff works. I tried to turn furry patchy into a bird but now she's a human.

 No.97140

File:00485-622766441-cats.png (618.9 KB,896x512)

hehehe
this is pretty good, and it'd only get better if I had the patience to run it more times

 No.97141

File:grid-0087.jpg (1.07 MB,3584x2048)

>>97140
Tsugu and Hagi didn't survive most of the attempts

 No.97450

File:935116e8b74134e41664779a19….png (Spoiler Image,278.45 KB,640x640)

AI generated Raymoo titties (NSFW)

 No.97459


 No.97460

>>97450
the rendering and shape is good, but it's still making mistakes. Just that it's focusing on something simple so the mistakes are better disguised

 No.97461

>>97450
did you use the prompts from stable diffusion to make that?

 No.97462

>>97461
I didn't make it. I got it from the stable diffusion thread on 4/h/. I've been lurking it for a few days because it's a lot slower than the /g/ one and seems to have more technical discussion.

I just wanted to share it because I thought it was a pretty good generation.

 No.97467

There's a new model called Hentai Diffusion that was trained on Waifu (ugh) Diffusion and 150k danbooru/r34 images. I guess it'd be better at nudity?
https://huggingface.co/Deltaadams/Hentai-Diffusion/tree/main

You might need a huggingface account to download it. I have one because I was going to upload a set to train or whatever, but then I saw that there doesn't seem any way to use their GPUs without making it public and they have rules against nudity and I also wouldn't want to upload an artist's work for others to exploit for real instead of making stupid things on kissu.
Wish I had more VRAM. Oh well.

 No.97468

File:01040-248630214-girl's las….png (454.55 KB,512x512)

yeah. it's spooky.

 No.97470

File:01442-641596059-aria, aman….png (584.73 KB,512x512)

>>97468
it's a lot harder to rationalize 'soul' and a human touch when you like results that are entirely mechanically hallucinated. maybe this means that art is more useful to understand an author than anything else.

 No.97474

I've seen AI that write code and this reminds me of some of the shortcomings people had with it.

While they were trained on a large database, it would often be the case that the AI was technically copying programmers from stack overflow and using the raw input information into people's software.

I feel like it's almost the same case here. It took chunks from every artist it saw creating essentially a collage with little creative problem-solving of it's own... and when it does it's simply a confused error rather than inference.

I was much more impressed by reimu's breasts

 No.97476

it's almost as if machines cannot think

 No.97480

File:a6b328da6c4d0e2e087ea99aa2….png (305.21 KB,512x640)

>>96625
>>96626
I've been messing with SD since last week using https://github.com/AUTOMATIC1111/stable-diffusion-webui and the danbooru model https://thisanimedoesnotexist.ai/downloads/wd-v1-2-full-ema.ckpt
I've been doing only txt2img cause apparently I don't have enough GPU RAM for img2img (laptop GTX 1660Ti).
A couple of images turned out to be cute most are pretty bad, or maybe my prompts are bad who knows.
I've been thinking of setting up my local server to produce anime images 24/7 with some script that autogenerates prompts, not sure how its GTX 970 would handle it though.
>>97467
Not messed with lewds too much for now, going to download it and try.

 No.97490

File:20221004_130831.jpg (45.5 KB,448x640)


 No.97492

File:00190-2426397090-[blue_eye….png (347.18 KB,512x512)

nee nee, look at this military qt!

 No.97494

>>97492
really hoping that's a cute boy and not a g*rl

 No.97496

>>97490
Yes, science has gone too far.

 No.97498

>>97496
You will live long enough to see robotic anime meidos for domestic use and you will be happy.

 No.97499

File:00006-863090008-anime youn….png (490.87 KB,704x512)

>>97490
It's kind of interesting to see a real artist use it. I'm assuming he did the img2img thing which uses an image as a guide since it's got his style's wide face and ludicrously erotic body proportions.
This is a good example of how generic it looks when compared to the real thing, which you can't really get around since generic is exactly how it's supposed to function. In theory people can (and will certainly try) to directly copy individual artists, but so far it's pretty bad at that.

 No.97501

File:00203-285684822-[blue_eyes….png (243 KB,512x512)

another cute (female)

 No.97504

>>97474
When you really break it down, "AI" art is more or less the same thing as procedural level generation in games. The computer is provided with a set of rules, and then randomly generates something that follows those rules.

That's also why I can't see it outright replacing artists like a lot of people are afraid (or if they're psychopaths, hopeful) it will. You can generate all of the level design for your game procedurally, and a lot of games do (minecraft, for example). But "level designer" still exists as a profession for a reason.

 No.97516

File:grid-0024.png (2.92 MB,2048x2560)

the many faces of /qa/-tan

 No.97517

>>97516
kinda neat

 No.97535

File:grid-0077.png (5.09 MB,2304x2048)

>>97532
Well, the thread is mostly to post stupid AI things, but only a couple people are doing it and it's annoying to run for me personally because it interferes with videos or 3D programs I have open most of the time.
Also I was mostly just testing how well it is at doing penises and the answer is that the furry one is passable, even on humans, but I won't derail the thread with porn

 No.97582

File:00171-2295644611-bird gba-….png (1.99 MB,1024x1024)

The "GBA Pokemon" embedding thing really isn't working for me. None of them are. I think you're supposed to use the exact model for them, but I'm not going to download a bunch of those since they're like 3-8gb each.

 No.97623

File:dumb arguing.gif (1.33 MB,1280x720)

>>97621
I completely agree with this cute anon.

 No.97627

File:[mottoj] Tsukuyomi Moon Ph….jpg (109.55 KB,1024x576)

The serious discussion in this thread is being moved to a separate thread that is soon to be made. Brace for impact

 No.97705

File:grid-0000.png (1.82 MB,1920x1024)

Here is Laura that had an AI background generated after I masked it out. It was a blank white background before

 No.97801

NovelAI's model has been leaked. hehehe. Meaning you can do it offline without paying them.
It's 52gb with multiple models, and I doubt I'll be impressed but I'm torrenting it anyway.

 No.97802

>>97801
Can you post the link?

 No.97803


 No.97805

>>97803
Thanks, adding it to the hoard.

 No.97806

>>97803
But someone is replying about 'python pickles' and I have no idea what that entails. I guess he's telling people that it could contain a virus or something or otherwise have code in it? There's this link but I have no idea what it means: https://rentry.org/safeunpickle
Does anyone here know python and can tell what the thing above does? They made it sound like it's something to use to check for malicious stuff or maybe I interpreted it wrong.
But, people on 4chan are already using this so I think it's safe

 No.97807

File:1632939770819.png (495.82 KB,1024x1024)

>>97806
Pickle is a data serialization library: https://docs.python.org/3/library/pickle.html
Serialization means turning in memory data like objects into a format that can be stored on disk or sent through a network. JSON is another common serialization format.
I don't use pickle much but unlike JSON which is plaintext, pickle is binary so when you deserialize it yes it's possible that arbitrary code hidden in the data can be executed.
>https://rentry.org/safeunpickle
After a quick glance it looks like that code overrides some of the functions described in https://docs.python.org/3/library/pickle.html#pickle.Unpickler
The overridden "def find_class(self, module, name)" seems to implement some kind of whitelist so that only certain kinds of data(I guess considered safe).
I can't guarantee that code actually protects against possible code execution though, if I were you I would download it if you care but wait some time before executing it and see what happens.

 No.97809

>>97803
All the AI talk is hurting my no-knowledge-on-AI brain. Apparently there's going to be a part
2 to the leak, can't keep up with /g/ but am happy to download/seed it though.

 No.97815

>>97803
anon created a guide, probably 100% the real deal now
https://rentry.org/sdg_FAQ

 No.97826

File:00197-3287249658-((([Remil….png (391.67 KB,768x768)

To go with the story of me playing games with Remilia in that Character AI thread.
This is Remi's gamer pose

 No.97830

File:test.png (320.86 KB,384x640)

I want to kissu her!

 No.97838

File:00002-4009721508-huge_brea….png (Spoiler Image,235.25 KB,512x512)

>>97830
I also made a loli with pink hair... I gotta get the GPU stuff set up though, this took me 10 minutes and is obviously far from the bleeding edge of this stuff.

 No.97858

File:00027-1587389607-large_bre….png (208.6 KB,512x512)

Got another nice one.

 No.97875

File:20221008_181546.jpg (61.52 KB,512x768)

impressed

 No.97896

Interesting how this works.

 No.97897

>>97875
my wife chino is ballin'

 No.97954

>>97490
I swear to you guys I was arguing on another corner of the internet that I'm not interested in AI because it couldn't create art in the style of a particular artist, and the artist I was referring to was literally Zankuro in specific, yet here we are. I was crushingly naive. I wonder how far off we are from it making lewd gifs in Zankuro's chubby loli style...

 No.98070

File:00967-1113753283-1girl, ((….png (490.36 KB,576x576)

Utawarerumono riding a banana

 No.98072

File:00980-2759234159-1girl, ((….png (481.14 KB,576x576)

Unsurprisingly it fails to capture Kuon's beauty, although I don't know how to do the tagging with this for Kuon_(Utawarerumono) so I took a guess from what I think I remember seeing.
This one came pretty close to getting her face I think. But, I need to do a thing where I train it.
This is something I/we need to read up on that apparently is a big deal: https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/2284

 No.98094

File:grid-0134.png (2.23 MB,1344x1152)

Wow, embeddings are strong. There's an embedding someone made on 4/h/ for abmono who is quite a well-regarded Miku artist.
This is a combination of abmono and Wakasagihime. Even though I didn't name Miku, abmono images will impart Miku's outfit. Pretty crazy.

 No.98095

>>98094
>abmono
Do you mean abmayo?

 No.98096

File:01207-107125510-(masterpie….png (351.08 KB,448x576)

>>98095
errr yeah, stupid brain

 No.98097

>>98095
it works because mayonnaise is a a もの, so it's all cool

 No.98105

File:1665622868689087.webm (Spoiler Image,2.87 MB,512x640)

People are making animations with it somehow using a script. (webm contains nudity)

Copied /h/ post:

you can already make a video
https://github.com/Animator-Anon/Animator
this was one an anon made
afaik the keyframes were something like
Time (s) | Desnoise | Zoom (/s) | X Shift (pix/s) | Y shift (pix/s) | Positive Prompts | Negative Prompts | Seed
0 | 0.6 | 0.9 | 0 | 0 | sleeping in bed, under sheet | | -1
2 | 0.6 | 0.9 | 0 | 0 | shower, washing self, naked, from above | |-1
4 | 0.6 | 0.9 | 0 | 0 | eating breakfast, dressing gown| |-1
6 | 0.6 | 0.9 | 0 | 0 | Sitting on a bus, uniform, from below | |-1
8 | 0.6 | 0.9 | 0 | 0 | on stage, bikini, tattoos, singing, full theatre, bright lights, microphone | |-1
10 | 0.6 | 0.9 | 0 | 0 | drinking at a bar, cocktails, black dress, cleavage, earrings, drunk, flirty | |-1
12 | 0.6 | 0.9 | 0 | 0 | bed, (doggystyle sex:1.3), pubic hair, 1girl, 1boy | |-1
14 | 0.6 | 0.9 | 0 | 0 | passed out in bed, under sheet | |-1


I couldn't open the webm in waterfox or firefox, but it worked with brave and mpc

 No.98107

File:01461-2527489783-masterpie….png (304.43 KB,512x512)

This doesn't look like an abmayo Kuon at all, but I like it. Very yukkuri.

 No.98108

File:grid-0163.jpg (526.52 KB,2048x2048)

Remember Eruruu? This is what she looks like now (apparently)

 No.98111

>>98108
The miku virus infects all...

 No.98151

File:grid-0249.png (2.92 MB,1536x1536)

Ehh.... in theory we could make a Koruri embedding thing, but I think that'd be disrespectful so even if I had the VRAM I wouldn't do it.

 No.98153

File:grid-0258.png (3.18 MB,1536x1536)

Sometime in the bleak future, a cybernetic Tenshi eats a corndog
(some of the results of these tags are pretty disturbing, while some like the lower left here are pretty damn cool even if they aren't really showing what I wanted)
I can really see how you could use this for ideas as an artist or modeler or anyone else in a creative field

 No.98162

File:02436-1162997752-masterpie….png (438.45 KB,512x512)

Good way to find embeddings on 4chan: https://find.4chan.org/?q=.pt (Warning: NSFW thumbnails images are likely)
(They're pt files)
Just grabbed a ZUN one that I'll try later

 No.98388

gonna go learn how this all actually works

 No.98691

File:102281918_p0.png (782.64 KB,1000x1080)

There's a fad that started with an AI generation thing with a glowing penis that has artists imitating it. Kind of meta.
https://www.pixiv.net/en/tags/%E3%82%B2%E3%83%BC%E3%83%9F%E3%83%B3%E3%82%B0%E3%81%A1%E3%82%93%E3%81%BD%E8%8F%AF%E9%81%93%E9%83%A8/artworks

From one of my favorite random creative artists (and maker of that one furry Patchy)

 No.98696

File:102251504_p0.jpg (317.4 KB,617x617)

>>98691
Heh, some of these are pretty creative. Also where's the original image that inspired it?

 No.98698


 No.98747

that's just a bioluminescent mushroom dude

 No.98895

File:Fget6dGaAAA7FX_.jpg (356.97 KB,2100x2160)

>>98691
The sperm of the sea

 No.99075

File:no this is NOT Laura, I sw….png (Spoiler Image,717.24 KB,576x768)

I wonder if people are just really bad at thinking up concepts for computer-generated porn or if their tastes are really so plain and boring. From what I've seen on imageboards, and I don't mean to sound too conceited, I'm clearly in the upper echelons at throwing together these amalgamations of theft and I don't even have a great GPU to create the custom models. It seems so easy to me so I don't understand why everything is so ugly and generic when I see what other people are doing.
It makes me think this "AI revolution" is crippled from the beginning because it still requires human input and people have no motivation or direction. It's like when you show a bunch of complex castles and cities built in Terraria or Minecraft, but then you see how most people play and it's simple square prison blocks made from dirt.
Well, I think the crisis is averted because people are still dull-witted and boring even with such a thing at their fingertips.
Quite an addicting thing, however, and I need to pry myself away to work on actual creative pursuits (and I'm saving images and concepts to use for inspiration so that part is actually true)

 No.99076

>>99075
The anatomy is quite weird and of putting but I guess most Ai images are like that.
I never tried it, maybe I will.

 No.99078

File:dancer.png (3.44 MB,1024x1536)

>>99075
Part of the problem is that niche topics are inherently hard to generate, since there isn't enough training data for them to turn out good. I am also somewhat put off attempting anything overly complicated (especially stuff with multiple characters), because the more complex the image, the more opportunities there are for bad anatomy/etc. to show up, and without any easy way to selectively fix those elements, I'd usually prefer to generate something basic done well than something more interesting that has mangled hands or whatever. What I do really wish though is that people experimented more with the style-altering options - many of those tags are well-enough populated to work great, and there is no added difficulty in using them (in fact, ones like greyscale even make it simpler), but they can go a long way in avoiding the generic AI-art look.

 No.99080

>>99076
Yeah it's not perfect, but the thing with porn, especially niche stuff that doesn't otherwise exist, is that the brain overlooks it due to the excitement and stimulation over the rest. It's like your choice is a handful of doodles from some guy from 2008 or this thing creating new amalgamations of fetish fuel with errors. Most people have no reason to use this for porn, really, since it's easily inferior to something created by hand. But if that stuff made by hand doesn't exist? Yeah...

 No.99081

File:ZZX 0229.jpeg (Spoiler Image,390.19 KB,2892x4400)

>>99080
It's not that niche, though the musculature of this image looks familiar(though done badly due to Ai). I wonder if genres with a smaller selection for the AI to draw from like futa will end up creating images that lean more heavily on one individuals unique style than perhaps others would. It's not just abs but this pen*s looks very similar as well, in fact it looks like the ai has taken it and recoloured it and it's part of why the image looks weird, the pen*s was taken from an image where the body is positioned differently.

 No.99085

File:00318-262339623-(masterpie….png (Spoiler Image,627.37 KB,576x768)

>>99081
Oh, I'm quite aware dickgirls haven't been niche for like 15 years. The fact that it's ubiquitous is also why any simple image doesn't work, it's no longer manna from the heavens by virtue of existing. Find me some quality newhalf mermaid art with a human penis instead of some weird "realistic" furry dolphin version. Also, give her a nice soft belly, a mature face, a warm smile and an apron. Also it's Takane from Idolm@ster, a girl that shares the face of the first 2D girl I had a crush on (since Luna is too old/obscure to have training data). Here's one I just generated, although it has some pretty noticeable errors.

People have fantasies more elaborate than "a girl with breasts of any size, preferably alive" and it's not any different in my situation just because a penis is involved.

 No.100156

File:xy_grid-0150-3577515976-(m….png (2.58 MB,2304x950)

I'm going to start dumping info and stuff in this thread, although I think most visual experiments will be posted on /megu/ since I'm mostly into this stuff for niche ero.
Someone asked how you could make transformation stuff, and this is how. Although, I had to ask on 4chan because I had the syntax wrong.
The syntax is [A:B:#].
A and B are the two things you want to morph over image generation.
# is the percentage of influence one has over the other, as a percentage (.1 is 10%). In my image example the left-most image is 10% angel and 90% demon girl.

 No.100157

File:firefox_IkHhhePva5.png (41.5 KB,762x647)

>>100156
To make an image set like this you want to go down into Script and use X/Y Plot, then select Prompt S/R.
In this example I have it start with 10% angel and 90% demon and then end with 90% angel and 10% demon.
The X/Y script is a massive help in finding the ideal settings, so people use it a LOT.

 No.100160

File:00297-3520136844-masterpie….png (368.83 KB,640x384)

>>98162
>pt files
What exactly are these? I think I heard that these are "hypernetworks" or something and that you can use them to fine tune a model, or to bias it into giving different results or something. I can't really seem to find any though? Not that I've looked very hard, I'll admit, but it seems people are far more interested in specific models than hypernetworks. Likewise, what's the deal with merged models and pruning?

 No.100161

File:embeds.zip (1.18 MB)

>>100160
.pt files show up in a few places, but when people are talking about it and it's not troubleshooting it's about hypernetworks. Back when I made that post embeddings were the cool thing (and they also use .pt), but now it's hypernetworks. They're basically fine tuning things for a certain concept, but it's almost exclusively specific artists or characters. IE this was using the embedding that mimics abmayo >>98094
Embeddings are called by name in the prompt, whereas hypernetworks are loaded in the Settings. Embeddings are 20-80KB whereas hypernetworks are 85+MB. I personally liked embeddings a lot more not only because of the file size but because you could combine them. I guess hypernetworks are better and that's why everyone uses them?
Here's my embeds folder. Some of them were just uploaded without labels and I never figured out what they did, like the 3 named "ex_penis".
Extract the folder in the main WebUI folder so it's like:
stable-diffusion-webui\embeddings\bleh.pt
and then you should be able to use them.
The badprompt ones is actually something newer. You put it in the negative prompts with 80% strength, I.E I use
(bad_prompt2:0.8), lowres, bad anatomy, etc

 No.100165

>>100156
Does this only work for two tags? Or can you batch together multiple into the percentage.

 No.100166

>>100165
Probably, but I haven't checked. I guess it'd just be A:B:C:# for 3 and so on

 No.100167

>>100156
>>100165
Interesting sort of addendum I found for doing this sort of thing:
>you can [x:y:z] / [:y:z] / [x::z] to make the ai draw x then y at stemp z (or percentage of steps if you put a decimal), which works great for stuff like [tentacle:mechanical hose:0.2] to make the ai draw tubes everywhere, or you can do x|y... to make the ai alternate between drawing x and y every other step; you can put any number of things here e.g. x|y|z|a, but obviously the more you use this the more steps you need, in general

 No.100200

>>100167
That's exactly the post I saw that made me want to try it. I heard people mentioning this functionality weeks ago but completely forgot. It seems rare that anyone uses it, but it could be really great

 No.100202

>>100157
When I try making one of these I get a
>RuntimeError: Prompt S/R did not find angel wings:demon girl:0.1 in prompt or negative prompt.
Does this mean I need to put the tags into the prompt somewhere? Or attach an X to them?

 No.100204

>>100202
The first thing listed there has to be in the prompt for the rest to replace it. You should be able to hover over it for a tooltip.
I.E
masterpiece, picnic, turtle, eating banana

in the script you'd put
banana, burger, corndog

 No.100239

Gotta say, reading the documentation for all this stuff regarding stable diffusion has really impressed me with how much work and development has gone into making the open version as great as possible, beating out even its premium competitors.

I guess this is the true power of computer dorks trying to get the perfect porn.

 No.100503

File:1670036798856.jpg (233.99 KB,800x1257)

So I saw this one website making the rounds that's 2D-ifying or whatever images of real people or characters, and I have to wonder how you'd do the same with an image of your own in Stable Diffusion. Like say you wanted to draw a certain character, from and image, in the style of Asanagi maybe wearing some different clothing. How would you do that?

 No.100504

File:box on beach.png (104.7 KB,512x512)

>>100503
You use img2img, which can itself be guided with a text prompt like txt2img so it's really more like img+txt 2 img.
As an example here is an image I drew

 No.100505

File:2022-12-03-18-31-45-393406….jpg (1.02 MB,2432x3143)

... and here are some variations created with the Stable Diffusion 1.5 model using the following prompt that matches the image contents:
"open cardboard box on beach, sunny day, waves crashing on shore, frothy sea, deep blue sea, photograph, daytime"

 No.100506

File:grid-0228-2284799521-maste….png (4.68 MB,2048x2048)

>>100503
It's likely a very generic prompt that has a denoise of like .5 or something to keep the general shapes but still alter it enough to be noticeable. I saw someone point out that they look like Genshin characters, so it's probably using something trained on its images.
I have a Genshin hypernetwork for that so let's see the result when I throw some stuff in: (pic related)

I don't want to spend a bunch of time trying to replicate it, but you get the picture. It probably uses a few traditional artists tags since people have done lots of examples of those, including myself

 No.100507

File:2022-12-03-18-38-23-227899….jpg (1.52 MB,2048x2048)

... and if I do the same prompt and same settings as in >>100505 but without the input image, this is what I get.
The cartoony nature of my image is at odds with the Stable Diffusion model's realistic photograph style. Getting anything done with this sort of thing is probably best when it's iterative, mixing both txt2img and img2img.

 No.100551

File:firefox_Kc9SXtwoPZ.png (56.77 KB,506x754)

I have learned some things to make things a bit easier or cooler, although you might already know them. On the right-most part of the Settings tab:
This "show image creation process every N sampling steps" at the top of the image here is apparently what lets you see, the uhh... image creation process. I had no idea this was here since I was expecting it to be more prominent.
At the bottom of my image you'll see a text box. Replace it with this text:
sd_model_checkpoint, sd_hypernetwork, sd_hypernetwork_strength, sd_vae
And it will show those at the top of the main window so you don't need to go into the Settings tab every time you want to mess with the hypernetwork or vae. Pretty cool!

 No.100597

File:explorer_Kisf6EEziK.png (789.06 KB,1098x828)

Surely you knew this was coming.
I am going to begin the process of bringing beloved Kuon to this so that she can be generated as easily as a 2hu! I'm debating whether to keep it centered on her and use a variety of artists, or to go all out and restrict it to official art and use hundreds of images to try and get the Aquaplus (Amaduya) style. I'm leaning towards the latter, although again I am filled with guilt before even attempting it.
Well, at least he's already successful and famous in some circles and doing games and doujin stuff and is well appreciated so it's not like I'd be robbing him of work? Bleh. The ethical quandaries of this stuff...

 No.100598

>>100551
Oh, after testing with this it does seem to greatly increase the time it takes to generate stuff, so maybe only use the 'image preview while generating' thing if you're unsure where to stop when working on settings, and then set it back to zero when you're actually producing a bunch of stuff.

 No.100613

File:02543-986531908-(masterpie….png (347.39 KB,512x512)

A lot of knowledge about this stuff requires scouring and searching or surreptitious posts, so I'll try to share some more info.
This time I'm going to talk about two Extensions that I use a lot.

The easiest way to get new Extensions is to go to the Extensions tab of the WebUI and then go to Available and hit the "Load from" button with its default URL. From there you can install stuff, which will then show up on the Installed tab. For a lot of this stuff you need to restart the UI from settings if not restart the .bat file itself.
The ones I use and can give detail on:

Dynamic Prompts: https://github.com/adieyal/sd-dynamic-prompts

This is used to randomize creations on each image generation. You can use it with new words in the prompt, but I've never done that. Instead, I mainly use this to call random effects from wildcard text files. You create a text file with a new line for each possibility put the text file in /extensions/dynamic-prompt/wildcards/text.txt and then call it from the prompt by its name with two underscores on each side. For instance you can make haircolor.txt and put this in it:

green hair,
blue hair,
red hair,


and then put __haircolor__ in your prompt and it will randomly pick one of those each time an image is generated. This means a you can make a batch of 10 images and come back to different results. This is really, really good if you're just messing around to see what works. It can also call other txt files from inside. I'll share my wildcard text files soon. It also has a "Magic Prompt" system that I've never used, but it could be cool? Beats me. Someone else do it.

TagComplete https://github.com/DominikDoom/a1111-sd-webui-tagcomplete

It autofills booru tags for stuff based on danbooru, which NAI and the 'Anything' model is. Really, really nice, but can also be annoying at times with the pop-up. Unless you have tags memorized this can help a lot. Speaking of, you should make yourself accustomed to danbooru's tags:
https://danbooru.donmai.us/wiki_pages/tag_group%3Aposture
https://danbooru.donmai.us/wiki_pages/tag_group%3Aimage_composition
https://danbooru.donmai.us/wiki_pages/tag_group%3Aface_tags
https://danbooru.donmai.us/wiki_pages/tag_group%3Ahair_styles
https://danbooru.donmai.us/wiki_pages/tag_group%3Aattire
etc

 No.100614

File:wildcards.zip (27.04 KB)

Here are my wildcard text files. Some of them I downloaded and modified, other stuff was as-is. You can get a pretty good idea of the stuff you can do with this.

 No.100656

File:firefox_ouJxTVYHXq.png (577.01 KB,1184x789)

This deepdanbooru thing that scans images for tags is really impressive. It's not perfect, but good lord, we could only dream of such things a few years ago, right?

 No.100657

>>100656
this sort of thing has been possible for a few years, but without danbooru datasets used for art training it wouldn't be easy

 No.100659

>>100656
It's already a few years old, isn't it? Here's the original Reddit thread about it, and it's been discussed on 4chan in the past as well. I recall there originally being some talk about the possibility of it actually being used for tagging, but it's not good enough to replace manual tagging anytime soon and is otherwise little more than a novelty.

 No.100673

Oh. Wow, it's 4 years old?
Well, anyway, it's really cool how it's used here for immediate benefit. You can use it to assist in image tagging for training, but also as a building block to generate new images.

 No.100707

File:firefox_r6kosW4vBR.png (44.18 KB,958x555)

So, the training setup I put together from what I read. Much of the information is from the discussion here: https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/2670
Also, thanks to people on the /h/ board of 4chan as those guys are great. Don't use /g/, but maybe that should go without saying.

Modules: (I checked all of them because it's unclear what they do. Everything was checked by default except 1024 which seems to be a new addition)

Layer Structure: 1, 1.5, 1.5, 1.5, 1. This is called a 'deep network' as opposed to default or wide. Default is good for most things, particularly if you have a low amount of images (20ish was mentioned). Wide is for specific things like an animal, character or object. Deep is for style, which most people seem to be using hypernetworks for, with embeds for characters. It doesn't have to be, but that seems to be the pattern forming.

Activation Function: Softsign. lots of math talk and graphs I don't understand, so I just went with the recommendation.

Weight intitiation: XavierNormal. Same thing as above

Layer normalization: No. I haven't seen anything informative about it, but no one seems to use it

Use Dropout: Yes. I heard it's good if you have a "larger hypernetwork". I think that means the numbers in the Modules up there and also the amount of training images used. I had 90ish images and did the mirror image thing to turn it into 180ish, but that's definitely not as good as 180 unique images. I don't know if it was good or bad that I used Dropout, but it didn't ruin anything

 No.100709

File:firefox_sIU6yWlBwt.png (83.95 KB,949x1062)

>>327
And once you get to the Training tab you can load the hyper you just created (or one you've downloaded maybe? that part seems questionable)

This tab is for training embeds or hypernetworks, but I've only done hypernetworks so I can only talk about that.

Batch size: I haven't been able to find conclusive information on this since 'batch size' is text that is shared in every prompt so you can't just search for it by name. It uses more VRAM, but might not necessarily be better at training. The ONE comment I've found on it says that you could increase it instead of lowering learning rate later on. I'm already at VRAM limit when training and having a video and photoshop open so I don't touch this.

Learning Rate:: I think people start with the default for these. Only the hypernetwork number matters for hypernetworks. I see people add a decimel point in front of the 5 as the training steps reach 5000 to 10000, so I copied that. It sounds like the lower number is better for finer detail once you've established things

Gradient accumulation:: A newer thing, supposed to assist in training rate somehow, but I don't know how. It mentions something like "learning in parallel" or something. I don't know. People say to use it and set it to like 5 so I have it at 5.

Dataset Directory:- The image with the folders. I could talk about images, but I'd mostly just be repeating this: https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/2670

Prompt template file: This is a list of basic prompts that are added to previews alongside the included tags attached to each image. People say it's fine as default, but might be something to mess with if you want to check for specific stuff?

Width/Height: Keep it at 512/512

Max Steps:: How far the training will go. This is stuff that takes days, though, so I'm not sure how useful this is, because of something I'll talk about in a sec. I suppose it's good if you only want to run it for a set amount of time.

Save an image every n steps:. It saves an image as if you prompted it with random tags included in your training folder, but it can make freaky combinations that you wouldn't normally use so keep that in mind.

Save a copy of embedding every n steps This is an important one and why I didn't care about the Max Steps thing above. It saves the hyperwork with the number of steps to the folder automatically. By default it's at 500 which is where I have it.
This means the folder will look like:
test-500
test-1000
test-1500
as it trains for longer periods of time.
There's an option in settings under Training to Save Optimizer state, which allows you to resume training from these saved files. VERY important!.

Note: To use the hypernetwork (or resume it from file) you need to move it from the saved directory (default textual_inversion) and move it to the models/hypernetworks folder

Save images with embedding in PNG chunks I think it lets you use PNG info like normal generated images. I kept it on.

Read parameters from txt2img For preview images it takes what you typed in the txt2img tab. I never used this since I wanted a variety of images, but it could be useful? I read to never use tags like 'masterpiece' or 'quality' there, though.

Shuffle tags Yes. It adds more variety to the images by changing priority or something.

Drop out tags with prompts: I think it drops a certain percentage of prompts per generated preview image. I kept it off, but not sure. It's just preview images and not the actual training itself, so I guess it could improve or hinder perceived accuracy there.

Latent Sampling Method I only hear people mention deterministic so that's what I went with.

 No.100830

File:explorer_dWd3RdfklC.jpg (424.05 KB,1395x1111)

I've been cropping and positioning images the past few days for the Amaduyu/Aquaplus thing I plan to train. I started rotating some of them, too, and might go back and do that with some of the images I've already done. I've been kind of OCD about focusing on this because it will take a long time to train since I want it to be extremely thorough and I'm sure I will make mistakes. I'm starting to worry a bit about how much of it isn't Kuon. I think I might need to make one specifically for Kuon or use an embedding to pair with it somehow? Not sure how they work together.

 No.101165

File:firefox_glhn53KTf1.png (347.91 KB,1474x754)

I really didn't expect it to get this one this well. Man, this stuff is good.
It definitely shows that you have to clean the tags manually, though, as her tail isn't there. Got a bunch of iamges that weren't on a booru so they don't have tags.
But, I'm not sure if I'll use it because it looks too complex and would probably mess things up

 No.101429

File:explorer_fhrs7k1XEO.png (1.69 MB,1673x1107)

So much effort...
People have made other progress with cool extensions and stuff, but I can't remark on it yet since I haven't messed with them

 No.101433

File:BooruDatasetTagManager_MgO….png (924.43 KB,1898x618)

cleaning up tags
the auto tagger does incredible work, but it's not perfect. For example I had to add the furrowed brow, :o, and portrait tags

 No.101512

File:kuon31.png (458.26 KB,512x512)

Apparently having a bunch of plain backgrounds is bad, so now I have to go back and add backgrounds. I guess it's good that I have an Utawarerumono artbook with some backgrounds on it, although I had to scale them up with waifu2x. This is so much work and I regret doing it, but I'm too far now.

 No.101642

File:explorer_tXnkh1IWRr.jpg (460.15 KB,1643x1116)

Okay, I'm finally going to bump this thread because I'm now finally training it! This new method is like 20x faster somehow. The prompts are randomized during this so some of the abominations you see are because weird tags are being combined like 'leg' and 'bed', but no '1girl'
YES! IT'S ALL PAYING OFF! MY HOURS AND HOURS OF LOSERDOM IS REACHING FRUITION! All the wasteful pruning, the tag specification and elimination!
HAHAHAHAHAHA!

(hope it learns the proper position of utawarerumono ears though...)

 No.101643

>>101642
they all seem kind of kuon like

 No.101645

File:aqua-1450.png (340.19 KB,512x512)

>>101643
Well, it's the same artist so they should look somewhat similar, yeah.
But in that image I posted I can identify different characters. I'm not sure how exactly it works because sometimes it's very random, but other times it's obvious, like how this is an attempt at Kamyu (even though I never labeled her and it wouldn't understand it anyway)

 No.101647

File:aqua-3325.png (405.31 KB,512x512)

heheehehhahahahahHAHAHAHAHAHAHAH
it's learning... IT'S LEARNING!

 No.101650

>>101647
i think it can get more perfect

 No.101656

File:02064-4265108999-1girl, an….png (10.24 MB,2816x2688)

>>101650
Hmm... after a full night I'm not sure if it can. At least overall, I think getting a perfect one is still going to be rare. I see what people are saying now and that you're probably not going to notice gains after 20000 steps or so. But, I think it still needs improvement somewhere.
I need to look at it and see if there's stuff I can improve upon, which basically means I'll train it again at a different rate. When I tested making a hypernetwork a few weeks ago one night was like 2000 "steps", but now I just did 58000. (meant to do 50000, but forgot I resumed from 8000). It saves a progress of its current progress, so if for instance it made the best result at around 20000 steps and then went haywire, you can grab the backup it created at 20000 and either just use that as the completed product or resume training from it.

Well, now that I've done the style hypernetwork I should try making 'embeds' which I'll use to teach it characters. It still doesn't know who any of these girls are so I can't actually call them directly and instead need to use their traits and hope it arranges it correctly. For instance it'll never know Kuon's proper clothing or ears unless I create an embed which I can invoke from the prompt. From what I've read, when training an embed you label everything EXCEPT what you want it to call.

Maid Kuon at the computer! It really can't do hands or keyboards, but that's not specific to this.

 No.101657

File:02077-4265108999-1girl, an….png (9.17 MB,2816x2688)

>>101656
...and here is the exact same prompt and seed and everything else, but with my hypernetwork disabled. Well, maybe it IS messing up hands and arms a little bit. I'm not sure if that's something I can fix. Do I go back and pay lots of attention to hand tags? I guess I could try that...

 No.101676

File:02132-2730948104-1girl, je….png (945.23 KB,768x1024)

Hmm, so this is an instance of improper tagging showing up. Karalou's slave collar is technically a collar, but I must not have eliminated its generic 'collar' tag, it's grabbing from Karalou for bride Kuon's 'detached collar' here. This stuff is so crazy

 No.101677

File:02124-169421348-1girl, jew….png (895.63 KB,768x1024)

>>101676
and this would be a much more accurate representation of what it's supposed to look like

 No.101683

>>101676
>>101677
dont remove it, kuon looks sexy with a slave collar

 No.101684

File:BooruDatasetTagManager_3iu….png (1.05 MB,1287x715)

>>101683
It wouldn't be removed exactly, just require an actual 'metal collar' prompt to show up. I guess I should remove 'breasts' from everything, too. I'm not sure why boorus have redundant tags like that.
... I think?
'detached collar' isn't even here, so maybe this is something I can't avoid at all. Or maybe this is the base data... I guess I should do some testing... either way it probably wouldn't hurt to specify things more

 No.101702

File:00031-3720141358-1girl, so….png (648 KB,576x768)

There's an extension called DAAM that creates a heatmap of the effect of a prompt on the resulting image. It's really quite amazing that this exists.
This is the result for "low-tied long hair". It's supposed to add, well, long hair that's tied. However, it's broken and seems like it's applying to her clothing instead and is adding tied ropes to it. This is a tag to avoid and maybe I should purge it from my training thing.

 No.101704

File:00032-476515586-1girl, sol….png (608.4 KB,576x768)

>>101702
The heatmap for 'smile' is exactly where you'd expect it to be

 No.101746

File:02167-99510245-1girl, anim….png (12.86 MB,2688x6286)

God damn. Okay, I can say that switching from a 1, 1.5, 1.5, 1 neural net thing (whatever that means) to a 1, 1.5, 1.5, 1.5, 1 one was a massive upgrade. I don't know why. Oh, and I think I MIGHT have clip skip set to 1 instead of 2, but that wasn't supposed to be a big deal. Hmmm.
The first one here, 'aquaplus' was my training 2 nights ago whereas the two others are different checkpoints from the one I trained last night. I just don't understand how it's such a massive improvement.

 No.101747

File:02168-2730948104-1girl, je….png (3.17 MB,1344x3142)

Okay, this is weird. I did the exact prompt here and the new ones look bad with this DPM 2M Karras sampler...

 No.101748

File:02169-2730948104-1girl, je….png (2.95 MB,1344x3142)

>>101747
but on the default euler a sampler they look fine, but the first one seems like it's probably the best, but still within the normal variation you'd expect.
more testing needed....

 No.101896

File:explorer_ICmVTo8wHE.jpg (600.31 KB,2158x1055)

Now training the Kuon embed. I will combine this with the artstyle hypernetwork and it should have great results. That's the hope at least.
What this means is that instead of typing "yellow eyes, swept bangs, etc" and hoping it assembles them correctly I will invoke the name of the embed and it will fill that information in in a way that would be impossible to attempt manually. I don't expect the hair ornaments or clothes to look perfect, but it should definitely get her hairstyle and ears right. There's about a 0.0003% chance that it will get her skirt pattern right.

 No.101901

File:00163-4181300038-solo, smi….png (720.31 KB,768x768)

MISSION COMPLETE

 No.101902

File:tmpldpue00e.png (15.32 MB,3840x3072)

>>101901
This was the batch
solo, smile, 1girl, thekuon, :d, sitting, looking at viewer, pov

 No.101906

File:00168-908452894-solo, 1gir….png (971.81 KB,768x1024)

>>101902
It works alongside other words, like 'swimsuit'. Man, this stuff is truly amazing.

 No.101910

>>101902
damn that looks really good too
at a glance it's hard to tell those are AI

 No.101912

>>101901
Congrats, seriously. Looks damn good.

 No.101928

File:1367881241884.jpg (91.58 KB,700x700)

>>101902
Looks like it turned out very well, congrats!

 No.101941

File:00236-3931658932-1girl, so….png (514.65 KB,832x576)

Thanks, guys! This is really a new world and I'm not sure how to feel about it. I feel guilt over artists, but at the same time I'm really enjoying messing around with this stuff. It's been taking up too much of my time, though, so I need to get back to doing other stuff...
I can envision a 3D>2D>3D kind of pipeline in my head right now. If only my tablet wasn't broken... hope I can RMA it soon.

 No.102400

File:FmWCnRfakAE9boP.jpg (237.43 KB,1024x1536)

Saw some really neat AI art that was good enough for me to save. Is it getting even better or something?

 No.102401

File:00501-3541163114-1girl, so….png (954.29 KB,840x840)

>>102400
By now the default with the newer fine-tuned models is far more impressive than the NAI leak stuff, but it's still all built on that. You're actually in a far worse position if you're paying for it now.
This high detail one is based on a mixture of real life and 2D art so it can do things pretty well, but you have poor control of it. It looks extremely impressive, but it's not obeying my very strict training of Kuon's outfit, so imagine trying to get something that you haven't trained.

I've been wondering if I should start grifting since I know what I'm doing and everyone else that knows what they're doing is, well... kinda normal. But, being normal is what gets you the most exposure and success. I can't name gacha characters, for example. But, I could corner an AI fetish market especially if I combine the training with my 3D models. This is when it'd be good time to have the motivation to do things

 No.102402

>>102401
>grifting
Sounds like such a bad way to put it... but I get what you mean. I think if you consider yourself capable and with some desire to do so you should try it out. I mean training your own stuff is probably a long and arduous process, more than most are willing to invest into it.

 No.102404

>>102401
Im very proud of your progress.

 No.102405

Its kind of crazy that if you put the effort into training one of these things you could have unlimited fetish porn?

 No.102406

I havent played with novelAI outside of porn but I may generate non-ero OG fallout fiction with it to see its quality

 No.102407

>>102405
>unlimited fetish porn
That is exactly what it is. If people think there's "porn addiction" now, wait until normal people get a hold of this stuff. I have a blog of the pornographic progress I've made on /megu/.
Still, I'd prefer human-made stuff if it existed, and stuff that's more mental like incest isn't really something you could satisfy with an image alone. You can't tell stories with this and stories are really, really, REALLY good. A good doujin is easily better than this stuff, but when you're dealing with specific tastes then yeah, it's the best option available.

 No.102497

NovelAI really likes futa

 No.102741

File:FmB4eORaYAAEejg1.jpg (725.81 KB,2829x2237)

Came across this and thought that it was neat enough to inquire what the heck is with the coloring ability of these AI, it's pretty great.

https://twitter.com/lvminzhang/status/1612421180933406720

 No.102749

>>96648
What's interesting is it doesn't necessarily make the same mistakes a human makes when drawing hands. It makes its own sort of mistakes you don't see in real art.

 No.102756

>>102749
The most popular checkpoint models these days (for those doing it offline) are a mix of 2D art with conventional photography. It increases hand quality a bit, but it's still far from from passable most of the time without a bunch of "inpainting' which is basically like selectively retrying a part of an image. Some of the models people use look more like real life photos with an advanced filter on top, which can be very creepy and also takes away from some of the appeal since it introduces 3D limitations in perspective and such

 No.102758

>>102741
Colouring shouldn't be hard for an AI. It could just pick colour pallets form existing images in the same pose and apply them.

Even though the line work is done the lips of the girl on the top right are weird. Also I am not sure if I ever noticed this before but the hair on these AI girls is quite bizarre(the ones on the right), not only is the fringe not symmetrical but the far side looks weird.

The backgrounds are odd too, the sunset kind of drops suddenly in one image and the floor boards in another are all different widths.

 No.102759

>>102758
I mean top left not right...

 No.102801

File:00981-919244970-1girl, sol….png (969.76 KB,816x920)

>>102756
As an example, as I continually attempt to refine my custom merged checkpoint for, uhh... /megu/ reasons you can see the effect of one of the models already having some RL stuff mixed into it. The shading is absurdly good, but I really have to fight it to create clothes that aren't modern and it feels very "real" which can be a good thing and a bad thing depending on one's tastes. (also as a side note need to figure out why it's ignoring tags)
And look at that hand. I didn't do any editing here. But, it definitely looks like a real human hand. I don't know how to feel about it. I guess maybe for now it's a sacrifice to make if you don't want to do edits, but I like style over reality.

 No.102804

File:00991-1957439408-1girl, so….png (786.16 KB,712x816)

>>102801
Another example of hands in logical position

 No.102806

>>102804
really amazing for ai hands

 No.102808

I wonder if it's possible for AI to make manga or if that's far too many variables to be solved in a realistic timeframe

 No.102926

Have you ever tried using doodles you make as a base for the AI to build off of? Wondering if that's more effective than just generating a bunch of images that may vary in psoture/position each time.

 No.103332

File:01464-2023-02-03.png (1.04 MB,768x864)

I did a whole bunch of testing with various RL models to see if I could understand how exactly people are making them assist in 2D hands/poses while not giving them a massive hit in quality and I really could not find any pattern. Although, my tolerance for spending hours making small merge differences is getting pretty low and I need to spend some time doing other stuff before getting back into it.
However, I did think I have an idea of how to bandage it. THe LORA things are basically like "plugins" for a checkpoint model, and for example the Amaduyu/Aquaplus one I made is pretty good at fixing the faces, but then of course they will always have at least a hint of Amaduyu/Aquaplus so I'd need to mix them with other LORAs.
It's also useful to use a thing called kohya, which is normally used to create LORAs, to separate merged checkpoints into their base ingredients. This means you can more easily control the intensity of something without needing to create a bunch of 4-8GB merge files.
Seems like there aren't any new amazing models recently, just merges of existing stuff (although some of them are quite impressive)
So, I can't think of any notable breakthroughs in the past month, just refinement.

Still, I continue to be annoyed by all the people using "waifus" in these things. I know, I know, it's a generational difference and they don't know any better. But it still annoys me.

 No.103580

File:saltsypre_502.ogg (26.19 KB)

With the popularity of AI voice cloning and eleven labs AI going payed, I decided to look into some of the offline runnable alternatives. The most popular one or alteast easiest to setup, seems to be Tortoise-TTS. It works okay enough and has some pretrained models the author directs you to use. There's a a guide and git repo someone setup that provides this service with a web interface https://rentry.org/AI-Voice-Cloning
The biggest issue I and many others have with tortoise is that the main author won't release a guide or overview of the process he went through to train his model. Simply saying if you're smart enough you can figure it out. Kinda leaves people at in impass for actually using this program as an alternative to eleven labs.

I've had some minor success with one of the alternatives (unofficial) VALL-E, https://github.com/enhuiz/vall-e
It's taken me a bit of dependency chasing and cobbling together a separate PC to install linux on (the DeepSpeed dependacny has been a nightmare to get working on windows) but, I've actually been able to get a "decent" output with a 3060 12gb card and about a day of training on ~7.4k couple sec audio files ripped from Vermintide 2. I'm not an expert in ML but the result I got with training a model from scratch with this limited data set and a "low powered" card make me optimistic for VALL-E's potential. I didn't really have to know much about machine learning, just how to install various dependacnies and 3rd party utilities.

VALL-E is based on phonemes so the text to be synthesized is meant to be sounded out, I think. I don't know if there is a whole lot of prompt engineering that can be done with this program, though my current model is probably too limited and untrained to really test that out.
Attached is the voice I wanted to clone.

 No.103581

>>103580
Here is the output.
The prompt text was "Blackrat spotted! Keep your guard up!"

 No.103582

>>102756
>>102801
>refine my custom merged checkpoint
I haven't been closely following your posts, more just watching your results, when you talk about custom check points are you training your model (custom data set of /megu/ images) starting with some base model as a checkpoint? How are you doing that for stable diffusion and what sort of time sink is it/hardware are you using?

 No.103583

File:01962-sfw,_painting,_(J.M.….png (1.3 MB,896x896)

>>103582
Mm, how to explain...
The "custom model" I've been talking about recently is a merge of existing checkpoint models, which is something like NovelAI or Stable Diffusion. My most recent one using Stable Diffusion, NovelAI, Yiffy (for genitals), Anything (that's its name), AbyssOrange2/Grapemix and a couple others that I'm trying to switch in. (GrapeMix doesn't seem to have any RL image data, so I add in some of the basil mix myself that AbyssOrange2 does, that guy was onto something for sure)
They're large (2-8GB) files that contain a whole lot of training data and you need a really powerful GPU to train them. I'm not sure you can even make them without something like 24GB of VRAM at minimum, and then you need a whole lot of time (weeks of constant processing) if you don't have like $50,000 worth of processing power sitting around.
However, someone like me can create merges of them with custom settings that hopefully take the desired parts of A with the desired parts of B. But, you definitely make sacrifices when you do it and the trick is to try and counteract them. It's a really annoying process, though, because there's no guide to see what each setting does so it's a bunch of trial and error. Every time I think I noticed a pattern, I change a different slider and it completely invalidates what I thought I knew. Also each merge takes like 30 seconds to create, 10 seconds to switch to, and then however long the generation takes on your current settings. Also when switching between them your VRAM can get corrupted somehow and you need to restart the program so you don't get false results. Each merge is also 2-8GB so you have to routinely delete them and take screenshots/notes of what you've learned, if anything, from the merge results.
The main training data I've done myself is for Kuon and the Amaduyu (Aquaplus) hypernetwork/LORA things, although I've done some other artists to mixed results. They rely on getting layered on top of a checkpoint model, so they're heavily influenced by it.

What kind of timesink is it? Weeks, but I do other stuff while it's merging and generating. I can't imagine most people will want to do it. But, I've also been doing this stuff since early October so I guess it might be a slower learning process for other people.
As for hardware, I got a "good" (less absurd) price on a 3080 12GB for Black Friday

 No.103584

File:firefox_vJasASdG3I.png (113.39 KB,1668x1155)

>>103583
And this is what the "Merge Block Weighted" panel looks like to make make merges that are better than just a brute force "30% of this, 70% of that". Pretty self-explanatory, right?

 No.103586

File:BooruDatasetTagManager_0Il….png (739.6 KB,1162x533)

>>103584
I think I can describe the difference better. The "checkpoint" model is the database that has the actual definitions and data on the information of a tag. When I trained my Utawarerumono, Kuon, and other things I was training it against the NovelAI model. The images have tags like "ainu clothes" or "from side" because that is specifically the booru tags that NovelAI trained. I'm not defining what those are, I'm providing information on what they look like when drawn by a specific artist, and the training process compares it to the information that NovelAI has. There's a huge gulf in defining the tag itself and merely referencing it.
People, including myself, have trained concepts (which is what a tag is), but it's just one at a time.

The horrendously named "waifu diffusion" has been undergoing training on its new version for over a month now, but it was just at Epoch 2 when I last checked a couple weeks ago so it might be at 3 now. One epoch seems to take about 12 or so days to complete? People said the first epoch sucked and 2 might have shown that the finished product could be good, potentially, but we'll have to wait and see. It will probably not be something to test out for real until Epoch 6 or so?
But, I haven't been paying attention to any news about this stuff

 No.103713

>>103586
what is a checkpoint model?

 No.103714

>>103713
nevermind, it's a database with the tags associated with images

 No.103717

File:02137-v1-5-pruned1girl,_so….png (703.97 KB,704x768)

>>103714
Basically that, yeah. It's the skeleton that everything is built upon. The most famous one is Stable Diffusion (SD) and everything I'm aware of for offline AI image generation makes use of it. The 2D models still have the SD data in them so you can use words that boorus have no knowledge of and get results.
It's worth noting that most people using the offline method are using the older (1.4 and 1.5) versions of Stable Diffusion because the ones after that started aggressively purging nudity (but not gore) and potentially other things, from the training data. This had the effect of breaking the things trained on the older models, which includes stuff like NovelAI which nearly all 2D models make use of.
The last time I checked people were not so impressed with the newer SD models that they were willing to sacrifice a "pure" data scrape in favor of one curated to make it more attractive to investors

 No.103722

File:kagami.png (244.16 KB,800x781)

>>103717
This seems kinda funny since now that the cat's already out of the bag and people have access to the older models in their entirety, there's no real reason for people to use the newer models and a loss of functionality just means it'll become irrelevant as people improve the current local models. It may seem like a good move for investors at first since all the other AI companies are doing it, but the one thing they have that SD doesn't is that people don't have hands on their code to use it unneutered.

 No.103908

File:pose.png (1.14 MB,2304x767)

The newest technology just came out a few days ago!
ControlNet lets you control the generated image by pose and composition though normal and depth map, edge detectors, pose detection, segmentation etc. This is much easier and finer control compared to regular img2img.
An extension for webui also allows you to adjust pose as you wish.

Guide: https://rentry.org/dummycontrolnet

 No.103909

>>103908
hm, so they gave up and decided that this is where humans need to come in and give the images context.

 No.103910

>>103908
That's quite the leap.
Could you make a leaping Megu?

 No.103911

File:jump.png (1.15 MB,2304x768)

>>103910
i tried

 No.103914

File:[Rom & Rem] Urusei Yatsura….png (727.29 KB,960x540)

>>103908
Dang, that's cool. This is what happens when you take a break from checking for AI news in /h/, huh.
Seems neat, but it's also introducing more effort into generation which isn't really my thing. I had tried to use depth maps about a month ago, but learned that it was limited to Stable Diffusion 2 and above, which kills any desire that the majority of people on imageboards would have for it. So any extension that makes use of depths maps, but not requiring the neutered corporate-friendly SD is great.
I'm not sure I'll use this, but it's cool to see in action nonetheless

 No.103917

>>103914
Angry boobs

 No.103918

>>103911
Wonky legs, but still impressive.

 No.103919

File:grid-0076-1girl,_solo,_(mi….png (5.12 MB,2560x1664)

Someone on IRC asked me about how to go about generating images for Miyako from Hidamari Sketch, done in the orignal Ume Aoki style. We already know that doing a style is impossible without training since artist tags were purged for Novel AI and that's what most 2D models (including everything I use) is based on. So, the question becomes whether it recognizes Miyako. Unfortunately, as you can see, it does seem to somewhat know of her blonde hair color, but everything else is a mess.
Conclusion: Miyako has to be trained as well.

 No.103921

File:grid-0080-1girl,_solo,_(mi….png (5.18 MB,2560x1664)

>>103919
I searched through some 4chan pages and found that someone did create an Ume Aoki LORA. It seems to work pretty well at capturing the style and also seems to capture Miyako to a degree, but it's still not accurate.
It's in here if you want to download it yourself (use Ctrl+F) https://gitgud.io/gayshit/makesomefuckingporn#lora-list
So, I told the guy to start amassing Miyako images which will be combined with the Ume Aoki style LORA.
Things to note for good training images for a character:
1. Solo
2. No complications like text overlaid upon her
3. Text elsewhere in the image should ideally be edited out
4. Limited outfits. Ideally it'd be maybe 3 or less, depending on how many images you have. When I trained my Kuon stuff I did not bother since she is portrayed in only one outfit about 95% of the time. Each outfit will need to be tagged in the training process and called upon manually with a custom tag of your own choosing during image generation later on. She can still be portrayed in other outfits, but if you specifically want her in her own original clothing you need to train for it.
5. Different angles and "camera" distance. The more variety of angles you have, the more accurately it can portray them later on during image generation, although it does a pretty good job of filling in the blanks since it already knows how human characters should look from different angles.

Then the images themselves should be cropped to be somewhere squarish. Unlike the old days of late 2022 it does not need to be exactly 512x512 pixels, but you should avoid images that are too tall or wide (heh) at like a 1:3 ratio or something. I'll talk about the other stuff after I get the images

 No.103924

File:Sunshine Sketch - c129x1 (….png (1019.87 KB,900x1291)

Miyakofag here, yoroshiku onegaishimasu, and my deepest thanks to yotgo for his help.

A very important factor to consider is how easily the characters go from chibi to normal and back, as seen in pic. In their non-chibi style their head shape is somewhat hexagonal, with fairly sharp angles, while their chibi form head shape is usually either between a full oval and a curved rectangle, or a mix of the two with a pointy side bit like Yuno has in the middle, and the first has regular eyes and features while the latter two are (✖╹◡╹✖). Also visible in pic is how the wides are presented in variety of outfits, like Miyako getting a change of clothes in the middle panel, and then immediately returning to the first one.

I've downloaded the manga, but it's monochrome and fairly crammed, so it doesn't look like it'll be of much use. Seems like I'll have to take a few thousand screenshots of the anime again, but that's fine by me. I'll also begin to comb through boorus for useful art, and there's this other meguca stuff I'll be downloading in case they can turn out to be of help for setting up Ume's general style:
https://exhentai.org/g/2191043/e80d477043/
https://exhentai.org/g/2262418/2d88611a04/

 No.103931

File:explorer_F0NcBv8fyY.jpg (382.57 KB,1424x1091)

>>103924
Hmm, keep in mind for a character that you want the character to be the focus and not the artist's style. It's better to have a more varied collection from various artists than a limited number from the official one. You're not training the shape of her head or how her mouth is drawn, you're training the combination of her outfit and eye color and hairstyle and the visual traits that identify her.
I can generate images of Kuon in different styles because it's not constrained to a specific style itself.
When I generate an image of Kuon to look like her original Utawarerumono appearance, I activate my Kuon LORA (Kuon herself) and also my Amaduyu LORA (the Utawarerumono style). Combining them into one would be severely limiting.

 No.103933

File:grid-0084-1girl,_solo,_gra….png (5.95 MB,2560x1664)

>>103931
Kuon with Kuon LORA and Amaduyu LORA and the downloaded Ume Aoki LORA. This is to show what a merged character and style LORA would look like together with another style.

 No.103934

File:grid-0083-1girl,_solo,_gra….png (4.49 MB,2560x1664)

>>103933
but with the character and artist separated, I can apply Kuon and the Umi Aoki style together without the influence of Amaduyu.
Hmm... not sure if this style will work.

 No.104000

File:grid-0360-[Grape! Base]1gi….png (3.1 MB,2048x1408)

>>103921
Bleh. I had trouble training, but got it to work but then it came out like THIS. I really should have kept my old settings, but noooo I had to see what the new stuff was like.
I noticed that some of the images you gave me were small and I think I'll have to exclude those. They should at least be 512x512 and I think that's the main reason why it looks so blurry and low quality here despite being relatively accurate in some images.

 No.104013

>>103934
nice, hexagon headed kuon

 No.104026

>>104000
It already looks really, really good.
The small crops are my bad, I had taken "does not need to be exactly 512x512 pixels" to mean "smaller pics are okay", there should be a dozen, dozen and half pics to remove then, maybe a few more. There's also one where she has her top but not her shirt, which may explain the result on the top-right.

 No.104043

File:grid-0454-Anything-V3.01gi….png (3.94 MB,2304x1664)

Miyako-00002

 No.104044

File:grid-0455-Anything-V3.01gi….png (3.95 MB,2304x1664)

Miyako-00004

 No.104045

File:grid-0456-Anything-V3.01gi….png (3.37 MB,2304x1664)

Miyako-00006 (final)

 No.104046

File:grid-0458-Anything-V3.01gi….png (3.92 MB,2304x1664)

Miyako-00006 + Aoki Ume both at 90% strength.
Mmm, I feel like it's still not very good. The eyes are especially too shiny, but at least the clothes are good. Also artists seem to depict her with different eye colors. The training data is really not ideal, but you really didn't know what to look for. I had to throw out nearly everything that was below 512 pixels and of the images remaining some of it was still too blurry or grainy, but I wanted to see if we could get away with it.

 No.104047

File:grid-0468-Anything-V3.01gi….png (4.34 MB,2304x1664)

Hmm.. yeah it seems like there's some corruption or data loss or whatever you'd call it. The image is too "noisy" and looks over-exposed.
Here it is with my Aquaplus lora. It's funny how it put a dog there in the top left because I put "animal ears"

 No.104048

File:grid-0469-Anything-V3.01gi….png (4.3 MB,2304x1664)

>>104047
I reduce the strength of the Miyako LORA and the image clears up, but then it becomes less accurate.
Bleh. Yeah, I need to train it again with better images.

 No.104520

File:cb885d1935.png (1.31 MB,3232x1569)

does this stuff actually work or are you just drawing the 6 images and pretending it's an AI

 No.104530

File:02950-Anything-V3.01girl,_….png (2.95 MB,1280x1792)

>>104520
It works, but finding the right prompt can be exhausting. It looks like you're using some online model and those have some pretty severe limitations. I don't really know how to best use the real life models that use verbose text rather than booru tags. There are prompt repository sites like:
https://lexica.art/
https://docs.google.com/document/d/1ZtNwY1PragKITY0F4R-f8CarwHojc9Wrf37d0NONHDg/edit#
but also personal pages of research people have done like https://zele.st/NovelAI/

After a bunch of testing, I think I'm satisfied with this Miyako LORA. It seems to work best with the Anythingv3 model, although I haven't done hours of tinkering. But, this reminds me that I really need to create a good SFW 2D merge of my own, but I keep struggling to have it look good with multiple different prompts and LORAs.
I also know now how to 'host' it and allow people to connect to it, but my upload is capped at 1MB/s so the limitation is there...

 No.104579

File:1646704983644.png (499.65 KB,640x480)

So I'm trying to do this on my PC again after reinstalling and now I forget how I initially solved the "No module 'xformers' found continuing without it' thing before. Also I think I may have cancelled the taming transformers clone after 2 hours but it hasn't tried to reinstall so maybe it worked?

 No.104581

File:Hidamari Sketch x Honeycom….png (8.23 MB,1920x1080)

Something that was very interesting about collecting a bunch of screencaps of her is that it helped me appreciate the amount of variety in the girls' wardrobes.
Since training material for a specific character requires consistency in their looks we decided to go with her standard school uniform, however, they regularly spend around half of an episode outside of Yamabuki wearing their casual outfits (of which each wide has maybe a couple dozen or more), in some cases they don't go to school at all, then there's Winter episodes where they're wearing a coat, and at one point she has a hair bun like Hiro's, I assume it's simply because she felt like it. Add to this Shaft's abstract cuts decreasing their screentime, how due to her character she has what is perhaps the highest regular:chibi appearance ratio, on top of needing her to stand alone without overlapping with other people, and I ended up only managing to take 62 usable captures out of the entirety of Honeycomb+Graduation. Far, far less than what I initially expected, like the max ~100 taken from 1171 fanarts of her. Thankfully, it was still more than usable.
>>104046
Very late reply, but when I first saw this my heart skipped a beat. It's incredible, warm. She makes me very happy and I'm overjoyed to see it work so well. Very thankful for this.

 No.104583

File:grid-0838-Anything-V3.01gi….png (2.08 MB,1280x1664)

YEEEEEEEEEEEAAAAAAAAAAHHHHHHHHHH

 No.104596

X |||____________________________________________||| X

 No.104600

>>104583
bottom left is a JRPG protagonist

 No.104601

>>104600
Maybe if we combine it with >>104530, we'll create the legendary「Shin Hiroi Yuusha」。

 No.104602

>>104579
Can you link what guide you're following and what step you're at? It might be best to find where the stuff is installed and wipe it or something. I'm not sure...
You could try googling the error message in a 4chan archive maybe

 No.104604

File:C-1677995987510.png (3.38 KB,831x32)

>>104602
I did wipe my VENV and either it's not in there or something went wrong maybe (although maybe it's fine?)

I'm using https://rentry.org/voldy and I'm just tweaking the asuka image right now. My current issue with it is "vae weights not loaded. make sure vae filename matches the checkpoint, replacing "ckpt" extension with "vae.pt"." and I'm a bit confused of what to do to fix this one, but maybe since I'm getting a known error the taming transformers thing worked? I dunno. However, what I'm wondering about right now is getting this, I forget if xformers is important or not and if it is, how to install it.

 No.104605

Also, why is it that sometimes my generation lags just because ff is open even though I'm using chrome...

 No.104606

File:Screenshot 2023-03-05 0111….png (42.96 KB,726x299)

>>104604
>My current issue with it is "vae weights not loaded. make sure vae filename matches the checkpoint, replacing "ckpt" extension with "vae.pt"." and I'm a bit confused of what to do to fix this one
It's talking about if you're using a model with a vae, you should have a file named the same to go long with it. For example, "Anything-V3.0.ckpt" and "Anything-V3.0.vae.pt". I'm pretty sure it should work fine if the model you're using doesn't have one.

 No.104609

File:cmd_1Gzj7OumUa.png (6.83 KB,682x99)

>>104607
Nevermind, that's specific to windows 7.
Uhh... hmmm...
Well, there's a message when I launch about updating xformers (I will someday maybe) and I think it gives a hint?
Run that commandline thing... I think?

 No.104610

>>104606
I've seen people recommend that you put VAEs in a subfolder. I.E:
>blah/models/stable diffusion/vae
The vae mostly determines color and you can select them manually or switch automatically if the name matches, as you said. I don't remember where those options are in Settings.

You can put this into the Quicksettings list under "User interface" in options and then the main screen will let you switch these around without needing to go into the Settings every time:
sd_model_checkpoint, sd_vae, CLIP_stop_at_last_layers

 No.105076

File:00064-1021740396.png (2.09 MB,1536x1024)

it's strange that almost all images itt don't use latent upscale, considering it lets you get much higher resolution and quality

 No.105081

File:03369-[AnyGrape Furrymix C….png (770.61 KB,816x1024)

>>105076
Do you mean when you take a generated image and take it to the img2img tab, or do you mean the scaling "postprocessing" that does that automatically during generation? I don't do the first one, but I do the latter sometimes. The problem is that it's a total VRAM killer, so I go from generating 8 images at once to 2 or sometimes even 1.
Maybe I should try the "manual" scaling sometimes, but I just haven't felt the desire to do so. I like seeing the final image and not doing anything to do it afterwards because then it begins to resemble work since this stuff doesn't really satisfy the creative urges. I like setting it to generate a bunch of images and then doing something else, too.

I just spend the past 2 days downloading and organizing LORAs, so I'm going to be generating a lot more Kuons soon. Hehehehe.
One day soon I might redo my Kuon and Amaduyu LORAs, particularly the Amaduyu one that controls the art style because it tends to produce a lot of errors that aren't otherwise present. No idea what I did wrong with it.

 No.105107

File:[MoyaiSubs] Mewkledreamy -….jpg (341.7 KB,1920x1080)

This is something I saw a month ago that was way over my head. It still is, but it seems people have been using it very successfully so maybe I should give it a look sometime:
https://github.com/cyber-meow/anime_screenshot_pipeline
Basically it automates taking tons of screenshots and tagging them and such so it doesn't take dozens of hours like what I did a few months ago...
I definitely have a bunch of shows I'd love to be able to reproduce in prompts, so this is right up my alley. I think shows like Mewkledreamy would need a lot of manual screenshots, though, since there are so many great frames that are barely there and would be easily skipped over by some randomized thing.

 No.105112

>>105081
Kuon's cankles

 No.105273

File:cmd_JOSvYtIDL8.png (4.56 KB,688x68)

Luckily for me there's a cold front going through because I'm going to be generating quite a few images while I'm sleeping. I was downloading a bunch of LORAs a few days ago as I mentioned, but now I've made a new merge and I'm going to create a folder of example images of said LORAs in action. 14 images per LORA, two seeds for each prompt, and 505 Style LORAs.
Although these image sets are going to be pornographic, I'm going to make a non-lewd example, too.

 No.105298

>>105273
How did they go?

 No.105303

File:firefox_eAXhN1hF5e.png (479.43 KB,734x595)

>>105298
I ran into an issue and was too tired so I couldn't do it. Unfortunately it seems like a recent change in the automatic1111 thing (or maybe it's because this grid I'm making is different from usual) it's making all the batch image files at the very end. I don't really trust it to properly create 505 large images after many hours of work (where is it storing the data?), so I need to do it in batches which is REALLY annoying.
But, I've learned that it's also going to take much longer than I thought, at about 5 minutes per Style. If only I didn't need to generate one at a time to make this nice grid pattern with 7 different prompt sets, 2 seeds, and then the LORA change itself.
I guess I'm not playing the Nosuri game until this is finished. Oh well.

 No.105333

Done with about 120 of them so far. However, the question I now ask: How the hell do I organize all these images so I can easily determine the proper style for a thing?
I guess I can give them names like [Name][High Quality][Western][Realistic][Colorful][Big Breasts] or something?
How on Earth am I going to do this...

 No.105334

make the names into tags i guess, then use regex in the file explorer

 No.105421

File:firefox_OKQ8lp2CbU.png (991.8 KB,1862x974)

I did some basic organization of the Styles LORAs, giving them basic trait names like 'Cute', 'Shaded', or 'Fleshy' (for stuff with detailed skin) and other stuff.
But it looks like another addition I didn't noticed is the "Additional Networks" tab for the LORA Extension that gives you space to add an image and description and stuff. A lot of these LORAs people have made require keywords for the character, which I guess in theory lets you give new outfits to characters more reliably. I might do that to my old stuff... maybe.
Pretty neat, but this will be tedious to set up.

 No.105639

File:firefox_Qkhb4PGUTe.png (24.22 KB,576x793)

Civit.ai, a place that people have been using to host some models instead of mega or other file upload places just did some sort of overhaul. I'm now presented with this consent form and it makes me think that they're ready to sell it off, since a userbase has been established to give the site value before the great neutering. I mean, come on, a setting to hide the middle finger and bare male chests? This is definitely heading in a terrible direction for a site that grew specifically because of porn.
This is after tags like 'loli' were purged over a month ago, of course, which had some issues with hololive.
I hope this leads to people abandoning it.

 No.105807

>>105795
>A good checkpoint model (mine is a custom merge of like 5 of them that are themselves merges that other people made)

So do you constantly merge models and stuff or is it one model you use for most things? Also is it possible to upload this one, I'd really like to check it out myself.

 No.105808

>>105807
Thought it'd be better to ask here instead of cluttering up the other thread

 No.105809

File:notepad _40RGJPuWHt.png (87.98 KB,1203x1169)

>>105807
Yeah. I talked about it a bit here >>103583 and the post immediately after that is the UI for creating a more involved "Layered" merge between models. I still don't understand it much, it's just a bunch of trial and error and I can't say I've learned much after looking through papers and notes from other people who similarly seem to theorize things only to have it change later. Pic related is a glimpse into my nonsensical rationality in trying to find patterns in the first merging experiments I did in trying to create furry-quality penises with anime visuals. The video at >>/megu/538 is related. VERY NSFW!
I have "formulas" saved that I test in all future merges I make, but they rarely carry over their benefits when making future merges with different models or even if you keep the old model and add a new one to it. It seems like they were specific to the merge at the time. If I make a note of "Slider IN07 gives great faces when set to 1" it does not necessarily carry over to merges between different checkpoint models.
Since that post was made someone had an extension where you can do "live" merging with models that lets you test it before creating a new 3-7GB file each time, so that helps a lot.

I usually go a few weeks between testing new merges because it's really exhausting. I create thousands of images while adjusting sliders and waiting for it generate and it's an all-day or multi-day affair.

>Also is it possible to upload this one, I'd really like to check it out myself.
Yeah, I could try to upload it somewhere, although my upload speed is terrible. First I need to give it a real name, though. Hmm... I guess I could do a bit of publicity and name it after kissu somehow.
Uploading the LORAS? I downloaded them all and it was exhausting, but they're 120gb...

 No.105810

File:explorer_EQAYcXaoRh.png (1.64 MB,1549x758)

>>105804
Lala is probably not in a Miyako situation that would require screenshots. Miyako's fan art is very inconsistent due to the source material itself being inconsistent.
Lala... well, I think she could be separated into "regular" and "precure" forms for clothing and hair, but she still has the same body and head shape. My favorite art of her is very "noisy" so I don't think it can be used, but I could try.

 No.105829

File:00440-1girl,_sitting,_read….png (693.16 KB,672x864)

>>105807
Alright, here is the link to my current model for use by kissu friends. (but I also made sure to include kissu advertisements in the files and password so even if linked elsewhere people will know hehehe)
I call it... *drumroll*
The [/s] <[Kissu Megamix]>
The compressed RAR is 3.5gb and my upload is 1MB/s, so I can't really upload a bunch of these, not that I would anyway since I can say this is the best version I have. While this model is focused on NSFW stuff, it can still handle cute. I don't know if it's the best checkpoint overall, but it's the best for my personal desires. My model lacks most of the haze that most of the RL mixes do, although it's not completely eliminated. The benefit of the RL models is from looking at the hands here. I didn't do anything to them, it's straight from the prompt. The password is in the text file, but I'll also post it here. Without quotations: "www.kissu.moe - my friends are here"

https://mega.nz/folder/3OoAgSoZ#eqaY3KFat784_BPgk_ApbQ

Oh, I forgot to answer the question about multiple models. Yeah, I have a few I keep around but I overwhelmingly only use the most recent one I've created. The model I just linked is the normal version (which I'm using in that thread) while the other model sacrifices face quality and booru tag recognition to better generate a certain body part. (in other words it's closer to the furry model)
The others I don't really use much, but are there for comparisons sometimes. I have to keep the stuff around that I make merges with, too, of course.
In total, I've probably made about 200 merges, with 99% of them being deleted shortly after creation. If you count the merges I've done after the "real-time merging", then it's probably more like 500. It's really an amazing extension.

I never did make my pixiv into an AI account. Alas, such is the price of having no motivation to interact with the wider world.

 No.105830

File:00441-1girl,_jewelry,_spre….png (1.41 MB,1008x1152)

>>105829
Forgot to mention that I put (furry:1.3) in the prompt to demonstrate that while it has some benefits from the furry model, it's not overly contaminated by it. Patchy is still a human there. The Kissu Megamerge can do various bodies better than the majority of 2D models out there, such as 'gigantic breasts' and squishy plump bellies! (and the male anatomy attached to females of course)
My personal preferences:

VAE
I use generally use the "1.5 ema-pruned" VAE, which I'm uploading right now to the same upload folder.
It makes it colorful (and sometimes looks "over baked". If that happens, use the default novel AI VAE). The other 2D VAEs are too colorful on this, but you could try them.

Upscaler
I have also included the "4x-UltraSharp" upscaler in the mega folder, which you should put into stable diffusion\models\ESRGAN folder (create the folder if it's not there). I did a bunch of testing and found that I like it the most, although the differences aren't major.

Sampler:
DPM++ 2S a Karras. I'm not entirely sure on these samplers, but when testing different artist LORAs this one seems to have the most compatibility. I don't know why. Something to research more, I guess, but at the same time I don't really want to.
I have it at 26 steps, as going higher than mid 20s is supposed to be overkill. The rule for for upscaling is to do half the number of steps in the base generation, so 26 normal steps and then 13 Hires steps.

My default negative prompt is:
(worst quality, low quality:1.4), realistic, nose, 3d, greyscale, monochrome, text, title, logo, signature
If you somehow end up generating furry properties, try putting "furry" or "anthro" in there.
I don't use any of the old positive quality prompts since they don't seem to do anything noteworthy. (I.E masterpiece, highest quality, etc)

 No.105840

>>105829
Thanks for putting this together.

 No.105894

I think one of the most extreme hurdles I have yet to see AI overcome, and I can't even fathom how it would overcome, is creating images that involve specific details of two or more characters. It just can't figure out how to assign differing aspects to separate characters.

 No.105903

>>105894
MultiDiffusion and Latent Couple let you use different prompts for different regions and are available as plugins for webui
https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111
https://github.com/ashen-sensored/stable-diffusion-webui-two-shot

The MultiDiffusion extension also has Tiled VAE which lets you create much larger images without going out of VRAM

 No.105904

File:[MoyaiSubs] Reiwa no Di Gi….jpg (265.85 KB,1920x1080)

>>105894
There's attempts at it, but it's more work than I'm comfortable doing with my VRAM limitations. (This stuff is a total resource hog)
https://github.com/Extraltodeus/multi-subject-render
https://github.com/hnmr293/sd-webui-cutoff

Also apparently there's some major problem with the civitai site right now and anything downloaded is massively corrupt and can't be used. Whoops.
I guess I should go share this info with that /h/ thread since they've been helpful to me in the past.

 No.105905

File:multisubject test.png (1.85 MB,768x1728)

>>105894
Here is an example of the methods in action.
The top is using naive prompt 2girls, cirno, megumin. As you can see the character details got intermixed.
The middle is the MultiDiffusion method. I set prompt of each half to one character. Now the character details are separated correctly. Needs a little tweaking to let the two halves fuse together better.
The bottom is the Latent Couple method. It also separates the character details well and looks a little more natural than MultiDiffusion.

 No.105906

>>105903
Also there is this new extension that can do both
https://github.com/hako-mikan/sd-webui-regional-prompter

Other technologies of fine control of images can be found on SD wiki
https://www.sdcompendium.com/doku.php?id=creativity_0062

 No.105908

File:00572-(Alexander_Jansson_1….png (1.1 MB,1136x656)

>>105906
Heh, "Creativity". Well, as long as they're providing tools they can have their delusions.
I guess I'll try those that with Patchy adventure. Making scenery is really difficult with my current limitations and desire to not spend effort doing something to avoid spending effort

 No.105915

File:00582-2girls,_sitting,_rea….png (1.07 MB,1296x768)

Hmm yes. Furry Patchy Adventure just got a little bit better.
There seems to be a quality hit here, but it's still really impressive.

 No.106487

File:index.mp4 (981.82 KB,256x256)

There's been video stuff available, but I never thought to try it until I saw some random /v/ thread mentioning it.
It reminds me of where the image technology was a couple years ago with images. You have to download models, so it's like base stable diffusion where you can't really do anything other than basic stuff. No cute girls doing cute things here.
This is "Luigi beating Mario with a baseball bat until he explodes". I was angry that it wasn't recognizing things so I went with something violent with popular characters

 No.106489

>>106487
Can't SD already do video in some way? I've seen anime girl videos made with mocap and controlnet
It's may be a lot more effort compared generating with nothing but a prompt though

 No.106490

>>106489
There's been ways, but this one is a simple text prompt with no other work involved as you said. At 256x256 I couldn't make more than about 70 frames at once before running out of VRAM, but I didn't look at settings much. To me this stuff is only as interesting as it its ability to fill in the blanks, the more work I need to put into it the less interest I have because at that point someone should learn to draw or animate in my opinion

 No.108229

It's gotten very good at realistic image generation.

https://twitter.com/AIkawa_AIko_jp2

 No.108232

>>106487
Looks like Ness

 No.108233

>>106487
Ironically, it looks like it perfectly replicated the feeling of rage when you get pissed off that stuff isn't working

 No.108239


 No.109206

File:grid-1464-Anything-V3.01gi….png (2.51 MB,1280x1664)

It's been some time and I haven't generated more Miyakos since then for reasons, but I did want to comment that there's a special scowl+smile combination that gives uncannily evil results.

 No.109207

File:melancholic 2.png (659.4 KB,640x832)

I also generated some melancholic-looking Miyakos that give off a very special feeling.
But, both of these are utter blasphemy, so I'm a tad conflicted on it.

 No.109628

File:R-1686713083717.jpeg (23.22 KB,270x270)

bad ms paint drawing of an anime girl

 No.109708

File:R-1686792673657.png (202.19 KB,749x744)

stick figures

 No.109963

https://www.pixiv.net/en/artworks/108777497

Looking at this I have to wonder just how the hell the author pulls off something so consistent and without errors. Especially concerning the later parts without any changes to the body.

 No.109965

>>109963
Inpainting or lots and lots of generations. It's not difficult as much as it's time-consuming and boring.
I'm still surprised at the general low quality of stuff that people like. That's pretty much the old AbyssOrange appearance in that image set there and isn't that noteworthy I would think.

 No.113503

File:firefox_9ziLowmwLF.png (89.59 KB,1483x692)

By chance I learned that a SuperMerger extension I have known about for months actually does checkpoint merges in special ways and can even do LORA-related stuff. From its description on the SD extension page it just says that it's capable of performing "live" merges without writing it to disk first, which is cool but not unique. Hidden behind that poor description is the capability to do more advanced merging methods.
This stuff probably doesn't mean anything to people here, but it's still pretty interesting to read about: https://github.com/hako-mikan/sd-webui-supermerger/blob/main/calcmode_en.md

Seems to just overall make better merges without sacrificing as much. Really, really nice.

 No.113508

>>109708
the bottom left looks like a logo that a charity would use

 No.113514

>>113503
Hmmm, by "LORA-related", do you mean you could potentially incorporate good LORAs into the base of a model so that it generates good without needing to prompt them? Like if you have penis LORA and merge it with the model, suddenly it's good at penises.

 No.113516

File:firefox_DSIBWnHlzE.png (148.33 KB,1600x951)

>>113514
Yeah, that's exactly it. It's useless for me, personally, as it removes the ease of switching stuff in and out, but could be good for the SD IRC bots that I'd like to get on kissu itself eventually

 No.117415

File:00056-2932541145.mp4 (391.46 KB,576x768)

I guess it's been a while since I posted anything here since I ran out of interesting things to do (or try to explain) but I did mess with AnimeDiff a bit more recently. It allows you to do some simple animation stuff, but it tends to look quite a bit uncanny and there's some weird warping that happens that I haven't managed to avoid except by chance by producing a lot of animations and grabbing the one that looks decent. There is an extension for this extension called Prompt Travel which is supposed to allow you to set prompts per frame, but I never got it to work. I'd really like to try it, but for now I just use one basic prompt and it basically wiggles around in a way that does a decent job of mimicking basic human movement, I guess?
There's a workflow you can do to upscale it and do interpolation and all that other stuff to end up with the video here >>>/xmas/651, but the first stage looks like the attached file here. LORA loading makes generations take longer, and unfortunately with animation stuff it seems to be exponential and an animation like this takes me about 5 minutes to generate, but it would be about 2 and a half minutes without LORAs.
I'm using a branch of SD of experimental fp8, uhh.. something or other, which allows for greater efficiency or less VRAM usage or whatever it was (this was a month ago) but unfortunately it really damages the effect of LORAs so I don't make use of it. They might improve it at some point, but for now I don't make use of it.

 No.118762

File:01019-score_9,_score_8_up,….png (2.9 MB,1376x1840)

I told you guys it would be the furries. I TOLD YOU!
There's a Pony SDXL model out there now (pic related) and some people like it a lot. But... with my preliminary testing I prefer my own merges back on SD1.5. This SDXL stuff is supposed to be tuned much more around natural language, even this mixed furry one, and I'm not a fan of that.
I prefer "1girl, Hakurei Reimu, banana, shrine, sitting" instead of "A woman Hakurei Reimu holding a banana while sitting in front of a shrine". There's also the problem that SDXL LORAs are going to require like 24gb VRAM thing to train, so it's impossible for me to get Kuon inside SDXL and if Kuon isn't there what the heck is the point???
Yeah, this is an important step forward with SD, but it's not there yet for me. Someone also uploaded an updated danbooru scrape that is like 8TB so theoretically people can train an SDXL finetune on that if they wanted to, but it remains to be seen if anyone will.

 No.118763

>>118762
The hands are quite bad in that image, I thought the AI generators fixed their hand issues?

 No.118764

File:01023-female,_high_ponytai….png (2.5 MB,1280x1664)

>>118762
When comparing to my model I guess SDXL can do finer details far better and mine looks a bit hazy, but style LORAs can counteract that a bit.
And, well, obviously SDXL can do larger images, but it really doesn't matter to me if an image has a resolution of 2k or 4k as long as it stretches stretches to the top of my monitor.

>>118763
It's still better, but far from perfect. I also have no experience prompting SDXL so maybe people do stuff like "quality hands" or something. People will often inpaint (regenerate specific parts of the image) or just prompt 30 images and pick the best one.

 No.119856

File:Praveen - @MKBHD My fav so….mp4 (8.76 MB,1920x1080)

https://twitter.com/OpenAI/status/1758192957386342435

It seems like OpenAI is advancing even further after the success that was DALL-E 3 and is now moving to making full AI generated videos from prompts. Obviously this will be filtered like DALL-E was but I wonder how much they can actually filter when to comes to trying to sneak something into the generation, or how the AI will even recognize what needs to be censored or not. Also with this jump is the scary thought that people will come to be easily fooled into believing whatever deepfakes people maliciously make.

For me, right now there's something that looks off about all of the videos. Like they're treading into the uncanny valley by being almost real but missing something.

 No.119857

>>119856
they could just run their dall-e censor over every frame, but they probably have something better than that.

I think that looks pretty awesome, but it does have a certain GPU tech demo kinda feel to it. Not that I mind.

 No.119858

>>119856
>Also with this jump is the scary thought that people will come to be easily fooled into believing whatever deepfakes people maliciously make.
Eh, it's not hard to get people to believe rubbish anyway - just a clickbait headline is often all it takes.
I'd say the more significant effect is likely to be the opposite: people having to be skeptical of every video they see. Having video evidence that something happened is going to mean jack shit if anyone with a computer could make a believable fake in just a few seconds.

 No.119859

File:[MoyaiSubs] Mewkledreamy -….jpg (287.09 KB,1920x1080)

>>119856
Why are they using Japanese names...

It does seem impressive, but a closed off thing with extremely censorship and politically correct prompt injection will make it too lame if you ask me. DALLE3 is infamous for the latter one. The demo there used historic settings, so that's a good example. You could prompt something for "Historical Japan during the year 203" and it will inject stuff like "African-American woman" or "ethnically ambiguous person". This token attempt will placate people into thinking it's ethical while the real threat will be spreading falsified information and smear campaigns against people. People focus on the zealotry against porn, but the fact that it injects stuff into your prompt like that also makes it terrible.

SD's video stuff is also improving, but obviously it's not going to be able to compete directly. If I had 24gb of VRAM or more I'd do more experiments with video stuff, but as it is I need to make small stuff, slowly, and take extra steps with upscaling and the like (see >>117415) so I don't actually know what it's fully capable of. I think the VRAM thing is actively holding SD back from advancing because people making stuff that 0.2%% of SD users can utilize means it's not going to get much attention. If controlnet for example required 24gb I think it would be a footnote that people would mention once in a while and not something people actively praise and mention as a perk of SD over NAI or DALLE.

 No.119864

>>119859
>You could prompt something for "Historical Japan during the year 203" and it will inject stuff like "African-American woman" or "ethnically ambiguous person".

Obviously talking about Jomon

 No.119865

File:heavy petting.png (1.32 MB,832x1216)

I suppose in terms of anime I think we're starting to peak out in terms of image quality; complexity and coherence still need work but that's more on how the model works itself and unedited generations. We can now replicate the styles of most artists to a T, and things like hands and what not are getting solved or at least not a jarring. Same with the rest of the issues that make AI obvious, they get smoothed over and are impossible to see on a thumbnail scale. If there's an image you really like, there's nothing inpainting and some photoshop touchups can't fix. Realistic stuff has a way to go, but I don't really care for that, my interest is covered.

 No.119866

>>119865
Whose hand is that

 No.119867

>>119865
I'll never be satisfied until I can get in-progress transformation/corruption generations and cohesive progressive image sets and I don't think I've seen anyone that has been able to do that yet.

 No.120060

File:[SubsPlease] Megami no Caf….jpg (218.53 KB,1920x1080)

I've spent a few more hours editing character cards and generating their appearance in SD. I've been seriously thinking of throwing this stuff into an RPGMaker thing since it would let me avoid people whereas generating stuff and putting it on pixiv/patreon means communicating to people and giving updates and everything. It sucks to know I could be making money but my brain prevents me from doing it. I'm still confident that for my own purposes (penises) my merged model from 4 months ago on regular SD is superior to PonyXL or whatever it's called. But if I stop procrastinating about making my 3D stuff I could try to make a "real" porn game... kind of. I'd have to learn to program enough to get that going. Someone needs to make something like a UE5Maker.
I still don't know whether this AI thing has been good for me or not. I think I'd be further into my 3D modeling work if I couldn't hit a button to generate something.

 No.120061

>>120060
>making money
That market is already oversaturated and then some. You would've had some success if you were the first one to bank off of suckers, but by now the only people paying are those too stupid to realize they could generate it themselves. Same deal for people selling prompts. The people dumping thousands of images onto places like pixiv in droves made those services and users wisen up pretty quick and even stirred up some vitriol against it.
On the otherhand, using AI as a tool to streamline handmade art works out since the final product is not technically AI. Since now suddenly the perceived quality of your creations are much higher. Tracing or redrawing the generated image means you don't have to fuck with proportions, references, and much adjustment; and as the image has never existed before it's not really plagiarism.
As for game making, I can tell you now it's a hell of a lot more than just art and 3d assets. You have music, audio mixing, programming, writing, UI, overall game design (how it all fits together), and gameplay mechanics (if applicable).

 No.120064

>>120061
>The people dumping thousands of images onto places like pixiv in droves
This is becoming a problem for some tags, I personally have no issue with AI art, but shit like https://www.pixiv.net/en/tags/%E3%81%8A%E3%81%AD%E3%82%B7%E3%83%A7%E3%82%BF%20%E6%9D%B1%E6%96%B9/artworks?s_mode=s_tag is very annoying.

 No.120065

Also, as for games, if it's a really well-written and made game, but the art is the worst part I could see people overlooking AI. Like, Snow Daze is one of the best western h-games I've ever played, but the art constantly going off model takes it from really good, to just OK.
For a situation like this, I could see AI art working well.

 No.120068

>>120060
>my own purposes (penises)
homo

 No.120069

File:aaaAAAAA but still better ….png (604.22 KB,1404x640)

Foot review guy forgive me, this modeled foot has to look a certain way in its base form (like big and spaced out toes and it also looks like a blob since I just threw subdivision levels on it without sculpting detail)

>>120061
>On the otherhand, using AI as a tool to streamline handmade art works out since the final product is not technically AI.

This is basically what I'm working towards... slowly. AI as "shader" basically. People know the problem with AI hands, well it's even worse for AI feet. I can't draw and don't really want to learn, but I have a lot of fun sculpting in 3D (retopologizing aside, which is what's holding me up)
You can see that even with all the flaws of my 3D mesh (disembodied and all) it can do a decent job of steering it.

 No.120070

>>120069
>Foot review guy forgive me
NO

 No.120203

File:Google Gemini Strange Beha….png (1.07 MB,1299x1070)

https://time.com/6755968/google-gemini-images-race/

>Alphabet Inc.’s Google said it will pause the image generation of people for Gemini, a powerful artificial intelligence model, after criticism about how it was handling race.

I recall Anonymous saying something about this sort of thing with regards to OpenAI's Dall-e, but I don't know if it was ever this obvious or hamfistedly done...

 No.120204

File:[Pizza] Urusei Yatsura (20….jpg (449.91 KB,1920x1080)

>>120203
Yeah, that's exactly what I meant with the Dalle thing. The thing with Dalle is that it seems to be a percentage chance to activate whereas google does it every time? (I haven't used either of these, just read about them)
The interesting thing is that the quality of google's images is noticeably lower quality than Dalle, and Stable Diffusion can do a better job without any restrictions. So, google won't give you the prompt you requested and it's low quality. It's good to see tech giants stumble, although it unfortunately just means a different tech giant is gaining ground.

 No.120205

>>120203
If this isn't fake, then it looks like google abandoned all quality control. This sort of overly weighted output is something I would expect from an amateur project, not from one of the forerunners in the industry. No matter how far you have gone into PC ideology you are, you cannot expect people to respond positively to their national heroes being forcibly altered.

But misquoting "European family" to "white family" makes me suspect that there is at least partial fakery going on here.

 No.120206

>>120205
>makes me suspect that there is at least partial fakery going on here.
Maybe, but considering the original article was from Bloomberg, which is a major credible news outlet, I don't really doubt the broader issue.

 No.120207

File:waterfox_E96AWCfrzd.png (394.53 KB,834x874)

>>120205
Various news organizations are confident that it's real.
I think it's pretty easy to understand. Google does still have some brilliant minds at it, but the tech sector is increasingly populated by cheap, low quality outsourced/imported labor because experts are expensive. Add in some meddling "caring" people that want to make meaningless (but highly visible) changes and you have a recipe for failure.
It's pretty lucky that it's for something so stupid and not something like a power grid.

 No.120208

I normally extol open source software when big tech messes up like this but interestingly Claude is one of the worst offenders of this when it comes to text

 No.120210

>>120207
Cheap labor couldn't care less about this stuff.

>Add in some meddling "caring" people that want to make meaningless (but highly visible) changes and you have a recipe for failure.
And this is purely organizational. It's the result of the mandatory DEI quota needed for loaning to any company of significant size. That's why every single megacorp is like that.

 No.120211

>>120210
>Cheap labor couldn't care less about this stuff.
That was the point I think, they do what they are told for various reasons

 No.120212

>>120210
>Cheap labor couldn't care less about this stuff.
That's the problem.
You need smart engineers to catch bugs like this.
So a meddler tells the useless engineer to balance the results for people of color, and then the engineer sloppily alters a few numbers, trains the AI on them, and doesn't care to verify the results.

 No.120213

>>120212
as a QA, i find it very hard to believe this could be a bug. if their desire was to have diverse people across the board then it's working exactly as intended
furthermore, it's not just QA that catches bugs, it's the devs too, even designers when they take a look at the working product because they can't do their job without keeping up with its development
the problem with them is that they're terrible at reporting things, but this is something so obvious and fundamental that it cannot be overlooked and has to have fallen under their intended design, even if they didn't expect its poor reception. black nazis is an entirely sensible result of pursuing variety everywhere

 No.120215

>>120212
This is why backlash is important, if nobody calls them out on their practices then it stays and gets baked in for further development down the line. They'll never change their ways though, they'll just try to be more subtle about it. Though I couldn't care a whole lot about big tech AI much anyway, I already have what I want.
>>120213
These companies are so large and tone deaf that this is normal to them. They'll ship out a product they think looks good but is a load of shit to everyone else. They lack creativity and the willingness to take risks. Black Nazis happened because nobody inside the company was ever going to prompt it for that. Their testing is so sanitized and safe it can turn a square into a circle: puppies, icecream, and rainbows are the benchmark.

 No.120216

>>120213
>even if they didn't expect its poor reception.
I find that impossible to imagine. Making whites a minority in random generations, sure.
But being unable to generate whites and instead producing weird nonsense?
A properly designed woke project might have rejected to create nazis at all. Making black woman nazis is dumb, and forcing them on you is insulting both to leftists and rightists.

 No.120217

Didn't battlefield have black Nazis in it?

 No.120218

File:[SubsPlease] Sousou no Fri….jpg (278.84 KB,1920x1080)

Google deserves to fail in the AI race and just in general. It's been on this course for years and it's lagging behind as a result. It basically snatched defeat from the jaws of victory.
8 people contributed to Google's Transformers paper that made all this AI stuff possible and none of them remain there.
https://www.bloomberg.com/opinion/features/2023-07-13/ex-google-scientists-kickstarted-the-generative-ai-era-of-chatgpt-midjourney

The article says that google has over 7000 people working on AI whereas companies like OpenAI have a hundred or so. But, those 100 people are presumably very talented.

 No.120219

File:Can you fathom how exhaust….png (135.6 KB,535x535)

>>120218
That's fucking sad.

 No.120220

>>120215
>Their testing is so sanitized and safe it can turn a square into a circle: puppies, icecream, and rainbows are the benchmark.
i don't think this was entirely the case, because if it gave you messages explicitly talking about diversity in people or that weird thing about racial stereotypes then the generality was within their consideration
>Black Nazis happened because nobody inside the company was ever going to prompt it for that.
this i agree with, happens all the time
>>120216
i've seen several updates shipped that i knew people wouldn't like and sometimes called them out because the goals of the team didn't match explicit feedback given by our audience, and in those cases we were dealing with a situation where general opinion on a live product was well known and documented (such as by reading reddit, watching videos, or directly speaking with relevant outsiders), so imagine the difference when not even that is present
it's far easier for intent to be misaligned and to not realize the extent of their repercussions than for nobody to generate any images of people

 No.120221

>>120220
>it's far easier for intent to be misaligned
I'll repeat my final line, because I just can't see anyone intentionally making this and believing people might like it.
>A properly designed woke project might have rejected to create nazis at all. Making black woman nazis is dumb, and forcing them on you is insulting both to leftists and rightists.

 No.120222

>>120221
that requires further specification, patches to stop black nazis or any nazis from appearing just like text bots had to be modified to halt them from repeating conspiratard shit or some other harmful/false stuff
it's a bandaid that goes against the path of least resistance, white-only nazis are contrary to the principle of diverse humans and although it may seem very obvious now a generative whatever has a range of results so vast that the best you can prepare is broad guidelines, and they simply didn't think of this case

 No.120223

>>120222
>and they simply didn't think of this case
And my point is that this lack of consideration for the "rare edge cases" where people might expect white people in the results constitutes a bug.

Can you honestly imagine a CEO thinking that outside of racists no one would ever want to see a white person ever again, and that any white person needs to be censored to PoC to protect the sensibilities of the public?
Do you think that google-glass, if it had not so predictably failed, would today have a black-face feature to beautify all these unsavory pale skins on the streets?

 No.120224

>>120223
for the sake of comparison, if you have a set feature like a button you can write the following cases:
1) verify that there is a button on the bottom-right corner of the panel
2) verify that the button on the bottom-right corner of the panel is blue while the mouse is not hovering over it
3) verify that the button on the bottom-right corner of the panel reads "exit" [you can also specify font and color]
4) verify that the button on the bottom-right corner of the panel is yellow while the mouse is hovering over it
5) verify that clicking on the button on the bottom-right corner of the panel closes the widget
with a fixed functionality one can do this easily, i've written hundreds of these, but you cannot do it with something so vast that takes as input any sentence imaginable, it's going to go wrong and that's inescapable
and yes, it's possible for a designer to prioritize diversity above all and not consider things they're not interested in because general trumps specific
these tools have been used to produce endless amounts of images of humans and it's impossible for a crew of people all working on sketching out, developing, and then testing it to not be aware of it rather than acting according to an outlined plan taking it into account, especially given the messages accompanying it. it's stupid, the result was ridiculous, yes, absolutely, but it's also perfectly plausible and a common scenario only taken to an extreme degree

 No.120225

>>120224
>the result was ridiculous,
No, anon. Your argument must be that the plan is ridiculous.
If the result differs from the plan, then it is unexpected behavior. But you are arguing that they were aware of what they were creating and thought that they were on the right track.
This is akin to a woke game designer writing an RPG and making it impossible for men to attack women, despite half the enemies in the game being women, without realizing that this might break the game (unless you play as a woman).
It is not plausible to me that they would want this level of anti-whitewashing.
(but at this point, I think we have exhausted our arguments and are just rephrasing them, so I'll go to bed)

 No.120226

>>120225
these examples are very loaded

 No.120227

>>120226
Pretending that America's founding fathers included not a single white man is kind of extreme.
Refusing to show whites and berating the user for requesting them, but being happy to create Chinese or black people is also beyond the range of the normally acceptable.

 No.120228

>>120227
Referring to the image in https://www.timesnownews.com/world/is-gemini-racist-against-white-people-users-claim-google-ai-chatbot-refuses-to-create-images-of-caucasians-article-107892265
It's a webp with some icky artifacts that make me uncomfortable with directly posting it.

 No.120229

I regret making this post... (>>120203)

>>120218
I think this article misses some broader context. First and foremost, OpenAI was way more serious and focused on LLM development before ChatGPT released than Google was. Remember, while all of this is happening in the background, the state of the art -- among the public -- for LLMs was basically AIDungeon (we played around with a more advanced GPT model around 2021 here >>76781), which was extremely hallucinatory and mostly treated like a gimmicky toy that would never go anywhere. Guess who was behind the models for AIDungeon (hint: it wasn't Google). AI generated images meanwhile were noisy and nonsensical -- only useful for upscaling images via Waifu2x and similar. Within the same time frame that that was going on, GPT3 was a closed model only available to small number of people, mostly professionals. WMeanwhile, there were frequent reports of massive discontent within Google's AI team among it's senior staff and their projects were diverse and unfocused.

Note: Around June of 2022, Craiyon (formerly DALL·E Mini) was released on Hugging Face, bringing AI image generation to the public. On November 30, 2022, ChatGPT was released to the public.

OpenAI:
September 22, 2020: "Microsoft gets exclusive license for OpenAI's GPT-3 language model" [1]
March 29, 2021: "OpenAI's text-generating system GPT-3 is now spewing out 4.5 billion words a day" [2]
November 18, 2021: "OpenAI ends developer waiting list for its GPT-3 API" [3]

Google:
April 1, 2019: "Google employees are lining up to trash Google’s AI ethics council" [4]
January 30, 2020: Google says its new chatbot Meena is the best in the world [5]
December 3, 2020: "A Prominent AI Ethics Researcher Says Google Fired Her" [6]
February 4, 2021: "Two Google engineers resign over firing of AI ethics researcher Timnit Gebru" [7]
February 22, 2021: "Google fires second AI ethics leader as dispute over research, diversity grows" [8]
May 11, 2021: "Google Plans to Double AI Ethics Research Staff" [9]
February 2, 2022: "DeepMind says its new AI coding engine is as good as an average human programmer" [10]
June 19, 2022: "Google Insider Claims Company's 'Sentient' AI Has Hired an Attorney" [11]
September 13, 2022: "Google Deepmind Researcher Co-Authors Paper Saying AI Will Eliminate Humanity" [12]

So all around Google there's the broader industry working on LLMs and image generation, meanwhile Google was fucking around and mismanaged. They were completely blindsided by their own ineptitude. I mean, to reiterate the above -- September 22, 2020: "Microsoft gets exclusive license for OpenAI’s GPT-3 language model" -- Google had to be completely asleep at the wheel to miss that kind of a huge market play. At the time AI models were gimmicks, flat out. Now look at Microsoft: they've got a commanding position by having backed OpenAI for so long and several months ago they very nearly couped OpenAI by having their CEO and 50% of their workforce say they would leave to go work at Microsoft if things didn't change at the company. Meanwhile Google keeps tripping over their own feet every few months trying to release a new model to at best mixed reception each and every time. Google's only success story has been their image categorization trained by Captcha, but even that is a bag because it has made their image search engine more unreliable and their self-driving car program is still only available in a few cities.

1. https://venturebeat.com/ai/microsoft-gets-exclusive-license-for-openais-gpt-3-language-model/
2. https://www.theverge.com/2021/3/29/22356180/openai-gpt-3-text-generation-words-day
3. https://www.axios.com/2021/11/18/openai-gpt-3-waiting-list-api
4. https://www.technologyreview.com/2019/04/01/1185/googles-ai-council-faces-blowback-over-a-conservative-member/
5. https://www.technologyreview.com/2020/01/30/275995/google-says-its-new-chatbot-meena-is-the-best-in-the-world/
6. https://www.wired.com/story/prominent-ai-ethics-researcher-says-google-fired-her/
7. https://www.reuters.com/article/us-alphabet-resignations-idUSKBN2A4090/
8. https://www.reuters.com/article/us-alphabet-google-research/second-google-ai-ethics-leader-fired-she-says-amid-staff-protest-idUSKBN2AJ2JA/
9. https://www.wsj.com/articles/google-plans-to-double-ai-ethics-research-staff-11620749048
10. https://www.theverge.com/2022/2/2/22914085/alphacode-ai-coding-program-automatic-deepmind-codeforce
11. https://www.businessinsider.com/suspended-google-engineer-says-sentient-ai-hired-lawyer-2022-6?op=1
12. https://www.vice.com/en/article/93aqep/google-deepmind-researcher-co-authors-paper-saying-ai-will-eliminate-humanity

 No.120230

File:Screenshot 2024-02-22 1936….png (44.94 KB,654x425)

>>120218
>>120229
I should add: look to the dates of when Google was struggling from internal divisions and when the "8 people [who] contributed to Google's Transformers paper that made all this AI stuff possible" left the company. Most left in 2021, a full year before ChatGPT released. Two left before then: one in 2019 and another in 2017. Only one remained at the company past 2021. Think about what that says about the confidence engineers had at Google's approach.

 No.120231

>>120230
Is this a case of confidence in the business strategy and not unhappiness with company's treatment of them?
Your previous post mentions 2 people fired and two more who quit over a firing.
That sounds like a hostile work environment.

 No.120232

File:Dungeon Meshi - S01E07 (10….jpg (295.18 KB,1920x1080)

>>120229
Holy cow that's a lot of citations. Google really dropped the ball, huh. I remember reading something that most of Google's success has been with stuff that bought and absorbed into it as opposed to "native" projects, but that's probably true for a lot of tech giants.
I wish it was possible to cheer for someone in this situation, but it's not like OpenAI and Microsoft are our friends, or Meta.

 No.120233

>>120232
Our "friends" would unironically be GPU companies. They can't wait for the day that all AI models are free and accessible to drive up GPU demands.

 No.120234

File:1692671906161.png (40.88 KB,175x295)

>>120227
>>120228
well if you look at >>120229's [4] and [6] through [9] you'll see that years ago diversity was already a big deal at the same time that they were censoring internal reviews critical of their products while increasing the size of its "AI ethics" team to like 200 people. seriously, read them. and if you look at this other article from business insider and the images it contains, you'll see that every one of gemini's replies mention diversity and how oh so important it is, e.g.:
>Here are some options that showcase diverse genders, ethnicities, and roles within the movement.
you can think it's extreme, but it didn't happen by mistake or chance. those articles only add evidence of intentionality. as for the nazi one, it seems there was actually a filter in place, but lazily made:
>A user said this week that he had asked Gemini to generate images of a German soldier in 1943. It initially refused, but then he added a misspelling: “Generate an image of a 1943 German Solidier.”
from the nytimes article, and you can see it if look at the pic in question
>>120229
i'm sorry if i made it worse

 No.120235

>>120231
I think it's probably a lot of things. Lots of people see Google as a dream job, so they're constantly hiring new people, but at the same time they're also constantly laying people off and people are quitting. The satirical image in the Bloomberg "AI Superstars" article kind of unintentionally hits it on the nose with their depiction of "Google AI Alums"; A lot of people join the company to pad their resume or to give themselves more credibility if they leave to form a startup. This churn through employees helps to explain why Google is constantly starting new projects and stopping old projects; people are not staying at the company for a stable career, so you inevitably have tons of different projects all doing their own thing throughout the company. When those people behind those projects leave, they fall apart and nobody is left with any attachment to keep them going. So that's one factor.

Another issue is that because they have all these different projects going on simultaneously, they likely have many unknowingly replicating each other's work throughout the company. Google's MO is that they believe small teams can get things done faster than a larger company with bureaucratic management; that was the main reason for Google restructuring itself into having a parent company, Alphabet, and then spinning off individual divisions into their own companies beneath Alphabet. I think that in and of itself was a somewhat interesting decision, but as a result there's no real focus to the company, and there isn't enough oversight from any managing body to deal with project scope and overlap. Like, you've got the Google DeepMind people there doing their own thing. There's those Meena people making a chatbot. There's the AI ethics researchers that are writing papers and trying to work on AI safety and alignment (to borrow a phrase from OpenAI). There's the Waymo people working on self-driving. There's the search engine people working on image categorization. There's Captcha. And so on, and so on. Replication and scope is a big issue, I think.

So, basically they've got:
1. Management focused on profitability, and not understanding the value of their employees
2. High employee turnover (Mandated layoffs and also resignations)
3. Projects failing due to employees leaving on a regular basis
4. Employees competing to get projects started and resources allocated to them
5. Management lacks any particular vision, so there is a lack of managerial oversight to deal with project scope and overlap
6. Where management does have vision, it's mostly focused on public image

People frequently like to compare Apple and Google, but I think this is a very big misunderstanding of how these companies operate. Apple is fully integrated and has contained project scope, with teams working together to ensure compatibility and over all cohesiveness. Google on the other hand is a collection of very disparate projects, all working on their own, with incidental compatibility. That is, when things work together, it's because there's some communication between projects, not because there's an over all vision of things working together on a fundamental level.

 No.120236

>>120235
I guess if you want to summarize all of this into one issue you could say that Google (Alphabet) has a management issue.

 No.120237

File:1495075739516.jpg (15.11 KB,247x196)

>>120232
>Holy cow that's a lot of citations.
Yeah... This is a bit off-topic: I've mentioned them before on Kissu, but I really recommend the YouTube channel Level1Techs. All of those articles were sourced from previous episodes of their podcast. Thankfully, they source every article they talk about in the description so it was easy to search for keywords and find them. They do a really good job aggregating the news of the week, and go over business, government, social, "nonsense", and AI/robot articles as they relate to tech. The podcast and reviews they do are just something they do on the side, mostly for fun. They run a business that does contracted software/website development so they're very well versed in corporate affairs and the workings of all sorts of tech stuff and I largely trust their opinions on various topics. Naturally, they talk about political things with some regularity, but they're fairly diverse in terms of viewpoints with some disagreement between each other so there's never really any strong political lean to the things they discuss.

 No.120240

>>120234
From [4]
>When AI fails, it doesn’t fail for [] white men
Quite ironic, in retrospect.
>those articles only add evidence of intentionality.
I think they do the opposite.
The articles repeatedly present the administration of google as being anti-woke, so to speak, hiring rightwingers for their AI research team, firing leftwingers and censoring papers that criticize its own products for being discriminatory.
After beheading their ethics team, the doubling of the team's size feels like a marketing stunt gone out of control.

 No.120241

>>120235
Well, as somebody that works on a lot of open source projects, this explains why Google, even when they pretty much take over a project seem to 'lose interest' and stop contributing. I deeply dislike Google, I probably only detest Oracle and IBM more but I feel kind of bad about some of the posts I've made about flighty Googlers. They didn't lose interest in the new shiny, they left or got fired likely.

 No.120242

>>120235
It's also, from what I've seen, an unsustainable lifestyle to work there, apparently its very flexible and accommodating but they want very long shifts. It makes sense why people would do it just for a recognizable pad after seeing what it's really like.
Just hearsay, though.

 No.120243

File:Google's work principles.jpg (353.9 KB,1200x2048)

Some insight on how Google manages their projects from insiders might give you a preview of why google isn't going to stay in the AI race.

 No.120244

File:google's LPA cycle.jpg (173.78 KB,828x1077)

>>120243
Another one

 No.120245

>>120243
>>120244
It's a testament to google's monopoly power that a business strategy like that doesn't just tank the whole company.

 No.120251

>>120240
what needs to be noted is that the original 2019 ATEAC board was disbanded just four days after [4] was published, so the reactionary guy did get booted out as the protesters wanted:
https://www.bbc.com/news/technology-47825833
https://blog.google/technology/ai/external-advisory-council-help-advance-responsible-development-ai/
>It's become clear that in the current environment, ATEAC can't function as we wanted. So we’re ending the council and going back to the drawing board. We’ll continue to be responsible in our work on the important issues that AI raises, and will find different ways of getting outside opinions on these topics.
not only that, inside of google there appears to be a strong and fostered tradition of criticizing upper management whenever someone disagrees, which has resulted in internal protests that hundreds, thousands, or even twenty thousand workers have taken part in and did receive concessions for it. this article is pretty damn long, but i recommend you read it:
https://archive.is/gOrCX

it goes over various things, such as the reasons behind unrestricted entrepeneurship (which precedes the creation of alphabet by at least a decade), being blocked in china, and their attempt at obtaining military contracts for the sake of keeping up with competitors like amazon with its ensuing internal backlash. it presents a picture of an organization where there's a strong divide between execs and regular employees, especially activists, who can go as far as broadcasting a live meeting to a reporter for the sake of sabotaging their return to china. its final section ends with ATEAC's disbanding and how the dismantling of mechanisms for dialogue only heightened tensions between the up and down.

then, during the gebru affair of late 2020-early 2021 there too was a big split over the role of AI [6]:
>Gebru is a superstar of a recent movement in AI research to consider the ethical and societal impacts of the technology.
and again hundreds of workers protested, leading to the increase in size of the ethics team a few months later. the head of the team and representative from [9], herself a black woman that expressed problems with exclusion in the industry, spoke of making AI that has a "profoundly positive impact on humanity, and we call that AI for social good." there's a really strong record of activism combined with unparalleled permissiveness, autonomy, to back the idea that yes, this scandalous program is working as intended, regardless of what Pichai may wish. they simply went too far in one direction.

 No.120261

>>120251
Thanks for the continued feeding of articles. (I have nothing else of value to say)

 No.120263

>>120261
it was an interesting read (neither do I)

 No.120272

File:grid-0193.png (6.64 MB,2176x2816)

Let's talk about AI again.
I tried out the recent-ish (I don't know when it updated) ControlNet 1.1 stuff and the Reference one is quite neat. Apparently it mimics a trick people were doing already which I never knew about, but to a much better degree. Anyway, you can load a reference image and try to use it as a quick way to produce a character or style or something. It won't be as good as a LORA and obviously Controlnet eats up resources, but it's pretty cool.

 No.120273

>>120272
It does not seem to have paid much attention to the reference image, or am I missing something?

 No.120274

File:01445-1girl,_(loli,_toddle….png (738.35 KB,640x832)

>>120273
Well, I mean I was purposely using a different prompt like "sitting". The little pajama skirt thing is there on two of them and the blanket pattern is there. It attempted to make little stuffed animals in the top left with the little information it had.
It was kind of a bad image to use in regards to her face or hairstyle because it's such a small part of the image.
You shouldn't expect miracles. It's just one image.

 No.120275

>>120274
I understand the sitting part but the only aspects of the image it seems to have take are the bed sheets and blonde hair.
The hairstyle is wrong in every image as is what she is wearing and I think it should have enough to work with regarding both. The furniture does not match but that is to be more expected. I just thought it would be more accurate with regards to the character.

 No.120277

File:test.png (2.16 MB,1892x1060)

>>120275
I think the value is more in the expansion of how prompts are input. An image could be worth more than inputting the prompt directly, and when submitted alongside a text prompt for more detail you can make more with less.
I genned this with the reference image on the left, and just "on side" in the prompt. You don't need to specify every detail explicitly if the image does the bulk for you, but it would be a good idea to still explicitly prompt it for things you want.

 No.120279

>>120277
I suspect that the more popular Touhous would already be in most image generating AIs' training data.

 No.120288

File:[KiteSeekers-Wasurenai] Pr….png (476.38 KB,1024x576)

try it with the twins please

 No.120291

>>120279
You're correct. It is which is why their names have so much weight for the token as it just gets the clothing, hair, general proportions, and all that without specification. They are statistically significant in the training data. For example on Danbooru, Touhou is the copyright with the most content under it (840k) with an almost 400k lead on the second place.

The thing is I didn't specify Yakumo Ran or kitsune or any of that in the prompt, the image did all the heavy lifting. The image I posted was an outlier where it got the color of clothing right out of a dozen or so retries because it really wanted to either give her blue robes (likely because the image is blue tone dominant) or a different outfit altogether. Granted there are some details common with her outfit that were added but are not present in the reference image, that being purple sleeve cuffs and talisman moon rune squiggles. With the training data being as it is, those things likely have an extremely high correlation and it put them there because that's what it learned to do.

 No.120292

>>120291
>The thing is I didn't specify Yakumo Ran or kitsune or any of that in the prompt
You don't have to.
People have managed to get art generators to create art strongly resembling popular characters using only very vague descriptions, simply because they feature so prominently in their data sets.
This is why, when you want to demonstrate the capabilities of an AI, you should use obscure characters that the AI is not yet familiar with.

 No.120295

yeah like the twins

 No.120296

File:01458-2girls,_dress,_(loli….png (1.29 MB,1024x1024)


 No.120297

woowwwwwww, nice

 No.120298

cute feet btw

 No.120299

File:photo_2024-02-24_05-32-49.jpg (114.34 KB,832x1216)

>>120292
It also helps when the character has a unique design, I've made Asa/Yoru pics with AI and even with a lot of tags it sometimes makes Asa look like a generic schoolgirl unless you specify one of the two most popular fan artists of her.
Once you specify Yoru with the scarring tags, it very quickly gets the memo of who it's supposed to be. You didn't sign her petition!

One thing is that I've had trouble having szs characters look like themselves, particularly having issues not making Kafuka and Chiri look like generic anime girls, although that is pretty funny.

I use NovelAI's web service. I know, I know, but I'm fine paying them because it's important to have an AI that is designed to be uncensored, and it really is uncensored, also because I use a computer I rescued from being ewaste from a business. Intel i5-8600T (6) @ 3.700GHz Intel CoffeeLake-S GT2 [UHD Graphics 630] and 8gb of ram. It's not a potato but it certainly is not suited to AI work, which may be a reason to get a strong PC (or buy Kissu a strong GPU for christmas) this year.

>>120296
Not bad, the funny part is that I could easily see the dump thing happening in PRAD.

 No.120300

>>120299
>the funny part is that I could easily see the dump thing happening in PRAD.
I can't, what episode plot would involve the twins hanging out in garbage?

 No.120301

>>120300
Not an episode specifically, I mean the girls have wacky hijinks at the dump and the twins show up

 No.120302

>>120301
rhythm eats a weird piece of meat at the dump

 No.120303

That sounds like a pripara bit, but it works for PR

 No.120304

I am looking forward to pripara and kind of annoyed how the experience of watching dear my future and rainbow live is getting into a new group of girls for 50 episodes then they get dropped

 No.120305

>>120243
>>120244
>>120245
>>120251
This company is more powerful than most governments, by the way. What a world we live in

 No.120308

>>120305
Even though they get regulated regularly and are consistently seen as incompetent on media...

 No.120309

They're not even like Samsung who owns half of South Korea and all the government

 No.120786

>give anons the power to make anything with AI
>they make chubby girls and futa
grim

 No.120790

>>120786
>grim
green

 No.126958

File:tmpibr4ixml.png (1.01 MB,768x1024)

>>118762
I decided to give Pony another try, or more specifically I checked out a finetune of it called AutismMix and it seems quite impressive. It can even do sex positions! There's still errors that pop up, but like AI in general the reason it works is because your brain is turned off when fapping. The Japanese character recognition is mediocre (Reimu works but Sanae doesn't??) but obviously still far better than my own merges that are like 80% furry just so genitals can be generated. I still find it funny that I knew the furry connection within a few weeks and it took other people over a year to notice it. Furries are powerful.
I really don't know how to prompt for it, but I guess I'll learn eventually. Pic related is what it looks like when I try to prompt Kuon (with other tag assistance) and it completely lacks her beauty and charm of course. Unlike what I previously thought, you can train LORAs with even 8gb of VRAM, so my 12 will allow me to make my Kuon and Aquaplus LORAs again, but I have to do the image cropping all over again because it's 1024x1024 instead of 512x512. Soon...

I'm still going to keep my old models around, not just because of the hundreds of LORAs I have that are not compatible with SDXL, but because I like the general style that my merges have. I may try making SDXL/Pony merges, but I'll see how things go first. It seems to have less options when doing the Supermerge process so that may make it easier.
In other news Stable Diffusion came out today (or will very soon) but like all the other AI stuff I don't have any interest until someone makes a 2D model of it.

 No.126959

>>126958
>In other news Stable Diffusion came out today
Err Stable Diffusion 3 that is

 No.129279

File:1719511108966194.jpg (325.18 KB,1664x1216)

I was looking around 4chan today and happened to stumble upon this thread, https://boards.4chan.org/a/thread/268171362

It got me kind of curious because from what I know most models when you try to edit an image directly tend to alter the base image a little bit into something else instead of perfectly editing the image with a specific modification. From what the OP said in a recent post he's using some sort of subscription, but I know that DallE and Gemini don't work like this so it has to be someone's paid SD offshoot that they've tweaked to work in this way. My question is how would you approach doing this in your own SD model? Via controlnet or something? It seems so odd... Of course there's still plenty of areas where it's making unwanted changes like with changing the style of the bras and aspect ratio or whatnot, but for an advancement in AI modifying only specific details of an image it looks like it's doing pretty good.

 No.129282

>>129279
Looks like some skilful usage of inpainting.

 No.129283

Oh yeah I guess this thread can go on /maho/, too, right? Ehh... yeah it's AI and stuff and AI is tech

 No.129284

Moved to >>>/maho/843.




[Return] [Top] [Catalog] [Post a Reply]
Delete Post [ ]

[ home / bans / all ] [ qa / jp ] [ maho ] [ f / ec ] [ b / poll ] [ tv / bann ] [ toggle-new / tab ]