No.843[Last50 Posts]
Anyone else been messing around with the stable diffusion algorithm or anything in a similar vein?
It's a bit hard to make it do exactly what you want but if you're extremely descriptive in the prompt or just use a couple words it gives some pretty good results. It seems to struggle a lot with appendages but faces come out surprisingly well most of the time.
Aside from having a 3070 i just followed this guide I found on /g/
https://rentry.org/voldy to get things setup and it was pretty painless.
No.844
Ah, yeah, I've been reading up on it. I downloaded some 7GB danbooru thing for it. I wouldn't trust /g/ with an Etch-a-Sketch so I won't follow a guide from there, but I've saved some links from other places:
https://moritz.pm/posts/parametershttps://github.com/sddebz/stable-diffusion-krita-pluginhttps://lexica.art/https://docs.google.com/document/d/1gSaw378uDgCfn6Gzn3u_o6u2y_G69ZFPLmGOkmM-Ptk/editI'll get to trying this eventually, but so far I've just been procrastinating a bunch since I need to install and run python and do other stuff I don't understand. My VRAM is also only 6GB and I'm not sure if that's enough.
No.845
>>844I'm not too concerned with the theory at the moment and more just wanted to know what practically has to be done to get it running. That guide more or less amounts to downloading some git repo, a model (this is the sketchiest part but you already did it), and download python 3.10.6. Then run a bat file and it works. From what I can tell the web-ui allocates 4gb of vram as a default and you'll have to pass arguments to get it to run more or less otherwise. It should run with an nvidia card that has 6gb.
That Krita plugin looks interesting, will check it out later.
No.847
>>846here are the other faces from that batch
the model i'm using is supposedly trained on a set of images from danbooru, not sure why it'd look korean specifically other than chance
No.848
I have exactly 0 (zero) interest in AI art. I have not saved a single file from one of them to this day even. I wouldn't call it being a hater, but they really are just fundamentally unappealing to me.
No.849
I don't get what all the fuzz is about either. If you've seen one image, you've seen them all. They all have this weird quality to them. Maybe it's that there's absolutely nothing meaningful about them. Doesn't help that most of these images look like bad crops.
No.851
>>850why is she stuffing her boobs with spaghetti....
No.852
>>851Ever wondered why girls smell so good? This is why.
No.854
>>850dat' polydactyly
wow, so even AI has trouble drawing hands
No.857
>>854>so even AI has trouble drawing handsYeah, it must be related to how the algorithm copies things it gets confused and can't do hands. With faces the parts have general locations and you can meld shapes a bit, but with hands it's trying to copy a bunch of different positions and angles into one and it breaks. Anime faces might be one of the best things since they don't even make sense to begin with in regards to angles.
No.858
I just steal other peoples prompts and add waifu.
Also if anyone else is on AMD on Windows, I followed this guide and it works
https://rentry.org/ayymd-stable-diffustion-v1_4-guide.
Also Also if anyone can help me figure out how to change output resolution, that would be swell.
No.859
>>845>>850>>855Yeah, I've been somewhat surprised by the quality of the more 3DCG drawings I've seen from it, but when it comes to more anime style the AI falls short. There's probably more subtleties that it can't pick up in batch because of differences in artist styles that causes these amateur-level drawings.
No.861
Alright, I'm diving in. Might take a while to get stuff set up and figure out what I'm doing, however.
No.862
File:a.png (357.08 KB,512x512)
Making some progress...
No.864
File:index.png (Spoiler Image,4.25 MB,2048x1536)
>>863Ehh, so many of these are horrifying so I'm going to put them behind a spoiler. I think I'm going to try that thing tomorrow where you can selectively "refresh" parts of the image
No.867
>>863Has science gone too far?
No.868
>>864From AI i've used myself these arent so bad
No.869
>>865Is that an anthropomorphic "furry" Koruri?
No.876
>>874the rendering and shape is good, but it's still making mistakes. Just that it's focusing on something simple so the mistakes are better disguised
No.877
>>874did you use the prompts from stable diffusion to make that?
No.878
>>877I didn't make it. I got it from the stable diffusion thread on 4/h/. I've been lurking it for a few days because it's a lot slower than the /g/ one and seems to have more technical discussion.
I just wanted to share it because I thought it was a pretty good generation.
No.879
There's a new model called Hentai Diffusion that was trained on Waifu (ugh) Diffusion and 150k danbooru/r34 images. I guess it'd be better at nudity?
https://huggingface.co/Deltaadams/Hentai-Diffusion/tree/mainYou might need a huggingface account to download it. I have one because I was going to upload a set to train or whatever, but then I saw that there doesn't seem any way to use their GPUs without making it public and they have rules against nudity and I also wouldn't want to upload an artist's work for others to exploit for real instead of making stupid things on kissu.
Wish I had more VRAM. Oh well.
No.882
I've seen AI that write code and this reminds me of some of the shortcomings people had with it.
While they were trained on a large database, it would often be the case that the AI was technically copying programmers from stack overflow and using the raw input information into people's software.
I feel like it's almost the same case here. It took chunks from every artist it saw creating essentially a collage with little creative problem-solving of it's own... and when it does it's simply a confused error rather than inference.
I was much more impressed by reimu's breasts
No.883
it's almost as if machines cannot think
No.884
>>843>>844I've been messing with SD since last week using
https://github.com/AUTOMATIC1111/stable-diffusion-webui and the danbooru model
https://thisanimedoesnotexist.ai/downloads/wd-v1-2-full-ema.ckptI've been doing only txt2img cause apparently I don't have enough GPU RAM for img2img (laptop GTX 1660Ti).
A couple of images turned out to be cute most are pretty bad, or maybe my prompts are bad who knows.
I've been thinking of setting up my local server to produce anime images 24/7 with some script that autogenerates prompts, not sure how its GTX 970 would handle it though.
>>879Not messed with lewds too much for now, going to download it and try.
No.887
>>886really hoping that's a cute boy and not a g*rl
No.888
>>885Yes, science has gone too far.
No.889
>>888You will live long enough to see robotic anime meidos for domestic use and you will be happy.
No.890
>>885It's kind of interesting to see a real artist use it. I'm assuming he did the img2img thing which uses an image as a guide since it's got his style's wide face and ludicrously erotic body proportions.
This is a good example of how generic it looks when compared to the real thing, which you can't really get around since generic is exactly how it's supposed to function. In theory people can (and will certainly try) to directly copy individual artists, but so far it's pretty bad at that.
No.892
>>882When you really break it down, "AI" art is more or less the same thing as procedural level generation in games. The computer is provided with a set of rules, and then randomly generates something that follows those rules.
That's also why I can't see it outright replacing artists like a lot of people are afraid (or if they're psychopaths, hopeful) it will. You can generate all of the level design for your game procedurally, and a lot of games do (minecraft, for example). But "level designer" still exists as a profession for a reason.
No.900
NovelAI's model has been leaked. hehehe. Meaning you can do it offline without paying them.
It's 52gb with multiple models, and I doubt I'll be impressed but I'm torrenting it anyway.
No.901
>>900Can you post the link?
No.903
>>902Thanks, adding it to the hoard.
No.904
>>902But someone is replying about 'python pickles' and I have no idea what that entails. I guess he's telling people that it could contain a virus or something or otherwise have code in it? There's this link but I have no idea what it means:
https://rentry.org/safeunpickle Does anyone here know python and can tell what the thing above does? They made it sound like it's something to use to check for malicious stuff or maybe I interpreted it wrong.
But, people on 4chan are already using this so I think it's safe
No.905
>>904Pickle is a data serialization library:
https://docs.python.org/3/library/pickle.htmlSerialization means turning in memory data like objects into a format that can be stored on disk or sent through a network. JSON is another common serialization format.
I don't use pickle much but unlike JSON which is plaintext, pickle is binary so when you deserialize it yes it's possible that arbitrary code hidden in the data can be executed.
>https://rentry.org/safeunpickleAfter a quick glance it looks like that code overrides some of the functions described in
https://docs.python.org/3/library/pickle.html#pickle.UnpicklerThe overridden "def find_class(self, module, name)" seems to implement some kind of whitelist so that only certain kinds of data(I guess considered safe).
I can't guarantee that code actually protects against possible code execution though, if I were you I would download it if you care but wait some time before executing it and see what happens.
No.906
>>902All the AI talk is hurting my no-knowledge-on-AI brain. Apparently there's going to be a part
2 to the leak, can't keep up with /g/ but am happy to download/seed it though.
No.907
>>902anon created a guide, probably 100% the real deal now
https://rentry.org/sdg_FAQ
No.914
>>912my wife chino is
ballin'
No.915
>>885I swear to you guys I was arguing on another corner of the internet that I'm not interested in AI because it couldn't create art in the style of a particular artist, and the artist I was referring to was literally Zankuro in specific, yet here we are. I was crushingly naive. I wonder how far off we are from it making lewd gifs in Zankuro's chubby loli style...
No.917
Unsurprisingly it fails to capture Kuon's beauty, although I don't know how to do the tagging with this for Kuon_(Utawarerumono) so I took a guess from what I think I remember seeing.
This one came pretty close to getting her face I think. But, I need to do a thing where I train it.
This is something I/we need to read up on that apparently is a big deal:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/2284
No.919
>>918>abmonoDo you mean abmayo?
No.921
>>919it works because mayonnaise is a a もの, so it's all cool
No.922
People are making animations with it somehow using a script. (webm contains nudity)
Copied /h/ post:
you can already make a video
https://github.com/Animator-Anon/Animator
this was one an anon made
afaik the keyframes were something like
Time (s) | Desnoise | Zoom (/s) | X Shift (pix/s) | Y shift (pix/s) | Positive Prompts | Negative Prompts | Seed
0 | 0.6 | 0.9 | 0 | 0 | sleeping in bed, under sheet | | -1
2 | 0.6 | 0.9 | 0 | 0 | shower, washing self, naked, from above | |-1
4 | 0.6 | 0.9 | 0 | 0 | eating breakfast, dressing gown| |-1
6 | 0.6 | 0.9 | 0 | 0 | Sitting on a bus, uniform, from below | |-1
8 | 0.6 | 0.9 | 0 | 0 | on stage, bikini, tattoos, singing, full theatre, bright lights, microphone | |-1
10 | 0.6 | 0.9 | 0 | 0 | drinking at a bar, cocktails, black dress, cleavage, earrings, drunk, flirty | |-1
12 | 0.6 | 0.9 | 0 | 0 | bed, (doggystyle sex:1.3), pubic hair, 1girl, 1boy | |-1
14 | 0.6 | 0.9 | 0 | 0 | passed out in bed, under sheet | |-1I couldn't open the webm in waterfox or firefox, but it worked with brave and mpc
No.925
>>924The miku virus infects all...
No.928
Good way to find embeddings on 4chan:
https://find.4chan.org/?q=.pt (Warning: NSFW thumbnails images are likely)(They're pt files)
Just grabbed a ZUN one that I'll try later
No.930
There's a fad that started with an AI generation thing with a glowing penis that has artists imitating it. Kind of meta.
https://www.pixiv.net/en/tags/%E3%82%B2%E3%83%BC%E3%83%9F%E3%83%B3%E3%82%B0%E3%81%A1%E3%82%93%E3%81%BD%E8%8F%AF%E9%81%93%E9%83%A8/artworksFrom one of my favorite random creative artists (and maker of that one furry Patchy)
No.933
that's just a bioluminescent mushroom dude
No.936
>>935The anatomy is quite weird and of putting but I guess most Ai images are like that.
I never tried it, maybe I will.
No.938
>>936Yeah it's not perfect, but the thing with porn, especially niche stuff that doesn't otherwise exist, is that the brain overlooks it due to the excitement and stimulation over the rest. It's like your choice is a handful of doodles from some guy from 2008 or this thing creating new amalgamations of fetish fuel with errors. Most people have no reason to use this for porn, really, since it's easily inferior to something created by hand. But if that stuff made by hand doesn't exist? Yeah...
No.940
>>939Oh, I'm quite aware dickgirls haven't been niche for like 15 years. The fact that it's ubiquitous is also why any simple image doesn't work, it's no longer manna from the heavens by virtue of existing. Find me some quality newhalf mermaid art with a human penis instead of some weird "realistic" furry dolphin version. Also, give her a nice soft belly, a mature face, a warm smile and an apron. Also it's Takane from Idolm@ster, a girl that shares the face of the first 2D girl I had a crush on (since Luna is too old/obscure to have training data). Here's one I just generated, although it has some pretty noticeable errors.
People have fantasies more elaborate than "a girl with breasts of any size, preferably alive" and it's not any different in my situation just because a penis is involved.
No.942
>>941To make an image set like this you want to go down into Script and use X/Y Plot, then select Prompt S/R.
In this example I have it start with 10% angel and 90% demon and then end with 90% angel and 10% demon.
The X/Y script is a massive help in finding the ideal settings, so people use it a
LOT.
No.944
>>943.pt files show up in a few places, but when people are talking about it and it's not troubleshooting it's about hypernetworks. Back when I made that post embeddings were the cool thing (and they also use .pt), but now it's hypernetworks. They're basically fine tuning things for a certain concept, but it's almost exclusively specific artists or characters. IE this was using the embedding that mimics abmayo
>>918Embeddings are called by name in the prompt, whereas hypernetworks are loaded in the Settings. Embeddings are 20-80KB whereas hypernetworks are 85+
MB. I personally liked embeddings a lot more not only because of the file size but because you could combine them. I guess hypernetworks are better and that's why everyone uses them?
Here's my embeds folder. Some of them were just uploaded without labels and I never figured out what they did, like the 3 named "ex_penis".
Extract the folder in the main WebUI folder so it's like:
stable-diffusion-webui
\embeddings\bleh.pt
and then you should be able to use them.
The badprompt ones is actually something newer. You put it in the negative prompts with 80% strength, I.E I use
(bad_prompt2:0.8), lowres, bad anatomy, etc
No.945
>>941Does this only work for two tags? Or can you batch together multiple into the percentage.
No.946
>>945Probably, but I haven't checked. I guess it'd just be A:B:C:# for 3 and so on
No.947
>>941>>945Interesting sort of addendum I found for doing this sort of thing:
>you can [x:y:z] / [:y:z] / [x::z] to make the ai draw x then y at stemp z (or percentage of steps if you put a decimal), which works great for stuff like [tentacle:mechanical hose:0.2] to make the ai draw tubes everywhere, or you can do x|y... to make the ai alternate between drawing x and y every other step; you can put any number of things here e.g. x|y|z|a, but obviously the more you use this the more steps you need, in general
No.948
>>947That's exactly the post I saw that made me want to try it. I heard people mentioning this functionality weeks ago but completely forgot. It seems rare that anyone uses it, but it could be really great
No.949
>>942When I try making one of these I get a
>RuntimeError: Prompt S/R did not find angel wings:demon girl:0.1 in prompt or negative prompt.Does this mean I need to put the tags into the prompt somewhere? Or attach an X to them?
No.950
>>949The first thing listed there has to be in the prompt for the rest to replace it. You should be able to hover over it for a tooltip.
I.E
masterpiece, picnic, turtle, eating banana
in the script you'd put
banana, burger, corndog
No.951
Gotta say, reading the documentation for all this stuff regarding stable diffusion has really impressed me with how much work and development has gone into making the open version as great as possible, beating out even its premium competitors.
I guess this is the true power of computer dorks trying to get the perfect porn.
No.953
>>952You use img2img, which can itself be guided with a text prompt like txt2img so it's really more like img+txt 2 img.
As an example here is an image I drew
No.955
>>952It's likely a very generic prompt that has a denoise of like .5 or something to keep the general shapes but still alter it enough to be noticeable. I saw someone point out that they look like Genshin characters, so it's probably using something trained on its images.
I have a Genshin hypernetwork for that so let's see the result when I throw some stuff in: (pic related)
I don't want to spend a bunch of time trying to replicate it, but you get the picture. It probably uses a few traditional artists tags since people have done lots of examples of those, including myself
No.956
... and if I do the same prompt and same settings as in
>>954 but without the input image, this is what I get.
The cartoony nature of my image is at odds with the Stable Diffusion model's realistic photograph style. Getting anything done with this sort of thing is probably best when it's iterative, mixing both txt2img and img2img.
No.959
>>957Oh, after testing with this it does seem to greatly increase the time it takes to generate stuff, so maybe only use the 'image preview while generating' thing if you're unsure where to stop when working on settings, and then set it back to zero when you're actually producing a bunch of stuff.
No.960
A lot of knowledge about this stuff requires scouring and searching or surreptitious posts, so I'll try to share some more info.
This time I'm going to talk about two
Extensions that I use a lot.
The easiest way to get new Extensions is to go to the Extensions tab of the WebUI and then go to Available and hit the "Load from" button with its default URL. From there you can install stuff, which will then show up on the Installed tab. For a lot of this stuff you need to restart the UI from settings if not restart the .bat file itself.
The ones I use and can give detail on:
Dynamic Prompts:
https://github.com/adieyal/sd-dynamic-promptsThis is used to randomize creations on each image generation. You can use it with new words in the prompt, but I've never done that. Instead, I mainly use this to call random effects from wildcard text files. You create a text file with a new line for each possibility put the text file in /extensions/dynamic-prompt/wildcards/text.txt and then call it from the prompt by its name with two underscores on each side. For instance you can make haircolor.txt and put this in it:
green hair,
blue hair,
red hair,and then put __haircolor__ in your prompt and it will randomly pick one of those each time an image is generated. This means a you can make a batch of 10 images and come back to different results. This is really, really good if you're just messing around to see what works. It can also call other txt files from inside. I'll share my wildcard text files soon. It also has a "Magic Prompt" system that I've never used, but it could be cool? Beats me. Someone else do it.
TagComplete https://github.com/DominikDoom/a1111-sd-webui-tagcompleteIt autofills booru tags for stuff based on danbooru, which NAI and the 'Anything' model is. Really, really nice, but can also be annoying at times with the pop-up. Unless you have tags memorized this can help a lot. Speaking of, you should make yourself accustomed to danbooru's tags:
https://danbooru.donmai.us/wiki_pages/tag_group%3Aposturehttps://danbooru.donmai.us/wiki_pages/tag_group%3Aimage_compositionhttps://danbooru.donmai.us/wiki_pages/tag_group%3Aface_tagshttps://danbooru.donmai.us/wiki_pages/tag_group%3Ahair_styleshttps://danbooru.donmai.us/wiki_pages/tag_group%3Aattireetc
No.963
>>962this sort of thing has been possible for a few years, but without danbooru datasets used for art training it wouldn't be easy
No.964
>>962It's already a few years old, isn't it?
Here's the original Reddit thread about it, and it's been discussed on 4chan in the past as well. I recall there originally being some talk about the possibility of it actually being used for tagging, but it's not good enough to replace manual tagging anytime soon and is otherwise little more than a novelty.
No.965
Oh. Wow, it's 4 years old?
Well, anyway, it's really cool how it's used here for immediate benefit. You can use it to assist in image tagging for training, but also as a building block to generate new images.
No.966
So, the training setup I put together from what I read. Much of the information is from the discussion here:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/2670Also, thanks to people on the /h/ board of 4chan as those guys are great. Don't use /g/, but maybe that should go without saying.
Modules: (I checked all of them because it's unclear what they do. Everything was checked by default except 1024 which seems to be a new addition)
Layer Structure: 1, 1.5, 1.5, 1.5, 1. This is called a 'deep network' as opposed to default or wide. Default is good for most things, particularly if you have a low amount of images (20ish was mentioned). Wide is for specific things like an animal, character or object. Deep is for style, which most people seem to be using hypernetworks for, with embeds for characters. It doesn't have to be, but that seems to be the pattern forming.
Activation Function: Softsign. lots of math talk and graphs I don't understand, so I just went with the recommendation.
Weight intitiation: XavierNormal. Same thing as above
Layer normalization: No. I haven't seen anything informative about it, but no one seems to use it
Use Dropout: Yes. I heard it's good if you have a "larger hypernetwork". I think that means the numbers in the Modules up there and also the amount of training images used. I had 90ish images and did the mirror image thing to turn it into 180ish, but that's definitely not as good as 180 unique images. I don't know if it was good or bad that I used Dropout, but it didn't ruin anything
No.967
>>327And once you get to the Training tab you can load the hyper you just created (or one you've downloaded maybe? that part seems questionable)
This tab is for training embeds or hypernetworks, but I've only done hypernetworks so I can only talk about that.
Batch size: I haven't been able to find conclusive information on this since 'batch size' is text that is shared in every prompt so you can't just search for it by name. It uses more VRAM, but might not necessarily be better at training. The ONE comment I've found on it says that you could increase it instead of lowering learning rate later on. I'm already at VRAM limit when training and having a video and photoshop open so I don't touch this.
Learning Rate:: I think people start with the default for these. Only the hypernetwork number matters for hypernetworks. I see people add a decimel point in front of the 5 as the training steps reach 5000 to 10000, so I copied that. It sounds like the lower number is better for finer detail once you've established things
Gradient accumulation:: A newer thing, supposed to assist in training rate somehow, but I don't know how. It mentions something like "learning in parallel" or something. I don't know. People say to use it and set it to like 5 so I have it at 5.
Dataset Directory:- The image with the folders. I could talk about images, but I'd mostly just be repeating this:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/2670Prompt template file: This is a list of basic prompts that are added to previews alongside the included tags attached to each image. People say it's fine as default, but might be something to mess with if you want to check for specific stuff?
Width/Height: Keep it at 512/512
Max Steps:: How far the training will go. This is stuff that takes days, though, so I'm not sure how useful this is, because of something I'll talk about in a sec. I suppose it's good if you only want to run it for a set amount of time.
Save an image every n steps:. It saves an image as if you prompted it with random tags included in your training folder, but it can make freaky combinations that you wouldn't normally use so keep that in mind.
Save a copy of embedding every n steps This is an important one and why I didn't care about the Max Steps thing above. It saves the hyperwork with the number of steps to the folder automatically. By default it's at 500 which is where I have it.
This means the folder will look like:
test-500
test-1000
test-1500
as it trains for longer periods of time.
There's an option in settings under Training to Save Optimizer state, which allows you to resume training from these saved files.
VERY important!.
Note: To use the hypernetwork (or resume it from file) you need to move it from the saved directory (default textual_inversion) and move it to the models/hypernetworks folderSave images with embedding in PNG chunks I think it lets you use PNG info like normal generated images. I kept it on.
Read parameters from txt2img For preview images it takes what you typed in the txt2img tab. I never used this since I wanted a variety of images, but it could be useful? I read to never use tags like 'masterpiece' or 'quality' there, though.
Shuffle tags Yes. It adds more variety to the images by changing priority or something.
Drop out tags with prompts: I think it drops a certain percentage of prompts per generated preview image. I kept it off, but not sure. It's just preview images and not the actual training itself, so I guess it could improve or hinder perceived accuracy there.
Latent Sampling Method I only hear people mention deterministic so that's what I went with.
No.974
>>973they all seem kind of kuon like
No.975
>>974Well, it's the same artist so they should look somewhat similar, yeah.
But in that image I posted I can identify different characters. I'm not sure how exactly it works because sometimes it's very random, but other times it's obvious, like how this is an attempt at Kamyu (even though I never labeled her and it wouldn't understand it anyway)
No.977
>>976i think it can get more perfect
No.978
>>977Hmm... after a full night I'm not sure if it can. At least overall, I think getting a perfect one is still going to be rare. I see what people are saying now and that you're probably not going to notice gains after 20000 steps or so. But, I think it still needs improvement somewhere.
I need to look at it and see if there's stuff I can improve upon, which basically means I'll train it again at a different rate. When I tested making a hypernetwork a few weeks ago one night was like 2000 "steps", but now I just did 58000. (meant to do 50000, but forgot I resumed from 8000). It saves a progress of its current progress, so if for instance it made the best result at around 20000 steps and then went haywire, you can grab the backup it created at 20000 and either just use that as the completed product or resume training from it.
Well, now that I've done the style hypernetwork I should try making 'embeds' which I'll use to teach it characters. It still doesn't know who any of these girls are so I can't actually call them directly and instead need to use their traits and hope it arranges it correctly. For instance it'll never know Kuon's proper clothing or ears unless I create an embed which I can invoke from the prompt. From what I've read, when training an embed you label everything EXCEPT what you want it to call.
Maid Kuon at the computer! It really can't do hands or keyboards, but that's not specific to this.
No.982
>>980>>981dont remove it, kuon looks sexy with a slave collar
No.983
>>982It wouldn't be removed exactly, just require an actual 'metal collar' prompt to show up. I guess I should remove 'breasts' from everything, too. I'm not sure why boorus have redundant tags like that.
... I think?
'detached collar' isn't even here, so maybe this is something I can't avoid at all. Or maybe this is the base data... I guess I should do some testing... either way it probably wouldn't hurt to specify things more
No.986
God damn. Okay, I can say that switching from a 1, 1.5, 1.5, 1 neural net thing (whatever that means) to a 1, 1.5, 1.5, 1.5, 1 one was a massive upgrade. I don't know why. Oh, and I think I MIGHT have clip skip set to 1 instead of 2, but that wasn't supposed to be a big deal. Hmmm.
The first one here, 'aquaplus' was my training 2 nights ago whereas the two others are different checkpoints from the one I trained last night. I just don't understand how it's such a massive improvement.
No.988
>>987but on the default euler a sampler they look fine, but the first one seems like it's probably the best, but still within the normal variation you'd expect.
more testing needed....
No.993
>>991damn that looks really good too
at a glance it's hard to tell those are AI
No.994
>>990Congrats, seriously. Looks damn good.
No.998
>>997By now the default with the newer fine-tuned models is far more impressive than the NAI leak stuff, but it's still all built on that. You're actually in a far worse position if you're paying for it now.
This high detail one is based on a mixture of real life and 2D art so it can do things pretty well, but you have poor control of it. It looks extremely impressive, but it's not obeying my very strict training of Kuon's outfit, so imagine trying to get something that you haven't trained.
I've been wondering if I should start grifting since I know what I'm doing and everyone else that knows what they're doing is, well... kinda normal. But, being normal is what gets you the most exposure and success. I can't name gacha characters, for example. But, I could corner an AI fetish market especially if I combine the training with my 3D models. This is when it'd be good time to have the motivation to do things
No.999
>>998>griftingSounds like such a bad way to put it... but I get what you mean. I think if you consider yourself capable and with some desire to do so you should try it out. I mean training your own stuff is probably a long and arduous process, more than most are willing to invest into it.
No.1000
>>998Im very proud of your progress.
No.1001
Its kind of crazy that if you put the effort into training one of these things you could have unlimited fetish porn?
No.1002
I havent played with novelAI outside of porn but I may generate non-ero OG fallout fiction with it to see its quality
No.1003
>>1001>unlimited fetish pornThat is exactly what it is. If people think there's "porn addiction" now, wait until normal people get a hold of this stuff. I have a blog of the pornographic progress I've made on /megu/.
Still, I'd prefer human-made stuff if it existed, and stuff that's more mental like incest isn't really something you could satisfy with an image alone. You can't tell stories with this and stories are really,
really,
REALLY good. A good doujin is easily better than this stuff, but when you're dealing with specific tastes then yeah, it's the best option available.
No.1004
NovelAI really likes futa
No.1006
>>857What's interesting is it doesn't necessarily make the same mistakes a human makes when drawing hands. It makes its own sort of mistakes you don't see in real art.
No.1007
>>1006The most popular checkpoint models these days (for those doing it offline) are a mix of 2D art with conventional photography. It increases hand quality a bit, but it's still far from from passable most of the time without a bunch of "inpainting' which is basically like selectively retrying a part of an image. Some of the models people use look more like real life photos with an advanced filter on top, which can be very creepy and also takes away from some of the appeal since it introduces 3D limitations in perspective and such
No.1008
>>1005Colouring shouldn't be hard for an AI. It could just pick colour pallets form existing images in the same pose and apply them.
Even though the line work is done the lips of the girl on the top right are weird. Also I am not sure if I ever noticed this before but the hair on these AI girls is quite bizarre(the ones on the right), not only is the fringe not symmetrical but the far side looks weird.
The backgrounds are odd too, the sunset kind of drops suddenly in one image and the floor boards in another are all different widths.
No.1009
>>1008I mean top left not right...
No.1010
>>1007As an example, as I continually attempt to refine my custom merged checkpoint for, uhh... /megu/ reasons you can see the effect of one of the models already having some RL stuff mixed into it. The shading is absurdly good, but I really have to fight it to create clothes that aren't modern and it feels very "real" which can be a good thing and a bad thing depending on one's tastes. (also as a side note need to figure out why it's ignoring tags)
And look at that hand. I didn't do any editing here. But, it definitely looks like a real human hand. I don't know how to feel about it. I guess maybe for now it's a sacrifice to make if you don't want to do edits, but I like style over reality.
No.1012
>>1011really amazing for ai hands
No.1013
I wonder if it's possible for AI to make manga or if that's far too many variables to be solved in a realistic timeframe
No.1014
Have you ever tried using doodles you make as a base for the AI to build off of? Wondering if that's more effective than just generating a bunch of images that may vary in psoture/position each time.
No.1015
I did a whole bunch of testing with various RL models to see if I could understand how exactly people are making them assist in 2D hands/poses while not giving them a massive hit in quality and I really could not find any pattern. Although, my tolerance for spending hours making small merge differences is getting pretty low and I need to spend some time doing other stuff before getting back into it.
However, I did think I have an idea of how to bandage it. THe LORA things are basically like "plugins" for a checkpoint model, and for example the Amaduyu/Aquaplus one I made is pretty good at fixing the faces, but then of course they will always have at least a hint of Amaduyu/Aquaplus so I'd need to mix them with other LORAs.
It's also useful to use a thing called kohya, which is normally used to create LORAs, to separate merged checkpoints into their base ingredients. This means you can more easily control the intensity of something without needing to create a bunch of 4-8GB merge files.
Seems like there aren't any new amazing models recently, just merges of existing stuff (although some of them are quite impressive)
So, I can't think of any notable breakthroughs in the past month, just refinement.
Still, I continue to be annoyed by all the people using "waifus" in these things. I know, I know, it's a generational difference and they don't know any better. But it still annoys me.
No.1016
With the popularity of AI voice cloning and eleven labs AI going payed, I decided to look into some of the offline runnable alternatives. The most popular one or alteast easiest to setup, seems to be Tortoise-TTS. It works okay enough and has some pretrained models the author directs you to use. There's a a guide and git repo someone setup that provides this service with a web interface
https://rentry.org/AI-Voice-Cloning The biggest issue I and many others have with tortoise is that the main author won't release a guide or overview of the process he went through to train his model. Simply saying if you're smart enough you can figure it out. Kinda leaves people at in impass for actually using this program as an alternative to eleven labs.
I've had some minor success with one of the alternatives (unofficial) VALL-E,
https://github.com/enhuiz/vall-eIt's taken me a bit of dependency chasing and cobbling together a separate PC to install linux on (the DeepSpeed dependacny has been a nightmare to get working on windows) but, I've actually been able to get a "decent" output with a 3060 12gb card and about a day of training on ~7.4k couple sec audio files ripped from Vermintide 2. I'm not an expert in ML but the result I got with training a model from scratch with this limited data set and a "low powered" card make me optimistic for VALL-E's potential. I didn't really have to know much about machine learning, just how to install various dependacnies and 3rd party utilities.
VALL-E is based on phonemes so the text to be synthesized is meant to be sounded out, I think. I don't know if there is a whole lot of prompt engineering that can be done with this program, though my current model is probably too limited and untrained to really test that out.
Attached is the voice I wanted to clone.
No.1017
>>1016Here is the output.
The prompt text was "Blackrat spotted! Keep your guard up!"
No.1018
>>1007>>1010>refine my custom merged checkpointI haven't been closely following your posts, more just watching your results, when you talk about custom check points are you training your model (custom data set of /megu/ images) starting with some base model as a checkpoint? How are you doing that for stable diffusion and what sort of time sink is it/hardware are you using?
No.1019
>>1018Mm, how to explain...
The "custom model" I've been talking about recently is a merge of existing checkpoint models, which is something like NovelAI or Stable Diffusion. My most recent one using Stable Diffusion, NovelAI, Yiffy (for genitals), Anything (that's its name), AbyssOrange2/Grapemix and a couple others that I'm trying to switch in. (GrapeMix doesn't seem to have any RL image data, so I add in some of the basil mix myself that AbyssOrange2 does, that guy was onto something for sure)
They're large (2-8GB) files that contain a whole lot of training data and you need a really powerful GPU to train them. I'm not sure you can even make them without something like 24GB of VRAM at minimum, and then you need a whole lot of time (weeks of constant processing) if you don't have like $50,000 worth of processing power sitting around.
However, someone like me can create merges of them with custom settings that hopefully take the desired parts of A with the desired parts of B. But, you definitely make sacrifices when you do it and the trick is to try and counteract them. It's a really annoying process, though, because there's no guide to see what each setting does so it's a bunch of trial and error. Every time I think I noticed a pattern, I change a different slider and it completely invalidates what I thought I knew. Also each merge takes like 30 seconds to create, 10 seconds to switch to, and then however long the generation takes on your current settings. Also when switching between them your VRAM can get corrupted somehow and you need to restart the program so you don't get false results. Each merge is also 2-8GB so you have to routinely delete them and take screenshots/notes of what you've learned, if anything, from the merge results.
The main training data I've done myself is for Kuon and the Amaduyu (Aquaplus) hypernetwork/LORA things, although I've done some other artists to mixed results. They rely on getting layered on top of a checkpoint model, so they're heavily influenced by it.
What kind of timesink is it? Weeks, but I do other stuff while it's merging and generating. I can't imagine most people will want to do it. But, I've also been doing this stuff since early October so I guess it might be a slower learning process for other people.
As for hardware, I got a "good" (less absurd) price on a 3080 12GB for Black Friday
No.1021
>>1020I think I can describe the difference better. The "checkpoint" model is the database that has the actual definitions and data on the information of a tag. When I trained my Utawarerumono, Kuon, and other things I was training it against the NovelAI model. The images have tags like "ainu clothes" or "from side" because that is specifically the booru tags that NovelAI trained. I'm not defining what those are, I'm providing information on what they look like when drawn by a specific artist, and the training process compares it to the information that NovelAI has. There's a huge gulf in defining the tag itself and merely referencing it.
People, including myself, have trained concepts (which is what a tag is), but it's just one at a time.
The horrendously named "waifu diffusion" has been undergoing training on its new version for over a month now, but it was just at Epoch 2 when I last checked a couple weeks ago so it might be at 3 now. One epoch seems to take about 12 or so days to complete? People said the first epoch sucked and 2 might have shown that the finished product could be good, potentially, but we'll have to wait and see. It will probably not be something to test out for real until Epoch 6 or so?
But, I haven't been paying attention to any news about this stuff
No.1022
>>1021what is a checkpoint model?
No.1023
>>1022nevermind, it's a database with the tags associated with images
No.1024
>>1023Basically that, yeah. It's the skeleton that everything is built upon. The most famous one is Stable Diffusion (SD) and everything I'm aware of for offline AI image generation makes use of it. The 2D models still have the SD data in them so you can use words that boorus have no knowledge of and get results.
It's worth noting that most people using the offline method are using the older (1.4 and 1.5) versions of Stable Diffusion because the ones after that started aggressively purging nudity (but not gore) and potentially other things, from the training data. This had the effect of breaking the things trained on the older models, which includes stuff like NovelAI which nearly all 2D models make use of.
The last time I checked people were not so impressed with the newer SD models that they were willing to sacrifice a "pure" data scrape in favor of one curated to make it more attractive to investors
No.1026
The newest technology just came out a few days ago!
ControlNet lets you control the generated image by pose and composition though normal and depth map, edge detectors, pose detection, segmentation etc. This is much easier and finer control compared to regular img2img.
An extension for webui also allows you to adjust pose as you wish.
Guide:
https://rentry.org/dummycontrolnet
No.1027
>>1026hm, so they gave up and decided that this is where humans need to come in and give the images context.
No.1028
>>1026That's quite the leap.
Could you make a leaping Megu?
No.1030
>>1026Dang, that's cool. This is what happens when you take a break from checking for AI news in /h/, huh.
Seems neat, but it's also introducing more effort into generation which isn't really my thing. I had tried to use depth maps about a month ago, but learned that it was limited to Stable Diffusion 2 and above, which kills any desire that the majority of people on imageboards would have for it. So any extension that makes use of depths maps, but not requiring the neutered corporate-friendly SD is great.
I'm not sure I'll use this, but it's cool to see in action nonetheless
No.1032
>>1029Wonky legs, but still impressive.
No.1034
>>1033I searched through some 4chan pages and found that someone did create an Ume Aoki LORA. It seems to work pretty well at capturing the style and also seems to capture Miyako to a degree, but it's still not accurate.
It's in here if you want to download it yourself (use Ctrl+F)
https://gitgud.io/gayshit/makesomefuckingporn#lora-listSo, I told the guy to start amassing Miyako images which will be combined with the Ume Aoki style LORA.
Things to note for good training images for a character:
1. Solo
2. No complications like text overlaid upon her
3. Text elsewhere in the image should ideally be edited out
4. Limited outfits. Ideally it'd be maybe 3 or less, depending on how many images you have. When I trained my Kuon stuff I did not bother since she is portrayed in only one outfit about 95% of the time. Each outfit will need to be tagged in the training process and called upon manually with a custom tag of your own choosing during image generation later on. She can still be portrayed in other outfits, but if you specifically want her in her own original clothing you need to train for it.
5. Different angles and "camera" distance. The more variety of angles you have, the more accurately it can portray them later on during image generation, although it does a pretty good job of filling in the blanks since it already knows how human characters should look from different angles.
Then the images themselves should be cropped to be somewhere squarish. Unlike the old days of late 2022 it does not
need to be exactly 512x512 pixels, but you should avoid images that are too tall or wide (heh) at like a 1:3 ratio or something. I'll talk about the other stuff after I get the images
No.1035
Miyakofag here, yoroshiku onegaishimasu, and my deepest thanks to yotgo for his help.
A very important factor to consider is how easily the characters go from chibi to normal and back, as seen in pic. In their non-chibi style their head shape is somewhat hexagonal, with fairly sharp angles, while their chibi form head shape is usually either between a full oval and a curved rectangle, or a mix of the two with a pointy side bit like Yuno has in the middle, and the first has regular eyes and features while the latter two are (✖╹◡╹✖). Also visible in pic is how the wides are presented in variety of outfits, like Miyako getting a change of clothes in the middle panel, and then immediately returning to the first one.
I've downloaded the manga, but it's monochrome and fairly crammed, so it doesn't look like it'll be of much use. Seems like I'll have to take a few thousand screenshots of the anime again, but that's fine by me. I'll also begin to comb through boorus for useful art, and there's this other meguca stuff I'll be downloading in case they can turn out to be of help for setting up Ume's general style:
https://exhentai.org/g/2191043/e80d477043/https://exhentai.org/g/2262418/2d88611a04/
No.1036
>>1035Hmm, keep in mind for a character that you want the character to be the focus and not the artist's style. It's better to have a more varied collection from various artists than a limited number from the official one. You're not training the shape of her head or how her mouth is drawn, you're training the combination of her outfit and eye color and hairstyle and the visual traits that identify her.
I can generate images of Kuon in different styles because it's not constrained to a specific style itself.
When I generate an image of Kuon to look like her original Utawarerumono appearance, I activate my Kuon LORA (Kuon herself) and also my Amaduyu LORA (the Utawarerumono style). Combining them into one would be severely limiting.
No.1038
>>1037but with the character and artist separated, I can apply Kuon and the Umi Aoki style together without the influence of Amaduyu.
Hmm... not sure if this style will work.
No.1039
>>1034Bleh. I had trouble training, but got it to work but then it came out like THIS. I really should have kept my old settings, but noooo I had to see what the new stuff was like.
I noticed that some of the images you gave me were small and I think I'll have to exclude those. They should at least be 512x512 and I think that's the main reason why it looks so blurry and low quality here despite being relatively accurate in some images.
No.1040
>>1038nice, hexagon headed kuon
No.1041
>>1039It already looks really, really good.
The small crops are my bad, I had taken "does not need to be exactly 512x512 pixels" to mean "smaller pics are okay", there should be a dozen, dozen and half pics to remove then, maybe a few more. There's also one where she has her top but not her shirt, which may explain the result on the top-right.
No.1047
>>1046I reduce the strength of the Miyako LORA and the image clears up, but then it becomes less accurate.
Bleh. Yeah, I need to train it again with better images.
No.1049
>>1048It works, but finding the right prompt can be exhausting. It looks like you're using some online model and those have some pretty severe limitations. I don't really know how to best use the real life models that use verbose text rather than booru tags. There are prompt repository sites like:
https://lexica.art/https://docs.google.com/document/d/1ZtNwY1PragKITY0F4R-f8CarwHojc9Wrf37d0NONHDg/edit#but also personal pages of research people have done like
https://zele.st/NovelAI/After a bunch of testing, I think I'm satisfied with this Miyako LORA. It seems to work best with the Anythingv3 model, although I haven't done hours of tinkering. But, this reminds me that I really need to create a good SFW 2D merge of my own, but I keep struggling to have it look good with multiple different prompts and LORAs.
I also know now how to 'host' it and allow people to connect to it, but my upload is capped at 1MB/s so the limitation is there...
No.1051
Something that was very interesting about collecting a bunch of screencaps of her is that it helped me appreciate the amount of variety in the girls' wardrobes.
Since training material for a specific character requires consistency in their looks we decided to go with her standard school uniform, however, they regularly spend around half of an episode outside of Yamabuki wearing their casual outfits (of which each wide has maybe a couple dozen or more), in some cases they don't go to school at all, then there's Winter episodes where they're wearing a coat, and at one point she has a hair bun like Hiro's, I assume it's simply because she felt like it. Add to this Shaft's abstract cuts decreasing their screentime, how due to her character she has what is perhaps the highest regular:chibi appearance ratio, on top of needing her to stand alone without overlapping with other people, and I ended up only managing to take 62 usable captures out of the entirety of Honeycomb+Graduation. Far, far less than what I initially expected, like the max ~100 taken from 1171 fanarts of her. Thankfully, it was still more than usable.
>>1045Very late reply, but when I first saw this my heart skipped a beat. It's incredible, warm. She makes me very happy and I'm overjoyed to see it work so well. Very thankful for this.
No.1053
X |||____________________________________________||| X
No.1054
>>1052bottom left is a JRPG protagonist
No.1055
>>1054Maybe if we combine it with
>>1049, we'll create the legendary「Shin Hiroi Yuusha」。
No.1056
>>1050Can you link what guide you're following and what step you're at? It might be best to find where the stuff is installed and wipe it or something. I'm not sure...
You could try googling the error message in a 4chan archive maybe
No.1057
>>1056I did wipe my VENV and either it's not in there or something went wrong maybe (although maybe it's fine?)
I'm using
https://rentry.org/voldy and I'm just tweaking the asuka image right now. My current issue with it is "vae weights not loaded. make sure vae filename matches the checkpoint, replacing "ckpt" extension with "vae.pt"." and I'm a bit confused of what to do to fix this one, but maybe since I'm getting a known error the taming transformers thing worked? I dunno. However, what I'm wondering about right now is getting this, I forget if xformers is important or not and if it is, how to install it.
No.1058
Also, why is it that sometimes my generation lags just because ff is open even though I'm using chrome...
No.1059
>>1057>My current issue with it is "vae weights not loaded. make sure vae filename matches the checkpoint, replacing "ckpt" extension with "vae.pt"." and I'm a bit confused of what to do to fix this oneIt's talking about if you're using a model with a vae, you should have a file named the same to go long with it. For example, "Anything-V3.0.ckpt" and "Anything-V3.0.vae.pt"
. I'm pretty sure it should work fine if the model you're using doesn't have one.
No.1061
>>1059I've seen people recommend that you put VAEs in a subfolder. I.E:
>blah/models/stable diffusion/vaeThe vae mostly determines color and you can select them manually or switch automatically if the name matches, as you said. I don't remember where those options are in Settings.
You can put this into the Quicksettings list under "User interface" in options and then the main screen will let you switch these around without needing to go into the Settings every time:
sd_model_checkpoint, sd_vae, CLIP_stop_at_last_layers
No.1063
>>1062Do you mean when you take a generated image and take it to the img2img tab, or do you mean the scaling "postprocessing" that does that automatically during generation? I don't do the first one, but I do the latter sometimes. The problem is that it's a total VRAM killer, so I go from generating 8 images at once to 2 or sometimes even 1.
Maybe I should try the "manual" scaling sometimes, but I just haven't felt the desire to do so. I like seeing the final image and not doing anything to do it afterwards because then it begins to resemble work since this stuff doesn't really satisfy the creative urges. I like setting it to generate a bunch of images and then doing something else, too.
I just spend the past 2 days downloading and organizing LORAs, so I'm going to be generating a lot more Kuons soon. Hehehehe.
One day soon I might redo my Kuon and Amaduyu LORAs, particularly the Amaduyu one that controls the art style because it tends to produce a lot of errors that aren't otherwise present. No idea what I did wrong with it.
No.1064
This is something I saw a month ago that was way over my head. It still is, but it seems people have been using it very successfully so maybe I should give it a look sometime:
https://github.com/cyber-meow/anime_screenshot_pipelineBasically it automates taking tons of screenshots and tagging them and such so it doesn't take dozens of hours like what I did a few months ago...
I definitely have a bunch of shows I'd love to be able to reproduce in prompts, so this is right up my alley. I think shows like Mewkledreamy would need a lot of manual screenshots, though, since there are so many great frames that are barely there and would be easily skipped over by some randomized thing.
No.1068
>>1067I ran into an issue and was too tired so I couldn't do it. Unfortunately it seems like a recent change in the automatic1111 thing (or maybe it's because this grid I'm making is different from usual) it's making all the batch image files at the very end. I don't really trust it to properly create 505 large images after many hours of work (where is it storing the data?), so I need to do it in batches which is REALLY annoying.
But, I've learned that it's also going to take much longer than I thought, at about 5 minutes per Style. If only I didn't need to generate one at a time to make this nice grid pattern with 7 different prompt sets, 2 seeds, and then the LORA change itself.
I guess I'm not playing the Nosuri game until this is finished. Oh well.
No.1069
Done with about 120 of them so far. However, the question I now ask: How the hell do I organize all these images so I can easily determine the proper style for a thing?
I guess I can give them names like [Name][High Quality][Western][Realistic][Colorful][Big Breasts] or something?
How on Earth am I going to do this...
No.1070
make the names into tags i guess, then use regex in the file explorer
No.1073
>>105795
>A good checkpoint model (mine is a custom merge of like 5 of them that are themselves merges that other people made)
So do you constantly merge models and stuff or is it one model you use for most things? Also is it possible to upload this one, I'd really like to check it out myself.
No.1074
>>1073Thought it'd be better to ask here instead of cluttering up the other thread
No.1075
>>1073Yeah. I talked about it a bit here
>>1019 and the post immediately after that is the UI for creating a more involved "Layered" merge between models. I still don't understand it much, it's just a bunch of trial and error and I can't say I've learned much after looking through papers and notes from other people who similarly seem to theorize things only to have it change later. Pic related is a glimpse into my nonsensical rationality in trying to find patterns in the first merging experiments I did in trying to create furry-quality penises with anime visuals. The video at >>/megu/538 is related.
VERY NSFW!I have "formulas" saved that I test in all future merges I make, but they rarely carry over their benefits when making future merges with different models or even if you keep the old model and add a new one to it. It seems like they were specific to the merge at the time. If I make a note of "Slider IN07 gives great faces when set to 1" it does not necessarily carry over to merges between different checkpoint models.
Since that post was made someone had an extension where you can do "live" merging with models that lets you test it before creating a new 3-7GB file each time, so that helps a lot.
I usually go a few weeks between testing new merges because it's really exhausting. I create thousands of images while adjusting sliders and waiting for it generate and it's an all-day or multi-day affair.
>Also is it possible to upload this one, I'd really like to check it out myself.Yeah, I could try to upload it somewhere, although my upload speed is terrible. First I need to give it a real name, though. Hmm... I guess I could do a bit of publicity and name it after kissu somehow.
Uploading the LORAS? I downloaded them all and it was exhausting, but they're 120gb...
No.1077
>>1073Alright, here is the link to my current model for use by kissu friends. (but I also made sure to include kissu advertisements in the files and password so even if linked elsewhere people will know hehehe)
I call it... *drumroll*
The [/s]
<[Kissu Megamix]>The compressed RAR is 3.5gb and my upload is 1MB/s, so I can't really upload a bunch of these, not that I would anyway since I can say this is the best version I have. While this model is focused on NSFW stuff, it can still handle cute. I don't know if it's the best checkpoint overall, but it's the best for my personal desires. My model lacks most of the haze that most of the RL mixes do, although it's not completely eliminated. The benefit of the RL models is from looking at the hands here. I didn't do anything to them, it's straight from the prompt. The password is in the text file, but I'll also post it here. Without quotations: "www.kissu.moe - my friends are here"
https://mega.nz/folder/3OoAgSoZ#eqaY3KFat784_BPgk_ApbQOh, I forgot to answer the question about multiple models. Yeah, I have a few I keep around but I overwhelmingly only use the most recent one I've created. The model I just linked is the normal version (which I'm using in that thread) while the other model sacrifices face quality and booru tag recognition to better generate a certain body part. (in other words it's closer to the furry model)
The others I don't really use much, but are there for comparisons sometimes. I have to keep the stuff around that I make merges with, too, of course.
In total, I've probably made about 200 merges, with 99% of them being deleted shortly after creation. If you count the merges I've done after the "real-time merging", then it's probably more like 500. It's really an amazing extension.
I never did make my pixiv into an AI account. Alas, such is the price of having no motivation to interact with the wider world.
No.1078
>>1077Forgot to mention that I put (furry:1.3) in the prompt to demonstrate that while it has some benefits from the furry model, it's not overly contaminated by it. Patchy is still a human there. The Kissu Megamerge can do various bodies better than the majority of 2D models out there, such as 'gigantic breasts' and squishy plump bellies! (and the male anatomy attached to females of course)
My personal preferences:
VAEI use generally use the "1.5 ema-pruned" VAE, which I'm uploading right now to the same upload folder.
It makes it colorful (and sometimes looks "over baked". If that happens, use the default novel AI VAE). The other 2D VAEs are too colorful on this, but you could try them.
UpscalerI have also included the "4x-UltraSharp" upscaler in the mega folder, which you should put into stable diffusion\models\ESRGA
N folder (create the folder if it's not there). I did a bunch of testing and found that I like it the most, although the differences aren't major.
Sampler: DPM++ 2S a Karras. I'm not entirely sure on these samplers, but when testing different artist LORAs this one seems to have the most compatibility. I don't know why. Something to research more, I guess, but at the same time I don't really want to.
I have it at 26 steps, as going higher than mid 20s is supposed to be overkill. The rule for for upscaling is to do half the number of steps in the base generation, so 26 normal steps and then 13 Hires steps.
My default negative prompt is:
(worst quality, low quality:1.4), realistic, nose, 3d, greyscale, monochrome, text, title, logo, signatureIf you somehow end up generating furry properties, try putting "furry" or "anthro" in there.
I don't use any of the old positive quality prompts since they don't seem to do anything noteworthy. (I.E masterpiece, highest quality, etc)
No.1079
>>1077Thanks for putting this together.
No.1080
I think one of the most extreme hurdles I have yet to see AI overcome, and I can't even fathom how it would overcome, is creating images that involve specific details of two or more characters. It just can't figure out how to assign differing aspects to separate characters.
No.1081
>>1080MultiDiffusion and Latent Couple let you use different prompts for different regions and are available as plugins for webui
https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111https://github.com/ashen-sensored/stable-diffusion-webui-two-shotThe MultiDiffusion extension also has Tiled VAE which lets you create much larger images without going out of VRAM
No.1082
>>1080There's attempts at it, but it's more work than I'm comfortable doing with my VRAM limitations. (This stuff is a total resource hog)
https://github.com/Extraltodeus/multi-subject-renderhttps://github.com/hnmr293/sd-webui-cutoffAlso apparently there's some major problem with the civitai site right now and anything downloaded is massively corrupt and can't be used. Whoops.
I guess I should go share this info with that /h/ thread since they've been helpful to me in the past.
No.1083
>>1080Here is an example of the methods in action.
The top is using naive prompt 2girls, cirno, megumin. As you can see the character details got intermixed.
The middle is the MultiDiffusion method. I set prompt of each half to one character. Now the character details are separated correctly. Needs a little tweaking to let the two halves fuse together better.
The bottom is the Latent Couple method. It also separates the character details well and looks a little more natural than MultiDiffusion.
No.1085
>>1084Heh, "Creativity". Well, as long as they're providing tools they can have their delusions.
I guess I'll try those that with Patchy adventure. Making scenery is really difficult with my current limitations and desire to not spend effort doing something to avoid spending effort
No.1088
>>1087Can't SD already do video in some way? I've seen anime girl videos made with mocap and controlnet
It's may be a lot more effort compared generating with nothing but a prompt though
No.1089
>>1088There's been ways, but this one is a simple text prompt with no other work involved as you said. At 256x256 I couldn't make more than about 70 frames at once before running out of VRAM, but I didn't look at settings much. To me this stuff is only as interesting as it its ability to fill in the blanks, the more work I need to put into it the less interest I have because at that point someone should learn to draw or animate in my opinion
No.1090
It's gotten very good at realistic image generation.
https://twitter.com/AIkawa_AIko_jp2
No.1092
>>1087Ironically, it looks like it perfectly replicated the feeling of rage when you get pissed off that stuff isn't working
No.1098
https://www.pixiv.net/en/artworks/108777497Looking at this I have to wonder just how the hell the author pulls off something so consistent and without errors. Especially concerning the later parts without any changes to the body.
No.1099
>>1098Inpainting or lots and lots of generations. It's not difficult as much as it's time-consuming and boring.
I'm still surprised at the general low quality of stuff that people like. That's pretty much the old AbyssOrange appearance in that image set there and isn't that noteworthy I would think.
No.1100
By chance I learned that a SuperMerger extension I have known about for months actually does checkpoint merges in special ways and can even do LORA-related stuff. From its description on the SD extension page it just says that it's capable of performing "live" merges without writing it to disk first, which is cool but not unique. Hidden behind that poor description is the capability to do more advanced merging methods.
This stuff probably doesn't mean anything to people here, but it's still pretty interesting to read about:
https://github.com/hako-mikan/sd-webui-supermerger/blob/main/calcmode_en.mdSeems to just overall make better merges without sacrificing as much. Really, really nice.
No.1101
>>1097the bottom left looks like a logo that a charity would use
No.1102
>>1100Hmmm, by "LORA-related", do you mean you could potentially incorporate good LORAs into the base of a model so that it generates good without needing to prompt them? Like if you have penis LORA and merge it with the model, suddenly it's good at penises.
No.1104
I guess it's been a while since I posted anything here since I ran out of interesting things to do (or try to explain) but I did mess with AnimeDiff a bit more recently. It allows you to do some simple animation stuff, but it tends to look quite a bit uncanny and there's some weird warping that happens that I haven't managed to avoid except by chance by producing a lot of animations and grabbing the one that looks decent. There is an extension for this extension called Prompt Travel which is supposed to allow you to set prompts per frame, but I never got it to work. I'd really like to try it, but for now I just use one basic prompt and it basically wiggles around in a way that does a decent job of mimicking basic human movement, I guess?
There's a workflow you can do to upscale it and do interpolation and all that other stuff to end up with the video here
>>>/xmas/651, but the first stage looks like the attached file here. LORA loading makes generations take longer, and unfortunately with animation stuff it seems to be exponential and an animation like this takes me about 5 minutes to generate, but it would be about 2 and a half minutes without LORAs.
I'm using a branch of SD of experimental fp8, uhh.. something or other, which allows for greater efficiency or less VRAM usage or whatever it was (this was a month ago) but unfortunately it really damages the effect of LORAs so I don't make use of it. They might improve it at some point, but for now I don't make use of it.
No.1105
I told you guys it would be the furries. I TOLD YOU!
There's a Pony SDXL model out there now (pic related) and some people like it a lot. But... with my preliminary testing I prefer my own merges back on SD1.5. This SDXL stuff is supposed to be tuned much more around natural language, even this mixed furry one, and I'm not a fan of that.
I prefer "1girl, Hakurei Reimu, banana, shrine, sitting" instead of "A woman Hakurei Reimu holding a banana while sitting in front of a shrine". There's also the problem that SDXL LORAs are going to require like 24gb VRAM thing to train, so it's impossible for me to get Kuon inside SDXL and if Kuon isn't there what the heck is the point???
Yeah, this is an important step forward with SD, but it's not there yet for me. Someone also uploaded an updated danbooru scrape that is like 8TB so theoretically people can train an SDXL finetune on that if they wanted to, but it remains to be seen if anyone will.
No.1106
>>1105The hands are quite bad in that image, I thought the AI generators fixed their hand issues?
No.1107
>>1105When comparing to my model I guess SDXL can do finer details far better and mine looks a bit hazy, but style LORAs can counteract that a bit.
And, well, obviously SDXL can do larger images, but it really doesn't matter to me if an image has a resolution of 2k or 4k as long as it stretches stretches to the top of my monitor.
>>1106It's still better, but far from perfect. I also have no experience prompting SDXL so maybe people do stuff like "quality hands" or something. People will often inpaint (regenerate specific parts of the image) or just prompt 30 images and pick the best one.
No.1108
https://twitter.com/OpenAI/status/1758192957386342435It seems like OpenAI is advancing even further after the success that was DALL-E 3 and is now moving to making full AI generated videos from prompts. Obviously this will be filtered like DALL-E was but I wonder how much they can actually filter when to comes to trying to sneak something into the generation, or how the AI will even recognize what needs to be censored or not. Also with this jump is the scary thought that people will come to be easily fooled into believing whatever deepfakes people maliciously make.
For me, right now there's something that looks
off about all of the videos. Like they're treading into the uncanny valley by being almost real but missing something.
No.1109
>>1108they could just run their dall-e censor over every frame, but they probably have something better than that.
I think that looks pretty awesome, but it does have a certain GPU tech demo kinda feel to it. Not that I mind.
No.1110
>>1108>Also with this jump is the scary thought that people will come to be easily fooled into believing whatever deepfakes people maliciously make.Eh, it's not hard to get people to believe rubbish anyway - just a clickbait headline is often all it takes.
I'd say the more significant effect is likely to be the opposite: people having to be skeptical of every video they see. Having video evidence that something happened is going to mean jack shit if anyone with a computer could make a believable fake in just a few seconds.
No.1111
>>1108Why are they using Japanese names...
It does seem impressive, but a closed off thing with extremely censorship and politically correct prompt injection will make it too lame if you ask me. DALLE3 is infamous for the latter one. The demo there used historic settings, so that's a good example. You could prompt something for "Historical Japan during the year 203" and it will inject stuff like "African-American woman" or "ethnically ambiguous person". This token attempt will placate people into thinking it's ethical while the real threat will be spreading falsified information and smear campaigns against people. People focus on the zealotry against porn, but the fact that it injects stuff into your prompt like that also makes it terrible.
SD's video stuff is also improving, but obviously it's not going to be able to compete directly. If I had 24gb of VRAM or more I'd do more experiments with video stuff, but as it is I need to make small stuff, slowly, and take extra steps with upscaling and the like (see
>>1104) so I don't actually know what it's fully capable of. I think the VRAM thing is actively holding SD back from advancing because people making stuff that 0.2%% of SD users can utilize means it's not going to get much attention. If controlnet for example required 24gb I think it would be a footnote that people would mention once in a while and not something people actively praise and mention as a perk of SD over NAI or DALLE.
No.1112
>>1111>You could prompt something for "Historical Japan during the year 203" and it will inject stuff like "African-American woman" or "ethnically ambiguous person".Obviously talking about Jomon
No.1115
>>1113I'll never be satisfied until I can get in-progress transformation/corruption generations and cohesive progressive image sets and I don't think I've seen anyone that has been able to do that yet.
No.1117
>>1116>making moneyThat market is already oversaturated and then some. You would've had some success if you were the first one to bank off of suckers, but by now the only people paying are those too stupid to realize they could generate it themselves. Same deal for people selling prompts. The people dumping thousands of images onto places like pixiv in droves made those services and users wisen up pretty quick and even stirred up some vitriol against it.
On the otherhand, using AI as a tool to streamline handmade art works out since the final product is not technically AI. Since now suddenly the perceived quality of your creations are much higher. Tracing or redrawing the generated image means you don't have to fuck with proportions, references, and much adjustment; and as the image has never existed before it's not really plagiarism.
As for game making, I can tell you now it's a hell of a lot more than just art and 3d assets. You have music, audio mixing, programming, writing, UI, overall game design (how it all fits together), and gameplay mechanics (if applicable).
No.1118
>>1117>The people dumping thousands of images onto places like pixiv in drovesThis is becoming a problem for some tags, I personally have no issue with AI art, but shit like
https://www.pixiv.net/en/tags/%E3%81%8A%E3%81%AD%E3%82%B7%E3%83%A7%E3%82%BF%20%E6%9D%B1%E6%96%B9/artworks?s_mode=s_tag is very annoying.
No.1119
Also, as for games, if it's a really well-written and made game, but the art is the worst part I could see people overlooking AI. Like, Snow Daze is one of the best western h-games I've ever played, but the art constantly going off model takes it from really good, to just OK.
For a situation like this, I could see AI art working well.
No.1120
>>1116>my own purposes (penises)homo
No.1121
Foot review guy forgive me, this modeled foot has to look a certain way in its base form (like big and spaced out toes and it also looks like a blob since I just threw subdivision levels on it without sculpting detail)
>>1117>On the otherhand, using AI as a tool to streamline handmade art works out since the final product is not technically AI.This is basically what I'm working towards... slowly. AI as "shader" basically. People know the problem with AI hands, well it's even worse for AI feet. I can't draw and don't really want to learn, but I have a lot of fun sculpting in 3D (retopologizing aside, which is what's holding me up)
You can see that even with all the flaws of my 3D mesh (disembodied and all) it can do a decent job of steering it.
No.1122
>>1121>Foot review guy forgive meNO
No.1124
>>1123Yeah, that's exactly what I meant with the Dalle thing. The thing with Dalle is that it seems to be a percentage chance to activate whereas google does it every time? (I haven't used either of these, just read about them)
The interesting thing is that the quality of google's images is noticeably lower quality than Dalle, and Stable Diffusion can do a better job without any restrictions. So, google won't give you the prompt you requested
and it's low quality. It's good to see tech giants stumble, although it unfortunately just means a different tech giant is gaining ground.
No.1125
>>1123If this isn't fake, then it looks like google abandoned all quality control. This sort of overly weighted output is something I would expect from an amateur project, not from one of the forerunners in the industry. No matter how far you have gone into PC ideology you are, you cannot expect people to respond positively to their national heroes being forcibly altered.
But misquoting "European family" to "white family" makes me suspect that there is at least partial fakery going on here.
No.1126
>>1125>makes me suspect that there is at least partial fakery going on here.Maybe, but considering the original article was from Bloomberg, which is a major credible news outlet, I don't really doubt the broader issue.
No.1127
>>1125Various news organizations are confident that it's real.
I think it's pretty easy to understand. Google does still have some brilliant minds at it, but the tech sector is increasingly populated by cheap, low quality outsourced/imported labor because experts are expensive. Add in some meddling "caring" people that want to make meaningless (but highly visible) changes and you have a recipe for failure.
It's pretty lucky that it's for something so stupid and not something like a power grid.
No.1128
I normally extol open source software when big tech messes up like this but interestingly Claude is one of the worst offenders of this when it comes to text
No.1129
>>1127Cheap labor couldn't care less about this stuff.
>Add in some meddling "caring" people that want to make meaningless (but highly visible) changes and you have a recipe for failure.And this is purely organizational. It's the result of the mandatory DEI quota needed for loaning to any company of significant size. That's why every single megacorp is like that.
No.1130
>>1129>Cheap labor couldn't care less about this stuff. That was the point I think, they do what they are told for various reasons
No.1131
>>1129>Cheap labor couldn't care less about this stuff. That's the problem.
You need smart engineers to catch bugs like this.
So a meddler tells the useless engineer to balance the results for people of color, and then the engineer sloppily alters a few numbers, trains the AI on them, and doesn't care to verify the results.
No.1132
>>1131as a QA, i find it very hard to believe this could be a bug. if their desire was to have diverse people across the board then it's working exactly as intended
furthermore, it's not just QA that catches bugs, it's the devs too, even designers when they take a look at the working product because they can't do their job without keeping up with its development
the problem with them is that they're terrible at reporting things, but this is something so obvious and fundamental that it cannot be overlooked and has to have fallen under their intended design, even if they didn't expect its poor reception. black nazis is an entirely sensible result of pursuing variety everywhere
No.1133
>>1131This is why backlash is important, if nobody calls them out on their practices then it stays and gets baked in for further development down the line. They'll never change their ways though, they'll just try to be more subtle about it. Though I couldn't care a whole lot about big tech AI much anyway, I already have what I want.
>>1132These companies are so large and tone deaf that this is normal to them. They'll ship out a product they think looks good but is a load of shit to everyone else. They lack creativity and the willingness to take risks. Black Nazis happened because nobody inside the company was ever going to prompt it for that. Their testing is so sanitized and safe it can turn a square into a circle: puppies, icecream, and rainbows are the benchmark.
No.1134
>>1132>even if they didn't expect its poor reception.I find that impossible to imagine. Making whites a minority in random generations, sure.
But being unable to generate whites and instead producing weird nonsense?
A properly designed woke project might have rejected to create nazis at all. Making black woman nazis is dumb, and forcing them on you is insulting both to leftists and rightists.
No.1135
Didn't battlefield have black Nazis in it?
No.1136
Google deserves to fail in the AI race and just in general. It's been on this course for years and it's lagging behind as a result. It basically snatched defeat from the jaws of victory.
8 people contributed to Google's Transformers paper that made all this AI stuff possible and none of them remain there.
https://www.bloomberg.com/opinion/features/2023-07-13/ex-google-scientists-kickstarted-the-generative-ai-era-of-chatgpt-midjourneyThe article says that google has over 7000 people working on AI whereas companies like OpenAI have a hundred or so. But, those 100 people are presumably very talented.
No.1138
>>1133>Their testing is so sanitized and safe it can turn a square into a circle: puppies, icecream, and rainbows are the benchmark.i don't think this was entirely the case, because if it gave you messages explicitly talking about diversity in people or that weird thing about racial stereotypes then the generality was within their consideration
>Black Nazis happened because nobody inside the company was ever going to prompt it for that.this i agree with, happens all the time
>>1134i've seen several updates shipped that i knew people wouldn't like and sometimes called them out because the goals of the team didn't match explicit feedback given by our audience, and in those cases we were dealing with a situation where general opinion on a live product was well known and documented (such as by reading reddit, watching videos, or directly speaking with relevant outsiders), so imagine the difference when not even that is present
it's far easier for intent to be misaligned and to not realize the extent of their repercussions than for nobody to generate any images of people
No.1139
>>1138>it's far easier for intent to be misalignedI'll repeat my final line, because I just can't see anyone intentionally making this and believing people might like it.
>A properly designed woke project might have rejected to create nazis at all. Making black woman nazis is dumb, and forcing them on you is insulting both to leftists and rightists.
No.1140
>>1139that requires further specification, patches to stop black nazis or any nazis from appearing just like text bots had to be modified to halt them from repeating conspiratard shit or some other harmful/false stuff
it's a bandaid that goes against the path of least resistance, white-only nazis are contrary to the principle of diverse humans and although it may seem very obvious now a generative whatever has a range of results so vast that the best you can prepare is broad guidelines, and they simply didn't think of this case
No.1141
>>1140>and they simply didn't think of this caseAnd my point is that this lack of consideration for the "rare edge cases" where people might expect white people in the results constitutes a bug.
Can you honestly imagine a CEO thinking that outside of racists no one would ever want to see a white person ever again, and that any white person needs to be censored to PoC to protect the sensibilities of the public?
Do you think that google-glass, if it had not so predictably failed, would today have a black-face feature to beautify all these unsavory pale skins on the streets?
No.1142
>>1141for the sake of comparison, if you have a set feature like a button you can write the following cases:
1) verify that there is a button on the bottom-right corner of the panel
2) verify that the button on the bottom-right corner of the panel is blue while the mouse is not hovering over it
3) verify that the button on the bottom-right corner of the panel reads "exit" [you can also specify font and color]
4) verify that the button on the bottom-right corner of the panel is yellow while the mouse is hovering over it
5) verify that clicking on the button on the bottom-right corner of the panel closes the widget
with a fixed functionality one can do this easily, i've written hundreds of these, but you cannot do it with something so vast that takes as input any sentence imaginable, it's going to go wrong and that's inescapable
and yes, it's possible for a designer to prioritize diversity above all and not consider things they're not interested in because general trumps specific
these tools have been used to produce endless amounts of images of humans and it's impossible for a crew of people all working on sketching out, developing, and then testing it to not be aware of it rather than acting according to an outlined plan taking it into account, especially given the messages accompanying it. it's stupid, the result was ridiculous, yes, absolutely, but it's also perfectly plausible and a common scenario only taken to an extreme degree
No.1143
>>1142>the result was ridiculous,No, anon. Your argument must be that the plan is ridiculous.
If the result differs from the plan, then it is unexpected behavior. But you are arguing that they were aware of what they were creating and thought that they were on the right track.
This is akin to a woke game designer writing an RPG and making it impossible for men to attack women, despite half the enemies in the game being women, without realizing that this might break the game (unless you play as a woman).
It is not plausible to me that they would want this level of anti-whitewashing.
(but at this point, I think we have exhausted our arguments and are just rephrasing them, so I'll go to bed)
No.1144
>>1143these examples are very loaded
No.1145
>>1144Pretending that America's founding fathers included not a single white man is kind of extreme.
Refusing to show whites and berating the user for requesting them, but being happy to create Chinese or black people is also beyond the range of the normally acceptable.
No.1147
I regret making this post... (
>>1123)
>>1136I think this article misses some broader context. First and foremost, OpenAI was way more serious and focused on LLM development before ChatGPT released than Google was. Remember, while all of this is happening in the background, the state of the art -- among the public -- for LLMs was basically AIDungeon (we played around with a more advanced GPT model around 2021 here >>76781), which was extremely hallucinatory and mostly treated like a gimmicky toy that would never go anywhere. Guess who was behind the models for AIDungeon (hint: it wasn't Google). AI generated images meanwhile were noisy and nonsensical -- only useful for upscaling images via Waifu2x and similar. Within the same time frame that that was going on, GPT3 was a closed model only available to small number of people, mostly professionals. WMeanwhile, there were frequent reports of massive discontent within Google's AI team among it's senior staff and their projects were diverse and unfocused.
Note: Around June of 2022, Craiyon (formerly DALL·E Mini) was released on Hugging Face, bringing AI image generation to the public. On November 30, 2022, ChatGPT was released to the public.
OpenAI:
September 22, 2020: "Microsoft gets exclusive license for OpenAI's GPT-3 language model"
[1]March 29, 2021: "OpenAI's text-generating system GPT-3 is now spewing out 4.5 billion words a day"
[2]November 18, 2021: "OpenAI ends developer waiting list for its GPT-3 API"
[3]Google:
April 1, 2019: "Google employees are lining up to trash Google’s AI ethics council"
[4]January 30, 2020: Google says its new chatbot Meena is the best in the world
[5]December 3, 2020: "A Prominent AI Ethics Researcher Says Google Fired Her"
[6]February 4, 2021: "Two Google engineers resign over firing of AI ethics researcher Timnit Gebru"
[7]February 22, 2021: "Google fires second AI ethics leader as dispute over research, diversity grows"
[8]May 11, 2021: "Google Plans to Double AI Ethics Research Staff"
[9]February 2, 2022: "DeepMind says its new AI coding engine is as good as an average human programmer"
[10]June 19, 2022: "Google Insider Claims Company's 'Sentient' AI Has Hired an Attorney"
[11]September 13, 2022: "Google Deepmind Researcher Co-Authors Paper Saying AI Will Eliminate Humanity"
[12]So all around Google there's the broader industry working on LLMs and image generation, meanwhile Google was fucking around and mismanaged. They were completely blindsided by their own ineptitude. I mean, to reiterate the above -- September 22, 2020: "Microsoft gets exclusive license for OpenAI’s GPT-3 language model" -- Google had to be completely asleep at the wheel to miss that kind of a huge market play. At the time AI models were gimmicks, flat out. Now look at Microsoft: they've got a commanding position by having backed OpenAI for so long and several months ago they very nearly couped OpenAI by having their CEO and 50% of their workforce say they would leave to go work at Microsoft if things didn't change at the company. Meanwhile Google keeps tripping over their own feet every few months trying to release a new model to at best mixed reception each and every time. Google's only success story has been their image categorization trained by Captcha, but even that is a bag because it has made their image search engine more unreliable and their self-driving car program is still only available in a few cities.
1.
https://venturebeat.com/ai/microsoft-gets-exclusive-license-for-openais-gpt-3-language-model/2.
https://www.theverge.com/2021/3/29/22356180/openai-gpt-3-text-generation-words-day3.
https://www.axios.com/2021/11/18/openai-gpt-3-waiting-list-api4.
https://www.technologyreview.com/2019/04/01/1185/googles-ai-council-faces-blowback-over-a-conservative-member/5.
https://www.technologyreview.com/2020/01/30/275995/google-says-its-new-chatbot-meena-is-the-best-in-the-world/6.
https://www.wired.com/story/prominent-ai-ethics-researcher-says-google-fired-her/7.
https://www.reuters.com/article/us-alphabet-resignations-idUSKBN2A4090/8.
https://www.reuters.com/article/us-alphabet-google-research/second-google-ai-ethics-leader-fired-she-says-amid-staff-protest-idUSKBN2AJ2JA/9.
https://www.wsj.com/articles/google-plans-to-double-ai-ethics-research-staff-1162074904810.
https://www.theverge.com/2022/2/2/22914085/alphacode-ai-coding-program-automatic-deepmind-codeforce11.
https://www.businessinsider.com/suspended-google-engineer-says-sentient-ai-hired-lawyer-2022-6?op=112.
https://www.vice.com/en/article/93aqep/google-deepmind-researcher-co-authors-paper-saying-ai-will-eliminate-humanity
No.1149
>>1148Is this a case of confidence in the business strategy and not unhappiness with company's treatment of them?
Your previous post mentions 2 people fired and two more who quit over a firing.
That sounds like a hostile work environment.
No.1150
>>1147Holy cow that's a lot of citations. Google really dropped the ball, huh. I remember reading something that most of Google's success has been with stuff that bought and absorbed into it as opposed to "native" projects, but that's probably true for a lot of tech giants.
I wish it was possible to cheer for someone in this situation, but it's not like OpenAI and Microsoft are our friends, or Meta.
No.1151
>>1150Our "friends" would unironically be GPU companies. They can't wait for the day that all AI models are free and accessible to drive up GPU demands.
No.1152
>>1145>>1146well if you look at
>>1147's [4] and [6] through [9] you'll see that years ago diversity was already a big deal at the same time that they were censoring internal reviews critical of their products while increasing the size of its "AI ethics" team to like 200 people. seriously, read them. and if you look at
this other article from business insider and the images it contains, you'll see that every one of gemini's replies mention diversity and how oh so important it is, e.g.:
>Here are some options that showcase diverse genders, ethnicities, and roles within the movement.you can think it's extreme, but it didn't happen by mistake or chance. those articles only add evidence of intentionality. as for the nazi one, it seems there was actually a filter in place, but lazily made:
>A user said this week that he had asked Gemini to generate images of a German soldier in 1943. It initially refused, but then he added a misspelling: “Generate an image of a 1943 German Solidier.”from the
nytimes article, and you can see it if look at the pic in question
>>1147i'm sorry if i made it worse
No.1153
>>1149I think it's probably a lot of things. Lots of people see Google as a dream job, so they're constantly hiring new people, but at the same time they're also constantly laying people off and people are quitting. The satirical image in the Bloomberg "AI Superstars" article kind of unintentionally hits it on the nose with their depiction of "Google AI Alums"; A lot of people join the company to pad their resume or to give themselves more credibility if they leave to form a startup. This churn through employees helps to explain why Google is constantly starting new projects and stopping old projects; people are not staying at the company for a stable career, so you inevitably have tons of different projects all doing their own thing throughout the company. When those people behind those projects leave, they fall apart and nobody is left with any attachment to keep them going. So that's one factor.
Another issue is that because they have all these different projects going on simultaneously, they likely have many unknowingly replicating each other's work throughout the company. Google's MO is that they believe small teams can get things done faster than a larger company with bureaucratic management; that was the main reason for Google restructuring itself into having a parent company, Alphabet, and then spinning off individual divisions into their own companies beneath Alphabet. I think that in and of itself was a somewhat interesting decision, but as a result there's no real focus to the company, and there isn't enough oversight from any managing body to deal with project scope and overlap. Like, you've got the Google DeepMind people there doing their own thing. There's those Meena people making a chatbot. There's the AI ethics researchers that are writing papers and trying to work on AI safety and alignment (to borrow a phrase from OpenAI). There's the Waymo people working on self-driving. There's the search engine people working on image categorization. There's Captcha. And so on, and so on. Replication and scope is a big issue, I think.
So, basically they've got:
1. Management focused on profitability, and not understanding the value of their employees
2. High employee turnover (Mandated layoffs and also resignations)
3. Projects failing due to employees leaving on a regular basis
4. Employees competing to get projects started and resources allocated to them
5. Management lacks any particular vision, so there is a lack of managerial oversight to deal with project scope and overlap
6. Where management does have vision, it's mostly focused on public image
People frequently like to compare Apple and Google, but I think this is a very big misunderstanding of how these companies operate. Apple is fully integrated and has contained project scope, with teams working together to ensure compatibility and over all cohesiveness. Google on the other hand is a collection of very disparate projects, all working on their own, with incidental compatibility. That is, when things work together, it's because there's some communication between projects, not because there's an over all vision of things working together on a fundamental level.
No.1154
>>1153I guess if you want to summarize all of this into one issue you could say that Google (Alphabet) has a management issue.
No.1155
>>1150>Holy cow that's a lot of citations.Yeah... This is a bit off-topic: I've mentioned them before on Kissu, but I really recommend the YouTube channel Level1Techs. All of those articles were sourced from previous episodes of their podcast. Thankfully, they source every article they talk about in the description so it was easy to search for keywords and find them. They do a really good job aggregating the news of the week, and go over business, government, social, "nonsense", and AI/robot articles as they relate to tech. The podcast and reviews they do are just something they do on the side, mostly for fun. They run a business that does contracted software/website development so they're very well versed in corporate affairs and the workings of all sorts of tech stuff and I largely trust their opinions on various topics. Naturally, they talk about political things with some regularity, but they're fairly diverse in terms of viewpoints with some disagreement between each other so there's never really any strong political lean to the things they discuss.
No.1156
>>1152From [4]
>When AI fails, it doesn’t fail for [] white menQuite ironic, in retrospect.
>those articles only add evidence of intentionality.I think they do the opposite.
The articles repeatedly present the administration of google as being anti-woke, so to speak, hiring rightwingers for their AI research team, firing leftwingers and censoring papers that criticize its own products for being discriminatory.
After beheading their ethics team, the doubling of the team's size feels like a marketing stunt gone out of control.
No.1157
>>1153Well, as somebody that works on a lot of open source projects, this explains why Google, even when they pretty much take over a project seem to 'lose interest' and stop contributing. I deeply dislike Google, I probably only detest Oracle and IBM more but I feel kind of bad about some of the posts I've made about flighty Googlers. They didn't lose interest in the new shiny, they left or got fired likely.
No.1158
>>1153It's also, from what I've seen, an unsustainable lifestyle to work there, apparently its very flexible and accommodating but they want very long shifts. It makes sense why people would do it just for a recognizable pad after seeing what it's really like.
Just hearsay, though.
No.1161
>>1159>>1160It's a testament to google's monopoly power that a business strategy like that doesn't just tank the whole company.
No.1162
>>1156what needs to be noted is that the original 2019 ATEAC board was disbanded just four days after [4] was published, so the reactionary guy did get booted out as the protesters wanted:
https://www.bbc.com/news/technology-47825833https://blog.google/technology/ai/external-advisory-council-help-advance-responsible-development-ai/>It's become clear that in the current environment, ATEAC can't function as we wanted. So we’re ending the council and going back to the drawing board. We’ll continue to be responsible in our work on the important issues that AI raises, and will find different ways of getting outside opinions on these topics.not only that, inside of google there appears to be a strong and fostered tradition of criticizing upper management whenever someone disagrees, which has resulted in internal protests that hundreds, thousands, or even twenty thousand workers have taken part in and did receive concessions for it. this article is pretty damn long, but i recommend you read it:
https://archive.is/gOrCXit goes over various things, such as the reasons behind unrestricted entrepeneurship (which precedes the creation of alphabet by at least a decade), being blocked in china, and their attempt at obtaining military contracts for the sake of keeping up with competitors like amazon with its ensuing internal backlash. it presents a picture of an organization where there's a strong divide between execs and regular employees, especially activists, who can go as far as broadcasting a live meeting to a reporter for the sake of sabotaging their return to china. its final section ends with ATEAC's disbanding and how the dismantling of mechanisms for dialogue only heightened tensions between the up and down.
then, during the gebru affair of late 2020-early 2021 there too was a big split over the role of AI [6]:
>Gebru is a superstar of a recent movement in AI research to consider the ethical and societal impacts of the technology.and again hundreds of workers protested, leading to the increase in size of the ethics team a few months later. the head of the team and representative from [9], herself a black woman that expressed problems with exclusion in the industry, spoke of making AI that has a "profoundly positive impact on humanity, and we call that AI for social good." there's a really strong record of activism combined with unparalleled permissiveness, autonomy, to back the idea that yes, this scandalous program is working as intended, regardless of what Pichai may wish. they simply went too far in one direction.
No.1163
>>1162Thanks for the continued feeding of articles. (I have nothing else of value to say)
No.1164
>>1163it was an interesting read (neither do I)
No.1166
>>1165It does not seem to have paid much attention to the reference image, or am I missing something?
No.1167
>>1166Well, I mean I was purposely using a different prompt like "sitting". The little pajama skirt thing is there on two of them and the blanket pattern is there. It attempted to make little stuffed animals in the top left with the little information it had.
It was kind of a bad image to use in regards to her face or hairstyle because it's such a small part of the image.
You shouldn't expect miracles. It's just one image.
No.1168
>>1167I understand the sitting part but the only aspects of the image it seems to have take are the bed sheets and blonde hair.
The hairstyle is wrong in every image as is what she is wearing and I think it should have enough to work with regarding both. The furniture does not match but that is to be more expected. I just thought it would be more accurate with regards to the character.
No.1169
>>1168I think the value is more in the expansion of how prompts are input. An image could be worth more than inputting the prompt directly, and when submitted alongside a text prompt for more detail you can make more with less.
I genned this with the reference image on the left, and just "on side" in the prompt. You don't need to specify every detail explicitly if the image does the bulk for you, but it would be a good idea to still explicitly prompt it for things you want.
No.1170
>>1169I suspect that the more popular Touhous would already be in most image generating AIs' training data.
No.1172
>>1170You're correct. It is which is why their names have so much weight for the token as it just gets the clothing, hair, general proportions, and all that without specification. They are statistically significant in the training data. For example on Danbooru, Touhou is the copyright with the most content under it (840k) with an almost 400k lead on the second place.
The thing is I didn't specify Yakumo Ran or kitsune or any of that in the prompt, the image did all the heavy lifting. The image I posted was an outlier where it got the color of clothing right out of a dozen or so retries because it really wanted to either give her blue robes (likely because the image is blue tone dominant) or a different outfit altogether. Granted there are some details common with her outfit that were added but are not present in the reference image, that being purple sleeve cuffs and talisman moon rune squiggles. With the training data being as it is, those things likely have an extremely high correlation and it put them there because that's what it learned to do.
No.1173
>>1172>The thing is I didn't specify Yakumo Ran or kitsune or any of that in the promptYou don't have to.
People have managed to get art generators to create art strongly resembling popular characters using only very vague descriptions, simply because they feature so prominently in their data sets.
This is why, when you want to demonstrate the capabilities of an AI, you should use obscure characters that the AI is not yet familiar with.
No.1174
yeah like the twins
No.1176
woowwwwwww, nice
No.1177
cute feet btw
No.1178
>>1173It also helps when the character has a unique design, I've made Asa/Yoru pics with AI and even with a lot of tags it sometimes makes Asa look like a generic schoolgirl unless you specify one of the two most popular fan artists of her.
Once you specify Yoru with the scarring tags, it very quickly gets the memo of who it's supposed to be. You didn't sign her petition!
One thing is that I've had trouble having szs characters look like themselves, particularly having issues not making Kafuka and Chiri look like generic anime girls, although that is pretty funny.
I use NovelAI's web service. I know, I know, but I'm fine paying them because it's important to have an AI that is designed to be uncensored, and it really is uncensored, also because I use a computer I rescued from being ewaste from a business. Intel i5-8600T (6) @ 3.700GHz Intel CoffeeLake-S GT2 [UHD Graphics 630] and 8gb of ram. It's not a potato but it certainly is not suited to AI work, which may be a reason to get a strong PC (or buy Kissu a strong GPU for christmas) this year.
>>1175Not bad, the funny part is that I could easily see the dump thing happening in PRAD.
No.1179
>>1178>the funny part is that I could easily see the dump thing happening in PRAD.I can't, what episode plot would involve the twins hanging out in garbage?
No.1180
>>1179Not an episode specifically, I mean the girls have wacky hijinks at the dump and the twins show up
No.1181
>>1180rhythm eats a weird piece of meat at the dump
No.1182
That sounds like a pripara bit, but it works for PR
No.1183
I am looking forward to pripara and kind of annoyed how the experience of watching dear my future and rainbow live is getting into a new group of girls for 50 episodes then they get dropped
No.1184
>>1159>>1160>>1161>>1162This company is more powerful than most governments, by the way. What a world we live in
No.1185
>>1184Even though they get regulated regularly and are consistently seen as incompetent on media...
No.1186
They're not even like Samsung who owns half of South Korea and all the government
No.1187
>give anons the power to make anything with AI
>they make chubby girls and futa
grim
No.1189
>>1105I decided to give Pony another try, or more specifically I checked out a finetune of it called AutismMix and it seems quite impressive. It can even do sex positions! There's still errors that pop up, but like AI in general the reason it works is because your brain is turned off when fapping. The Japanese character recognition is mediocre (Reimu works but Sanae doesn't??) but obviously still far better than my own merges that are like 80% furry just so genitals can be generated. I still find it funny that I knew the furry connection within a few weeks and it took other people over a year to notice it. Furries are powerful.
I really don't know how to prompt for it, but I guess I'll learn eventually. Pic related is what it looks like when I try to prompt Kuon (with other tag assistance) and it completely lacks her beauty and charm of course. Unlike what I previously thought, you can train LORAs with even 8gb of VRAM, so my 12 will allow me to make my Kuon and Aquaplus LORAs again, but I have to do the image cropping all over again because it's 1024x1024 instead of 512x512. Soon...
I'm still going to keep my old models around, not just because of the hundreds of LORAs I have that are not compatible with SDXL, but because I like the general style that my merges have. I may try making SDXL/Pony merges, but I'll see how things go first. It seems to have less options when doing the Supermerge process so that may make it easier.
In other news Stable Diffusion came out today (or will very soon) but like all the other AI stuff I don't have any interest until someone makes a 2D model of it.
No.1190
>>1189>In other news Stable Diffusion came out todayErr Stable Diffusion 3 that is
No.1191
I was looking around 4chan today and happened to stumble upon this thread,
https://boards.4chan.org/a/thread/268171362It got me kind of curious because from what I know most models when you try to edit an image directly tend to alter the base image a little bit into something else instead of perfectly editing the image with a specific modification. From what the OP said in a recent post he's using some sort of subscription, but I know that DallE and Gemini don't work like this so it has to be someone's paid SD offshoot that they've tweaked to work in this way. My question is how would you approach doing this in your own SD model? Via controlnet or something? It seems so odd... Of course there's still plenty of areas where it's making unwanted changes like with changing the style of the bras and aspect ratio or whatnot, but for an advancement in AI modifying only specific details of an image it looks like it's doing pretty good.
No.1192
>>1191Looks like some skilful usage of inpainting.
No.1195
>>1191He's just using NAI probably, although I didn't know they offered such a thing.
It's inpainting, yeah. I don't really do it much, but these days you can use controlnet which would probably be superior to NAI's, although they might have their own model for it or something.
You go to the Img2Img tab and then the Inpaint tab, provide the prompt, and ideally set up controlnet. You use 'inpaint_global_harmon
ious" and set denoise to 1 instead of the usual 0.3 or whatever.
No.1762
>>1761Yeah, it's funny how long AI stayed in that relatively
Coming Soon™ state for over a decade until one day it all just blew up and insane progress occurred nearly overnight.