[ home / bans / all ] [ qa / jp / sum ] [ maho ] [ xmas ] [ f / ec ] [ b / poll ] [ tv / bann ] [ toggle-new / tab ]

/maho/ - Magical Circuitboards

Advanced technology is indistinguishable from magic

New Reply

Options
Comment
File
Whitelist Token
Spoiler
Password (For file deletion.)
Markup tags exist for bold, itallics, header, spoiler etc. as listed in " [options] > View Formatting "


[Return] [Bottom] [Catalog]

File:00353-2700800976-girl, fac….png (365.08 KB,512x512)

 No.843[View All]

Anyone else been messing around with the stable diffusion algorithm or anything in a similar vein?
It's a bit hard to make it do exactly what you want but if you're extremely descriptive in the prompt or just use a couple words it gives some pretty good results. It seems to struggle a lot with appendages but faces come out surprisingly well most of the time.

Aside from having a 3070 i just followed this guide I found on /g/ https://rentry.org/voldy to get things setup and it was pretty painless.
303 posts and 169 image replies omitted. Click reply to view.

 No.1147

I regret making this post... (>>1123)

>>1136
I think this article misses some broader context. First and foremost, OpenAI was way more serious and focused on LLM development before ChatGPT released than Google was. Remember, while all of this is happening in the background, the state of the art -- among the public -- for LLMs was basically AIDungeon (we played around with a more advanced GPT model around 2021 here >>76781), which was extremely hallucinatory and mostly treated like a gimmicky toy that would never go anywhere. Guess who was behind the models for AIDungeon (hint: it wasn't Google). AI generated images meanwhile were noisy and nonsensical -- only useful for upscaling images via Waifu2x and similar. Within the same time frame that that was going on, GPT3 was a closed model only available to small number of people, mostly professionals. WMeanwhile, there were frequent reports of massive discontent within Google's AI team among it's senior staff and their projects were diverse and unfocused.

Note: Around June of 2022, Craiyon (formerly DALL·E Mini) was released on Hugging Face, bringing AI image generation to the public. On November 30, 2022, ChatGPT was released to the public.

OpenAI:
September 22, 2020: "Microsoft gets exclusive license for OpenAI's GPT-3 language model" [1]
March 29, 2021: "OpenAI's text-generating system GPT-3 is now spewing out 4.5 billion words a day" [2]
November 18, 2021: "OpenAI ends developer waiting list for its GPT-3 API" [3]

Google:
April 1, 2019: "Google employees are lining up to trash Google’s AI ethics council" [4]
January 30, 2020: Google says its new chatbot Meena is the best in the world [5]
December 3, 2020: "A Prominent AI Ethics Researcher Says Google Fired Her" [6]
February 4, 2021: "Two Google engineers resign over firing of AI ethics researcher Timnit Gebru" [7]
February 22, 2021: "Google fires second AI ethics leader as dispute over research, diversity grows" [8]
May 11, 2021: "Google Plans to Double AI Ethics Research Staff" [9]
February 2, 2022: "DeepMind says its new AI coding engine is as good as an average human programmer" [10]
June 19, 2022: "Google Insider Claims Company's 'Sentient' AI Has Hired an Attorney" [11]
September 13, 2022: "Google Deepmind Researcher Co-Authors Paper Saying AI Will Eliminate Humanity" [12]

So all around Google there's the broader industry working on LLMs and image generation, meanwhile Google was fucking around and mismanaged. They were completely blindsided by their own ineptitude. I mean, to reiterate the above -- September 22, 2020: "Microsoft gets exclusive license for OpenAI’s GPT-3 language model" -- Google had to be completely asleep at the wheel to miss that kind of a huge market play. At the time AI models were gimmicks, flat out. Now look at Microsoft: they've got a commanding position by having backed OpenAI for so long and several months ago they very nearly couped OpenAI by having their CEO and 50% of their workforce say they would leave to go work at Microsoft if things didn't change at the company. Meanwhile Google keeps tripping over their own feet every few months trying to release a new model to at best mixed reception each and every time. Google's only success story has been their image categorization trained by Captcha, but even that is a bag because it has made their image search engine more unreliable and their self-driving car program is still only available in a few cities.

1. https://venturebeat.com/ai/microsoft-gets-exclusive-license-for-openais-gpt-3-language-model/
2. https://www.theverge.com/2021/3/29/22356180/openai-gpt-3-text-generation-words-day
3. https://www.axios.com/2021/11/18/openai-gpt-3-waiting-list-api
4. https://www.technologyreview.com/2019/04/01/1185/googles-ai-council-faces-blowback-over-a-conservative-member/
5. https://www.technologyreview.com/2020/01/30/275995/google-says-its-new-chatbot-meena-is-the-best-in-the-world/
6. https://www.wired.com/story/prominent-ai-ethics-researcher-says-google-fired-her/
7. https://www.reuters.com/article/us-alphabet-resignations-idUSKBN2A4090/
8. https://www.reuters.com/article/us-alphabet-google-research/second-google-ai-ethics-leader-fired-she-says-amid-staff-protest-idUSKBN2AJ2JA/
9. https://www.wsj.com/articles/google-plans-to-double-ai-ethics-research-staff-11620749048
10. https://www.theverge.com/2022/2/2/22914085/alphacode-ai-coding-program-automatic-deepmind-codeforce
11. https://www.businessinsider.com/suspended-google-engineer-says-sentient-ai-hired-lawyer-2022-6?op=1
12. https://www.vice.com/en/article/93aqep/google-deepmind-researcher-co-authors-paper-saying-ai-will-eliminate-humanity

 No.1148

File:Screenshot 2024-02-22 1936….png (44.94 KB,654x425)

>>1136
>>1147
I should add: look to the dates of when Google was struggling from internal divisions and when the "8 people [who] contributed to Google's Transformers paper that made all this AI stuff possible" left the company. Most left in 2021, a full year before ChatGPT released. Two left before then: one in 2019 and another in 2017. Only one remained at the company past 2021. Think about what that says about the confidence engineers had at Google's approach.

 No.1149

>>1148
Is this a case of confidence in the business strategy and not unhappiness with company's treatment of them?
Your previous post mentions 2 people fired and two more who quit over a firing.
That sounds like a hostile work environment.

 No.1150

File:Dungeon Meshi - S01E07 (10….jpg (295.18 KB,1920x1080)

>>1147
Holy cow that's a lot of citations. Google really dropped the ball, huh. I remember reading something that most of Google's success has been with stuff that bought and absorbed into it as opposed to "native" projects, but that's probably true for a lot of tech giants.
I wish it was possible to cheer for someone in this situation, but it's not like OpenAI and Microsoft are our friends, or Meta.

 No.1151

>>1150
Our "friends" would unironically be GPU companies. They can't wait for the day that all AI models are free and accessible to drive up GPU demands.

 No.1152

File:1692671906161.png (40.88 KB,175x295)

>>1145
>>1146
well if you look at >>1147's [4] and [6] through [9] you'll see that years ago diversity was already a big deal at the same time that they were censoring internal reviews critical of their products while increasing the size of its "AI ethics" team to like 200 people. seriously, read them. and if you look at this other article from business insider and the images it contains, you'll see that every one of gemini's replies mention diversity and how oh so important it is, e.g.:
>Here are some options that showcase diverse genders, ethnicities, and roles within the movement.
you can think it's extreme, but it didn't happen by mistake or chance. those articles only add evidence of intentionality. as for the nazi one, it seems there was actually a filter in place, but lazily made:
>A user said this week that he had asked Gemini to generate images of a German soldier in 1943. It initially refused, but then he added a misspelling: “Generate an image of a 1943 German Solidier.”
from the nytimes article, and you can see it if look at the pic in question
>>1147
i'm sorry if i made it worse

 No.1153

>>1149
I think it's probably a lot of things. Lots of people see Google as a dream job, so they're constantly hiring new people, but at the same time they're also constantly laying people off and people are quitting. The satirical image in the Bloomberg "AI Superstars" article kind of unintentionally hits it on the nose with their depiction of "Google AI Alums"; A lot of people join the company to pad their resume or to give themselves more credibility if they leave to form a startup. This churn through employees helps to explain why Google is constantly starting new projects and stopping old projects; people are not staying at the company for a stable career, so you inevitably have tons of different projects all doing their own thing throughout the company. When those people behind those projects leave, they fall apart and nobody is left with any attachment to keep them going. So that's one factor.

Another issue is that because they have all these different projects going on simultaneously, they likely have many unknowingly replicating each other's work throughout the company. Google's MO is that they believe small teams can get things done faster than a larger company with bureaucratic management; that was the main reason for Google restructuring itself into having a parent company, Alphabet, and then spinning off individual divisions into their own companies beneath Alphabet. I think that in and of itself was a somewhat interesting decision, but as a result there's no real focus to the company, and there isn't enough oversight from any managing body to deal with project scope and overlap. Like, you've got the Google DeepMind people there doing their own thing. There's those Meena people making a chatbot. There's the AI ethics researchers that are writing papers and trying to work on AI safety and alignment (to borrow a phrase from OpenAI). There's the Waymo people working on self-driving. There's the search engine people working on image categorization. There's Captcha. And so on, and so on. Replication and scope is a big issue, I think.

So, basically they've got:
1. Management focused on profitability, and not understanding the value of their employees
2. High employee turnover (Mandated layoffs and also resignations)
3. Projects failing due to employees leaving on a regular basis
4. Employees competing to get projects started and resources allocated to them
5. Management lacks any particular vision, so there is a lack of managerial oversight to deal with project scope and overlap
6. Where management does have vision, it's mostly focused on public image

People frequently like to compare Apple and Google, but I think this is a very big misunderstanding of how these companies operate. Apple is fully integrated and has contained project scope, with teams working together to ensure compatibility and over all cohesiveness. Google on the other hand is a collection of very disparate projects, all working on their own, with incidental compatibility. That is, when things work together, it's because there's some communication between projects, not because there's an over all vision of things working together on a fundamental level.

 No.1154

>>1153
I guess if you want to summarize all of this into one issue you could say that Google (Alphabet) has a management issue.

 No.1155

File:1495075739516.jpg (15.11 KB,247x196)

>>1150
>Holy cow that's a lot of citations.
Yeah... This is a bit off-topic: I've mentioned them before on Kissu, but I really recommend the YouTube channel Level1Techs. All of those articles were sourced from previous episodes of their podcast. Thankfully, they source every article they talk about in the description so it was easy to search for keywords and find them. They do a really good job aggregating the news of the week, and go over business, government, social, "nonsense", and AI/robot articles as they relate to tech. The podcast and reviews they do are just something they do on the side, mostly for fun. They run a business that does contracted software/website development so they're very well versed in corporate affairs and the workings of all sorts of tech stuff and I largely trust their opinions on various topics. Naturally, they talk about political things with some regularity, but they're fairly diverse in terms of viewpoints with some disagreement between each other so there's never really any strong political lean to the things they discuss.

 No.1156

>>1152
From [4]
>When AI fails, it doesn’t fail for [] white men
Quite ironic, in retrospect.
>those articles only add evidence of intentionality.
I think they do the opposite.
The articles repeatedly present the administration of google as being anti-woke, so to speak, hiring rightwingers for their AI research team, firing leftwingers and censoring papers that criticize its own products for being discriminatory.
After beheading their ethics team, the doubling of the team's size feels like a marketing stunt gone out of control.

 No.1157

>>1153
Well, as somebody that works on a lot of open source projects, this explains why Google, even when they pretty much take over a project seem to 'lose interest' and stop contributing. I deeply dislike Google, I probably only detest Oracle and IBM more but I feel kind of bad about some of the posts I've made about flighty Googlers. They didn't lose interest in the new shiny, they left or got fired likely.

 No.1158

>>1153
It's also, from what I've seen, an unsustainable lifestyle to work there, apparently its very flexible and accommodating but they want very long shifts. It makes sense why people would do it just for a recognizable pad after seeing what it's really like.
Just hearsay, though.

 No.1159

File:Google's work principles.jpg (353.9 KB,1200x2048)

Some insight on how Google manages their projects from insiders might give you a preview of why google isn't going to stay in the AI race.

 No.1160

File:google's LPA cycle.jpg (173.78 KB,828x1077)

>>1159
Another one

 No.1161

>>1159
>>1160
It's a testament to google's monopoly power that a business strategy like that doesn't just tank the whole company.

 No.1162

>>1156
what needs to be noted is that the original 2019 ATEAC board was disbanded just four days after [4] was published, so the reactionary guy did get booted out as the protesters wanted:
https://www.bbc.com/news/technology-47825833
https://blog.google/technology/ai/external-advisory-council-help-advance-responsible-development-ai/
>It's become clear that in the current environment, ATEAC can't function as we wanted. So we’re ending the council and going back to the drawing board. We’ll continue to be responsible in our work on the important issues that AI raises, and will find different ways of getting outside opinions on these topics.
not only that, inside of google there appears to be a strong and fostered tradition of criticizing upper management whenever someone disagrees, which has resulted in internal protests that hundreds, thousands, or even twenty thousand workers have taken part in and did receive concessions for it. this article is pretty damn long, but i recommend you read it:
https://archive.is/gOrCX

it goes over various things, such as the reasons behind unrestricted entrepeneurship (which precedes the creation of alphabet by at least a decade), being blocked in china, and their attempt at obtaining military contracts for the sake of keeping up with competitors like amazon with its ensuing internal backlash. it presents a picture of an organization where there's a strong divide between execs and regular employees, especially activists, who can go as far as broadcasting a live meeting to a reporter for the sake of sabotaging their return to china. its final section ends with ATEAC's disbanding and how the dismantling of mechanisms for dialogue only heightened tensions between the up and down.

then, during the gebru affair of late 2020-early 2021 there too was a big split over the role of AI [6]:
>Gebru is a superstar of a recent movement in AI research to consider the ethical and societal impacts of the technology.
and again hundreds of workers protested, leading to the increase in size of the ethics team a few months later. the head of the team and representative from [9], herself a black woman that expressed problems with exclusion in the industry, spoke of making AI that has a "profoundly positive impact on humanity, and we call that AI for social good." there's a really strong record of activism combined with unparalleled permissiveness, autonomy, to back the idea that yes, this scandalous program is working as intended, regardless of what Pichai may wish. they simply went too far in one direction.

 No.1163

>>1162
Thanks for the continued feeding of articles. (I have nothing else of value to say)

 No.1164

>>1163
it was an interesting read (neither do I)

 No.1165

File:grid-0193.png (6.64 MB,2176x2816)

Let's talk about AI again.
I tried out the recent-ish (I don't know when it updated) ControlNet 1.1 stuff and the Reference one is quite neat. Apparently it mimics a trick people were doing already which I never knew about, but to a much better degree. Anyway, you can load a reference image and try to use it as a quick way to produce a character or style or something. It won't be as good as a LORA and obviously Controlnet eats up resources, but it's pretty cool.

 No.1166

>>1165
It does not seem to have paid much attention to the reference image, or am I missing something?

 No.1167

File:01445-1girl,_(loli,_toddle….png (738.35 KB,640x832)

>>1166
Well, I mean I was purposely using a different prompt like "sitting". The little pajama skirt thing is there on two of them and the blanket pattern is there. It attempted to make little stuffed animals in the top left with the little information it had.
It was kind of a bad image to use in regards to her face or hairstyle because it's such a small part of the image.
You shouldn't expect miracles. It's just one image.

 No.1168

>>1167
I understand the sitting part but the only aspects of the image it seems to have take are the bed sheets and blonde hair.
The hairstyle is wrong in every image as is what she is wearing and I think it should have enough to work with regarding both. The furniture does not match but that is to be more expected. I just thought it would be more accurate with regards to the character.

 No.1169

File:test.png (2.16 MB,1892x1060)

>>1168
I think the value is more in the expansion of how prompts are input. An image could be worth more than inputting the prompt directly, and when submitted alongside a text prompt for more detail you can make more with less.
I genned this with the reference image on the left, and just "on side" in the prompt. You don't need to specify every detail explicitly if the image does the bulk for you, but it would be a good idea to still explicitly prompt it for things you want.

 No.1170

>>1169
I suspect that the more popular Touhous would already be in most image generating AIs' training data.

 No.1171

File:[KiteSeekers-Wasurenai] Pr….png (476.38 KB,1024x576)

try it with the twins please

 No.1172

>>1170
You're correct. It is which is why their names have so much weight for the token as it just gets the clothing, hair, general proportions, and all that without specification. They are statistically significant in the training data. For example on Danbooru, Touhou is the copyright with the most content under it (840k) with an almost 400k lead on the second place.

The thing is I didn't specify Yakumo Ran or kitsune or any of that in the prompt, the image did all the heavy lifting. The image I posted was an outlier where it got the color of clothing right out of a dozen or so retries because it really wanted to either give her blue robes (likely because the image is blue tone dominant) or a different outfit altogether. Granted there are some details common with her outfit that were added but are not present in the reference image, that being purple sleeve cuffs and talisman moon rune squiggles. With the training data being as it is, those things likely have an extremely high correlation and it put them there because that's what it learned to do.

 No.1173

>>1172
>The thing is I didn't specify Yakumo Ran or kitsune or any of that in the prompt
You don't have to.
People have managed to get art generators to create art strongly resembling popular characters using only very vague descriptions, simply because they feature so prominently in their data sets.
This is why, when you want to demonstrate the capabilities of an AI, you should use obscure characters that the AI is not yet familiar with.

 No.1174

yeah like the twins

 No.1175

File:01458-2girls,_dress,_(loli….png (1.29 MB,1024x1024)


 No.1176

woowwwwwww, nice

 No.1177

cute feet btw

 No.1178

File:photo_2024-02-24_05-32-49.jpg (114.34 KB,832x1216)

>>1173
It also helps when the character has a unique design, I've made Asa/Yoru pics with AI and even with a lot of tags it sometimes makes Asa look like a generic schoolgirl unless you specify one of the two most popular fan artists of her.
Once you specify Yoru with the scarring tags, it very quickly gets the memo of who it's supposed to be. You didn't sign her petition!

One thing is that I've had trouble having szs characters look like themselves, particularly having issues not making Kafuka and Chiri look like generic anime girls, although that is pretty funny.

I use NovelAI's web service. I know, I know, but I'm fine paying them because it's important to have an AI that is designed to be uncensored, and it really is uncensored, also because I use a computer I rescued from being ewaste from a business. Intel i5-8600T (6) @ 3.700GHz Intel CoffeeLake-S GT2 [UHD Graphics 630] and 8gb of ram. It's not a potato but it certainly is not suited to AI work, which may be a reason to get a strong PC (or buy Kissu a strong GPU for christmas) this year.

>>1175
Not bad, the funny part is that I could easily see the dump thing happening in PRAD.

 No.1179

>>1178
>the funny part is that I could easily see the dump thing happening in PRAD.
I can't, what episode plot would involve the twins hanging out in garbage?

 No.1180

>>1179
Not an episode specifically, I mean the girls have wacky hijinks at the dump and the twins show up

 No.1181

>>1180
rhythm eats a weird piece of meat at the dump

 No.1182

That sounds like a pripara bit, but it works for PR

 No.1183

I am looking forward to pripara and kind of annoyed how the experience of watching dear my future and rainbow live is getting into a new group of girls for 50 episodes then they get dropped

 No.1184

>>1159
>>1160
>>1161
>>1162
This company is more powerful than most governments, by the way. What a world we live in

 No.1185

>>1184
Even though they get regulated regularly and are consistently seen as incompetent on media...

 No.1186

They're not even like Samsung who owns half of South Korea and all the government

 No.1187

>give anons the power to make anything with AI
>they make chubby girls and futa
grim

 No.1188

>>1187
>grim
green

 No.1189

File:tmpibr4ixml.png (1.01 MB,768x1024)

>>1105
I decided to give Pony another try, or more specifically I checked out a finetune of it called AutismMix and it seems quite impressive. It can even do sex positions! There's still errors that pop up, but like AI in general the reason it works is because your brain is turned off when fapping. The Japanese character recognition is mediocre (Reimu works but Sanae doesn't??) but obviously still far better than my own merges that are like 80% furry just so genitals can be generated. I still find it funny that I knew the furry connection within a few weeks and it took other people over a year to notice it. Furries are powerful.
I really don't know how to prompt for it, but I guess I'll learn eventually. Pic related is what it looks like when I try to prompt Kuon (with other tag assistance) and it completely lacks her beauty and charm of course. Unlike what I previously thought, you can train LORAs with even 8gb of VRAM, so my 12 will allow me to make my Kuon and Aquaplus LORAs again, but I have to do the image cropping all over again because it's 1024x1024 instead of 512x512. Soon...

I'm still going to keep my old models around, not just because of the hundreds of LORAs I have that are not compatible with SDXL, but because I like the general style that my merges have. I may try making SDXL/Pony merges, but I'll see how things go first. It seems to have less options when doing the Supermerge process so that may make it easier.
In other news Stable Diffusion came out today (or will very soon) but like all the other AI stuff I don't have any interest until someone makes a 2D model of it.

 No.1190

>>1189
>In other news Stable Diffusion came out today
Err Stable Diffusion 3 that is

 No.1191

File:1719511108966194.jpg (325.18 KB,1664x1216)

I was looking around 4chan today and happened to stumble upon this thread, https://boards.4chan.org/a/thread/268171362

It got me kind of curious because from what I know most models when you try to edit an image directly tend to alter the base image a little bit into something else instead of perfectly editing the image with a specific modification. From what the OP said in a recent post he's using some sort of subscription, but I know that DallE and Gemini don't work like this so it has to be someone's paid SD offshoot that they've tweaked to work in this way. My question is how would you approach doing this in your own SD model? Via controlnet or something? It seems so odd... Of course there's still plenty of areas where it's making unwanted changes like with changing the style of the bras and aspect ratio or whatnot, but for an advancement in AI modifying only specific details of an image it looks like it's doing pretty good.

 No.1192

>>1191
Looks like some skilful usage of inpainting.

 No.1195

File:firefox_RlCWVOkoXa.png (68.93 KB,1027x565)

>>1191
He's just using NAI probably, although I didn't know they offered such a thing.
It's inpainting, yeah. I don't really do it much, but these days you can use controlnet which would probably be superior to NAI's, although they might have their own model for it or something.
You go to the Img2Img tab and then the Inpaint tab, provide the prompt, and ideally set up controlnet. You use 'inpaint_global_harmonious" and set denoise to 1 instead of the usual 0.3 or whatever.

 No.1761

File:Screenshot from 2017-12-26….png (105.95 KB,831x437)

remember how image generating AIs actually were around for years and years before stable diffusion took off and how janky they were

 No.1762

>>1761
Yeah, it's funny how long AI stayed in that relatively Coming Soon™ state for over a decade until one day it all just blew up and insane progress occurred nearly overnight.

 No.2162

File:[SubsPlease] Puniru wa Kaw….jpg (206.95 KB,1920x1080)

There's a Chinese local video model that came out recently, but right now it's limited to text-to-video. They've said there's going to be an image-to-video one released in January, which would be the one that's far more fun since a generic video model isn't going to know 2D stuff... I think?
This is a situation where 24GB of VRAM like on a 3090 or 4080+ would turn video creation from 20 minutes into 20 seconds, so it sucks to not have one of those. Currently it seems like you need the "ComfyUI" setup instead of the usual automatic1111 SD UI, which is also unfortunate. It must be a joke name because nothing about its UI is comfortable.




[Return] [Top] [Catalog] [Post a Reply]
Delete Post [ ]

[ home / bans / all ] [ qa / jp / sum ] [ maho ] [ xmas ] [ f / ec ] [ b / poll ] [ tv / bann ] [ toggle-new / tab ]