Stable Diffusion 2 is here!

By | November 24, 2022

Stable Diffusion 2.0 is here already! New inpainting, text-to-image, upscaling and inpainting models are now available – along with an updated codebase too. Lots of new toys for a nerd to play with 🙂

 


Yes it feels like we’ve only just had Stable diffusion 1.5 but now it is time For the stable diffusion 2.0 release We’ll just have a quick look at some of The new features before we dive into Getting this installed here we have the New text to image diffusion models which Are basically 768 by 768 resolution so We’ve got a new model there we’ve got a New model here for super resolution Upscaler diffusion model so here we can Upscale things by four times they’ve got An example there of taking a 128 by 128 Image and upscaling that to 512 by 512 There is also a new depth to image Diffusion model now it doesn’t show it On this picture it does in a moment but Basically it will take a depth map of That image which it can then use to Create new images if we scroll down a Little bit we’ve got this little Animation thing here it goes between the Pictures and you do eventually see the Depth map there it is so there you go You’ve got the depth map and all the new Images created from that there is also An updated in-painting diffusion model As well so we’ve got inpating 2.0 along With everything else right so if we head On over to the GitHub repository here it Is stable diffusion 2.0 we can get all This downloaded and installed before we Move on let’s just have a quick couple Of notes about my environment here I am

Using Ubuntu 2204 with an Nvidia GPU and I am also using Anaconda for my virtual Python environment management now the First thing I’m going to do of course is I’m going to git clone that repo there It is I can just copy and paste download There we go I have the repository and of Course ICD into that as well Now I’m also going to make a couple of New directories here I’m going to make a Models directory and also a Midas models Directory too then there are a whole Load of things to download remember all Those things well they’ve all got their Own models yes there’s a new five 12×5 12 model a 768 by 768 model a depth Model A Midas model the upscaling model And the new in painting model so if you Download all of those into your new Models directory apart from the Midas Model which will go into the Midas Models directory or the initial setup You’ll want to conduct and create minusf Environment.yaml and then conda activate Ldm I am doing it ever so slightly Differently of course because that’s That’s my whim and my way so I am doing Condo create minus minus name sd2 yes I Already have an environment called ldm I’m going to use Python 3.10 and of Course after I’ve created that Condor Environment I I am going to activate it After activation I’m going to run pip Install minus r requirements.txt just as

Normal but I’m also going to run an Extra command as there seems to be a few Things missing from that Requirements.txt namely the Transformers Diffusers invisible Watermark and dim It’s a very good idea to install the Optional xformus package as well option A very very easy you can just conduct Install that of course you will need to Be on Linux the other option is to bip Install it and this should work on Windows as well maybe I don’t know it Definitely works on Linux the downside Of this is it does take 20 minutes to Compile but then you’ll feel all cool Because you’ve just compiled some code You will need app to install build Essential as well because you’re going To need a compiler but basically you can Just pip install ninja and then pip Install from the GitHub repo now we’re Just going to test each one of these Components bit by bit and the first one Is text to image so here we can see Various examples of text to image Pictures on the new 768×768 and the clip Vit h14 as well that all looks very nice They provide a reference sampling script And that is the one that I’m going to Use here so it is text to image and I’m Going to do a professional photograph of A giant steampunk rodent riding a sad Cat now I have already tried this of Course and unfortunately it doesn’t

Quite come out as I want it it’s not Riding a cat and it’s not that sad but Let’s have a quick run of this and see How those images come out And there we go we’ve got eight example Images in our outputs text to image Samples samples directory we get a grid Here so rather than looking at them all Individually there they are all on a Grid they are some very nice looking Steampunk rodents the cat isn’t Particularly sad it isn’t particularly Being ridden but it is still very nice Output next example they have on their Page is the image modification with Stable diffusion depth conditional Stable diffusion for this they have Provided a radio interface they’ve got The example there so let’s just have a Quick look at this so here I’m going to Run this Python gradio and I’m going to Use the 512 depth EMA checkpoint that I Put into that models directory After you’ve clicked on that link you’ll Be presented with a fairly familiar web Interface if you’re used to using Automatic 1111 there you can see you can Just drop an image there or click to Upload so let’s drop an image there I Want a prompt as well so let’s have a Woman wearing glasses now you can just Click run but let’s have a quick look at The advanced features there we can see Now we’ve got images steps guidance

Scale strength seed and etm DDA so we Can play with any of those I’m just Going to pick a random seed and click Run and see what that comes out with There we go we’ve got a woman wearing Glasses very similar to the original Image and that’s got the nice depth map There as well that’s good let’s see what Happens if we turn the strength up to One is that any better is it any worse I’m not sure because I’m still testing This as well there we go it’s gone a Little bit black and white it’s gone a Bit black and white but it’s still very Very similar because it’s using the Depth map very nice I am very impressed By that I can certainly see lots of uses For that maybe some animations and Things in the future as we’ve got a nice Depth map can’t wait to see what comes Out for this okay let’s have a quick Look at the next feature this is more of A sub feature but you can also do the Classic image to image and they give you An example command there the example Command I’m going to run is this one Here let’s just pop this into my thing While it runs and there we can see we’ve Got a woman wearing steampunk glasses Trending on octane engine it’s taking That initial image that we’ve already Seen it’s using a strength of 0.8 and It’s using that 512 base EMA checkpoint As expected this will appear in the

Outputs image to image samples and it’s Empty at the moment so let’s just wait a Second and there we can see the grid has Once again been produced and there we Have her with her steampunk glasses very Good I like that standard image to image And the next feature is the upscaling With stable diffusion once again they Give you an example and I have my own Example here so we’ve got python super Resolution because their example isn’t Quite right super resolution uh that you Want to use the types for upscaling yaml And of course the upscaling checkpoint There so let’s copy and paste this one In and we will get a new gradio Interface again much like the depth Model one and here we have the stable Diffusion upscaling interface once again You’d be fairly familiar with that let’s Just drop that same image in there and Have a look at the advanced options Which are also fairly similar you’ve Also got a noise augmentation this time As well for synthetic images you may Want to increase the noise but you’ve Got the number of samples there ddim Step scale seed play with those as usual Let’s have a different seed and we’ll Also pop a prompt in there of a Professional photo of a pretty face with Finely detailed hair and now we have a Nicely upscaled image let’s just open That in a new tab so we can have a

Little compare there is the original Image There is the old one so it’s smaller And upscaled As you can see there’s some slight Upscaling artifacts you’ll be probably Used to those where it Smooths things Out a little bit but that’s pretty good That is a pretty nice upscaler And the final part on this GitHub repo Is the image in painting with stable Diffusion 2.0 and once again they’ve got The example command there the one that I Am using is exactly like that but of Course I’ve got my models in the models Directory so I’m using models 512 in Painting ema.checkpoint and this is also A gradio interface so that will open in Just a second And now we have the stable diffusion in Painting interface I’ll just drop my Same picture in pop in the prompt an Anime style female face but I will need To do a little bit of in painting here You can select the brush size I’m going To select a fairly large brush and just Paint out a random portion of her face And generate that just have a quick look At the advanced options while that is Generating we’ve got images so this is Going to make four number of steps 45 Guide and scale 10 and a random seed and There we have some rather beautiful Anime style faces so that about covers

Everything on there hopefully we’ll see This in the automatic 1111 web interface At some point soon but until then this Is the way to play with it if you need To learn about more nerdy rodent geekery Then do click on this video