V2
This model was finetuned from Stable Diffusion's 1.4 release with batch size 3 and 5e-6
learning rate on a single Radeon Pro W6800 GPU.
Captions use space separated tags. Spaces within tags should be replaced with underscores.
The autoencoder was finetuned for ~120,000 steps before finetuning the rest of the model.
Full finetune release
Global step count in filename doesn't match in the state_dict due to the way I resumed at one point.
TODO: Sample images, release more checkpoints later.
Short finetune / Beta (2022-10-11) Model
This checkpoint was captured after ~26,000 steps (~80,000 images) of training, which took 12 hours. This isn't even a full epoch.
This shows that finetuning to a reasonable quality level does not need a huge cluster of expensive datacenter GPUs.
V1 (2022-10-01) Model
This model was finetuned for ~500,000 steps at multiple different batch sizes and learning rates. It performs better than the current V2 model due to its longer training time, but I expect it to be surpassed soon.
The training captions used comma separated tags.
sd1.4 finetuned derpibooru safe suggestive e=0002 gs=287000.ckpt
sd1.4 finetuned derpibooru safe suggestive e=0007 gs=460000.ckpt
Training Code
These models were trained using github:LunNova/translunar-diffusion.
Please see the stable diffusion training notes for more details.
Model Licenses
These releases are all licensed under the CreativeML Open RAIL-M License, as required for derivatives of the original stable diffusion model.