See how image generation can be used with creating great videos
Easily Create eye-catching images for Plane Helper
Just dream up what you want and we optimize to get great images at scale. Plane Helper is perfect for aircraft, vehicle, planes, tiltrotor, and jets images
Introducing Plane Helper! A multi-concept LyCORIS (trained as a LoHa) for v2.1. Plane Helper is trained on around 2300 images of 71 different planes. I would like to share my training process for anyone interested in learning about training a multi-concept LoHa on complex concepts. This is by no means meant to be scientific, but to help others who might be facing some of the same roadblocks I did.
Dataset:
I've created a data table that contains all 71 plane types in this model along with their respective key tokens which can be viewed here:
https://docs.google.com/spreadsheets/d/1N7o4pc9mGyYSYIoeD4fU_WpO_2tL_JfKDETOJPVQw18/edit?usp=sharing
The project started after realizing how bad 2.1 was with planes. They would always come out with wings in the wrong places, or extra wings, and not very detailed. Sometimes nothing more than a blob of pointy geometric shapes. I aimed to do something about it. So I started out by grabbing about 130 images of cool looking planes and running them through a batch interrogator for captioning. I trained with Kohya and realized no matter how much I trained, I would still get the same issues. I felt the issue was I had too many different objects and was confusing the LoHa by not differentiating them and just calling them all planes, or jets. So I decided I was going to identify each type of plane and inject their names into their respective captions.
I didn't know much about planes, but I wanted 2.1 to be able to make awesome images of them. So the process of running each image into image.google.com and identifying them all was a tedious but necessary exercise. I ended up with 71 different plane types. So instead of picking 2 and moving on. I thought I would see what this LoHa type model was made of and continued to build my dataset with at least 30 images of each plane type and train them all into 1 LoHa. There were some plane types that I could not get 30 decent images of, however I decided to leave them in the dataset because the ultimate purpose was to train a model on planes in general and it might help to keep them in there. By the time I acheived this goal, I had about 2300 different images total for 71 different plane types all organized neatly in their own folders. Then came the captioning process.
After running these through batch interrogation using Captionr, I googled each plane type and made an excel table with the name and brief but detailed description of the aircraft. I then brought each folder into the Dataset Tag Editor extension one by one and proceeded to clean up the captions and inject the tokens from the excel table I made into each one. I decided to add a debug "CHV3CPlane" token and a "CHV3CTiltRotor" token for the tiltrotor planes (because they are very different shapes) into each caption as well. These debug tokens work great for making unique planes as well. I also allowed for weighted captioning so those debug tokens, along with the plane type identifier tokens were weighted. For example a caption might look something like this:
"(Hawker Hunter), transonic jet-powered fighter aircraft, Hawker Aircraft, Royal Air Force (RAF), Avon turbojet engine, swept wing, (CHV3CPlane), a plane is flying in the air, swiss, low level, man, viewed in profile from far away, switzerland, illustration, overhead".
After this long and patience testing process, I now had an acceptably decent dataset to begin training with.
Repeats:
I initially set the repeats for each folder to 5 but some plane types had a smaller amount of images in their dataset than others, so I did a rough balancing where the planes with a higher amount of images ran for 5 repeats, where as a set with few images might have ran for 10-15.
Training method:
I went about this project in a somewhat unorthodox way. For a dataset so big, I naturally had issues with overfitting. For LoHa a network dim of 8 is recommended but I changed over to 32 to allow for more training space. I went through a few phases trying to find the best settings that would allow me to train for more epochs before getting too burnt up. The original source model was v2-1_768-ema-pruned, however with each training run, I would merge the LoHa from the previous run into v2-1_768-ema-pruned and use that merge as the source model for the next run. The theory behind this is that it would gain some sort of extra knowledge from the previous run and give it a little bit of a head start. This theory needs more testing. Preferrably on a smaller dataset. After each training run, I would also merge the old source model that I merged with the new merged source model and then use that as the new source. I would also use Extract LyCORIS LoCON to extract a lora from it. This particular model is 4 LoHas from 4 different training runs (all on the same 71 plane dataset) merged together and extracted.
Training Specs:
I won't get too in the weeds with training specs. Instead I will post where I found the best settings to be for this particular project and point you to 3 other guides that go through everything in much better detail.
THE OTHER LoRA TRAINING RENTRY
https://rentry.org/59xed3
LoRA Training Guide
https://rentry.org/lora_train
RFKTR's in-depth guide to Training high quality models
https://civitai.com/articles/397
My Specs:
{
"LoRA_type": "LyCORIS/LoHa",
"adaptive_noise_scale": 0,
"additional_parameters": "",
"block_alphas": "",
"block_dims": "",
"block_lr_zero_threshold": "",
"bucket_no_upscale": true,
"bucket_reso_steps": 64,
"cache_latents": true,
"cache_latents_to_disk": false,
"caption_dropout_every_n_epochs": 0.0,
"caption_dropout_rate": 0,
"caption_extension": ".txt",
"clip_skip": "1",
"color_aug": false,
"conv_alpha": 1,
"conv_alphas": "",
"conv_dim": 4,
"conv_dims": "",
"decompose_both": false,
"dim_from_weights": false,
"down_lr_weight": "",
"enable_bucket": true,
"epoch": 20,
"factor": -1,
"flip_aug": true,
"full_fp16": false,
"gradient_accumulation_steps": 2.0,
"gradient_checkpointing": true,
"keep_tokens": "0",
"learning_rate": 0.0001,
"logging_dir": "",
"lora_network_weights": "",
"lr_scheduler": "cosine",
"lr_scheduler_num_cycles": "",
"lr_scheduler_power": "",
"lr_warmup": "10",
"max_data_loader_n_workers": "4",
"max_resolution": "768,768",
"max_timestep": 1000,
"max_token_length": "150",
"max_train_epochs": "",
"mem_eff_attn": true,
"mid_lr_weight": "",
"min_snr_gamma": 5,
"min_timestep": 0,
"mixed_precision": "bf16",
"model_list": "custom",
"module_dropout": 0,
"multires_noise_discount": 0,
"multires_noise_iterations": 0,
"network_alpha": 4,
"network_dim": 32,
"network_dropout": 0,
"no_token_padding": false,
"noise_offset": 0,
"noise_offset_type": "Original",
"num_cpu_threads_per_process": 4,
"optimizer": "AdamW8bit",
"optimizer_args": "\"decouple=True\" \"weight_decay=0.01\" \"d_coef=2\" \"use_bias_correction=True\" \"safeguard_warmup=True\"",
"output_dir": "",
"output_name": "",
"persistent_data_loader_workers": false,
"pretrained_model_name_or_path": "",
"prior_loss_weight": 1.0,
"random_crop": false,
"rank_dropout": 0,
"reg_data_dir": "",
"resume": "",
"sample_every_n_epochs": 0,
"sample_every_n_steps": 0,
"sample_prompts": "",
"sample_sampler": "euler_a",
"save_every_n_epochs": 1,
"save_every_n_steps": 0,
"save_last_n_steps": 0,
"save_last_n_steps_state": 0,
"save_model_as": "safetensors",
"save_precision": "bf16",
"save_state": true,
"scale_v_pred_loss_like_noise_pred": true,
"scale_weight_norms": 0,
"sdxl": false,
"sdxl_cache_text_encoder_outputs": false,
"sdxl_no_half_vae": false,
"seed": "",
"shuffle_caption": true,
"stop_text_encoder_training": 0,
"text_encoder_lr": 5e-05,
"train_batch_size": 2,
"train_data_dir": "",
"train_on_input": false,
"training_comment": "",
"unet_lr": 8e-05,
"unit": 1,
"up_lr_weight": "",
"use_cp": false,
"use_wandb": false,
"v2": true,
"v_parameterization": true,
"vae_batch_size": 0,
"wandb_api_key": "",
"weighted_captions": true,
"xformers": true
}
See what else you can do with Plane Helper
Create hundreds of images for Plane Helper in minutes
Produce eye-catching images on Plane Helper with our autoscale system. You can create thousands of images on top of our cluster to get images incredibly fast. No more waiting around on your local machine for results.
Test variants and ideas quickly with our csv system
Have a general idea you want to test, but its not coming out exactly as you want, use our mass generator import system to generate many variations to find the perfect image you are looking for. You can write in excel, csv, or any other tool that exports CSV.
Check out other related models: aircraft, vehicle, planes, tiltrotor, jets
Save time and money
Its a hassle getting all of this setup, with the click of a button start creating images instantly and effortlessy to find the perfect visualizations for your project.
Better looking images than anywhere else
Don't waste time playing around with settings and configurations. We have optimized and fine-tuned Plane Helper to get the best results.
Build a custom model on top of Plane Helper
Like the results you see with Plane Helper and want to build your own on top of it? No problem, we provide infrastructure to easily build on top of it. Just give us a few images and we will build a new model on top of it. (This feature is in private beta)
Test Plane Helper against hundreds of models with one click
Use our omni prompt tool and compare Plane Helper with any models you want. You just put in your prompts and select the models and it runs instantly.
Want to try a different image model? We have thousands...
Best for producing fashion, woman, and clothing images. Activation keywords: fl0r4l_1, flower dress and fl0r4l_1, floral dress
Best for producing photorealistic, celebrity, and adult star images. Activation keywords: MiaMelanoTI
Best for producing anime, character, female, mekakushi dan, kagerou project, mekakucity actors, and tateyama ayano images. Activation keywords: Tateyama Ayano, red eyes, and brown eyes