3.About ControlNet

3.1 What is ControlNet?

ControlNet is a neural network architecture that controls and guides image generation models by accepting different types of conditional inputs to achieve more precise and diverse image generation. Simply put, it allows customization of generated results by adding conditional images.

3.2 The Function of ControlNet

It can control the elements of the entire image more precisely, including the subject, background, style, etc. There is no need for a large number of “random attempts” for random selection. You can further refine the control of the generated image through control functions in ControlNet such as Canny (hard edges), Softedge (soft edges), and depth (depth map).

3.3 Main Functions of ControlNet

3.3.1 Image Reference

This is applicable when you want the generated image to have a similar style or structure to a specific reference image.

For example: generating an image with a style similar to a certain work of art

Original Image

Generated Image

Image Reference Parameter Settings:

Control Weight: Determines the intensity of the control effect displayed in the image, with a default value of 1
Start Step: The step from which this ControlNet starts to affect the image generation content (the starting timing)
End Step: The step from which this ControlNet stops affecting the image generation content (the ending timing)
Steps: By default, the range from 0 to 1 means this ControlNet is effective throughout the entire process. The larger the start step, the later the control takes effect; the smaller the end step, the earlier the control ends, and the higher the degree of freedom of the generated image.
Style Fidelity: The higher the parameter, the higher the degree of reference to the image.
Reference Weight:
- Reference_adain：Only refers to the reference color of the input image
- Reference_adain+attn：Only refers to the input image, with Adaptive Instance Normalization + Attention connection
- Reference_attn（only）：Only refers to the input image

3.3.2 Hard Edges

Based on edge detection, it extracts relatively fine line drawings of the outlines of people or objects in the image (basically able to identify the details in the original image), allowing AI to generate images according to the contour lines.

Original Image

Process Image

Generated Image

Note: Try to ensure the reference image has a clean composition, and reduce the interfering parts of the main subject in the original image, so that the extracted contour edge lines will be clearer.

Hard Edge Parameter Settings：

Control Weight: Determines the intensity of the control effect displayed in the image, with a default value of 1
Start Step: The step from which this ControlNet starts to affect the image generation content (the starting timing)
End Step: The step from which this ControlNet stops affecting the image generation content (the ending timing)
Steps: By default, the range from 0 to 1 means this ControlNet is effective throughout the entire process. The larger the start step, the later the control takes effect, and the higher the degree of freedom of the generated image
Low Threshold: The lower the value, the thinner the extracted lines and the more scattered lines; increasing the (low threshold) will make the image have a general outline
High Threshold: The higher the value, the clearer the extracted image line drawing

Note:
To retain more details in the image, you can lower the (low threshold); to reduce details, increase the (low threshold). Generally, it is recommended to keep the default value.
If the generated image is too similar to the reference image, you can consider reducing the control weight or enriching the prompt.

3.3.3 Soft Edges

Compared with hard edges, the identified details are slightly fewer. It allows appropriate redrawing of some content, giving AI more room to play, and is suitable for generating images with soft transitions and blurred edges.

Original Image

Process Image

Generated Image

3.3.4 Depth

It extracts the depth information in the image, distinguishes the front – back relationship of the image content—the whiter the area, the more forward it is. It generates an image with the same depth structure as the original image, which can better restore the front – back relationship of people or objects in the image. This function is applicable to applications such as portrait photography, 3D modeling, 3D posters, architectural design, and background drawing.

Original Image

Process Image

Generated Image

Simply put, if you are very satisfied with the composition of an image but not the content, or want to generate an image with the same structure as a reference sample but different content, you can use the Depth function.
Note：Depth extraction loses some details of the original image and only identifies the general outline and composition.

Depth Parameter Settings:

Control Weight:Determines the intensity of the control effect displayed in the image, with a default value of 1.
End Step：The step from which this ControlNet stops affecting the image generation content (the ending timing).
Steps：By default, the range from 0 to 1 means this ControlNet is effective throughout the entire process. The larger the start step, the later the control takes effect; the smaller the end step, the earlier the control ends, and the higher the degree of freedom of the generated image.

3.3.5 Sketch/Scribble

Sketch/Scribble will lose many details. It is applicable to scenarios such as draft/scribble drawing, or when the original image has a lot of content to be redrawn but the general shape needs to be retained, allowing AI to generate more refined and detailed images.

Original Image

Process Image

Generated Image

Sketch/Scribble Parameter Settings:

Control Weight: Determines the intensity of the control effect displayed in the image, with a default value of 1
Start Step: The step from which this ControlNet starts to affect the image generation content (the starting timing)
End Step: The step from which this ControlNet stops affecting the image generation content (the ending timing)
Steps: By default, the range from 0 to 1 means this ControlNet is effective throughout the entire process. The larger the start step, the later the control takes effect; the smaller the end step, the earlier the control ends, and the higher the degree of freedom of the generated image.

3.3.6 Human Pose (Real – Person Photos)

Through pose recognition, it accurately identifies the pose of the person in the image, thereby generating the corresponding action pose. It is suitable for character design and fields that require images of people with specific poses.

Original

Procrss

Generated 1

Generated 2

Human Pose (Real – Person Photos)Parameter Settings:

Control Weight: Determines the intensity of the control effect displayed in the image, with a default value of 1
Start Step: The step from which this ControlNet starts to affect the image generation content (the starting timing)
End Step: The step from which this ControlNet stops affecting the image generation content (the ending timing)
Steps: By default, the range from 0 to 1 means this ControlNet is effective throughout the entire process. The larger the start step, the later the control takes effect; the smaller the end step, the earlier the control ends, and the higher the degree of freedom of the generated image.

3.3.7 Line Art

It converts the image into stylized lines to generate images. The biggest difference between the image generated under the control of line art and the original image is usually the color matching. It is suitable for any application scenarios that require generating detailed images based on line art, 2D to 3D conversion, and line art coloring.

Original Image

Process Image

Generated Image

If you only like the line art composition of the image but are not very satisfied with the color matching, you can use the Line Art control to process your image.

3.3.8 Anime Line Art

It generates line drawings in anime style. Compared with regular line art, it extracts fewer details but is better at generating anime characters. It is suitable for the creation of comics, animations, and related artworks.

Original Image

Process Image

Generated Image

Anime Line Art Parameter Settings:

Control Weight: Determines the intensity of the control effect displayed in the image, with a default value of 1
Start Step: The step from which this ControlNet starts to affect the image generation content (the starting timing)
End Step: The step from which this ControlNet stops affecting the image generation content (the ending timing)
Steps: By default, the range from 0 to 1 means this ControlNet is effective throughout the entire process. The larger the start step, the later the control takes effect; the smaller the end step, the earlier the control ends, and the higher the degree of freedom of the generated image.

3.3.9 Straight line

It constructs the building appearance by analyzing the straight line structure and geometric shape of the image, and is suitable for relatively regular application scenarios such as interior/architectural design.

Original Image

Process Image

Generated Image

Straight Lines (MLSD) Parameter Settings:

Control Weight: Determines the intensity of the control effect displayed in the image, with a default value of 1
Start Step: The step from which this ControlNet starts to affect the image generation content (the starting timing)
End Step: The step from which this ControlNet stops affecting the image generation content (the ending timing)
Steps: By default, the range from 0 to 1 means this ControlNet is effective throughout the entire process. The larger the start step, the later the control takes effect; the smaller the end step, the earlier the control ends, and the higher the degree of freedom of the generated image.
Score Threshold: Filters the straight line intensity of lines. The larger the value, the more lines are filtered out, leaving only straight lines in the end.
Distance Threshold: Filters the length of lines. The larger the value, the shorter straight lines are filtered out, and the lines extracted from the image are clearer.

3.3.10 Semantic Segmentation

It controls the composition and content generation of the image by labeling the different block colors and structures in the image. It is suitable for changing the style of large – scene images.

Original

Procrss

Generated 1

Generated 2

3.3.11 Normal Map

It generates images based on the generated normal vectors, records the concave – convex information, generates better details according to the concave – convex content, and can simulate complex lighting and texture effects. It is suitable for 3D modeling to improve the details and realism of the model surface.

Original Image

Process Image

Generated Image

3.4 Reference for Single ControlNet Application Methods

Line Art Coloring (Hard Edges, Soft Edges, Line Art, Anime Line Art)
Architectural/Interior Design (Straight Lines)
Background Replacement (Depth)
Controlling Human Movements (Human Pose)

3.5 Reference for Multiple ControlNet Application Methods

Controlling Characters and Background: Human Pose + Depth. Human Pose controls the posture, and Depth controls the background generation content. Adjust the ControlNet weight so that Human Pose weight is greater than Depth weight
Adjust the ControlNet weight to prioritize character pose over depth.

3.1 What is ControlNet?

3.2 The Function of ControlNet

3.3 Main Functions of ControlNet

3.3.1 Image Reference

This is applicable when you want the generated image to have a similar style or structure to a specific reference image.

For example: generating an image with a style similar to a certain work of art

Image Reference Parameter Settings:

3.3.2 Hard Edges

3.3.3 Soft Edges

3.3.4 Depth

3.3.5 Sketch/Scribble

3.3.6 Human Pose (Real – Person Photos)

3.3.7 Line Art

3.3.8 Anime Line Art

3.3.9 Straight line

3.3.10 Semantic Segmentation

3.3.11 Normal Map

3.4 Reference for Single ControlNet Application Methods

3.5 Reference for Multiple ControlNet Application Methods

More posts

6.AI feature introduction-V3

5. Introduction to AI Functions-V2

4. AI Function Introduction – V1

About Us