inference_realesrgan.py
Command-line script for upsampling images using Real-ESRGAN models.Usage
Arguments
Input image or folder path
Model name to use for inferenceAvailable models:
RealESRGAN_x4plus- General 4x upsamplingRealESRNet_x4plus- 4x upsampling without GANRealESRGAN_x4plus_anime_6B- Anime images 4xRealESRGAN_x2plus- General 2x upsamplingrealesr-animevideov3- Anime video 4xrealesr-general-x4v3- General purpose 4x with denoise control
Output folder path
Denoise strength. Range: 0 (weak denoise, keep noise) to 1 (strong denoise ability)Only used for the
realesr-general-x4v3 modelThe final upsampling scale of the image
Optional custom model path. Usually not needed as models are downloaded automatically
Suffix of the restored image filename
Tile size for processing. 0 means no tiling. Use tiling to avoid out-of-memory errors with large images
Tile padding size to reduce border artifacts
Pre-padding size at each border
Use GFPGAN to enhance faces in the image (flag, no value needed)
Use fp32 (full) precision during inference. Default is fp16 (half precision)
The upsampler for alpha channels (transparency)Options:
realesrgan | bicubicOutput image extensionOptions:
auto | jpg | pngauto uses the same extension as the inputGPU device to use. Can be 0, 1, 2, etc. for multi-GPU systems
Examples
Memory Management:
If you encounter CUDA out of memory errors, try using the
--tile option with a smaller tile size (e.g., -t 200 or -t 400). Tiling processes the image in smaller chunks at the cost of slightly slower processing.File Formats:
RGBA images (with transparency) are automatically saved as PNG regardless of the
--ext setting to preserve the alpha channel.inference_realesrgan_video.py
Command-line script for upsampling videos using Real-ESRGAN models. Optimized for anime videos.Usage
Arguments
Input video, image, or folder path
Model name to use for inferenceAvailable models:
realesr-animevideov3- Optimized for anime videos (default)RealESRGAN_x4plus_anime_6B- Anime images/videos 4xRealESRGAN_x4plus- General 4x upsamplingRealESRNet_x4plus- 4x upsampling without GANRealESRGAN_x2plus- General 2x upsamplingrealesr-general-x4v3- General purpose 4x with denoise control
Output folder path
Denoise strength. Range: 0 (weak denoise, keep noise) to 1 (strong denoise ability)Only used for the
realesr-general-x4v3 modelThe final upsampling scale of the video
Suffix of the restored video filename
Tile size for processing. 0 means no tiling. Use tiling to avoid out-of-memory errors
Tile padding size to reduce border artifacts
Pre-padding size at each border
Use GFPGAN to enhance faces in the video (flag, no value needed)Note: Automatically disabled for anime models
Use fp32 (full) precision during inference. Default is fp16 (half precision)
FPS of the output video. If not specified, uses the input video’s FPS
Path to the ffmpeg binary
Extract frames to disk before processing (flag, no value needed). Can be useful for certain workflows
Number of processes to spawn per GPU for parallel processing
The upsampler for alpha channels (transparency)Options:
realesrgan | bicubicImage extension when processing image foldersOptions:
auto | jpg | pngExamples
Performance Warning:
If you are generating videos larger than 4K resolution, processing will be very slow due to I/O speed limitations. It is highly recommended to decrease the
--outscale value.Multi-GPU Processing:
The script automatically detects available GPUs and can process video segments in parallel. Use
--num_process_per_gpu to control how many processes run per GPU. For example, with 2 GPUs and --num_process_per_gpu 2, a total of 4 processes will run in parallel.FLV Format:
If the input is a
.flv file, the script automatically converts it to .mp4 using ffmpeg before processing.Audio Preservation:
The script automatically preserves the audio track from the input video in the output video.
Helper Classes
The video inference script includes two helper classes:Reader
Handles reading frames from videos, images, or folders. Supports streaming from video files using ffmpeg. Methods:get_resolution()- Returns (height, width) of the inputget_fps()- Returns the FPS of the videoget_audio()- Returns the audio streamget_frame()- Returns the next frameclose()- Closes the stream reader
Writer
Handles writing frames to output video using ffmpeg with H.264 encoding. Methods:write_frame(frame)- Writes a frame to the output videoclose()- Finalizes and closes the output video