2023-12-04 AIGC-Stable Diffusion和SadTalker-搭建及使用
摘要:
2023-12-04 AIGC-SadTalker-搭建及使用
代码仓库:
https://github.com/Stability-AI/stablediffusion
https://github.com/camenduru/stable-diffusion-webui-colab
https://github.com/OpenTalker/SadTalker
https://github.com/adofsauron/SadTalker-Video-Lip-Sync
文档:
Stable Diffusion一键安装包 Windows版 - Stable Diffusion中文网
类似D-ID的免费开源虚拟数字人制作工具SadTalker搭建教程及效果演示 - 哔哩哔哩
SadTalker调参实验 - 知乎
https://github.com/OpenTalker/SadTalker/blob/main/docs/best_practice.md
SadTalker参数:
(sadtalker) PS D:\sd\SadTalker> python inference.py --help
usage: inference.py [-h] [--driven_audio DRIVEN_AUDIO] [--source_image SOURCE_IMAGE] [--ref_eyeblink REF_EYEBLINK] [--ref_pose REF_POSE] [--checkpoint_dir CHECKPOINT_DIR]
[--result_dir RESULT_DIR] [--pose_style POSE_STYLE] [--batch_size BATCH_SIZE] [--size SIZE] [--expression_scale EXPRESSION_SCALE]
[--input_yaw INPUT_YAW [INPUT_YAW ...]] [--input_pitch INPUT_PITCH [INPUT_PITCH ...]] [--input_roll INPUT_ROLL [INPUT_ROLL ...]] [--enhancer ENHANCER]
[--background_enhancer BACKGROUND_ENHANCER] [--cpu] [--face3dvis] [--still] [--preprocess {crop,extcrop,resize,full,extfull}] [--verbose] [--old_version]
[--net_recon {resnet18,resnet34,resnet50}] [--init_path INIT_PATH] [--use_last_fc USE_LAST_FC] [--bfm_folder BFM_FOLDER] [--bfm_model BFM_MODEL]
[--focal FOCAL] [--center CENTER] [--camera_d CAMERA_D] [--z_near Z_NEAR] [--z_far Z_FAR]
optional arguments:
-h, --help show this help message and exit
--driven_audio DRIVEN_AUDIO
path to driven audio
--source_image SOURCE_IMAGE
path to source image
--ref_eyeblink REF_EYEBLINK
path to reference video providing eye blinking
--ref_pose REF_POSE path to reference video providing pose
--checkpoint_dir CHECKPOINT_DIR
path to output
--result_dir RESULT_DIR
path to output
--pose_style POSE_STYLE
input pose style from [0, 46)
--batch_size BATCH_SIZE
the batch size of facerender
--size SIZE the image size of the facerender
--expression_scale EXPRESSION_SCALE
the batch size of facerender
--input_yaw INPUT_YAW [INPUT_YAW ...]
the input yaw degree of the user
--input_pitch INPUT_PITCH [INPUT_PITCH ...]
the input pitch degree of the user
--input_roll INPUT_ROLL [INPUT_ROLL ...]
the input roll degree of the user
--enhancer ENHANCER Face enhancer, [gfpgan, RestoreFormer]
--background_enhancer BACKGROUND_ENHANCER
background enhancer, [realesrgan]
--cpu
--face3dvis generate 3d face and 3d landmarks
--still can crop back to the original videos for the full body aniamtion
--preprocess {crop,extcrop,resize,full,extfull}
how to preprocess the images
--verbose saving the intermedia output or not
--old_version use the pth other than safetensor version
--net_recon {resnet18,resnet34,resnet50}
useless
--init_path INIT_PATH
Useless
--use_last_fc USE_LAST_FC
zero initialize the last fc
--bfm_folder BFM_FOLDER
--bfm_model BFM_MODEL
bfm model
--focal FOCAL
--center CENTER
--camera_d CAMERA_D
--z_near Z_NEAR
--z_far Z_FAR