[rk3588 debain]cpu死锁问题解决
1 问题
rk3588机器上运行客户如下程序程序发生“BUG: spinlock recursion on CPU#0”
./rtsp RtspWrapper.xml
应用程序功能是:ip摄像头推流,通过rtsp协议拉流,对视频流做裁剪,缩放工作。首先,根据视频帧率每秒钟处理25张图片。其次,每张图片都会做颜色转换和缩放。最后,会根据图像里人的个数并行执行:颜色转换,裁剪,缩放。总体流程:取流,vpu解码,gpu图像处理,npu算法分析
发生的bug日志如下所示
[WARNING] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(420)::InitRKNNTensorMemory The input tensor type != model's inputs type.The input_type need FP16,but inputs[0].type is UINT8
[WARNING] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(420)::InitRKNNTensorMemory The input tensor type != model's inputs type.The input_type need FP16,but inputs[0].type is UINT8
[ 529.440796] BUG: spinlock recursion on CPU#0, Pose/3919
[ 529.440830] lock: 0xffffff81058a98b8, .magic: dead4ead, .owner: Pose/3919, .owner_cpu: 0
[ 529.440840] CPU: 0 PID: 3919 Comm: Pose Tainted: G W O 5.10.160 #22
[ 529.440845] Hardware name: Rockchip RK3588 IR88MX01 LP4X V10 Board (DT)
[ 529.440852] Call trace:
[ 529.440863] dump_backtrace+0x0/0x1a8
[ 529.440871] show_stack+0x1c/0x24
[ 529.440880] dump_stack_lvl+0xc4/0xf0
[ 529.440886] dump_stack+0x14/0x2c
[ 529.440893] spin_bug+0x8c/0xac
[ 529.440899] do_raw_spin_lock+0x5c/0xd4
[ 529.440906] _raw_spin_lock+0x14/0x1c
[ 529.440915] rknpu_job_subcore_commit.isra.0+0x3c/0x254
[ 529.440921] rknpu_job_commit+0x54/0xa4
[ 529.440927] rknpu_job_next+0xf0/0xf4
[ 529.440934] rknpu_irq_handler.constprop.0+0x6c/0x2d4
[ 529.440941] rknpu_core0_irq_handler+0x1c/0x24
[ 529.440948] __handle_irq_event_percpu+0xd0/0x200
[ 529.440954] handle_irq_event_percpu+0x34/0x84
[ 529.440959] handle_irq_event+0x4c/0x8c
[ 529.440966] handle_fasteoi_irq+0xa8/0x124
[ 529.440973] generic_handle_irq_desc+0x10/0x18
[ 529.440979] __handle_domain_irq+0xb8/0xc0
[ 529.440986] gic_handle_irq+0x2b0/0x300
[ 529.440992] el1_irq+0xc8/0x180
[ 529.440998] rknpu_job_subcore_commit.isra.0+0x1b0/0x254
[ 529.441004] rknpu_job_commit+0x54/0xa4
[ 529.441010] rknpu_job_next+0xf0/0xf4
[ 529.441017] rknpu_job_schedule+0x210/0x220
[ 529.441023] rknpu_submit_ioctl+0x308/0x68c
[ 529.441029] __rknpu_submit_ioctl+0x44/0x64
[ 529.441037] drm_ioctl_kernel+0xa8/0xf8
[ 529.441042] drm_ioctl+0x2f4/0x33c
[ 529.441050] vfs_ioctl+0x2c/0x48
[ 529.441056] __arm64_sys_ioctl+0x64/0x94
[ 529.441063] el0_svc_common.constprop.0+0xd4/0x184
[ 529.441069] do_el0_svc+0x20/0x28
[ 529.441076] el0_svc+0x1c/0x28
[ 529.441082] el0_sync_handler+0xc8/0x14c
[ 529.441088] el0_sync+0x158/0x180
[ 529.995156] rkvdec2_ccu_timeout_work:1380: fdc38100.rkvdec-core, task 746 state 0xf timeout
[ 530.025518] rga3_reg: RGA3 core[1] soft reset complete.
[ 530.025542] rga_job: rga request[1351] commit failed!
RgaBlit(1465) RGA_BLIT fail: Device or resource busy
[ 530.025546] rga: request[1351] submit failed!
RgaBlit(1466) RGA_BLIT fail: Device or resource busy
[ 530.025554] rga3_reg: RGA3 core[2] soft reset complete.
fd-vir-phy-hnd-format[0, (nil), (nil), 0, 0]
[ 530.025600] rga_job: rga request[1353] commit failed!
rect[286, 0, 314, 988, 1920, 1088, 2560, 0]
[ 530.025604] rga: request[1353] submit failed!
f-blend-size-rotation-col-log-mmu[0, 0, 0, 0, 0, 0, 0]
fd-vir-phy-hnd-format[0, (nil), (nil), 0, 0]
rect[0, 0, 192, 256, 192, 256, 1792, 0]
f-blend-size-rotation-col-log-mmu[0, 0, 0, 0, 0, 0, 0]
This output the user patamaters when rga call blit fail
srect[x,y,w,h] = [286, 0, 314, 988] src[w,h,ws,hs] = [314, 988, 1920, 1088]
drect[x,y,w,h] = [0, 0, 192, 256] dst[w,h,ws,hs] = [192, 256, 192, 256]
usage[0x80000]
RgaBlit(1465) RGA_BLIT fail: Device or resource busy
RgaBlit(1466) RGA_BLIT fail: Device or resource busy
fd-vir-phy-hnd-format[0, (nil), (nil), 0, 0]
rect[0, 0, 1920, 1088, 1920, 1088, 2560, 0]
f-blend-size-rotation-col-log-mmu[0, 0, 0, 0, 0, 0, 0]
fd-vir-phy-hnd-format[0, (nil), (nil), 0, 0]
rect[0, 0, 640, 640, 640, 640, 1792, 0]
f-blend-size-rotation-col-log-mmu[0, 0, 0, 0, 0, 0, 0]
This output the user patamaters when rga call blit fail
srect[x,y,w,h] = [0, 0, 1920, 1088] src[w,h,ws,hs] = [1920, 1088, 1920, 1088]
drect[x,y,w,h] = [0, 0, 640, 640] dst[w,h,ws,hs] = [640, 640, 640, 640]
2 解决问题
原来系统的rknpu版本为0.82,版本太低了,更新 rknpu的版本为 0.96 解决该问题,更新的代码文件如下所示。
zwzn2064@zwzn2064-CVN-Z690D5-GAMING-PRO:~/sda1/work/rknpu_096/rk3588-linux$ git status
Refresh index: 100% (199723/199723), done.
HEAD detached at origin/develop-ir88mx01-lianyun
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: kernel/drivers/firmware/rockchip_sip.c
modified: kernel/drivers/rknpu/Makefile
modified: kernel/drivers/rknpu/include/rknpu_drv.h
modified: kernel/drivers/rknpu/include/rknpu_gem.h
modified: kernel/drivers/rknpu/include/rknpu_ioctl.h
modified: kernel/drivers/rknpu/include/rknpu_iommu.h
modified: kernel/drivers/rknpu/include/rknpu_job.h
modified: kernel/drivers/rknpu/include/rknpu_mem.h
modified: kernel/drivers/rknpu/rknpu_debugger.c
modified: kernel/drivers/rknpu/rknpu_drv.c
modified: kernel/drivers/rknpu/rknpu_gem.c
modified: kernel/drivers/rknpu/rknpu_iommu.c
modified: kernel/drivers/rknpu/rknpu_job.c
modified: kernel/drivers/rknpu/rknpu_mem.c
modified: kernel/drivers/rknpu/rknpu_reset.c
modified: kernel/drivers/soc/rockchip/rockchip_opp_select.c
modified: kernel/include/linux/rockchip/rockchip_sip.h
modified: kernel/include/linux/version_compat_defs.h
modified: kernel/include/soc/rockchip/rockchip_opp_select.h
Untracked files:
(use "git add <file>..." to include in what will be committed)
kernel/drivers/rknpu/include/rknpu_devfreq.h
kernel/drivers/rknpu/rknpu_devfreq.c
no changes added to commit (use "git add" and/or "git commit -a")
zwzn2064@zwzn2064-CVN-Z690D5-GAMING-PRO:~/sda1/work/rknpu_096/rk3588-linux$
查询 npu 版本使用命令
dmesg | grep -i rknpu