报错:no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call , so the stacktrace below might be incorrect.
For debuggins consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.