amdgpu gpu_recovery is impressive but required too often

amdgpu keeps hanging when using ROCM
by

AMDGPU keeps impressing me by recovering from GPU crashes without interrupting my X session.

It keeps letting me down by crashing whenever I run mixed compute and graphics loads on the same GPU.

Very mixed feelings on this. Can I trade the awesome crash recovery for not crashing in the first place please?

See the stable diffusion post's AMD section for some more details.


Cite as BibTeX
@misc{amdgpu-rocm-pain,
    author = {Luna Nova},
    title = {amdgpu gpu_recovery is impressive but required too often},
    year = {2022},
    url = {https://lunnova.dev/articles/amdgpu-rocm-pain/},
    howpublished = {https://lunnova.dev/articles/amdgpu-rocm-pain/},
    urldate = {2022-10-01},
    note = {lunnova.dev - amdgpu keeps hanging when using ROCM}
}

tagged amdgpu rocm linux