Then the second, related patch:
When you destroy a scheduler, what if there are still jobs pending or running? The logical thing to do would be to abort pending jobs, and signal running jobs to error, or some sort of cleanup like that. What does the current code do? Nothing... it just crashes.
I fixed it so that tearing down the scheduler gracefully aborts all jobs and detaches the hardware callbacks (it can't abort the underlying hardware jobs, but it can decouple them from the scheduler side). In my driver's case, that all works beautifully because my driver internals are basically reference counted everywhere, so while the scheduler and high-level queue can be destroyed, any currently running jobs continue to run to completion or failure and their underlying driver resources get cleaned up then, asynchronously.
The maintainer rejected the patch, and said it was the driver's job to ensure that the scheduler outlives job execution.
But the scheduler owns the jobs lifetime-wise after you submit them, so how would that work? It doesn't. If you try to introduce a job->scheduler reference, you're creating a loop again, and the scheduler deadlocks when it frees a job and tries to tear itself down from within.
So now we're back at having to introduce an asynchronous cleanup workqueue or similar, just to deal with the DRM scheduler's incredibly poor lifetime design choices.