Then the second, related patch: When you destroy a scheduler, what if there are still ...

Why Nostr? What is Njump? Join Nostr

Hoshino Lina (星乃リナ) 🩵 3D Yuri Wedding 2026!!!

npub1cx…z6wtz

2024-08-30 15:29:20 UTC

in reply to nevent1q…l5sm

Then the second, related patch:

When you destroy a scheduler, what if there are still jobs pending or running? The logical thing to do would be to abort pending jobs, and signal running jobs to error, or some sort of cleanup like that. What does the current code do? Nothing... it just crashes.

I fixed it so that tearing down the scheduler gracefully aborts all jobs and detaches the hardware callbacks (it can't abort the underlying hardware jobs, but it can decouple them from the scheduler side). In my driver's case, that all works beautifully because my driver internals are basically reference counted everywhere, so while the scheduler and high-level queue can be destroyed, any currently running jobs continue to run to completion or failure and their underlying driver resources get cleaned up then, asynchronously.

The maintainer rejected the patch, and said it was the driver's job to ensure that the scheduler outlives job execution.

But the scheduler owns the jobs lifetime-wise after you submit them, so how would that work? It doesn't. If you try to introduce a job->scheduler reference, you're creating a loop again, and the scheduler deadlocks when it frees a job and tries to tear itself down from within.

So now we're back at having to introduce an asynchronous cleanup workqueue or similar, just to deal with the DRM scheduler's incredibly poor lifetime design choices.

Author Public Key

npub1cxqje22r8ewnr30rt625d58cr8gpdjnw405cq5765m2qsvghnmpquz6wtz

Show more details

Published at

2024-08-30 15:29:20 UTC

Kind type

1 Short Text Note

Event JSON

{ "id": "6d61102f56b0d1d77831450bed25b8f0785adcb153addb9522cf634e44b75137", "pubkey": "c1812ca9433e5d31c5e35e9546d0f819d016ca6eabe98053daa6d40831179ec2", "created_at": 1725031760, "kind": 1, "tags": [ [ "p", "c1812ca9433e5d31c5e35e9546d0f819d016ca6eabe98053daa6d40831179ec2" ], [ "e", "e8d9e0d93ed719da3de622eb59770073e2aa668306514a97f5bebf49801665f3", "", "root", "c1812ca9433e5d31c5e35e9546d0f819d016ca6eabe98053daa6d40831179ec2" ], [ "e", "810e8bbb4a37d5a0d3d8df724c7f7adef645cd38a31deb1a541d703caf15b782", "", "reply", "c1812ca9433e5d31c5e35e9546d0f819d016ca6eabe98053daa6d40831179ec2" ], [ "proxy", "https://vt.social/@lina/113051681471475249", "web" ], [ "proxy", "https://vt.social/users/lina/statuses/113051681471475249", "activitypub" ], [ "L", "pink.momostr" ], [ "l", "pink.momostr.activitypub:https://vt.social/users/lina/statuses/113051681471475249", "pink.momostr" ], [ "-" ] ], "content": "Then the second, related patch:\n\nWhen you destroy a scheduler, what if there are still jobs pending or running? The logical thing to do would be to abort pending jobs, and signal running jobs to error, or some sort of cleanup like that. What does the current code do? Nothing... it just crashes.\n\nI fixed it so that tearing down the scheduler gracefully aborts all jobs and detaches the hardware callbacks (it can't abort the underlying hardware jobs, but it can decouple them from the scheduler side). In my driver's case, that all works beautifully because my driver internals are basically reference counted everywhere, so while the scheduler and high-level queue can be destroyed, any currently running jobs continue to run to completion or failure and their underlying driver resources get cleaned up then, asynchronously.\n\nThe maintainer rejected the patch, and said it was the driver's job to ensure that the scheduler outlives job execution.\n\nBut the scheduler owns the jobs lifetime-wise after you submit them, so how would that work? It doesn't. If you try to introduce a job-\u003escheduler reference, you're creating a loop again, and the scheduler deadlocks when it frees a job and tries to tear itself down from within.\n\nSo now we're back at having to introduce an asynchronous cleanup workqueue or similar, just to deal with the DRM scheduler's incredibly poor lifetime design choices.", "sig": "0f0ec50c61daa0dfec1d9cb7fe16c3ed3a102a31d4703b22f87c849868468468f1aa78bcd12a5f0d7aa6515eaf9f3e87d8e1d56e575987e29ba164966550fabb" }