Summary
PostSharp.Patterns.Caching.Backends.Redis can deadlock on dispose (unbounded, no timeout) when the Redis pub/sub notification-processing thread was never started. A transient Redis connect/subscribe hiccup is enough to trigger it. The stuck thread is the (single) finalizer thread, which then prevents the host process from exiting.
This is the product root cause behind the CI test-target hang reported in #22 (the TimeSensitiveTest target runs the Caching/Redis tests, and a hung finalizer keeps the test host alive until the build is killed). Same source code passes or hangs depending only on the runtime Redis condition — i.e. it is not a code regression, it is a latent unbounded-wait bug.
Affected component
Patterns/Caching/PostSharp.Patterns.Caching.Backends.Redis/RedisNotificationQueue.cs
(also the sync-over-async dispose in RedisCachingBackend.cs)
Root cause
In RedisNotificationQueue.InitAsync the channel-subscription loop runs before the processing thread is started:
RedisNotificationQueue.cs:86-114 — SubscribeAsync + while (!IsConnected(channel)) (with a connectionTimeout) + PingAsync, all of which can throw on a transient hiccup or cancellation.
RedisNotificationQueue.cs:116 — notificationProcessingThread.Start(...) only runs after the loop succeeds.
notificationProcessingThreadCompleted (a TaskCompletionSource) is signalled only in the processing thread's finally block (RedisNotificationQueue.cs:254-262). If the subscription loop throws, the thread never starts, so that TCS is never completed.
Any subsequent dispose then blocks forever:
RedisNotificationQueue.cs:352 (sync Dispose) — notificationProcessingThreadCompleted.Task.Wait() — no timeout, no cancellation.
RedisNotificationQueue.cs:353 / :426 — notificationProcessingThread.Join() — unbounded.
Because a failed RedisCachingBackend.Create / CreateAsync does not dispose the partially-initialized backend (RedisCachingBackend.cs:240, :280), the orphaned RedisNotificationQueue is collected and its finalizer (~RedisNotificationQueue -> Dispose(false)) hits the unbounded wait on the finalizer thread. A permanently blocked finalizer thread stops all finalization and prevents clean process exit.
Impact
- A transient Redis pub/sub connect failure can hang the disposing/finalizing thread indefinitely.
- In a host that depends on clean shutdown (e.g. a test host, or a short-lived process), this manifests as a process that never exits.
- Customer-facing risk: applications that create Redis caching backends under flaky network conditions can leak a wedged finalizer thread.
Fix direction
- Always complete
notificationProcessingThreadCompleted even when the processing thread never starts (e.g. set it on the init-failure path, or only wait when the thread was actually started).
- Bound every dispose wait (
Task.Wait/Thread.Join) with a timeout; log and proceed on timeout rather than blocking forever.
- Dispose the partially-initialized backend/queue when
Init/InitAsync throws, so failures clean up deterministically instead of relying on the finalizer.
- Add a deterministic regression test (init fails before thread start -> dispose must return promptly).
Relation
Summary
PostSharp.Patterns.Caching.Backends.Rediscan deadlock on dispose (unbounded, no timeout) when the Redis pub/sub notification-processing thread was never started. A transient Redis connect/subscribe hiccup is enough to trigger it. The stuck thread is the (single) finalizer thread, which then prevents the host process from exiting.This is the product root cause behind the CI test-target hang reported in #22 (the
TimeSensitiveTesttarget runs the Caching/Redis tests, and a hung finalizer keeps the test host alive until the build is killed). Same source code passes or hangs depending only on the runtime Redis condition — i.e. it is not a code regression, it is a latent unbounded-wait bug.Affected component
Patterns/Caching/PostSharp.Patterns.Caching.Backends.Redis/RedisNotificationQueue.cs(also the sync-over-async dispose in
RedisCachingBackend.cs)Root cause
In
RedisNotificationQueue.InitAsyncthe channel-subscription loop runs before the processing thread is started:RedisNotificationQueue.cs:86-114—SubscribeAsync+while (!IsConnected(channel))(with aconnectionTimeout) +PingAsync, all of which can throw on a transient hiccup or cancellation.RedisNotificationQueue.cs:116—notificationProcessingThread.Start(...)only runs after the loop succeeds.notificationProcessingThreadCompleted(aTaskCompletionSource) is signalled only in the processing thread'sfinallyblock (RedisNotificationQueue.cs:254-262). If the subscription loop throws, the thread never starts, so that TCS is never completed.Any subsequent dispose then blocks forever:
RedisNotificationQueue.cs:352(syncDispose) —notificationProcessingThreadCompleted.Task.Wait()— no timeout, no cancellation.RedisNotificationQueue.cs:353/:426—notificationProcessingThread.Join()— unbounded.Because a failed
RedisCachingBackend.Create/CreateAsyncdoes not dispose the partially-initialized backend (RedisCachingBackend.cs:240,:280), the orphanedRedisNotificationQueueis collected and its finalizer (~RedisNotificationQueue->Dispose(false)) hits the unbounded wait on the finalizer thread. A permanently blocked finalizer thread stops all finalization and prevents clean process exit.Impact
Fix direction
notificationProcessingThreadCompletedeven when the processing thread never starts (e.g. set it on the init-failure path, or only wait when the thread was actually started).Task.Wait/Thread.Join) with a timeout; log and proceed on timeout rather than blocking forever.Init/InitAsyncthrows, so failures clean up deterministically instead of relying on the finalizer.Relation
TimeSensitiveTesttarget deadlocks on startup). PostSharp.Patterns TimeSensitiveTest target deadlocks on startup (reproducible hang in threading tests, regression on 2026-06-10) #22 is the CI symptom; this issue is the underlying product defect.