Distributed Locking Mechanisms: DB, Redis, and ZooKeeper Comparisons

Understanding Distributed Locking Requirements

In distributed systems, concurrent access to shared resources must be controlled—similar to how locks prevent race conditions in single-process multi-threaded environments. A distributed lock ensures safe coordination across multiple nodes and clients. Ideally, such a mechanism should provide:
  • Mutual Exclusion: Only one client may hold the lock for a given resource at any time.
  • Tolerance to Failures: The lock service should remain available even if some nodes fail (講究AP in CAP).
  • No Deadlock: Locks must eventually be released—even if a client crashes or loses connectivity during use.
Additional desirable features include reentrancy, blocking behavior (e.g., thread state awareness like AQS), and high performance.

Distributed Lock via Relational DBMS

A simple table-based locking strategy can be implemented by leveraging uniqueness constraints. Example schema:
CREATE TABLE system_lock ( id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, lock_key BIGINT NOT NULL, owner_id VARCHAR(64) NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id), UNIQUE KEY uk_lock_key (lock_key) ) ENGINE=InnoDB;
Each unique lock_key represents a protected resource. Inserting a record attempts to acquire the lock; deletion attempts to release it. Core logic:
boolean acquireLock(String key, String clientId) { try { jdbcTemplate.update( "INSERT INTO system_lock(lock_key, owner_id) VALUES (?, ?)", key, clientId ); return true; } catch (DuplicateKeyException e) { return false; } } boolean releaseLock(String key, String clientId) { int affected = jdbcTemplate.update( "DELETE FROM system_lock WHERE lock_key = ? AND owner_id = ?", key, clientId ); return affected > 0; }
To simulate blocking, clients retry periodically in a loop with exponential backoff.
Caveats:
  • Deadlock prevention: Requires periodically cleaning stale entries (e.g., delete records older than 2 minutes).
  • Throughput limitation: Typically restricted to ~1,000 ops/sec due to I/O latency and single-database constraint.
  • High availability: Achieved via master-slave replication and failover using virtual IPs.

Distributed Lock via Redis

Redis provides native atomicity, making it ideal for lock implementation. The canonical command:
SET lock-key client-id NX PX 30000
- NX: Set only if absent. - PX: Set expiry in milliseconds. Acquire & release logic (Lua script wrapped for atomicity):
String acquire(String key, String id, long ttlMs) { String result = redis.eval("return redis.call('set', KEYS[1], ARGV[1], 'NX', 'PX', ARGV[2])", Arrays.asList(key), Arrays.asList(id, String.valueOf(ttlMs))); return "OK".equals(result) ? id : null; } boolean release(String key, String id) { String current = redis.get(key); if (id.equals(current)) { redis.del(key); return true; } return false; }
Revoking stale locks: Use lease renewal. For example, schedule background renewals at intervals equal to one-third of the TTL to keep the lock alive.
Drawbacks:
  • Data stored in memory risks loss on crash—even with replication, async replication may lose recent writes before sync completes.
  • Not ideal when strict consistency is required.

Distributed Lock via ZooKeeper

ZooKeeper implements ordering semantics via ephemeral sequential nodes. To lock on path /resource/lock:
  1. Create /resource/lock/lock- with Ephemeral + Sequential flags.
  2. Retrieve all children of /resource/lock; identify the smallest node.
  3. Check if this client created the smallest node—if yes, lock acquired.
  4. Otherwise, register a watcher on the immediate predecessor node. Triggered only when predecessor is deleted.
Curator client simplifies usage:
CuratorFramework client = CuratorFrameworkFactory.builder() .connectString("zk-host:2181") .retryPolicy(new ExponentialBackoffRetry(1000, 3)) .build(); client.start(); InterProcessMutex mutex = new InterProcessMutex(client, "/locks/resource1"); mutex.acquire(); try { // critical section } finally { mutex.release(); }
Advantages:
  • Ephemeral nodes automatically remove locks upon disconnection—including network partitions or client crashes.
  • Strong consistency via ZAB consensus protocol ensures no split-brain.
  • Watchers provide real-time lock-granted notifications (no polling needed).
Limitations:
  • Network timeouts can misclassify transient failures as permanent disconnects, potentially leading to premature lock loss.
  • Higher latency than Redis due to disk-based persistence and consensus overhead.

comparative overview

DimensionDBRedisZooKeeper
PerformanceLow (~1k ops/s)High (memory-only)Moderate (disk + consensus)
Deadlock SafetyTTL-based cleanup (application layer)TTL + renewalEphemeral nodes auto-cleanup
High AvailabilitySync/multi-master + VIPCluster or Redis SentinelZAB quorum-based clusters
ConsistencyStrong (ACID)Eventual (async replication)Strong (ZAB ensures quorum writes)
Lock NotificationPoll-basedPoll-basedWatcher-driven

Practical Considerations

Distributed locks cannot match in-failure safety guarantees of local (@synchronized) primitives due to external failures—especially network glitches. Best practices include:
  • Deploy clients and lock service within the same low-latency network zone.
  • Ensure monitors and alarm systems detect lock acquisition failures.
  • Choose wisely: Redis suits high-throughput scenarios where near-real-time availability matters more than absolute correctness; ZooKeeper suits high-safety use cases like payment workflows.

Thẻ: DistributedSystems LockMechanism Redis zookeeper RelationalDB

Đăng vào ngày 26 tháng 5 lúc 17:50