Distributed Locking Mechanisms: DB, Redis, and ZooKeeper Comparisons

Understanding Distributed Locking Requirements

In distributed systems, concurrent access to shared resources must be controlled—similar to how locks prevent race conditions in single-process multi-threaded environments. A distributed lock ensures safe coordination across multiple nodes and clients. Ideally, such a mechanism should provide:

Mutual Exclusion: Only one client may hold the lock for a given resource at any time.
Tolerance to Failures: The lock service should remain available even if some nodes fail (講究AP in CAP).
No Deadlock: Locks must eventually be released—even if a client crashes or loses connectivity during use.

Additional desirable features include reentrancy, blocking behavior (e.g., thread state awareness like AQS), and high performance.

Distributed Lock via Relational DBMS

A simple table-based locking strategy can be implemented by leveraging uniqueness constraints. Example schema:


CREATE TABLE system_lock (
  id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
  lock_key BIGINT NOT NULL,
  owner_id VARCHAR(64) NOT NULL,
  created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (id),
  UNIQUE KEY uk_lock_key (lock_key)
) ENGINE=InnoDB;

Each unique lock_key represents a protected resource. Inserting a record attempts to acquire the lock; deletion attempts to release it. Core logic:


boolean acquireLock(String key, String clientId) {
  try {
    jdbcTemplate.update(
      "INSERT INTO system_lock(lock_key, owner_id) VALUES (?, ?)",
      key, clientId
    );
    return true;
  } catch (DuplicateKeyException e) {
    return false;
  }
}

boolean releaseLock(String key, String clientId) {
  int affected = jdbcTemplate.update(
    "DELETE FROM system_lock WHERE lock_key = ? AND owner_id = ?",
    key, clientId
  );
  return affected > 0;
}

To simulate blocking, clients retry periodically in a loop with exponential backoff.

Caveats:

Deadlock prevention: Requires periodically cleaning stale entries (e.g., delete records older than 2 minutes).
Throughput limitation: Typically restricted to ~1,000 ops/sec due to I/O latency and single-database constraint.
High availability: Achieved via master-slave replication and failover using virtual IPs.

Distributed Lock via Redis

Redis provides native atomicity, making it ideal for lock implementation. The canonical command:


SET lock-key client-id NX PX 30000

- NX: Set only if absent. - PX: Set expiry in milliseconds. Acquire & release logic (Lua script wrapped for atomicity):


String acquire(String key, String id, long ttlMs) {
  String result = redis.eval("return redis.call('set', KEYS[1], ARGV[1], 'NX', 'PX', ARGV[2])", 
    Arrays.asList(key), Arrays.asList(id, String.valueOf(ttlMs)));
  return "OK".equals(result) ? id : null;
}

boolean release(String key, String id) {
  String current = redis.get(key);
  if (id.equals(current)) {
    redis.del(key);
    return true;
  }
  return false;
}

Revoking stale locks: Use lease renewal. For example, schedule background renewals at intervals equal to one-third of the TTL to keep the lock alive.

Drawbacks:

Data stored in memory risks loss on crash—even with replication, async replication may lose recent writes before sync completes.
Not ideal when strict consistency is required.

Distributed Lock via ZooKeeper

ZooKeeper implements ordering semantics via ephemeral sequential nodes. To lock on path /resource/lock:

Create /resource/lock/lock- with Ephemeral + Sequential flags.
Retrieve all children of /resource/lock; identify the smallest node.
Check if this client created the smallest node—if yes, lock acquired.
Otherwise, register a watcher on the immediate predecessor node. Triggered only when predecessor is deleted.

Curator client simplifies usage:


CuratorFramework client = CuratorFrameworkFactory.builder()
  .connectString("zk-host:2181")
  .retryPolicy(new ExponentialBackoffRetry(1000, 3))
  .build();
client.start();

InterProcessMutex mutex = new InterProcessMutex(client, "/locks/resource1");
mutex.acquire();
try {
  // critical section
} finally {
  mutex.release();
}

Advantages:

Ephemeral nodes automatically remove locks upon disconnection—including network partitions or client crashes.
Strong consistency via ZAB consensus protocol ensures no split-brain.
Watchers provide real-time lock-granted notifications (no polling needed).

Limitations:

Network timeouts can misclassify transient failures as permanent disconnects, potentially leading to premature lock loss.
Higher latency than Redis due to disk-based persistence and consensus overhead.

comparative overview

Dimension	DB	Redis	ZooKeeper
Performance	Low (~1k ops/s)	High (memory-only)	Moderate (disk + consensus)
Deadlock Safety	TTL-based cleanup (application layer)	TTL + renewal	Ephemeral nodes auto-cleanup
High Availability	Sync/multi-master + VIP	Cluster or Redis Sentinel	ZAB quorum-based clusters
Consistency	Strong (ACID)	Eventual (async replication)	Strong (ZAB ensures quorum writes)
Lock Notification	Poll-based	Poll-based	Watcher-driven

Practical Considerations

Distributed locks cannot match in-failure safety guarantees of local (@synchronized) primitives due to external failures—especially network glitches. Best practices include:

Deploy clients and lock service within the same low-latency network zone.
Ensure monitors and alarm systems detect lock acquisition failures.
Choose wisely: Redis suits high-throughput scenarios where near-real-time availability matters more than absolute correctness; ZooKeeper suits high-safety use cases like payment workflows.

Thẻ: DistributedSystems LockMechanism Redis zookeeper RelationalDB

Đăng vào ngày 26 tháng 5 lúc 10:50

Thành phố Cuồng loạn