Listening to the Redis replication stream

Posted on December 23, 2018 by wjwh

Redis is an in-memory data structure store with a variety of uses. You can use it as a database, as a cache, as a job queue, as a message broker and for many more use cases. It is useful in very many situations where you want to communicate in a low overhead and safe way between many different processes which may or may not be on the same physical machine. However, it does make a number of tradeoffs to maintain a simple and fast interface. For speed, it stores its data entirely in memory (though with optional persistence to disk for disaster recovery). Mutations to the data set happen entirely in a single threaded fashion. This makes complicated locking logic unnecessary and makes reasoning about the behaviour straightforward, but it does cause long running commands to block all other clients from executing their own commands.

One of the available features is replication, where you can have multiple replicas of a single “master” server. Any changes to the data set on the master instance will be transmitted to each of the replicas. This way, you can run heavier queries against one of the replicas and not worry about blocking the master instance. However, it is not easy to hook up other types of systems to this replication stream, which could be desirable if you want to stream Redis data into (for example) a data warehouse for analytics or into log files for auditing purposes. In this post, we will take a short look at how to hook into the replication stream in a simple way. In an upcoming post, we will build on this foundation to do it the “proper” way.

How normal Redis replication works

Any replication client can connect in one of two ways: with full or with partial synchronisation. Partial synchronisation is a subset of full synchronisation, so we’ll look at the full version first.

When a Redis replication connects, it will send a SYNC command to the server. The server responds by first sending over the full .rdb file representing the current dataset and from then on it will pass on any commands that mutate the state of the server.

Partial synchronisation works in a similar way, but is more efficient because it does not need to synchronize the whole data set. Obviously, this can only take place if it had previously done a full synchronisation because otherwise it will miss a part of the data set. It can be used if the client has only lost connection for a few seconds, for example. Partial synchronisation uses the PSYNC command in conjunction with the global replication offset that every Redis server maintains internally. The master server maintains a certain amount of the replication stream in memory, as specified by the repl-backlog-size configuration variable. If a replica wants to re-synchronize and its replication offset is still ‘covered’ by the replica stream that the master server has retained, it will just receive the replica stream that it has missed and no streaming of the .rdb file is required. This is obviously much more efficient than sending the entire data set over each time, especially if the data set is very large.

After the dataset has been transmitted, the server will send out every command that alters the state of the server over the replication connection. This will be sent using the usual Redis protocol. In addition, every ten seconds (by default, it’s configurable with the repl-ping-replica-period variable), the server will also send out a PING command. This is done so that the client can always tell if there are connection problems, even if the data set does not change very often. The client will send a REPLCONF ACK message every second, indicating how much it has replicated so far. The master can use this to determine how many of its replicas are “up to date”.

Pretending to be a replication client

As mentioned, the replication stream can be obtained by sending a SYNC command to any Redis server. To gain the maximum amount of clarity about what is being transmitted, we will use the netcat program to do our communications with the Redis server, rather than the usual Redis command line client.

Below is the annotated output of connecting to an otherwise empty Redis server and sending a SYNC command. From another connection we then send SETEX a 3 1 to the master instance.

wjwh:~$ nc localhost 6379
SYNC
$178        // `.rdb` dump of 178 bytes starts here
REDIS0009�	redis-ver4.9.105�
redis-bits�@�ctime³�\used-mem��I�repl-stream-db��
repl-id(001eaa53cce137820e653344dd254522bfe0995b�
                    repl-offset���
                                  aof-preamble���#�U ^*1
$4          // PING command from server
PING
*2          // SELECT database number zero
$6
SELECT
$1
0
*4          // SETEX a 3 1
$5
SETEX
$1
a
$1
3
$1
1
*2          // Three seconds later the key expires. Expiry of keys on the
$3          // master triggers a synthetic DEL command on the replication
DEL         // stream
$1
a
*1          // PING command from server (10 seconds after the previous PING)
$4
PING

In this short snippet we can see several things:

Everything is encoded using the Redis protocol. This makes it both easy to parse and easy to read for humans (though it does take up quite a bit of vertical space).
The .rdb file dump of an empty Redis instance is still 178 bytes long, containing amongst others the Redis version and the replication ID of the instance.
Expiration of keys on the master instance is modeled as a DEL command on the replication stream.
The PING commands from the master come every repl-ping-replica-period seconds, regardless of any commands that have been sent in between.

Looking at the master a bit later, we can see the following snippet in the output of the INFO command:

# Replication
role:master
connected_slaves:1
slave0:ip=127.0.0.1,port=0,state=online,offset=0,lag=872

While the master (correctly) thinks it has one connected replica, it incorrectly thinks that the replica has lagged quite a bit. We have seen our little netcat replica print out all the commands we have sent to the master almost instantly, so from the client side there is very little lag. This is caused by not sending any REPLCONF ACK messages to the master from the client.

Aside: other methods to check what’s going on

You can also listen in on the replication stream by using redis-cli --slave, which will discard the initial bulk transfer for you. It will also automatically parse the incoming Redis commands into single strings like SETEX a 3 1, rather than print out the raw Redis protocol stream. It will not send anything back to the server though, so from the master’s point of view this replica will lag indefinitely. For educational purposes I have decided to use netcat in the article, in order to better look at the data that is actually being sent.

Alternatively, it is possible to get updates for only some keys or some commands by using keyspace notifications. This will send notifications on a pubsub channel that you can subscribe to in a similar way to normal pubsub channels. The amount of options for these events is pretty large, so we won’t go into detail here.

All the methods mentioned so far only allow you to listen in on commands that actually change the dataset, so any read-only commands like GET will not show up. For most use cases this is not a problem, but if you absolutely must see them (for example to measure the read/write balance of your system), the MONITOR call can give you this information. Be careful though, since it can have a big impact on the total throughput of the Redis server.

Conclusion

In this article we had a short look at how to listen in on the replication stream of a Redis server. By simply sending over the SYNC command over a TCP socket, the Redis server will gladly consider us a replica and send over any commands that change the state of the dataset. However, without sending anything back, the master will consider the lag of our replica to only increase. This may or may not be a problem for your use case. The received data is also still in the shape of the Redis protocol, which is unlikely to be native to your application and will need parsing. In a future article we’ll create a simple client to remedy those problems.