Syed Jafer K

Its all about Trade-Offs

Redis : Read Through Cache

Its an Application level caching, where the application reads the data from cache. If the data not present in the cache it will read from database and updates the cache and sends the result to the application.

Steps:

  1. The Application (fastapi) checks whether the key is present in redis.
  2. If the key is existing in redis, then it will return the value.
  3. If the key is not present then redis will communicate to primary database (here mongodb) to get the value.
  4. Then the result will be set into the redis.
  5. And then it will be returned to the application.

Implementation

Github: https://github.com/syedjaferk/redis-cache

We are going to create a mock todo application. Let’s spin redis and mongo db using dockers,

For Redis,

docker run -p 6379:6379 --name redis-container --rm redislabs/redismod:latest

For mongodb,

docker run --name mongo_container --rm mongo

To load the data into mongo, either you can use the below script, or restore from the dump file.

from pymongo.mongo_client import MongoClient
from faker import Faker  

fake = Faker()  

# Mongo Db Connection
MONGO_CONN_STR = "mongodb://localhost:27017"
mongo_client = MongoClient(MONGO_CONN_STR)
database = mongo_client['test']
collection = database["todos"]

for itr in range(1, 1000000):
    data = {
        "id": str(itr),
        "name": fake.name(),
        "description": fake.text(100)
    }
    collection.insert_one(data)
    print(data)

or

docker exec -i mongo_container sh -c 'mongorestore --archive' < 2l_data.dump

In the redis docker image used, we have the support for redis-gears. So we have to write a python script or a module to be triggered when some commands hit.

For eg: If GET is been hit then it will call a particular python function.

As of now there is a support for python and java modules. https://oss.redis.com/redisgears/configuration.html#plugin

Below is the python script (read_script.py) to be triggered when JSON.GET hits,

from pymongo.mongo_client import MongoClient
import json

def find_data(key):
    MONGO_CONN_STR = "mongodb://host.docker.internal:27017"
    mongo_client = MongoClient(MONGO_CONN_STR)
    database = mongo_client['test']
    collection = database["todos"]
    db_data = collection.find_one({'id': key}, {"_id": 0})
    return db_data

def read_cache(event):
    key = event['key']
    data = execute('JSON.GET', key)
    if data:
        return data
    data = find_data(key)
    json_str = json.dumps(data)
    execute('JSON.SET', key, '.', json_str)
    override_reply(json_str)
    return json_str

GB('KeysReader').map(read_cache).register(commands=['JSON.GET'], mode='sync')

Now the script is ready so we need to send the script in RG.PYEXECUTE (https://oss.redis.com/redisgears/commands.html#rgpyexecute) to the redis-server. We can acheive this using node.

Below is the node script (script.js) to send to redis-cli,

import fs from "fs";
import { createClient } from "redis";


const redisConnectionUrl = "redis://127.0.0.1:6379";
const pythonFilePath = "./read_script.py";

const runReadThroughRecipe = async () => {
    const requirements = ["pymongo==3.12.0"];
    const readThroughCode = fs.readFileSync(pythonFilePath).toString();
    const client = createClient({ url: redisConnectionUrl });
    await client.connect();
    const params = ["RG.PYEXECUTE", readThroughCode,
    "REQUIREMENTS", ...requirements];
    try {
        await client.sendCommand(params);
        console.log("RedisGears ReadThrough set up completed.");
    }
    catch (err) {
        console.error("RedisGears ReadThrough setup failed !");
        console.error(JSON.stringify(err, Object.getOwnPropertyNames(err), 4));
    }
    process.exit();
};
runReadThroughRecipe();

Run it using,

node script.js

Now let’s design the app,

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from pymongo.mongo_client import MongoClient
import redis


r = redis.Redis(host='localhost', port=6379, decode_responses=True)
app = FastAPI()

# Mongo Db Connection
MONGO_CONN_STR = "mongodb://localhost:27017"
mongo_client = MongoClient(MONGO_CONN_STR)
database = mongo_client['test']
collection = database["todos"]

class Todo(BaseModel):
    id: str
    name: str
    description: str = None

@app.post("/todos/")
def create_todo(todo: Todo):
    collection.insert_one(todo.dict())
    return {} 

@app.get("/todos/{todo_id}", response_model=Todo)
def read_item(todo_id: str):
    res = r.json().get(todo_id)
    if res:
        return res
    raise HTTPException(status_code=404, detail="Item not found")

The flow will be like this,

Drawbacks

  1. For initial every request there is a cache miss. If the requested items aren’t repeating in a good percentage, then this will create latency.
  2. What happens if the record been updated in the database, but not in redis. Do we need to write on every update ?
  3. In a distributed system, if a node fails, it will be replaced by a new empty node. This will increase the latency. (this can be overcomed with replication of the data).
  4. All drawbacks of cache aside also follows here.

When to use ?

  1. If your application has a read heavy workloads.
  2. If you not have a highly dynamic data.