Finding High Memory Keys in Redis Cache with Python

May 07, 2025

Note: This post was generated with the assistance of AI.

Redis is a popular in-memory data structure store used as a database, cache, and message broker. When working with Redis in production, it's important to monitor memory usage, especially when dealing with large datasets. In this post, I'll share a simple Python script to identify the keys consuming the most memory in your Redis instance.

The Problem

As Redis databases grow, certain keys can consume disproportionate amounts of memory. Identifying these "heavy" keys is crucial for:

  • Preventing out-of-memory errors
  • Optimizing cache efficiency
  • Planning capacity requirements
  • Implementing better data structures or expiration policies

Dependencies

  1. Python 3.x
  2. Redis-py: pip install redis

Implementation


from collections import Counter
import redis

# Connect to Redis
r = redis.StrictRedis(host='localhost', port=6379, decode_responses=True)

# Initialize variables
cursor = 0
key_sizes = Counter()
LIMIT = 500

while True:
    cursor, keys = r.scan(cursor=cursor, count=1000)
    for key in keys:
        # Get memory usage for each key in bytes
        size = r.memory_usage(key)
        if size is not None:  # Only add keys with valid size
            key_sizes[key] = size
    
    if cursor == 0:
        break

# Get top 500 keys by size
largest_keys = key_sizes.most_common(LIMIT)

# Print results
print(f"Top {LIMIT} keys by size:")
print("Size (bytes) | Key")
print("-" * 50)
for key, size in largest_keys:
    print(f"{size:11,d} | {key}")

print(f"\nTotal keys analyzed: {len(key_sizes)}")
        

How It Works

The script performs the following operations:

  1. Connects to a local Redis instance
  2. Uses the scan command to iterate through all keys (which is more production-friendly than KEYS *)
  3. For each key, gets its memory usage using the MEMORY USAGE Redis command
  4. Stores the results in a Counter collection
  5. Displays the top 500 keys by memory consumption

Sample Output


Top 500 keys by size:
Size (bytes) | Key
--------------------------------------------------
    1,048,576 | large_hash:user_data
      524,288 | session:active_users
      262,144 | cache:product_catalog
      131,072 | analytics:daily_metrics
       65,536 | queue:pending_jobs
       32,768 | cache:homepage
       16,384 | user:preferences:1001
        8,192 | config:system_settings
        4,096 | lock:inventory_update
        2,048 | counter:daily_visitors

Total keys analyzed: 12,543
        

Optimizations and Considerations

  • Performance Impact: The MEMORY USAGE command can be resource-intensive on busy Redis servers. Consider running this script during off-peak hours.
  • Sampling: For very large Redis instances, you might want to sample keys rather than analyzing all of them.
  • Authentication: Add authentication parameters if your Redis instance requires it.
  • Clustering: For Redis clusters, you'll need to modify the script to connect to each node.

Taking Action

Once you've identified the largest keys, you can:

  • Set appropriate TTL (Time To Live) values
  • Consider using more efficient data structures
  • Implement key eviction policies
  • Shard large hashes or lists
  • Move infrequently accessed data to disk-based storage

Conclusion

Monitoring and managing memory usage in Redis is essential for maintaining a healthy and efficient cache system. This simple script provides a starting point for identifying memory-intensive keys that might need optimization.

#Python #Redis #Cache #Performance #Memory #AI-Generated