Finding High Memory Keys in Redis Cache with Python
Note: This post was generated with the assistance of AI.
Redis is a popular in-memory data structure store used as a database, cache, and message broker. When working with Redis in production, it's important to monitor memory usage, especially when dealing with large datasets. In this post, I'll share a simple Python script to identify the keys consuming the most memory in your Redis instance.
The Problem
As Redis databases grow, certain keys can consume disproportionate amounts of memory. Identifying these "heavy" keys is crucial for:
- Preventing out-of-memory errors
- Optimizing cache efficiency
- Planning capacity requirements
- Implementing better data structures or expiration policies
Dependencies
- Python 3.x
- Redis-py:
pip install redis
Implementation
from collections import Counter
import redis
# Connect to Redis
r = redis.StrictRedis(host='localhost', port=6379, decode_responses=True)
# Initialize variables
cursor = 0
key_sizes = Counter()
LIMIT = 500
while True:
cursor, keys = r.scan(cursor=cursor, count=1000)
for key in keys:
# Get memory usage for each key in bytes
size = r.memory_usage(key)
if size is not None: # Only add keys with valid size
key_sizes[key] = size
if cursor == 0:
break
# Get top 500 keys by size
largest_keys = key_sizes.most_common(LIMIT)
# Print results
print(f"Top {LIMIT} keys by size:")
print("Size (bytes) | Key")
print("-" * 50)
for key, size in largest_keys:
print(f"{size:11,d} | {key}")
print(f"\nTotal keys analyzed: {len(key_sizes)}")
How It Works
The script performs the following operations:
- Connects to a local Redis instance
- Uses the
scan
command to iterate through all keys (which is more production-friendly thanKEYS *
) - For each key, gets its memory usage using the
MEMORY USAGE
Redis command - Stores the results in a Counter collection
- Displays the top 500 keys by memory consumption
Sample Output
Top 500 keys by size:
Size (bytes) | Key
--------------------------------------------------
1,048,576 | large_hash:user_data
524,288 | session:active_users
262,144 | cache:product_catalog
131,072 | analytics:daily_metrics
65,536 | queue:pending_jobs
32,768 | cache:homepage
16,384 | user:preferences:1001
8,192 | config:system_settings
4,096 | lock:inventory_update
2,048 | counter:daily_visitors
Total keys analyzed: 12,543
Optimizations and Considerations
- Performance Impact: The
MEMORY USAGE
command can be resource-intensive on busy Redis servers. Consider running this script during off-peak hours. - Sampling: For very large Redis instances, you might want to sample keys rather than analyzing all of them.
- Authentication: Add authentication parameters if your Redis instance requires it.
- Clustering: For Redis clusters, you'll need to modify the script to connect to each node.
Taking Action
Once you've identified the largest keys, you can:
- Set appropriate TTL (Time To Live) values
- Consider using more efficient data structures
- Implement key eviction policies
- Shard large hashes or lists
- Move infrequently accessed data to disk-based storage
Conclusion
Monitoring and managing memory usage in Redis is essential for maintaining a healthy and efficient cache system. This simple script provides a starting point for identifying memory-intensive keys that might need optimization.