数据分析(庖丁)
庖丁(数据分析)
1 项目概述
1.1 背景介绍及目标
针对 Redis 服务做 Redis 服务数据分析
大 key 分析
大 key 带来了什么危害?
Redis 阻塞:因为 Redis 单线程特性,如果操作某个 Bigkey 耗时比较久,则后面的请求会被阻塞。
内存空间不均匀:在 Redis 集群中,会造成节点的内存使用不均匀。
过期时可能阻塞:如果 Bigkey 设置了过期时间,当过期后,这个 key 会被删除,假如没有使用 Redis 4.0 的过期异步删除,就会存在阻塞 Redis 的可能性,并且慢查询中查不到(因为这个删除是内部循环事件)。
热 key 分析
1.2 名词说明
1.3 Roadmap
2 需求分析
2.1 功能需求
2.2 非功能需求
2.3 调研
2.3.1 big key
2.3.1.1 Redis cli
使用 scan 命令进行扫描 big key
redis-cli --bigkeys -i 0.1
output
# Scanning the entire keyspace to find biggest keys as well as
# average sizes per key type. You can use -i 0.1 to sleep 0.1 sec
# per 100 SCAN commands (not usually needed).
[00.00%] Biggest set found so far 'redisorm:xxx2_monit:object:cHjyJazUyOi3ygSD:tags' with 2 members
[00.00%] Biggest string found so far 'redisorm:xxx_redis_matrix:object:vxFtvpeJBFFkBWr2' with 1489 bytes
[00.00%] Biggest set found so far 'redisorm:0AAB5C3FE11F0D59:tags:time:2020-11-23 20:25:49' with 6 members
[00.01%] Biggest hash found so far 'rq:job:c16a763b-4f1b-4544-a410-dd45245d96da' with 10 fields
[00.02%] Biggest set found so far 'redisorm:xxx_conf:object:ZNtmrqCWGoUKIvt4:tags' with 10 members
[00.07%] Biggest string found so far 'redisorm:xxx_conf:object:cTSSZrWoo3vastEW' with 1553 bytes
[00.14%] Biggest set found so far 'redisorm:xxx2_status:object:2u69AiuZR06kNv2p:tags' with 17 members
[00.32%] Biggest string found so far 'redisorm:xxx_replication:object:4DypIgMprlbRnPRF' with 11471 bytes
[00.56%] Biggest set found so far 'redisorm:xxx_replication:tags:time:2021-04-12 16:44' with 59 members
[01.22%] Biggest set found so far 'redisorm:xxx_replication:tags:time:2021-04-12 16:28' with 64 members
[01.24%] Biggest set found so far 'redisorm:xxx_conf:tags:adapter_bin_hash:OK' with 2217 members
[11.60%] Biggest zset found so far 'rq:finished:xxx_service' with 27 members
[12.98%] Biggest zset found so far 'redisorm:xxx_qos:__expire__' with 779 members
[29.24%] Biggest hash found so far 'rq:worker:f6a94ae9bdd0402e872386354b9a4c35' with 11 fields
[47.73%] Biggest set found so far 'redisorm:xxx_redis_matrix:__all__' with 4120 members
[66.18%] Biggest list found so far 'info' with 6 items
[86.92%] Biggest hash found so far 'test' with 18 fields
[96.71%] Biggest zset found so far 'rq:finished:default' with 2277 members
-------- summary -------
Sampled 91031 keys in the keyspace!
Total key length in bytes is 7497297 (avg len 82.36)
Biggest string found 'redisorm:xxx_replication:object:4DypIgMprlbRnPRF' has 11471 bytes
Biggest list found 'info' has 6 items
Biggest set found 'redisorm:xxx_redis_matrix:__all__' has 4120 members
Biggest hash found 'test' has 18 fields
Biggest zset found 'mq:finished:default' has 2277 members
23609 strings with 8878680 bytes (25.94% of keys, avg size 376.07)
1 lists with 6 items (00.00% of keys, avg size 6.00)
65031 sets with 225006 members (71.44% of keys, avg size 3.46)
2377 hashs with 23698 fields (02.61% of keys, avg size 9.97)
13 zsets with 3182 members (00.01% of keys, avg size 244.77)
缺点:
线上使用:虽然 scan 命令通过游标遍历建空间并且在生产上可以通过对从服务执行该命令,但毕竟是一个线上操作
set,zset,list 以及 hash 类型只能获取有多少个元素。但其实元素多的不一定占用空间大
2.3.2 hot key
2.3.2.1 Facebook redis-faina(基于 monitor)
使用 monitor 获取最近访问的热点信息
Overall Stats
========================================
Lines Processed 117773
Commands/Sec 11483.44
Top Prefixes
========================================
friendlist 69945
followedbycounter 25419
followingcounter 10139
recentcomments 3276
queued 7
Top Keys
========================================
friendlist:zzz:1:2 534
followingcount:zzz 227
friendlist:zxz:1:2 167
friendlist:xzz:1:2 165
friendlist:yzz:1:2 160
friendlist:gzz:1:2 160
friendlist:zdz:1:2 160
friendlist:zpz:1:2 156
Top Commands
========================================
SISMEMBER 59545
HGET 27681
HINCRBY 9413
SMEMBERS 9254
MULTI 3520
EXEC 3520
LPUSH 1620
EXPIRE 1598
Command Time (microsecs)
========================================
Median 78.25
75% 105.0
90% 187.25
99% 411.0
Heaviest Commands (microsecs)
========================================
SISMEMBER 5331651.0
HGET 2618868.0
HINCRBY 961192.5
SMEMBERS 856817.5
MULTI 311339.5
SADD 54900.75
SREM 40771.25
EXEC 28678.5
Slowest Calls
========================================
3490.75 "SMEMBERS" "friendlist:zzz:1:2"
2362.0 "SMEMBERS" "friendlist:xzz:1:3"
2061.0 "SMEMBERS" "friendlist:zpz:1:2"
1961.0 "SMEMBERS" "friendlist:yzz:1:2"
1947.5 "SMEMBERS" "friendlist:zpz:1:2"
1459.0 "SISMEMBER" "friendlist:hzz:1:2" "zzz"
1416.25 "SMEMBERS" "friendlist:zhz:1:2"
1389.75 "SISMEMBER" "friendlist:zzx:1:2" "zzz"
需要对 Redis 使用 monitor 命令,需要考虑 Redis client-output-buffer 问题
2.3.2.2 aof-selector(基于 aof)
https://github.com/hongliuliao/aof-selector
需要线上 Redis 开启了 AOF
2.3.2.3 阿里云热点 key 发现
热点数据的发现(在 Redis 侧进行统计)
请求统计
热点定位
热点反馈
需要改造 Redis 内核
3 总体设计
总体设计重点是设计与折衷
3.1 系统架构
一般来说会有个简单的架构图,并配以文字对架构进行简要说明;
3.2 模块简介
架构图中如果有很多模块,需要对各个模块的功能进行简要介绍;
3.3 设计与折衷
设计与折衷是总体设计中最重要的部分;
3.4 潜在风险
4 详细设计
详细设计重点在“详细”
4.1 模块 xx
(有了数据库 + 接口 + 流程,别的同学拿到详设文档,基本也能够搞定了)
4.1.1 交互流程
简要的交互可用文字说明,复杂的交互建议使用流程图,交互图或其他图形进行说明
4.1.2 数据库设计
4.1.3 接口形式
Last updated