1.原始数据
2. 需求
3.思路
- 首先考虑需要拆分code,分成三个字段device_code(设备编码),kpi_code(指标code), shift(班次)
- 再根据以上三个字段分组,再每个设备,当天当班次的指标到一个字典里。
4.操作
db.datas.aggregate([{
'$match': {
'code': {
'$in': ['dev_plc_LS_01_001~zb_r_cl_bc1', 'dev_plc_LS_01_001~zb_r_cl_bc2', 'dev_plc_LS_01_001~zb_r_bhgs_bc1','dev_plc_LS_01_001~zb_r_bhgs_bc2', 'dev_plc_LS_01_001~zb_r_sbjgsj_bc1', 'dev_plc_LS_01_001~zb_r_sbjgsj_bc2','dev_plc_LS_01_001~zb_r_sbsjsj_bc1', 'dev_plc_LS_01_001~zb_r_sbsjsj_bc2', 'dev_plc_LS_01_001~zb_r_sbsbsj_bc1','dev_plc_LS_01_001~zb_r_sbsbsj_bc2', 'dev_plc_LS_01_001~zb_r_sbrlsj_bc1', 'dev_plc_LS_01_001~zb_r_sbrlsj_bc2','dev_plc_LS_01_001~zb_r_sbyxsj_bc1', 'dev_plc_LS_01_001~zb_r_sbyxsj_bc2', 'dev_plc_LS_01_002~zb_r_cl_bc1','dev_plc_LS_01_002~zb_r_cl_bc2', 'dev_plc_LS_01_002~zb_r_bhgs_bc1', 'dev_plc_LS_01_002~zb_r_bhgs_bc2','dev_plc_LS_01_002~zb_r_sbjgsj_bc1', 'dev_plc_LS_01_002~zb_r_sbjgsj_bc2', 'dev_plc_LS_01_002~zb_r_sbsjsj_bc1','dev_plc_LS_01_002~zb_r_sbsjsj_bc2', 'dev_plc_LS_01_002~zb_r_sbsbsj_bc1', 'dev_plc_LS_01_002~zb_r_sbsbsj_bc2','dev_plc_LS_01_002~zb_r_sbrlsj_bc1', 'dev_plc_LS_01_002~zb_r_sbrlsj_bc2', 'dev_plc_LS_01_002~zb_r_sbyxsj_bc1','dev_plc_LS_01_002~zb_r_sbyxsj_bc2', 'dev_plc_LS_01_003~zb_r_cl_bc1', 'dev_plc_LS_01_003~zb_r_cl_bc2','dev_plc_LS_01_003~zb_r_bhgs_bc1', 'dev_plc_LS_01_003~zb_r_bhgs_bc2', 'dev_plc_LS_01_003~zb_r_sbjgsj_bc1','dev_plc_LS_01_003~zb_r_sbjgsj_bc2', 'dev_plc_LS_01_003~zb_r_sbsjsj_bc1', 'dev_plc_LS_01_003~zb_r_sbsjsj_bc2','dev_plc_LS_01_003~zb_r_sbsbsj_bc1', 'dev_plc_LS_01_003~zb_r_sbsbsj_bc2', 'dev_plc_LS_01_003~zb_r_sbrlsj_bc1','dev_plc_LS_01_003~zb_r_sbrlsj_bc2', 'dev_plc_LS_01_003~zb_r_sbyxsj_bc1', 'dev_plc_LS_01_003~zb_r_sbyxsj_bc2','dev_plc_LS_01_004~zb_r_cl_bc1', 'dev_plc_LS_01_004~zb_r_cl_bc2', 'dev_plc_LS_01_004~zb_r_bhgs_bc1','dev_plc_LS_01_004~zb_r_bhgs_bc2', 'dev_plc_LS_01_004~zb_r_sbjgsj_bc1', 'dev_plc_LS_01_004~zb_r_sbjgsj_bc2','dev_plc_LS_01_004~zb_r_sbsjsj_bc1', 'dev_plc_LS_01_004~zb_r_sbsjsj_bc2', 'dev_plc_LS_01_004~zb_r_sbsbsj_bc1','dev_plc_LS_01_004~zb_r_sbsbsj_bc2', 'dev_plc_LS_01_004~zb_r_sbrlsj_bc1', 'dev_plc_LS_01_004~zb_r_sbrlsj_bc2','dev_plc_LS_01_004~zb_r_sbyxsj_bc1', 'dev_plc_LS_01_004~zb_r_sbyxsj_bc2']},'timestamp': {
'$gte': ISODate("2020-01-01T00:00:00.000+0000"), '$lte': ISODate("2020-12-31T00:00:00.000+0000")},'value': {
'$ne': 0}}},{
'$addFields': {
"d_c": {
"$split": ["$code", "~"]}}},{
'$addFields': {
'device_code': {
"$arrayElemAt": ["$d_c", 0]},'kpi_code': {
"$arrayElemAt": ["$d_c", 1]}}},{
'$addFields': {
"length": {
'$strLenCP': "$kpi_code"}}},{
'$addFields': {
"kpi_length": {
'$toInt': {
'$subtract': ["$length", 4]}}}},{
'$project': {
'_id': 0,'device_code': '$device_code','timestamp': '$timestamp','value': '$value','kpi_code': {
'$substr': ["$kpi_code", 0, "$kpi_length"]},'shift': {
'$substr': ["$kpi_code", {
'$toInt': {
'$add': ["$kpi_length", 1]}}, 3]}}},{
'$group': {
'_id': {
'device_code': '$device_code', 'timestamp': '$timestamp', 'shift': '$shift'},"kpi_dict": {
"$push": {
"k": "$kpi_code", "v": "$value"}}}},{
'$project': {
"_id": 0,'shift': '$_id.shift','device_code': '$_id.device_code','kpi_dict': {
"$arrayToObject": "$kpi_dict"},'timestamp': '$_id.timestamp'}},{
'$sort': {
'device_code': 1, 'timestamp': 1, 'shift': 1}}])
5.解析
5.1 '$match'
操作符先匹配出满足条件的数据,缩小数据操作范围
{
'$match': {
'code': {
'$in': ['dev_plc_LS_01_001~zb_r_cl_bc1', 'dev_plc_LS_01_001~zb_r_cl_bc2']},'timestamp': {
'$gte': ISODate("2020-01-01T00:00:00.000+0000"), '$lte': ISODate("2020-12-31T00:00:00.000+0000")},'value': {
'$ne': 0}}
}
5.2 '$split'
拆分code字段分成两部分,设备编码和带班次的指标code. '$addFields'
增加字段d_c,内容是个数组。如['dev_plc_LS_01_001', 'zb_r_cl_bc1']
{
'$addFields': {
"d_c": {
"$split": ["$code", "~"]}}
}
5.3 '$arrayElemAt'
按下标取数组的数据区分出设备编码和指标code,再'$addFields'
增加以上两个字段,分别命名device_code,kpi_code
{
'$addFields': {
'device_code': {
"$arrayElemAt": ["$d_c", 0]},'kpi_code': {
"$arrayElemAt": ["$d_c", 1]}}
}
5.4 接下来要拆分kpi_code字段,可以确定的是班次名长度是3,所以先'$strLenCP'
统计每个kpi_code总的长度'length
’,再统计每个kpi_code使用'$subtract'
操作符减去’_bc1’后缀以后的’指标长度"kpi_length"
{
'$addFields': {
"length": {
'$strLenCP': "$kpi_code"}}},
{
'$addFields': {
"kpi_length": {
'$toInt': {
'$subtract': ["$length", 4]}}}
},
5.5 由于上述操作引入了很多中间过渡字段,所以这里'$project'
过滤出实际需要的字段。并使用'$substr'
操作符取’kpi_code’下标0开始长度为kpi_length长度即为拆分后的真正指标code,班次shift取’kpi_code’下标kpi_length+1,长度为3得到shift
{
'$project': {
'_id': 0,'device_code': '$device_code','timestamp': '$timestamp','value': '$value','kpi_code': {
'$substr': ["$kpi_code", 0, "$kpi_length"]},'shift': {
'$substr': ["$kpi_code", {
'$toInt': {
'$add': ["$kpi_length", 1]}}, 3]}}},
5.6 对数据进行分组,按设备编码、时间、班次分组,并使用"$push"
操作符将每个设备每天每个班次的指标汇总起来。
{
'$group': {
'_id': {
'device_code': '$device_code', 'timestamp': '$timestamp', 'shift': '$shift'},"kpi_dict": {
"$push": {
"k": "$kpi_code", "v": "$value"}}}
},
5.7 数据最终显示与排序"$arrayToObject"
将数组转为字典(文档)格式
{
'$project': {
"_id": 0,'shift': '$_id.shift','device_code': '$_id.device_code','kpi_dict': {
"$arrayToObject": "$kpi_dict"},'timestamp': '$_id.timestamp'}},{
'$sort': {
'device_code': 1, 'timestamp': 1, 'shift': 1}}
至此,本次mongo聚合操作结束,大功告成!水平有限,上述如有优化还请不吝赐教!