问题描述:
在用R-FCN目标检测API训练我自己的数据时, 我对原代码做了些改动,原来的rfcn_resnet101_coco.config文件中,batch_size的值为1即一次处理1张图片,我改成了别的数(2、8、16、32...)都会出现如下的问题:
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,600,750,3] vs. shape[2] = [1,600,692,3]
[[node concat (defined at /home/ubuntu/luanbo/VOCMaker/VOCMaker/object_detection/legacy/trainer.py:190) = ConcatV2[N=64, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Preprocessor/sub, Preprocessor_1/sub, Preprocessor_2/sub, Preprocessor_3/sub, Preprocessor_4/sub, Preprocessor_5/sub, Preprocessor_6/sub, Preprocessor_7/sub, Preprocessor_8/sub, Preprocessor_9/sub, Preprocessor_10/sub, Preprocessor_11/sub, Preprocessor_12/sub, Preprocessor_13/sub, Preprocessor_14/sub, Preprocessor_15/sub, Preprocessor_16/sub, Preprocessor_17/sub, Preprocessor_18/sub, Preprocessor_19/sub, Preprocessor_20/sub, Preprocessor_21/sub, Preprocessor_22/sub, Preprocessor_23/sub, Preprocessor_24/sub, Preprocessor_25/sub, Preprocessor_26/sub, Preprocessor_27/sub, Preprocessor_28/sub, Preprocessor_29/sub, Preprocessor_30/sub, Preprocessor_31/sub, Preprocessor_32/sub, Preprocessor_33/sub, Preprocessor_34/sub, Preprocessor_35/sub, Preprocessor_36/sub, Preprocessor_37/sub, Preprocessor_38/sub, Preprocessor_39/sub, Preprocessor_40/sub, Preprocessor_41/sub, Preprocessor_42/sub, Preprocessor_43/sub, Preprocessor_44/sub, Preprocessor_45/sub, Preprocessor_46/sub, Preprocessor_47/sub, Preprocessor_48/sub, Preprocessor_49/sub, Preprocessor_50/sub, Preprocessor_51/sub, Preprocessor_52/sub, Preprocessor_53/sub, Preprocessor_54/sub, Preprocessor_55/sub, Preprocessor_56/sub, Preprocessor_57/sub, Preprocessor_58/sub, Preprocessor_59/sub, Preprocessor_60/sub, Preprocessor_61/sub, Preprocessor_62/sub, Preprocessor_63/sub, Loss/RPNLoss/Loss/huber_loss/assert_broadcastable/is_valid_shape/has_valid_nonscalar_shape/has_invalid_dims/x)]]
[[{ {node assert_equal_3/y/_3989}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_23268_assert_equal_3/y", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
解决办法:
将batch_size改回了1就可以了。
原因分析:
怀疑是train.py文件中的'num_clones'和'worker_replicas'这两个参数的问题,但我将它们改成跟batch_size相同大小后,又出了别的问题(这里忘记截图了,后面有机会补上),改回原来的1就好了。不解,暂时先这样,后面搞明白再回来填坑。
这里有几个类似问题的解释,但不知道跟我这个问题是否相关:
关于bidirectional_dynamic_rnn出现 Dimensions of inputs should match问题
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match