mxnet 1.6.0 运行报错:
...
cuda error:Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: unknown error....Traceback (most recent call last):
File "/home/user1/anaconda3/lib/python3.x/site-packages/mxnet/symbol/symbol.py", line 1488, in simple_bind
ctypes.byref(exe_handle)))
File "/home/user1/anaconda3/lib/python3.x/site-packages/mxnet/base.py", line 146, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))During handling of the above exception, another exception occurred:Traceback (most recent call last):
File "/home/user1/mxnetproject/new_scene.py", line 90, in
mod_score = fit(new_sym, new_args, aux_params, train, val, batch_size, num_gpus=1)
File "/home/user1/mxnetproject/new_scene.py", line 84, in fit
eval_metric='acc')
File "/home/user1/anaconda3/lib/python3.6/site-packages/mxnet/module/base_module.py", line 460, in fit
for_training=True, force_rebind=force_rebind)
File "/home/user1/anaconda3/lib/python3.6/site-packages/mxnet/module/module.py", line 428, in bind
state_names=self._state_names)
File "/home/user1/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 237, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File "/home/use1/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 333, in bind_exec
shared_group))
File "/home/use1/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 611, in _bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File "/home/user1/anaconda3/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1494, in simple_bind
raise RuntimeError(error_msg)
...
检查:使用 $ nvidia-smi
检查是不是有一张显卡error了。导致了bind错误。如果有的话尝试重启,如果还是有问题,尝试只使用剩下的其他显卡。