Failed to create session. CUDA_ERROR_INVALID_DEVICE
古いGPUがついていると、こんなエラーになることがある。
$cd /usr/local/lib/python3.5/dist-packages/tensorflow/models/image/mnist $/usr/local/lib/python3.5/dist-packages/tensorflow/models/image/mnist$ python3 convolutional.py I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate (GHz) 1.7465 pciBusID 0000:01:00.0 Total memory: 7.92GiB Free memory: 7.79GiB W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x314aa00 E tensorflow/core/common_runtime/direct_session.cc:135] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE Traceback (most recent call last): File "convolutional.py", line 339, in <module> tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 43, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "convolutional.py", line 284, in main with tf.Session() as sess: File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1186, in __init__ super(Session, self).__init__(target, graph, config=config) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 551, in __init__ self._session = tf_session.TF_NewDeprecatedSession(opts, status) File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__ next(self.gen) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
こうやって実行すれば、GPUを0だけ使用して、実行する。
$CUDA_VISIBLE_DEVICES=0 python3 convolutional.py
以下でも可。
$export CUDA_VISIBLE_DEVICES=0 $python3 convolutional.py
以下の定数も必要な様子。
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64" export CUDA_HOME=/usr/local/cuda