TensorBoard on Docker
TensorBoard has been developed by Google in order to accelerate the debugging process of TensorFlow and visualize the training process. However, I actually have not got a chance to use this tool until very recently. This blog documented the most basic protocols of visualizing TensorBoard running on the Docker on the remote server from your local computer.
TensorBoard does not run on its own. You would need to add new components to your TensorFlow code to ask TensorBoard keep tracking of the tensor values.
Please check the official TensorBoard Tutorial about how to add such components.
This is usually done via the
-p argument of
docker run command. TensorBoard uses port
6006 by default, so we connect the port
0.0.0.0:6006) on Docker container to the port
0.0.0.0:5001) on the sever.
$ nvidia-docker run -it --name leimao-speech-instance -v /home/leimao/workspace:/workspace -p 5000:8888 -p 5001:6006 leimao/speech
To exit the docker container while keep the container running in the backgroud, click
docker container ls to check if we have connected the port successfully.
$ docker container ls
To connect the local port to the server port, in our local terminal:
$ ssh -L 127.0.0.1:16006:0.0.0.0:5001 username@server
We use the full port name because sometimes there are warnings from the server terminal if we do not do so.
To restart the docker container, in our server terminal:
$ docker start -i 05ee0d5a5a0e
To start the TensorBoard service, in our docker container terminal:
$ tensorboard --logdir ./graphs/rnn/ &
We have to specify the TensorBoard record directories in the
logdir argument. There might be remaining TensorBoard records in the directory. It would be better to clean the directory before getting new records from the new training, if we would like to monitor the new training process.
& sign is used to run TensorBoard in the background.
After starting TensorBoard successfully, we will receive such message:
TensorBoard 1.8.0 at http://05ee0d5a5a0e:6006 (Press CTRL+C to quit)
To kill TensorBoard process in the background if necessary, we first check the PID (Process ID) of Tensorboard using
In our case, the PID of Tensorboard is
56. We could kill the process in terminal:
$ kill -09 56
Run the TensorFlow program in Docker container terminal:
$ python main.py
Q to exit the docker container while keep the container running in the background if necessary.
Open a web browser, such as Chrome, and go to the url
http://127.0.0.1:16006. We could see the TensorBoard and keep track of the training process on the remote server on our local computer.
TensorBoard on Docker