当前位置:首页 > 娱乐

【922ee最新】0490-如何为GPU环境编译CUDA9.2的TensorFlow1.8与1.12

作者:李继武

1

创建文档的目的

从CDSW1.1.0开始支持GPU,请参阅Fayson前面的句子。

如何在CDSW中利用GPU进行深度学习

“您可以在最新的CDSW支持GPU网站上查看相应的NVIDIA驱动器版本、CUDA版本和tensorflow版本,如下所示:

我们注意到CUDA的版本是9.2,但目前正式发布的编译版TensorFlow的CUDA版本仍然是9.0。在CDSW环境中,为了使TensorFlow在GPU上运行,必须使用CUDA9.2。我们需要手动编译TensorFlow源代码。

这里,以编译Ten和Ten的版本为例,指定CUDA的版本为9.2,cudnn的版本为7.2.1。

2

安装编译过程中需要的包及环境

此部分两个版本的操作都相同

1.配置JDK1.8到环境变量中

2.执行如下命令,安装依赖包

yum -y install numpy yum -y install python-devel yum -y install python-pip yum -y install python-wheel yum -y install epel-release yum -y install gcc-c++ pip install --upgrade pip enum34 pip install keras --user pip install mock

如果安装时没有可用的包,可到下面的地址下载,然后制作本地yum源:

3.下载CUDA9.2并安装

到下面的地址下载CUDA9.2安装包:

;target_arch=x86_64&target_distro=RHEL&target_version=7&target_type=runfilelocal

选择runfile(local)版本:

上传到服务器:

修改文件权限,并运行该文件:

chmod +x cuda_9.2.148_396.37_linux.run .

将CUDA添加到环境变量:

export PATH=/usr/local:$PATH export LD_LIBRARY_PATH=/usr/local:$LD_LIBRARY_PATH

执行如下命令应能看到cuda版本:

source /etc/profile nvcc -V

4.cuDNN v7.2.1 下载并安装

到如下地址下载cudnn v7.2.1,需要注册之后才能下载:

上传到服务器CUDA的安装目录/usr/local/cuda,解压到该目录下

tar -zxvf cudnn-9.2-linux-x64-v7.2.1.38.tgz

在该目录下执行下面命令将cudnn添加到cuda的库中:

sudo cp cuda/include /usr/local/cuda/include sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 sudo chmod a+r /usr/local/cuda/include /usr/local/cuda/lib64/libcudnn*

进入lib64目录,建立一个软连接:

cd /usr/local/cuda/lib64 ln -s stub libcuda.

3

安装编译工具bazel

这部分编译不同的tensorflow版本需要安装不同版本的bazel,使用太新的版本有 时会报错。

A.Ten使用的bazel版本为0.19.2:

1.下载bazel-0.19.2:

wget

2.添加可执行权限,并执行:

chmod +x bazel-0.19.2-in ./bazel-0.19.2-in --user

该--user标志将Bazel安装到$HOME/bin系统上的目录并设置.bazelrc路径$HOME/.bazelrc。使用该--help 命令可以查看其他安装选项。

显示下面的提示表示安装成功:

如果使用--user上面的标志运行Bazel安装程序,则Bazel可执行文件将安装在$HOME/bin目录中。将此目录添加到默认路径是个好主意,如下所示:

export PATH=$HOME/bin:$PATH

B.Ten使用的bazel版本为0.13.0:

1.下载bazel-0.13.0

wget https://github.com/bazelbuild/bazel/releases/download/0.13.0/bazel-0.13.0-in

其余的操作与上面安装bazel-0.19.2相同。

4

下载Tensorflow源码

A. 下载最新版的tensorflow:

git clone --recurse-submodules

该命令会在当前目录下创建一个tensorflow目录,在其中下载最新版的tensorflow源码:

编写此文档时tensorflow最新的版本为1.12。

B.下载ten:

wget /archive/v1.8.0.tar.gz

解压到当前文件夹:

wget /archive/v1.8.0.tar.gz

5

配置tensorflow

不同版本的配置略有不同。

A.Ten

进入ten的源码目录,执行./configure并根据提示选择:

[root@cdh4 tensorflow]# ./configure Extracting Bazel installation... WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". INFO: Invocation ID: cc8b0ee2-5e84-4995-ba12-2c922ee3646b You have bazel 0.19.2 installed. Please specify the location of python. [Default is /usr/bin/python]: Found possible Python library paths: /usr/lib /usr/lib64 Please input the desired Python library path to use. Default is [/usr/lib] Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n No XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with ROCm support? [y/N]: n No ROCm support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 9.2 Please specify the location where CUDA 9.2 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.2.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Do you wish to build TensorFlow with TensorRT support? [y/N]: n No TensorRT support will be enabled for TensorFlow. Please specify the locally installed NCCL version you want to use. [Default is to use ]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: . Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]: Do you want to use clang as CUDA compiler? [y/N]: n nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: n No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. --config=gdr # Build with GDR support. --config=verbs # Build with libverbs support. --config=ngraph # Build with Intel nGraph support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=noignite # Disable Apacha Ignite support. --config=nokafka # Disable Apache Kafka support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished

B.Ten

进入ten的源码目录,执行./configure并根据提示选择:

[root@cdh2 ]# ./configure WARNING: Running Bazel server needs to be killed, because the startup options are different. You have bazel 0.13.0 installed. Please specify the location of python. [Default is /usr/bin/python]: Found possible Python library paths: /usr/lib /usr/lib64 Please input the desired Python library path to use. Default is [/usr/lib] Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y jemalloc as malloc support will be enabled for TensorFlow. Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n No Google Cloud Platform support will be enabled for TensorFlow. Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n No Hadoop File System support will be enabled for TensorFlow. Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n No Amazon S3 File System support will be enabled for TensorFlow. Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n No Apache Kafka Platform support will be enabled for TensorFlow. Do you wish to build TensorFlow with XLA JIT support? [y/N]: n No XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with GDR support? [y/N]: n No GDR support will be enabled for TensorFlow. Do you wish to build TensorFlow with VERBS support? [y/N]: n No VERBS support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.2 Please specify the location where CUDA 9.2 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.2.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Do you wish to build TensorFlow with TensorRT support? [y/N]: n No TensorRT support will be enabled for TensorFlow. Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: . Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2] Do you want to use clang as CUDA compiler? [y/N]: n nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: n No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tool for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. Configuration finished

6

编译tensorflow

两个版本都使用下方的命令进行编译

bazel build --config=opt --config=cuda --config=monolithic //tensorflow/tools/pip_package:build_pip_package

注意:执行该命令要在tensorflow的源码目录下

开始编译:

等待编译结束,该过程比较耗时,出现下面提示表示编译成功。

编译结束后,执行下面命令:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

执行完毕后可在/tmp/tensorflow_pkg目录中看到编译成功的tensorflow安装包:

注意:在编译过程中,磁盘不足或者内存不足都将导致编译失败,内存不足可能出现下面的错误,可通过设置交换区来解决。

设置缓冲区:

sudo dd if=/dev/zero of=/var/cache/swap/swap0 bs=1M count=1024 sudo chmod 0600 /var/cache/swap/swap0 sudo mkswap /var/cache/swap/swap0 sudo swapon /var/cache/swap/swap0

当编译结束后,删除该交换区:

swapoff /var/cache/swap/swap0 rm -rf /var/cache/swap/swap0

7

验证

此处以验证ten为例:

1.安装编译好的tensorflow安装包:

sudo pip install /tmp/tensorflow_pkg/-cp27-none-linux_x86_64.whl

2.安装成功后,打开Python的交互界面,导入tensorflow,查看版本及路径:

注意:测试的时候别在tensorflow目录下import tensorflow,可能直接引用里 面的目录下的包。

提示:代码块部分可以左右滑动查看噢

为天地立心,为生民立命,为往圣继绝学,为万世开太平。

温馨提示:如果使用电脑查看图片不清晰,可以使用手机打开文章单击文中的图片放大查看高清原图。

推荐关注Hadoop实操,第一时间,分享更多Hadoop干货,欢迎转发和分享。

原创文章,欢迎转载,转载请注明:转载自微信公众号Hadoop实操

1.《【922ee最新】0490-如何为GPU环境编译CUDA9.2的TensorFlow1.8与1.12》援引自互联网,旨在传递更多网络信息知识,仅代表作者本人观点,与本网站无关,侵删请联系页脚下方联系方式。

2.《【922ee最新】0490-如何为GPU环境编译CUDA9.2的TensorFlow1.8与1.12》仅供读者参考,本网站未对该内容进行证实,对其原创性、真实性、完整性、及时性不作任何保证。

3.文章转载时请保留本站内容来源地址,https://www.lu-xu.com/yule/3196997.html

上一篇

【什么山河】摄影家眼中的壮美山河

【922ee最新】“9.22”特大跨国电信网络诈骗专案77名嫌犯被押解回国

【922ee最新】“9.22”特大跨国电信网络诈骗专案77名嫌犯被押解回国

922ee最新相关介绍,5日6时55分,伴随着发动机的巨大轰鸣声,一架中国南航航班降落在吉林省长春龙加国际机场的跑道上,77名通信飞机嫌疑人从斐济共和国护送回国,这是我国首次在大洋洲大量护送电信诈骗嫌疑人。 据了解,2016...

【922ee最新】百度i贴吧0day跨站漏洞

  • 【922ee最新】百度i贴吧0day跨站漏洞
  • 【922ee最新】百度i贴吧0day跨站漏洞
  • 【922ee最新】百度i贴吧0day跨站漏洞
【922ee最新】EDG的秘密武器?找SKT打训练赛!

【922ee最新】EDG的秘密武器?找SKT打训练赛!

922ee最新相关介绍,正在接近RNG的LPL夏季决赛。EDG有什么特别的比赛准备方法吗?当然,那是找宇宙队SKT进行训练比赛!虽然目前正在进行训练战,但双方进行了一段时间后似乎打开了禁止观看的选项,因此看不到双方的战局。(...

【922ee最新】这就是英国低端安卓手机?国内小伙伴表示呵呵

【922ee最新】这就是英国低端安卓手机?国内小伙伴表示呵呵

922ee最新相关介绍,过去几个月,英国三大移动运营商之一EE正式发布了两款自有品牌智能手机。 这两款手机延续该运营商一贯的命名习惯,分别被称为Harrier和Harrier Mini。 另外,这两款机型均支持4G LTE网...