【CANN训练营】CANN：AICPU算子开发

在这里插入图片描述

创建一个LessEqual算子，对标torch.le
https://pytorch.org/docs/1.5.0/torch.html?highlight=torch%20le#torch.le
下载mindstudio免安装版本
https://www.hiascend.com/software/mindstudio/download
clone canndev

cd ~
git clone https://gitee.com/ascend_wuyongkang/canndev.git
cd canndev
./build.sh --aicpu -u -j100

报错
CMake 3.14 or higher is required. You are running version 3.10.2

sudo apt remove --purge cmake 
hash -r 
sudo snap install cmake --classic
 
cmake --version

在这里插入图片描述

export ASCEND_CUSTOM_PATH=$HOME/Ascend/ascend-toolkit/latest

重新执行./build.sh --aicpu -u -j100
find …/…/…/ -name “*”

算子还是太难了。
那我们先参考这个做个单算子调用
https://gitee.com/ascend/samples/wikis/%E8%AE%AD%E7%BB%83%E8%90%A5/CANN%E8%AE%AD%E7%BB%83%E8%90%A5–%E5%8D%95%E7%AE%97%E5%AD%90%E8%B0%83%E7%94%A8

单算子调用

conv2d算子验证
https://gitee.com/ascend/samples/tree/master/cplusplus/level1_single_api/4_op_dev/2_verify_op/acl_execute_conv2d

cd samples/cplusplus/level1_single_api/4_op_dev/2_verify_op/acl_execute_conv2d
export DDK_PATH=$HOME/Ascend/ascend-toolkit/latest
export NPU_HOST_LIB=$DDK_PATH/acllib/lib64/stub
#用于设置python3.7.5库文件路径
export LD_LIBRARY_PATH=/usr/local/python3.7.5/lib:$LD_LIBRARY_PATH
#如果用户环境存在多个python3版本，则指定使用python3.7.5版本
export PATH=/usr/local/python3.7.5/bin:$PATH
cd run/out/
atc --singleop=test_data/config/conv2d_tik_op.json --soc_version=Ascend310 --output=op_models

然后就报错了

EZ3003: No supported Ops kernel and engine are found for [Conv2DTik], optype [Conv2DTik].

查了一圈，没查到，只能暂且先放弃了，走下一步

在这里插入图片描述

前面跳过了一步，没有acl文件，猜想是不是因为aclLite没有初始化编译？

cd ${HOME}/samples/cplusplus/common/acllite
make 
make install

貌似还真是第一次编译，不然不会这么大串信息

在这里插入图片描述

最后还是失败了，那我没办法了。

在这里插入图片描述

然后我们屡败屡战，看下面这个高清图像修复，用到了matmul_27648.json算子
https://gitee.com/ascend/samples/tree/master/python/level2_simple_inference/6_other/imageinpainting_hifill

首先看版本，我们的是符合要求的

在这里插入图片描述

安装第三方依赖
https://gitee.com/ascend/samples/tree/master/python/environment

在这里插入图片描述

这个算子又转换成功了

在这里插入图片描述

与官方文档不太一样，这里可以直接python3.7.5

在这里插入图片描述

效果很不错，真的很高清，不过这个算法是怎么识别到右上角是主角，然后留下他的呢
回到前面的问题，
是不是算子名字变了，但是文档还没改呢？

cd samples/cplusplus/level1_single_api/4_op_dev/2_verify_op/acl_execute_conv2d
cd run/out/

cp test_data/config/conv2d_tik_op.json  test_data/config/Conv2D.json
vi test_data/config/Conv2D.json

atc --singleop=test_data/config/Conv2D.json --soc_version=Ascend310 --output=op_models

在这里插入图片描述

终于成功了

在这里插入图片描述

但是前面那个问题还是没有解决

在这里插入图片描述

参考：https://www.hiascend.com/document/detail/zh/mindstudio/50RC1/msug/msug_000215.html

使用AICPU算子开发开发方式实现LessEqual算子，对标torch.le

torch.le：https://pytorch.org/docs/stable/generated/torch.le.html
算子原型定义

/**
 * Copyright (C)  2020. Huawei Technologies Co., Ltd. All rights reserved.

 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the Apache License Version 2.0.You may not use this file except in compliance with the License.

 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * Apache License for more details at
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * @file add_dsl.h
 *
 * @brief
 *
 * @version 1.0
 *
 */

#ifndef GE_OPS_OP_PROTO_ADDDSL_H_
#define GE_OPS_OP_PROTO_ADDDSL_H_
#include "graph/operator_reg.h"
namespace ge {
    REG_OP(AddDsl)
    .INPUT(x1, TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
    DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
    DT_COMPLEX64, DT_STRING}))
    .INPUT(x2,
    TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
    DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
    DT_COMPLEX64, DT_STRING}))
    .OUTPUT(y,
    TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
    DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
    DT_COMPLEX64, DT_STRING}))
    .OP_END_FACTORY_REG(AddDsl)
}

#endif //GE_OPS_OP_PROTO_ADDDSL_H_

lessequal.cc实现

/**
 * Copyright (C)  2020. Huawei Technologies Co., Ltd. All rights reserved.

 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the Apache License Version 2.0.You may not use this file except in compliance with the License.

 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * Apache License for more details at
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * @file add_dsl.h
 *
 * @brief
 *
 * @version 1.0
 *
 */

#ifndef GE_OPS_OP_PROTO_ADDDSL_H_
#define GE_OPS_OP_PROTO_ADDDSL_H_
#include "graph/operator_reg.h"
namespace ge {
    REG_OP(AddDsl)
    .INPUT(x1, TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
    DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
    DT_COMPLEX64, DT_STRING}))
    .INPUT(x2,
    TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
    DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
    DT_COMPLEX64, DT_STRING}))
    .OUTPUT(y,
    TensorType({DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_INT16,
    DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
    DT_COMPLEX64, DT_STRING}))
    .OP_END_FACTORY_REG(AddDsl)
}

#endif //GE_OPS_OP_PROTO_ADDDSL_H_

算子代码实现

cpukernel/impl/lessequal_kernel.h
cpukernel/impl/lessequal_kernel.cc

算子信息库定义

cpukernel/op_info_cfg/aicpu_kernel/reshape_cust.ini

算子适配插件实现

framework/tf_plugin/tensorflow_lessequal_plugin.cc

UT测试

这里遇到了一个问题，就是按照文档来做，右键没有找到New Cases > AI CPU UT Case

在这里插入图片描述

但是就算没有自动生成模板，我们也可以自己写下：
testcases/ut/aicpu_test/lessequal/test_lessequal_impl.cc
testcases/ut/aicpu_test/lessequal/test_lessequal_proto.cc

算子工程编译

连接远程云服务器成功后，进行编译
旧版本需要在130行往后添加代码，我们这次新版本就不用

在这里插入图片描述

单独打开算子工程文件夹，
然后进行编译

在这里插入图片描述

我在ascendtoolkit的安装路径是/home/HwHiAiUser/Ascend/ascend-toolkit，因此配置环境变量

ASCEND_OPP_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/opp;
TOOLCHAIN_DIR=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/toolchain/hcc; 
ASCEND_TENSOR_COMPILER_INCLUDE=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/include;
ASCEND_AICPU_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest

在这里插入图片描述

貌似我在中文路径下，这就不太行，那么我们改到I盘根目录。

在这里插入图片描述

这样就大题实现了算子功能了

python pytorch 深度学习

【CANN训练营】CANN：AICPU算子开发

单算子调用

算子代码实现

算子信息库定义

算子适配插件实现

UT测试

算子工程编译

相关文章