OpenACC常数参数

问题描述

我想知道在OpenACC内核中处理常量的正确方法是什么。

例如,在以下代码中

module vecaddmod

  implicit none

  integer,parameter :: n = 100000
  !$acc declare create(n)

contains
  subroutine vecaddgpu(r,a,b)
    real,dimension(:) :: r,b
    integer :: i
    !$acc update self(n)
    !$acc data present(n)
    !$acc kernels loop copyin(a(1:n),b(1:n)) copyout(r(1:n))
    do i = 1,n
       r(i) = a(i) + b(i)
    enddo
    !$acc end data
  end subroutine vecaddgpu
end module vecaddmod

program main
  use vecaddmod
  implicit none
  integer :: i,errs,argcount
  real,dimension(:),allocatable :: a,b,r,e
  character*10 :: arg1

  allocate( a(n),b(n),r(n),e(n) )
  do i = 1,n
     a(i) = i
     b(i) = 1000*i
  enddo
  ! compute on the GPU
  call vecaddgpu( r,b )
  ! compute on the host to compare
  do i = 1,n
     e(i) = a(i) + b(i)
  enddo
  ! compare results
  errs = 0
  do i = 1,n
     if( r(i) /= e(i) )then
        errs = errs + 1
     endif
  enddo
  print *,' errors found'
  if( errs ) call exit(errs)
end program main

n在模块中的CPU上声明为常量,并在循环中用作范围。 nvfortran向我警告Constant or Parameter used in data clause。上面的示例是处理此问题的正确方法吗?我是否可以利用GPU上的恒定内存,这样就不必在每次内核启动时将其从CPU复制到GPU?

谢谢。

解决方法

编译器将用文字值替换参数,因此无需将其放在数据区域中。

module vecaddmod

  implicit none

  integer,parameter :: n = 100000

contains
  subroutine vecaddgpu(r,a,b)
    real,dimension(:) :: r,b
    integer :: i
    !$acc kernels loop copyin(a(1:n),b(1:n)) copyout(r(1:n))
    do i = 1,n
       r(i) = a(i) + b(i)
    enddo
  end subroutine vecaddgpu
end module vecaddmod

...

% nvfortran -acc -Minfo=accel test.f90
vecaddgpu:
     11,Generating copyin(a(:100000)) << "n" is replaced with 100000
         Generating copyout(r(:100000)) 
         Generating copyin(b(:100000)) 
     12,Loop is parallelizable
         Generating Tesla code
         12,!$acc loop gang,vector(128) ! blockidx%x threadidx%x

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...