linux-drm.ko缺少CUDA 6.5 / Ubuntu 14.04 / AWS EC2 GPU实例g2.2xlarge

要在AWS EC2 g2.2xlarge实例上的Ubuntu 14.04.1 LTS上安装CUDA 6.5,无论是通过.deb文件还是.run文件安装
.sudo ./cuda_6.5.14_linux_64.run --kernel-source-path=/usr/src/linux-headers-3.13.0-34-generic

关于丢失的drm.ko,我总是得到同样的错误.代码编译似乎很成功.下面是日志. (我在安装前重新启动)

Kernel module compilation complete.

Unable to determine if Secure Boot is enabled: No such file or directory

Kernel module load error: No such file or directory

Kernel messages:

[ 3.595939] type=1400 audit(1408809902.911:5): apparmor=”STATUS”

operation=”profile_replace” profile=”unconfined”

name=”/usr/lib/NetworkManager/nm-dhcp-client.action” pid=492

comm=”apparmor_parser”

[ 3.595942] type=1400 audit(1408809902.911:6): apparmor=”STATUS”

operation=”profile_replace” profile=”unconfined”

name=”/usr/lib/connman/scripts/dhclient-script” pid=492

comm=”apparmor_parser”

[ 3.596140] type=1400 audit(1408809902.915:7): apparmor=”STATUS”

operation=”profile_replace” profile=”unconfined”

operation=”profile_replace” profile=”unconfined”

name=”/usr/lib/connman/scripts/dhclient-script” pid=492

comm=”apparmor_parser”

[ 4.696067] init: failsafe main process (833) killed by TERM signal

[ 4.793261] type=1400 audit(1408809904.107:8): apparmor=”STATUS”

operation=”profile_replace” profile=”unconfined” name=”/sbin/dhclient”

pid=952 comm=”apparmor_parser”

[ 4.793267] type=1400 audit(1408809904.107:9): apparmor=”STATUS”

operation=”profile_replace” profile=”unconfined”

name=”/usr/lib/NetworkManager/nm-dhcp-client.action” pid=952

comm=”apparmor_parser”

[ 5.036249] init: plymouth-upstart-bridge main process ended,
respawning

[ 6.589233] init: udev-fallback-graphics main process (1203)
terminated

with status 1

[ 136.367014] nvidia: module license ‘NVIDIA’ taints kernel.

[ 136.367019] Disabling lock debugging due to kernel taint

[ 136.370281] nvidia: module verification failed: signature and/or

required key missing – tainting kernel

[ 136.370383] nvidia: Unknown symbol drm_open (err 0)

[ 136.370393] nvidia: Unknown symbol drm_poll (err 0)

[ 136.370404] nvidia: Unknown symbol drm_pci_init (err 0)

[ 136.370449] nvidia: Unknown symbol drm_gem_prime_handle_to_fd (err
0)

[ 136.370462] nvidia: Unknown symbol drm_gem_private_object_init (err
0)

[ 136.370474] nvidia: Unknown symbol drm_gem_mmap (err 0)

[ 136.370478] nvidia: Unknown symbol drm_ioctl (err 0)

[ 136.370486] nvidia: Unknown symbol drm_gem_object_free (err 0)

[ 136.370496] nvidia: Unknown symbol drm_read (err 0)

[ 136.370509] nvidia: Unknown symbol drm_gem_handle_create (err 0)

[ 136.370515] nvidia: Unknown symbol drm_prime_pages_to_sg (err 0)

[ 136.370550] nvidia: Unknown symbol drm_pci_exit (err 0)

[ 136.370563] nvidia: Unknown symbol drm_release (err 0)

[ 136.370565] nvidia: Unknown symbol drm_gem_prime_export (err 0)

The driver installation is unable to locate the kernel source. Please
make sure that the kernel source packages are installed and set up
correctly.

解决方法

该错误是由NVIDIA驱动程序所需的drm模块丢失引起的.
默认情况下,Ubuntu AMI安装最小的通用Linux内核(linux-image-virtual),不包括drm模块.
要修复它,请安装完整的通用内核linux-image-generic.
安装linux-image-extra-virtual会起作用,因为它只是linux-image-generic的过渡包.我建议安装linux-generic以包含头文件和图像.
总结一下:
sudo apt-get install linux-generic

AWS forum也有类似的问题

相关文章

/etc/sysctl.conf这个目录主要是配置一些系统信息,/etc/sys...
1.作用 useradd或adduser命令用来建立用户帐号和创建用户的起...
它们都是多模式编辑器,不同的是vim 是vi的升级版本,它不仅...
不管是我们在安装软件还是监测软件的使用性能,我们都要随时...
装好Tomcat7后,发现除了本机能访问外界访问不了,岂有此理。...
修改防火墙配置需要修改 /etc/sysconfig/iptables 这个文件,...