Problems installing Nvidia drivers for Cent OS 7

Issues related to applications and software problems
miney
Posts: 10
Joined: 2019/04/13 11:55:06

Problems installing Nvidia drivers for Cent OS 7

Post by miney » 2019/04/13 12:32:31

Hello,

I've recently changed to CentOS 7 and having problems installing and using the appropriate Nvidia drivers. First of all, I am using a Lenovo Thinkpad T470p with an integrated on-board Intel graphics card and a Nvidia GeForce 940MX (Nvidia Optimus).

Current kernel:

Code: Select all

uname -r
3.10.0-957.10.1.el7.x86_64

Code: Select all

lscpi -v entries of both cards:

00:02.0 VGA compatible controller: Intel Corporation Device 591b (rev 04) (prog-if 00 [VGA controller])
	Subsystem: Lenovo Device 505e
	Flags: bus master, fast devsel, latency 0, IRQ 138
	Memory at f0000000 (64-bit, non-prefetchable) [size=16M]
	Memory at e0000000 (64-bit, prefetchable) [size=256M]
	I/O ports at e000 [size=64]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [40] Vendor Specific Information: Len=0c <?>
	Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [d0] Power Management version 2
	Capabilities: [100] Process Address Space ID (PASID)
	Capabilities: [200] Address Translation Service (ATS)
	Capabilities: [300] Page Request Interface (PRI)
	Kernel driver in use: i915
	Kernel modules: i915
	
02:00.0 3D controller: NVIDIA Corporation GM108M [GeForce 940MX] (rev a2)
	Subsystem: Lenovo Device 505e
	Flags: bus master, fast devsel, latency 0, IRQ 142
	Memory at f1000000 (32-bit, non-prefetchable) [size=16M]
	Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Memory at d0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at d000 [size=128]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19
	Kernel driver in use: nvidia
	Kernel modules: nouveau, nvidia_drm, nvidia
My first try was to install the Nvidia drivers via the classic .run-files from the Nvidia homepage, version 418.56 and the installation completed successfully (except that I always got the notification during installation that the current installation of libglvnd is incomplete and if I want to overwrite the existing installation by a new one. I always said yes, but when I re-ran the installation after a "successful" installation, the notification occurred again). But, unfortunately, the Nvidia card was not being used by any processes. When I ran

Code: Select all

nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce 940MX       Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   37C    P8    N/A /  N/A |      0MiB /  2004MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
it didn't show any running procceses using the card. Since I blacklisted the nouveau driver, I assumed that the processes were using the onboard Intel card with the i915-driver. When I also blacklisted this driver, the resolution changed, but there where still no changes in the nvidia-smi. I did not know hot to determine which graphics driver were being used at this time.

After a while I found advices regarding the installation of Nvidia driver via the elrepo-repositories. I did this, before I had to erase ocl-icd package because of conflicts (I read that it is possible to reinstall dependant packages like VLC via nux). After that the installation completed successfully. Unfortunately, when I rebooted, the system stucks at starting X, can not start it respectively. When I switch to shell I can see that there are processes running via nvidia-smi (X and gnome-shell), therefore I think I might be on the right way, but I can not manage to get X started. When I remove the file /etc/X11/xorg.conf then the startup process is normal, but still the graphics card is not being used. Content of the file is the following:

Code: Select all

Section "Device"
    Identifier  "Videocard0"
    Driver      "nvidia"
EndSection
After looking at the manpage of xorg.conf I added the following lines:

Code: Select all

    BusID       "PCI:2:0:0"
    Screen      0
But it did not change anything. Also I looked into the /var/log/Xorg.x.log files but could not find anything suspicious. If it helps I can provide log-files for both cases, with and without xorg.conf file.

I also read that if there are more than one cards you have to use Bumblebee, Prime or nvidia-xrun to manage the use of the cards, but if I understood correct nvidia-xrun and Prime are not for CentOS and Bumblebee is out of date and should not be used? I also checked if I can disable the on-board card via BIOS, but that is not possible.

I would be happy for any advices and help.

Kind regards, miney.

miney
Posts: 10
Joined: 2019/04/13 11:55:06

Re: Problems installing Nvidia drivers for Cent OS 7

Post by miney » 2019/04/17 14:47:50

Since nobody could help I tried myself installing Bumblebee and everything around it using these instructions. What I'm wondering about are three things:
  • If I run with "optirun glxgears" I get the below described errors, if I run "primusrun glxgears" the program starts but with the integrated Intel card.
  • The Nvidia directories specified for LibraryPath and XorgModulePath are empty or non-existent, although the Nvidia driver packages are installed. Therefore I added several directories to satisfy the respective error messages. This is my bumblebee.conf so far:

    Code: Select all

    [bumblebeed]
    VirtualDisplay=:8
    KeepUnusedXServer=false
    ServerGroup=bumblebee
    TurnCardOffAtExit=false
    NoEcoModeOverride=false
    Driver=nvidia
    XorgConfDir=/etc/bumblebee/xorg.conf.d
    
    [optirun]
    Bridge=primus
    VGLTransport=proxy
    PrimusLibraryPath=/usr/lib/primus:/usr/lib64/primus
    AllowFallbackToIGC=false
    
    [driver-nvidia]
    KernelDriver=nvidia
    Module=nvidia
    PMMethod=bbswitch
    LibraryPath=/usr/lib64/nvidia:/usr/lib64/vdpau:/usr/lib/nvidia:/usr/lib/vdpau:/usr/lib:/usr/lib64:/usr/lib64/primus
    XorgModulePath=/usr/lib64/xorg/modules/extensions/nvidia,/usr/lib64/xorg/modules/drivers,/usr/lib64/xorg/modules,/usr/lib,/usr/lib64
    XorgConfFile=/etc/bumblebee/xorg.conf.nvidia
    
    [driver-nouveau]
    KernelDriver=nouveau
    PMMethod=auto
    XorgConfFile=/etc/bumblebee/xorg.conf.nouveau
    
  • The error I get with "optirun -b primus glxgears" or "optirun glxgears" is:

Code: Select all

[ERROR]Cannot access secondary GPU - error: [XORG] (EE) kbd: <default keyboard>: failed to set us as foreground pgrp (Inappropriate ioctl for device)
I had a similar error about mouse drivers and solved this by installing xorg-x11-drv-mouse. I can not solve the current problem by installing xorg-x11-drv-kbd (does not exist) or xorg-x11-drv-keyboard (already installed).

Any help would be welcome...

miney
Posts: 10
Joined: 2019/04/13 11:55:06

Re: Problems installing Nvidia drivers for Cent OS 7

Post by miney » 2019/04/17 20:21:21

Strangely, the above keyboard error only occurs when the notebook is plugged into the docking station and uses another keyboard / mouse / screen etc. Anyways, when running the notebook alone (without docking station) I make further progress. The applications that I run with "optirun -b primus glxgears" start, but the opening window stays black. The same with glxspheres64 etc. The processes are listed under nvidia-smi. I get errors like this:

Code: Select all

X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  11 (X_GLXSwapBuffers)
  Serial number of failed request:  37
  Current serial number in output stream:  38
My current bumblebee.conf

Code: Select all

[bumblebeed]
VirtualDisplay=:8
KeepUnusedXServer=false
ServerGroup=bumblebee
TurnCardOffAtExit=false
NoEcoModeOverride=false
Driver=nvidia
XorgConfDir=/etc/bumblebee/xorg.conf.d

[optirun]
Bridge=primus
VGLTransport=proxy
PrimusLibraryPath=/usr/lib/primus:/usr/lib64/primus
AllowFallbackToIGC=false

[driver-nvidia]
KernelDriver=nvidia
Module=nvidia
PMMethod=bbswitch
LibraryPath=/usr/lib64/nvidia:/usr/lib64/vdpau:/usr/lib/nvidia:/usr/lib/vdpau:/usr/lib:/usr/lib64:/usr/lib64/primus
XorgModulePath=/usr/lib64/xorg/modules/extensions/nvidia,/usr/lib64/xorg/modules/drivers,/usr/lib64/xorg/modules,/usr/lib,/usr/lib64
XorgConfFile=/etc/bumblebee/xorg.conf.nvidia

[driver-nouveau]
KernelDriver=nouveau
PMMethod=auto
XorgConfFile=/etc/bumblebee/xorg.conf.nouveau
My current xorg.conf.nvidia

Code: Select all

Section "ServerLayout"
    Identifier      "Layout0"
    Screen          0  "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
    Option          "AutoAddDevices" "false"
    Option          "AutoAddGPU" "false"
EndSection

Section "Files"
    FontPath        "/usr/share/fonts/default/Type1"
EndSection

Section "InputDevice"
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/input/mice"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier  "DiscreteNvidia"
    Driver      "nvidia"
    VendorName  "NVIDIA Corporation"
    BusID       "PCI:2:0:0"
    Option "ProbeAllGpus" "false"
    Option "NoLogo" "true"
    Option "UseEDID" "false"
    Option "UseDisplayDevice" "none"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

hachi
Posts: 13
Joined: 2019/04/14 21:52:16

Re: Problems installing Nvidia drivers for Cent OS 7

Post by hachi » 2019/04/17 21:00:57

I just recently installed first time CentOS, on my oldest Thinkpad still in use, a t430s, to see if nvidia will work
and it does

just follow this guild
https://elrepo.org/tiki/bumblebee

the only think I had to do was to add the device id in xorg bumblebee section,
see viewtopic.php?f=49&t=70267#p295334
but even this was already documented in the file that will be created for you.

So I would suggest, get rid of what you downloaded/did, reset to a clean state, and follow the howto from elrepo

miney
Posts: 10
Joined: 2019/04/13 11:55:06

Re: Problems installing Nvidia drivers for Cent OS 7

Post by miney » 2019/04/18 04:52:28

Hi hachi,

thank you for your help. Unfortunately, I tried both of the instructions you provided in the link and it is still not working.

miney
Posts: 10
Joined: 2019/04/13 11:55:06

Re: Problems installing Nvidia drivers for Cent OS 7

Post by miney » 2019/04/18 05:56:24

I want to do a new and clean install of the whole thing, so so far I uninstalled the following packages:

Code: Select all

elrepo-release
nvidia-detect
yum-plugin-nvidia
bumblebee-selinux
kmod-nvidia
bumblebee
kmod-bbswitch
primus
Which packages are you recommending to uninstall and reinstall further to that? What about all that GLX, Mesa, OpenGL, libglvnd stuff? I am no longer sure if I installed them correctly in the first place but I don't know what can and should be removed and reinstalled.

hachi
Posts: 13
Joined: 2019/04/14 21:52:16

Re: Problems installing Nvidia drivers for Cent OS 7

Post by hachi » 2019/04/18 16:50:25

I do not see what you mean with 'does not work'. black screen, optirun brings an error message, .... ?

your config looks quite complex, my xorg.conf.nvidia is simple default, just the BusID added by me

Code: Select all

Section "ServerLayout"
    Identifier  "Layout0"
    Option      "AutoAddDevices" "false"
    Option      "AutoAddGPU" "false"
EndSection

Section "Device"
    Identifier  "DiscreteNvidia"
    Driver      "nvidia"
    VendorName  "NVIDIA Corporation"
    BusID "PCI:01:00:0"
    Option "ProbeAllGpus" "false"
    Option "NoLogo" "true"
    Option "UseEDID" "false"
    Option "UseDisplayDevice" "none"
EndSection
everything else like in doe docu of the elreop, but I did not do anything with the gfx card before following th eelreop guide, you did nvida and confs by hand, .... , could be there is something left?

miney
Posts: 10
Joined: 2019/04/13 11:55:06

Re: Problems installing Nvidia drivers for Cent OS 7

Post by miney » 2019/04/22 21:47:59

By does not work I mean I get the error described above:

Code: Select all

[ERROR]Cannot access secondary GPU - error: [XORG] (EE) kbd: <default keyboard>: failed to set us as foreground pgrp (Inappropriate ioctl for device)
I created the xorg.conf.nvidia I stated before by using the default one combining it with nvidia-xconfig. But even if I use the default config (the one you stated) I get the same error.

hachi
Posts: 13
Joined: 2019/04/14 21:52:16

Re: Problems installing Nvidia drivers for Cent OS 7

Post by hachi » 2019/04/24 15:53:39

this is exactly the error I had, see the thread I linked ot

I had to add

Code: Select all

BusID "PCI:01:00:0"
to my config file

you need to adopt this to the BusID for your device,
which might be "PCI:2:0:0" as you have written above, but please double check

miney
Posts: 10
Joined: 2019/04/13 11:55:06

Re: Problems installing Nvidia drivers for Cent OS 7

Post by miney » 2019/04/25 17:24:39

Thank you for the answer.

First of all my lspci gives me this:

Code: Select all

02:00.0 3D controller: NVIDIA Corporation GM108M [GeForce 940MX] (rev a2)
Therefore my bumblebee.conf looks like this:

Code: Select all

Section "ServerLayout"
    Identifier      "Layout0"
    Option          "AutoAddDevices" "false"
    Option          "AutoAddGPU" "false"
EndSection

Section "Device"
    Identifier      "DiscreteNvidia"
    Driver          "nvidia"
    VendorName      "NVIDIA Corporation"
    BusID           "PCI:2:0:0"
    Option          "ProbeAllGpus" "false"
    Option          "NoLogo" "true"
    Option          "UseEDID" "false"
    Option          "UseDisplayDevice" "none"
EndSection
However, this error (failed to set us as foreground pgrp) does not appear every time. Mainly it pops up when the notebook is plugged into the docking station, was in sleep mode, the lid was closed, the power cable got pulled out or plugged in etc. So this is one problem I will maybe solve after I am able to execute optirun properly. So when the error does not appear, sudo optirun -vv glxinfo | grep -i nvidia gives me this:

Code: Select all

[113425.977302] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf
[113425.977483] [INFO]Configured driver: nvidia
[113425.978414] [DEBUG]optirun version 3.2.1 starting...
[113425.978425] [DEBUG]Active configuration:
[113425.978432] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf
[113425.978439] [DEBUG] X display: :8
[113425.978446] [DEBUG] LD_LIBRARY_PATH: /usr/lib64:/usr/lib:/usr/lib64/vdpau
[113425.978453] [DEBUG] Socket path: /var/run/bumblebee.socket
[113425.978460] [DEBUG] Accel/display bridge: primus
[113425.978467] [DEBUG] VGL Compression: proxy
[113425.978473] [DEBUG] VGLrun extra options: 
[113425.978483] [DEBUG] Primus LD Path: /usr/lib64/primus
[113428.079577] [INFO]Response: Yes. X is active.

[113428.079591] [INFO]Running application using primus.
[113428.079707] [DEBUG]Process glxinfo started, PID 19877.
server glx vendor string: NVIDIA Corporation
OpenGL vendor string: NVIDIA Corporation
OpenGL core profile version string: 4.6.0 NVIDIA 418.56
OpenGL core profile shading language version string: 4.60 NVIDIA
OpenGL version string: 4.6.0 NVIDIA 418.56
OpenGL shading language version string: 4.60 NVIDIA
[113428.186245] [DEBUG]SIGCHILD received, but wait failed with No child processes
[113428.186266] [DEBUG]Socket closed.
[113428.186291] [DEBUG]Killing all remaining processes.
This and the fact that I can start sudo optirun -vv nvidia-settings -c :8.0 seems quite promising. When I want to start glxgears or glxspheres64 I only get the program's window filled with black and the following output and nothing else happens until I close the program:

Code: Select all

[113612.390634] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf
[113612.390813] [INFO]Configured driver: nvidia
[113612.391676] [DEBUG]optirun version 3.2.1 starting...
[113612.391687] [DEBUG]Active configuration:
[113612.391694] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf
[113612.391700] [DEBUG] X display: :8
[113612.391707] [DEBUG] LD_LIBRARY_PATH: /usr/lib64:/usr/lib:/usr/lib64/vdpau
[113612.391714] [DEBUG] Socket path: /var/run/bumblebee.socket
[113612.391721] [DEBUG] Accel/display bridge: primus
[113612.391727] [DEBUG] VGL Compression: proxy
[113612.391737] [DEBUG] VGLrun extra options: 
[113612.391748] [DEBUG] Primus LD Path: /usr/lib64/primus
[113614.472881] [INFO]Response: Yes. X is active.

[113614.472899] [INFO]Running application using primus.
[113614.472997] [DEBUG]Process glxspheres64 started, PID 28085.
Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
Visual ID of window: 0x141
Context is Direct
OpenGL Renderer: GeForce 940MX/PCIe/SSE2
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  11 (X_GLXSwapBuffers)
  Serial number of failed request:  37
  Current serial number in output stream:  38
primus: warning: timeout waiting for display worker

Post Reply