kernel: e1000e: eth0 NIC Link is Down

This error has happened to me a few times in the past years. The network suddenly gets frozen and then it cames back after a few seconds, on other times it gets frozen and the only way to work remotely on the server is to reboot the box. This is the exact error message I get: kernel: e1000e: eth0 NIC Link is Down

While inspecting the logs, it looked like this at /var/log/messages

May 29 09:10:35 server kernel: e1000e: eth0 NIC Link is Down
May 29 09:10:35 server kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
May 29 09:10:35 server kernel: e1000e: eth0 NIC Link is Down
May 29 09:10:35 server kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
May 29 09:10:35 server kernel: e1000e: eth0 NIC Link is Down
May 29 09:10:35 server kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
May 29 09:10:35 server kernel: e1000e: eth0 NIC Link is Down

After rebooting the server the connectivity was back to normal, however, this is not a permanent fix for production servers. These are the top 4 solutions I’ve ever used on this situation to fix this “eth0 NIC Link is Down” error on Linux servers.

One of the first things to do is to check if the number of errors increases while running ifconfig command, that is a way to detect possible nic driver issues if not, it could be related to a hardware problem (Nic, cable, port).

How can I solve this e1000e: eth0 NIC Link is Down error?

There are a few reasons why this could be happening. I’ve seen this happening on both CentOS and Ubuntu servers, and it is always related to:

1. Bad ethernet cables

This is one of the easiest fixes, just replace the ethernet cables and start monitoring again. Sometimes bad ethernet cables can cause this kind of issue. This takes around 1 minute to replace the old network wires with new ones.

2. Failing e1000e network drivers

e1000e driver sometimes fails on CentOS Linux, make sure you have the latest e1000e nic drivers. You can update your drivers by following these steps:

For Ubuntu – Try this handy script.
For CentOS/RHEL: Try this guide by Intel, you can also try this tiny script by Ioflood.com that works on CentOS 6 & 7:

# Copyright 2014 Input Output Flood LLC
# IOFLOOD.com -- We Love Servers
# This script may be freely distributed so long as this copyright notice remains intact
#
# this is a pre-requisite for our nifty nic upgrade script
 yum -y install pciutils
 
 # update this network driver for the appropriate RHEL release and the appropriate driver (e1000e and igb supported)
 NIC=`lspci -nv | egrep "e1000e$|igb$" | sed 's/\tKernel driver in use: //g' | sed 's/\tKernel modules: //g' | uniq`
 if grep -q -i "release 5" /etc/redhat-release
 then
   RPM="http://elrepo.org/elrepo-release-5-5.el5.elrepo.noarch.rpm"
 elif grep -q -i "release 6" /etc/redhat-release
 then
   RPM="http://elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm"
   if [[ "$NIC" == "e1000e" ]]
   then
     grubby --update-kernel=ALL --args="pcie_aspm=off e1000e.IntMode=1,1 e1000e.InterruptThrottleRate=10000,10000 acpi=ht"
   fi
 elif grep -q -i "release 7" /etc/redhat-release
 then
   RPM="http://elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm"
   if [[ "$NIC" == "e1000e" ]]
   then
     grubby --update-kernel=ALL --args="pcie_aspm=off e1000e.IntMode=1,1 e1000e.InterruptThrottleRate=10000,10000 acpi=ht"
   fi
 fi
 if [[ -n "$RPM" && -n "$NIC" ]] 
 then
   rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org
   rpm -Uvh $RPM
   yum -y install kmod-$NIC
 fi

3. Failing NIC

Swap the NIC for a new one. If the NIC is bundled into the motherboard you will have to run a full motherboard swap it can take up to 30 mins of having your website down.

4. Failing Switch Port

The last thing to try is to change the switch port where this server is connected to.

Last suggestion:

ethtool command can help you to make sure that your Linux network settings are the same as the ones you have configured on your switch, it can be another reason to have this kind of networking errors.

5. Update your BIOS & enable ASPM mode

If disabled, the ASPM mode can be the cause of this issue. Apart from that, make sure you are running your latest BIOS version, and update if not.

6. Disable flow control

Having flow control enabled caused this weird network error a few times. Try to disable it and keep monitoring your logs to see if that was the cause of the issue:

ethtool -A eth0 rx off tx off

Now, check out if this has been applied or not:

[[email protected]:~]ethtool -a eth0

Pause parameters for eth0:

Autonegotiate:  on
RX:             off
TX:             off

If you see off, then flow control is disabled.

7. Replace your motherboard with onboard NIC and the CPU

Once I saw an E3-1230v2 having constant issues, the only way to fix it was to migrate the disks to a new E3-1231v3 with a different motherboard and onboard nic.

What about you? Were you able to fix this e1000e: eth0 NIC Link is Down error?

eth0 NIC Link is Down error on Linux