[BBLISA] 10GBE NICS

John Stoffel john at stoffel.org
Wed Oct 10 09:27:25 EDT 2012


>>>>> "Daniel" == Daniel Feenberg <feenberg at nber.org> writes:

Daniel> We want to PXE boot a dozen compute and file servers with over 10GBE 
Daniel> ethernet. All of them boot fine with the motherboard NICs. We have
Daniel> a Brocade Ironport switch and a dozen direct attach cables. We also
Daniel> have  samples of 3 brands of 10GBE NICs to test.

Why are you booting file servers via PXE?  Shouldn't the be your core
servers which stay up all the time and provide services?  

Daniel> 1) The HP card boots correctly when PXE is enabled on the NIC.

Daniel> 2) Chelsio N320E gives a "PXE-E61 - Media Failure" error and
Daniel>     the NIC link light never comes on.

It could be a bad card, how many have you tested?  

Daniel> 3) Brocade 1010 - says "Adaptor 1/0 link initialization failed.
Daniel>     Disabling BIOS" or "No target devices or link down or init
Daniel>     failed" depending on the NIC BIOS setting. Again, the NIC link
Daniel>     light does not come on.

Daniel> We discount a bad cable (it works with the HP, and we have tried
Daniel> several) or a motherboard incompatibility (if we boot RHEL from
Daniel> a local drive and enable eth2 we can use the Chelsio or Brocade
Daniel> cards). Is there some configuration issue we are missing? Chelsio
Daniel> support did not offer a solution, we haven't contacted Brocade yet.

I'm not sure you can discount motherboard issues, since the PXE stuff
is directly tied into the BIOS and how the BIOS initialized the card
and how the on-card BIOS handles PXE support.  

Daniel> The motherboard is a Gigabyte GA-P55M-UB2. All the cards have the
Daniel> latest firmware. 

Is the motherboard at the latest level of firmware? 

Daniel> Since the failure occurs before any packets are sent it can't
Daniel> be a dhcpd or tftp problem. Is the problem that some cards
Daniel> offer less than full support for direct attach? Or is direct
Daniel> attach not fully standardized? Should be try fiber optic
Daniel> cables? The documentation for all the cards suggests that
Daniel> their primary purpose is ethernet SANs. Perhaps the vendors
Daniel> don't care about other uses?

Daniel> We only need a single port per server, while the HP offers
Daniel> 2. Because of heat, power and cost reasons, we would prefer a
Daniel> single port card.

How much have you spent already in terms of your time debugging this?
I'd just got with the HP cards and get on with life.  Get the systems
up and running and then keep playing with other cards in the lab.  

I don't think you're going to notice heat and power by just having a
second port on the system which isn't used.  It will be in the noise I
suspect. 

Daniel> Any wisdom greatly appreciated. This is our first experience
Daniel> with 10GBE.  We could boot over 1Gb and then switch to 10GBE
Daniel> for file service, but we wish to reduce the amount of cabling.

Are you also running a management network on these systems for IPMI or
remote console stuff?  Could you boot off that?

Daniel> Two odd details - even booting from the local drive the
Daniel> Chelsio card fails on intensive use if IRQPOLL is enabled. An
Daniel> additional advantage of the HP card is that brief network
Daniel> interuptions do not affect it, while the Chelsio card will
Daniel> hang the computer if the switch reboots or a cable is moved
Daniel> from one port to another.

Sounds like you need to toss that Chelsio card out the window, it's
not reliable and you'll just end up with all kinds of black magic
voodoo.

Now of course it could just be a motherboard interaction issue.  Have
you tried another brand of motherboard?  

John



More information about the bblisa mailing list