[BBLISA] 10GBE NICS

John Stoffel john at stoffel.org
Wed Oct 10 11:20:10 EDT 2012


>>>>> "Daniel" == Daniel Feenberg <feenberg at nber.org> writes:

Daniel> On Wed, 10 Oct 2012, John Stoffel wrote:

>>>>>>> "Daniel" == Daniel Feenberg <feenberg at nber.org> writes:
>> 
>>> We want to PXE boot a dozen compute and file servers with over 10GBE
>>> ethernet. All of them boot fine with the motherboard NICs. We have
>>> a Brocade Ironport switch and a dozen direct attach cables. We also
>>> have  samples of 3 brands of 10GBE NICs to test.
>> 
>> Why are you booting file servers via PXE?  Shouldn't the be your core
>> servers which stay up all the time and provide services?

Daniel> Until now we haven't had any trouble booting PXE, and it makes
Daniel> it easy to keep the systems up to date and consistent. For
Daniel> FreeBSD we posted some notes at
Daniel> http://www.nber.org/sys-admin/FreeBSD-diskless.html

Makes sense I guess.  I just never plan on bringing down my file
servers if I can help it, since so much depends on them.  *grin*

>>> 1) The HP card boots correctly when PXE is enabled on the NIC.
>> 
>>> 2) Chelsio N320E gives a "PXE-E61 - Media Failure" error and
>>> the NIC link light never comes on.
>> 
>> It could be a bad card, how many have you tested?
>> 

Daniel> We actually tried several of the Chelsio cards.

Ugh... what a pain!

>>> 3) Brocade 1010 - says "Adaptor 1/0 link initialization failed.
>>> Disabling BIOS" or "No target devices or link down or init
>>> failed" depending on the NIC BIOS setting. Again, the NIC link
>>> light does not come on.
>> 
>>> We discount a bad cable (it works with the HP, and we have tried
>>> several) or a motherboard incompatibility (if we boot RHEL from
>>> a local drive and enable eth2 we can use the Chelsio or Brocade
>>> cards). Is there some configuration issue we are missing? Chelsio
>>> support did not offer a solution, we haven't contacted Brocade yet.
>> 
>> I'm not sure you can discount motherboard issues, since the PXE stuff
>> is directly tied into the BIOS and how the BIOS initialized the card
>> and how the on-card BIOS handles PXE support.
>> 

Daniel> Something to try, but we tried the Chelsio card in many
Daniel> machines, and the Brocade in 2.

Ah ok, you didn't mention that before.  I hope this isn't a repeat of
the fiasco that we went through when GigE first came out and the Sun
hme (Happy Meal Ethernet) cards couldn't reliably negotiate with some
switches.  

>> I don't think you're going to notice heat and power by just having a
>> second port on the system which isn't used.  It will be in the noise I
>> suspect.
>> 

Daniel> The HP card is too hot to touch.

Hmm... but it works, right?  That would count for a lot in my book.
Since heat is related to power draw, and you put a Kill-a-watt on the
unit and see how much it draws with and without the card?  Or just
looking at the spec sheet might give a clue.  But in any case, having
it working and working reliably as you want is a key thing.

Quick question though, are you running the compute nodes without any
local disk at all?  We've got a bunch of dual CPU, quad core boxes
with 144G of RAM each running RHEL5.5/6 and we ran into a problem with
/tmp mounted on tmpfs which is pulled from swap.  One of our EDA tools
generates that god awfully big files in /tmp/.SCIxxxxx/.tmpfile
(don't you just love the double hiding?  What moron thought *that*
up?) and we ran into problems with /tmp filling up.  We only have dual
mirrored 147g disks on these systems, so it basically sucks and
crashes jobs when /tmp fills.  And then the job doesn't cleanup
either!  

Anyway... what I'm getting at, is that for HPC systems, writing temp
files back to the file server might be a big performance impact if
these systems are completely diskless.  We saw a 2x slowdown when I
finally got them to put stuff into a scratch space on NFS.  

Now I admit, I'm only running gigE on these systems, we have zero
10gigE networking, though that might change in time of course.  

So I'm really interested in your experience here, esp since I'll want
to do PXE to kickstart my systems down the road if they do have 10gigE
installed. 

>>> Any wisdom greatly appreciated. This is our first experience
>>> with 10GBE.  We could boot over 1Gb and then switch to 10GBE
>>> for file service, but we wish to reduce the amount of cabling.
>> 
>> Are you also running a management network on these systems for IPMI or
>> remote console stuff?  Could you boot off that?

Daniel> We probably could if we knew how.



>>> Two odd details - even booting from the local drive the
>>> Chelsio card fails on intensive use if IRQPOLL is enabled. An
>>> additional advantage of the HP card is that brief network
>>> interuptions do not affect it, while the Chelsio card will
>>> hang the computer if the switch reboots or a cable is moved
>>> from one port to another.
>> 
>> Sounds like you need to toss that Chelsio card out the window, it's
>> not reliable and you'll just end up with all kinds of black magic
>> voodoo.
>> 
>> Now of course it could just be a motherboard interaction issue.  Have
>> you tried another brand of motherboard?
>> 

Daniel> We have tried multiple motherboards. I expect it is a
Daniel> configuration issue, but not one that support is willing to
Daniel> reveal. Or maybe Brocade will fess up.

Daniel> I found that someone with an Emulex card has the same problem
Daniel> we see (no packets move for PXE, otherwise works) -
Daniel> http://www.mail-archive.com/xcat-user@lists.sourceforge.net/msg01574.html

Daniel> So this seems to be a generic problem with most 10GBE NICS. Mysterious.

Damn frustrating for sure!  Good luck Dan.

John



More information about the bblisa mailing list