[BBLISA] Guidelines for giving full root access to DBAs

John Stoffel john at stoffel.org
Tue Aug 22 11:23:54 EDT 2006


I want to chime in here a bit about the difference between devel, test
and production machines.  I can see both sides of the fence here,
since it's nice to give the engineers full access to the devel and
test machines and let them play with them to their hearts content
before you roll out to production.

But there's the rub for me.  If a test box is what you stage to,
before you goto production, then the test box *has* to mirror what the
production environment is, otherwise it's useless for testing.  

Sure, let the DBAs and engineers go hog wild on devel boxes, but lock
down test and production, so that they are forced to a) articulate
what they need in production for system settings, and b) make sure
those requirements are met and tested.

Now back to Sharon's request for help.   You're in a tough boat here,
because your problem is completely political and not technical.   I'd
go back to management and ask them to change it so that the DBAs have
full root on devel boxes, but that they need to work with you on
test/prod boxes because otherwise you can't make sure they are the
same anymore.

Be a little flexible, setup kickstart/jumpstart whatever and put your
system setup into that system, so that you don't generally hack
systems, you just update the image and re-flash the box.  This gives
you the best of both worlds.  

One, a documented and tested process for deploying systems to
production with a known config.

Two, an easy way to deploy test/devel boxes to the DBAs to play with
and tweak.  And if they screw up a system, you just re-image it
instead of trying to fix it, because you want to start from a known
state.  

In any case, good luck!  It's all in the politics.  

Oh yeah, another thing to do is to bow gracefully to the pressure, and
then just document any and all problems that crop up which you are
forced to fix which are due to the DBAs mucking up system
configurations.  Publicize this widely in weekly status emails to your
management, and to the DBAs management, documenting the time taken to
solve issues.  Offer constructive solutions which let you keep
control, but which give the DBAs the ability to get their work done.

And take some time to setup a lunch/dinner with the DBAs and yourself
to get to know and trust each other, and maybe you can provide a
united front to management where the DBAs say "you know, we really
don't need root because Sharon has addressed all our needs so we can
get work done..." and presto, you look really good, and your systems
are better managed.

This might take some time to accomplish, but building the trust
between you and the DBAs in the trenches will be key.  

<long story>

I may or may not have told this story before, but when I was at a
networking company that made high end network WAN switches, let's call
it "L" for short.  *grin*  We had a problem where a NetApp which held
user's home directories and data (a production machine) was located in
a test group's lab and on the test groups network.  They lab manager
for the group gave all the engineers access to the router so they
could add/delete/modify routes while they did testing.  They routinely
knocked their Netapp off the network, which caused all kinds of
problems, mostly because the first time it happened we spent hours
trying to figure out the problem.

Once we corrected the route, all was set.  Then it happened again, but
was fixed faster because I knew where to look for problems.  I kept
asking him and his management to lock down the router, but he was lazy
and didn't want to do it.  So one day, after yet another failure due
to an engineer making a change, I went in and fixed the router, then
changed the enable password so that only I knew it.

Then I wrote up a *long* diatrabe to him, his management and my
management explaining the history, why it was bad, what I had done,
etc.  

Having this lab manager yelling and cursing at me in the cubicals at
the top of his voice was not pleasent.  But I got my point across and
got the problem fixed because I was willing to stand by my guns, and
because I had worked to solve the problem in a cooperative manner at
first, and only become a jerk when the situation had devolved
completely.  

We ended up moving that Netapp to our production network, and getting
their production servers off their test network.  I'm still not a fan
of this Lab manager, but hey, I don't work there any more.  *grin*

</long story>

Good luck Sharon!

John




More information about the bblisa mailing list