Re: kickstart problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]





Paolo Supino  wrote / napísal(a):

On Thu, Sep 4, 2008 at 8:27 AM, Romeo Ninov <rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>> wrote:



    Paolo Supino  wrote / napísal(a):



        On Wed, Sep 3, 2008 at 5:52 PM, Marco Fretz
        <mailinglist@xxxxxxx <mailto:mailinglist@xxxxxxx>
        <mailto:mailinglist@xxxxxxx <mailto:mailinglist@xxxxxxx>>> wrote:

           hi,

           we had the same problem with newer HP pcs and servers
        (broadcom nics).
           pxe works well on broadcom, the install not. doesn't matter
        if you're
           using kickstart or manual install.

           the problem was in centos 4.2. after updating the install
           environment to
           4.5 the problem was gone... so it was a driver issue! the
        install
           kernel
           is not exactly the normal linux kernel i think.

           if anaconda just says that it cannot find install image,
        etc. the
           system
           has no connectivity at this time.

           hope this is helpful...

           bests
            marco

           Paolo Supino wrote:
           >
           >
           > On Tue, Sep 2, 2008 at 3:07 PM, Romeo Ninov
        <rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>
           <mailto:rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>>
           > <mailto:rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>
        <mailto:rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>>>> wrote:
           >
           >
           >
           >     Paolo Supino  wrote / napísal(a):
           >
           >
           >
           >         On Tue, Sep 2, 2008 at 2:17 PM, Romeo Ninov
           <rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>
        <mailto:rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>>
           >         <mailto:rninov@xxxxxxxxx
        <mailto:rninov@xxxxxxxxx> <mailto:rninov@xxxxxxxxx
        <mailto:rninov@xxxxxxxxx>>>
           <mailto:rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>
        <mailto:rninov@xxxxxxxxx <mailto:rninov@xxxxxxxxx>>
           >         <mailto:rninov@xxxxxxxxx
        <mailto:rninov@xxxxxxxxx> <mailto:rninov@xxxxxxxxx
        <mailto:rninov@xxxxxxxxx>>>>> wrote:
           >
           >
           >
           >            Paolo Supino  wrote / napísal(a):
           >
           >
           >
           >                On Tue, Sep 2, 2008 at 8:14 AM, nate
           >         <centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>
           <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>>
        <mailto:centos@xxxxxxxxxxxxxxxx <mailto:centos@xxxxxxxxxxxxxxxx>
           <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>>>
           >                <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>
           <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>>
           >         <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>
           <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>>>>
           >                <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>
           <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>>
           >         <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>
           <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>>>
           >
           >                <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>
           <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>>
           >         <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>
           <mailto:centos@xxxxxxxxxxxxxxxx
        <mailto:centos@xxxxxxxxxxxxxxxx>>>>>> wrote:
           >
           >                   Paolo Supino wrote:
           >                   > Hi Nate
           >                   >
           >
           >                   > 3: After the error comes up I get the
        HTTP setup
           >                configuration
           >                   screen with
           >                   > the source website (in IP) and CentOS
           directory as I
           >         entered
           >                   them in the
           >                   > pxeconfiguration file and as it
        appears in
           the kickstart
           >                   configuration file
           >                   > and all I have to do is press the
        'OK' button to
           >         continue the
           >                   installation
           >                   > to a successful completion.
           >
           >                   If that's the case the next most likely
        culprit is
           >
           >                   > url --url http://192.168.11.1/source
           >
           >
           >                   Just because the PXE boot loader can
        download the
           >         kickstart
           >                   config does not mean that the
        installation process
           >         will work
           >                   with that NIC.
           >
           >                   Also I've had lots of broadcom systems not
           work with
           >                kickstart over
           >                   the years, it's not uncommon for newer
        systems
           to have
           >         newer
           >                   revs of the chipsets and those revs not
        being
           >         supported by the
           >                   installer.
           >
           >                   But it sounds like in your case it does
        work, so I
           >         would look
           >                   at the url above, as it likely is the
        cause of the
           >         problem.
           >                Check
           >                   the http access logs on the server for
        404s and
           >         similar errors.
           >
           >                   nate
           >
> _______________________________________________
           >                   CentOS mailing list
           >                   CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx> <mailto:CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx>>
           <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>>
           >         <mailto:CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx> <mailto:CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx>>
           <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>>>
           >                <mailto:CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx>
           <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
           <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>>
           >         <mailto:CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx> <mailto:CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx>>
           <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>>>>
           >
           >
> http://lists.centos.org/mailman/listinfo/centos
           >
           >
           >
           >                Hi Nate
           >
           >                 After figuring what I was doing wrong (see
           previous reply
           >                ...) I started going through each of my
        systems
           in order to
           >                boot them and install CentOS 5.2 on each.
        For the
           most
           >         part it
           >                works, but only for the most part? Because
        once
           in a few
           >         boots
           >                (not machine specific) anaconda stops and
        either
           asks me what
           >                interface it needs to configure or fails
        to load
           'stage2.img'
           >                from the web server on 192.168.11.1
        <http://192.168.11.1>
           <http://192.168.11.1> <http://192.168.11.1>
           >         <http://192.168.11.1>
           >                <http://192.168.11.1> ... All cables are good
           cables. The
           >                network switch is a Cisco 3750G with no
           configuration)
           >         and all
           >                the NICs are broadcom with firmware 3.8.9.
        <http://3.8.9.>
           <http://3.8.9.>
           >         <http://3.8.9.> <http://3.8.9.>
           >                <http://3.8.9.> Can you throw a guess
        where the
           problem might
           >                be lying (I hate inconsistencies)?
           >
           >
           >            Have you check apache logs for something.
        Check also
           the server
           >            messages
           >
           >            _______________________________________________
           >            CentOS mailing list
           >            CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>
           <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>>
           >         <mailto:CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx> <mailto:CentOS@xxxxxxxxxx
        <mailto:CentOS@xxxxxxxxxx>>
           <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>>>
           >            http://lists.centos.org/mailman/listinfo/centos
           >
           >
           >         Hi Romeo
           >
           >          Yes I did, and nothing shows up in either
        access_log or
           >         error_log :-(
           >         I just had a node that stopped asking me for IP
           configuration
           >         (twice) and only on the second time (checked on the
           server using
           >         tcpdump) did it actually try to contact the server to
           retrieve
           >         network configuration continue and it
        successfully retrieved
           >         'stage2.img' from the web server :-(
           >
           >     Paolo, what about DHCP or bootp servers. Check the logs,
           flush ARP
           >     cache from server(s)
           >
           >     _______________________________________________
           >     CentOS mailing list
           >     CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>
           <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>>
           >     http://lists.centos.org/mailman/listinfo/centos
           >
           >
           > Hi Romeo
           >
           >   The more systems I boot the more I'm starting to feel
        that it's
           > hardware problem related ... I just booted a system in
        which the
           ELOM
           > says that NIC0 has 1 MAC address, but when I boot the
        system I
           saw on
           > the network a different MAC address altogether ...
           >   I'm checking at the lowest level: on the wire (using
        tcpdump)
           so if
           > nothing shows in the capture I'm sure I won't find
        anything in
           the logs :-(
           >
           >
           >
           >
           > --
           > TIA
           > Paolo
           >
           >
           >
------------------------------------------------------------------------
           >
           > _______________________________________________
           > CentOS mailing list
           > CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>
           > http://lists.centos.org/mailman/listinfo/centos
           _______________________________________________
           CentOS mailing list
           CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
        <mailto:CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>>
           http://lists.centos.org/mailman/listinfo/centos



        Hi Marco

         Thanx for the email. I've been debugging this problem for a
        few days and a few installs before I posted the first email in
        this thread I started sniffing the network interface on the
        server (dhcp, tftp, http are all on the same computer) and I
        noticed that no communication reaches the server between the
        PXE load and the retrieval error (and I think I wrote about it
        in my original post). Some people suggested that it might be
        that Linux gets confused in the interfaces (the Sun X2200 M2
        has 4 NICs), which I find hard to believe (Linux kernel is old
        enough and probably got rid of these kind of bugs a long time
        ago). In some of the failures the kernel loaded, retrieved the
        kickstart configuration file and than failed to retrieve
        'stage2.img' (again nothing appeared on the wire). I have a
        sneaky feeling that the kickstart process assumes a lot of
        basic facts and doesn't do any/enough sanity checking. Right
        now I need to get this cluster up and running (I'm already 2
        weeks behind schedule). After it's up I will try to debug the
        process.
         The situation got me so aggravated that I was contemplating
        resurrecting my old private distro (not going to do that) that
        does things in a much simpler way.


    Paolo
    Unfortunately CentOS/RHEL have really problem in process of
    loading modules, especialy in case of two identical NICs, they
    change on random way. I personaly use this way to mitigate the
    problem: in /etc/modprobe.conf add 1st modprobe for NIC on 1st
    place and second on last place in the file and after reboot i have
    always NIC->eth? relation in place

    _______________________________________________
    CentOS mailing list
    CentOS@xxxxxxxxxx <mailto:CentOS@xxxxxxxxxx>
    http://lists.centos.org/mailman/listinfo/centos



Hi Marco

I didn't finish testing the way Nate asked me to so right now I don't have any conclusive answers about what exactly is going on, but in pasting my original email (that started this thread) I wrote that what I see happening is: anaconda prints an error message that it fails to retrieve 'stage2.img' from the HTTP server. I press 'OK' in the error message screen. The screen that comes after it is the HTTP setup screen with the information given by the 'ks' directive from pxelinux already in place, so that the only thing left for me to do is press the 'OK' button. When I press the 'OK' button anaconda successfully retrieves 'stage2.img' from the http server and goes on to finish successfully the unattendded install (take a look at my original post). The only thing that makes sense is that the network configuration didn't finish (yet) before tring to retrieve 'stage2.img'. Along the way I tried to change configuration various times and I got all possible failures (or at least it feels like it): failed to retrieve kickstart config file, failed to retrieve 'stage2.img' file no matter how many times I pressed the 'OK' button in the HTTP setup screen, and probably a few more scenarios that I'm trying very hard to forget ;-) One thing I noticed is that anaconda reconfigures the network interface after the kernel already configured it and successfully retrieves the kickstart config file from the web server (proved by sniffing the network). The question that goes in my mind when I see it is: why is it doing that??? and makes me feel that something is wrong in the assumptions and install process .., Maybe you're right about the module loading issue because (though it doesn't explain what I wrote in the original post): I resorected my old distro (a heavily modified Slackware) to test the issue and what I found is that a no module kernel (all needed drivers are statically compiled before) and no initrd to mess things up the issue simply didn't happen (tested 10 times). On the other hand if you were right about it than RHEL/CentOS/Fedora installation would be unsuitable in any multihome configuration because it would map ETH devices differently (albeit once in a while) which means one whould have to swtich the cables because of network device remapping!!! and that isn't something users and corporations that use REHL (and there are many of those) would be willing to live with :-)

Paolo, this problem occur only in RHEL/CentOS/other RH based distros and not in Slack, SuSE, Debian, etc. I was not going deeper in the problem, but that is the reality. BTW: You can play with MAC address in incfg files, but this is applicable only on already installed machine.About Your remarc for corporations and RH - you are right, but how often servers are restarted? :-)
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux