Professional Web Applications Themes

onbar hangs - Informix

IDS 9.21.UC4 Solaris 2.6 EDS takes our backup using onbar. What we have observed is that, quite frequently onbar hangs. By hangs it means, onbar processes keep running indefinitely. Since the backup is taken daily, all subsequent backups also hang until the earlier on is killed. I have a script to look for onbar process 15 hours after it starts, and if it finds one, it sends a mail as follows:- ========================================= onbar is still running on flx10 root 15709 15703 0 00:50:14 ? 0:00 /bin/sh /opt/informix/bin/onbar -b -L 0 root 15878 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L ...

  1. #1

    Default onbar hangs

    IDS 9.21.UC4
    Solaris 2.6

    EDS takes our backup using onbar. What we have observed is that,
    quite frequently onbar hangs. By hangs it means, onbar processes
    keep running indefinitely. Since the backup is taken daily, all
    subsequent backups also hang until the earlier on is killed.

    I have a script to look for onbar process 15 hours after it
    starts, and if it finds one, it sends a mail as follows:-

    =========================================
    onbar is still running on flx10
    root 15709 15703 0 00:50:14 ? 0:00 /bin/sh /opt/informix/bin/onbar -b -L 0
    root 15878 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    root 15879 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    root 15873 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    root 15872 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    root 15871 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    root 15874 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    root 15711 15709 0 00:50:14 ? 0:01 /opt/informix/bin/onbar_d -b -L 0
    =========================================

    The ssyadmin then kills the processes indicated above.

    Once they kill the client process, we get this message in online.log
    =======================================
    15:04:10 Archive on logs02 ABORTED.
    15:04:10 Aborted by client.
    15:04:10 Archive on logs03 ABORTED.
    15:04:10 Aborted by client.
    15:04:10 Archive on logs04 ABORTED.
    15:04:10 Aborted by client.
    15:04:10 Archive on physdbs ABORTED.
    15:04:10 Aborted by client.
    15:04:11 Archive on fdbs04 ABORTED.
    15:04:11 Aborted by client.
    15:04:11 Archive on gdbs01 ABORTED.
    15:04:11 Aborted by client.
    =======================================
    In other words, the backup is useless.

    Earlier on, it use to be once/twice a month. Now it is once/twice a week.
    What could be wrong with it. Why does onbar hang for 15 hours without doing
    anything.




    rkusenet Guest

  2. #2

    Default Re: onbar hangs






    Hi

    Were there any unusual messages in the BAR_ACT_LOG file? If you have
    configured a BAR_DEBUG_LOG file, that may also contain some useful
    information.

    If those BAR files do not indicate anything unusual, email me your BAR_*
    configurations from ONCONFIG, along with the pertinent portions of
    BAR_ACT_LOG and BAR_DEBUG_LOG (if applicable), and online.log.

    thanks. Davis.




    "rkusenet"
    <rkusenetsympati To: [email]informix-listiiug.org[/email]
    co.ca> cc:
    Sent by: Subject: onbar hangs
    owner-informix-li
    [email]stiiug.org[/email]


    08/06/2003 01:23
    PM
    Please respond to
    "rkusenet"





    IDS 9.21.UC4
    Solaris 2.6

    EDS takes our backup using onbar. What we have observed is that,
    quite frequently onbar hangs. By hangs it means, onbar processes
    keep running indefinitely. Since the backup is taken daily, all
    subsequent backups also hang until the earlier on is killed.

    I have a script to look for onbar process 15 hours after it
    starts, and if it finds one, it sends a mail as follows:-

    =========================================
    onbar is still running on flx10
    root 15709 15703 0 00:50:14 ? 0:00 /bin/sh
    /opt/informix/bin/onbar -b -L 0
    root 15878 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b
    -L 0
    root 15879 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b
    -L 0
    root 15873 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b
    -L 0
    root 15872 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b
    -L 0
    root 15871 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b
    -L 0
    root 15874 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b
    -L 0
    root 15711 15709 0 00:50:14 ? 0:01 /opt/informix/bin/onbar_d -b
    -L 0
    =========================================

    The ssyadmin then kills the processes indicated above.

    Once they kill the client process, we get this message in online.log
    =======================================
    15:04:10 Archive on logs02 ABORTED.
    15:04:10 Aborted by client.
    15:04:10 Archive on logs03 ABORTED.
    15:04:10 Aborted by client.
    15:04:10 Archive on logs04 ABORTED.
    15:04:10 Aborted by client.
    15:04:10 Archive on physdbs ABORTED.
    15:04:10 Aborted by client.
    15:04:11 Archive on fdbs04 ABORTED.
    15:04:11 Aborted by client.
    15:04:11 Archive on gdbs01 ABORTED.
    15:04:11 Aborted by client.
    =======================================
    In other words, the backup is useless.

    Earlier on, it use to be once/twice a month. Now it is once/twice a week.
    What could be wrong with it. Why does onbar hang for 15 hours without doing
    anything.






    sending to informix-list
    Davis Kwong Guest

  3. #3

    Default Re: onbar hangs






    Hi, I'm not sure which manual is for 9.21, but here's a cut & paste of the
    9.4 IBM Informix Backup and Restore Guide which talks about controlling
    parallel backup and restore using onbar.

    The BAR_MAX_BACKUP parameter specifies the maximum number of
    parallel processes that are allowed for each onbar command. Both UNIX and
    Windows support parallel backups. Although the database server default
    value for BAR_MAX_BACKUP is 4, the onconfig.std value is 0.

    To perform a serial backup or restore, set BAR_MAX_BACKUP to 1.
    ON-Bar ignores the BAR_MAX_BACKUP parameter for a whole-system
    backup because they are always done serially.
    onconfig.std value 0
    if value not present 4
    units onbar processes
    range of values 0 = maximum number of processes allowed on system
    1 = serial backup or restore
    n = specified number of processes spawned
    takes effect When onbar starts

    To specify parallel backups and restores, set BAR_MAX_BACKUP to a value
    other than 1. For example, if you set BAR_MAX_BACKUP to 5 and execute an
    ON-Bar command, the maximum number of processes that ON-Bar will
    spawn concurrently is 5. Configure BAR_MAX_BACKUP to any number up to
    the maximum number of storage devices or the maximum number of streams
    available for physical backups and restores.
    If you set BAR_MAX_BACKUP to 0, the system creates as many ON-Bar
    processes as needed. The number of ON-Bar processes is limited only by the
    number of storage spaces or the amount of memory available to the database
    server, whichever is less.
    The amount of memory available is based on SHMTOTAL. ON-Bar performs
    the following calculation where N is the maximum number of ON-Bar
    processes that are allowed:
    N = SHMTOTAL / (# transport buffers * size of transport buffers / 1024)
    If SHMTOTAL is 0, BAR_MAX_BACKUP is reset to 1. If N is greater than
    BAR_MAX_BACKUP, ON-Bar uses the BAR_MAX_BACKUP value. Otherwise,
    ON-Bar starts N backup or restore processes.

    thanks, Davis.




    "rkusenet"
    <rkusenetsympati To: [email]informix-listiiug.org[/email]
    co.ca> cc:
    Sent by: Subject: Re: onbar hangs
    owner-informix-li
    [email]stiiug.org[/email]


    08/06/2003 02:03
    PM
    Please respond to
    "rkusenet"





    "rkusenet" <rkusenetsympatico.ca> wrote in message
    news:bgro1s$rbvfo$1ID-75254.news.uni-berlin.de...
    > IDS 9.21.UC4
    > Solaris 2.6
    >
    > EDS takes our backup using onbar. What we have observed is that,
    > quite frequently onbar hangs. By hangs it means, onbar processes
    > keep running indefinitely. Since the backup is taken daily, all
    > subsequent backups also hang until the earlier on is killed.
    >
    > I have a script to look for onbar process 15 hours after it
    > starts, and if it finds one, it sends a mail as follows:-
    >
    > =========================================
    > onbar is still running on flx10
    > root 15709 15703 0 00:50:14 ? 0:00 /bin/sh
    /opt/informix/bin/onbar -b -L 0
    > root 15878 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15879 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15873 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15872 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15871 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15874 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15711 15709 0 00:50:14 ? 0:01 /opt/informix/bin/onbar_d
    -b -L 0
    > =========================================
    >
    > The ssyadmin then kills the processes indicated above.
    >
    > Once they kill the client process, we get this message in online.log
    > =======================================
    > 15:04:10 Archive on logs02 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on logs03 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on logs04 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on physdbs ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:11 Archive on fdbs04 ABORTED.
    > 15:04:11 Aborted by client.
    > 15:04:11 Archive on gdbs01 ABORTED.
    > 15:04:11 Aborted by client.
    > =======================================
    > In other words, the backup is useless.
    >
    > Earlier on, it use to be once/twice a month. Now it is once/twice a week.
    > What could be wrong with it. Why does onbar hang for 15 hours without
    doing
    > anything.
    Which manual describes onbar commands for 9.21?

    It seems it has something to do with parallelism.

    current request exceeds parallelism.

    solaris log file shows this error:-
    > XBSA-1.0.1 6.0.Build.153 15872 Wed Aug 6 00:52:17 2003
    > _nwbsa_is_retryable_error: received a retryable network error
    > (Severity 0 Number -13): current request exceeds parallelism
    > XBSA-1.0.1 6.0.Build.153 15873 Wed Aug 6 00:52:17 2003
    > _nwbsa_is_retryable_error: received a retryable network error
    > (Severity 0 Number -13): current request exceeds parallelism
    > XBSA-1.0.1 6.0.Build.153 15871 Wed Aug 6 00:52:17 2003
    > _nwbsa_is_retryable_error: received a retryable network error
    > (Severity 0 Number -13): current request exceeds parallelism
    > XBSA-1.0.1 6.0.Build.153 15874 Wed Aug 6 00:52:19 2003
    > _nwbsa_is_retryable_error: received a retryable network error
    > (Severity 0 Number -13): current request exceeds parallelism
    I would like to see whether there are options in onbar to turn off
    parallelism.




    sending to informix-list
    Davis Kwong Guest

  4. #4

    Default Fw: onbar hangs







    ----- Forwarded by Davis Kwong/Menlo Park/IBM on 08/06/2003 05:18 PM -----

    Davis Kwong
    To: "rkusenet" <rkusenetsympatico.ca>
    08/06/2003 05:19 cc: [email]informix-listiiug.org[/email], [email]owner-informix-listiiug.org[/email]
    PM From: Davis Kwong/Menlo Park/IBMIBMUS
    Subject: Re: onbar hangs(Doent link: Davis Kwong)




    Hi, I'm not sure which manual is for 9.21, but here's a cut & paste of the
    9.4 IBM Informix Backup and Restore Guide which talks about controlling
    parallel backup and restore using onbar.

    The BAR_MAX_BACKUP parameter specifies the maximum number of
    parallel processes that are allowed for each onbar command. Both UNIX and
    Windows support parallel backups. Although the database server default
    value for BAR_MAX_BACKUP is 4, the onconfig.std value is 0.

    To perform a serial backup or restore, set BAR_MAX_BACKUP to 1.
    ON-Bar ignores the BAR_MAX_BACKUP parameter for a whole-system
    backup because they are always done serially.
    onconfig.std value 0
    if value not present 4
    units onbar processes
    range of values 0 = maximum number of processes allowed on system
    1 = serial backup or restore
    n = specified number of processes spawned
    takes effect When onbar starts

    To specify parallel backups and restores, set BAR_MAX_BACKUP to a value
    other than 1. For example, if you set BAR_MAX_BACKUP to 5 and execute an
    ON-Bar command, the maximum number of processes that ON-Bar will
    spawn concurrently is 5. Configure BAR_MAX_BACKUP to any number up to
    the maximum number of storage devices or the maximum number of streams
    available for physical backups and restores.
    If you set BAR_MAX_BACKUP to 0, the system creates as many ON-Bar
    processes as needed. The number of ON-Bar processes is limited only by the
    number of storage spaces or the amount of memory available to the database
    server, whichever is less.
    The amount of memory available is based on SHMTOTAL. ON-Bar performs
    the following calculation where N is the maximum number of ON-Bar
    processes that are allowed:
    N = SHMTOTAL / (# transport buffers * size of transport buffers / 1024)
    If SHMTOTAL is 0, BAR_MAX_BACKUP is reset to 1. If N is greater than
    BAR_MAX_BACKUP, ON-Bar uses the BAR_MAX_BACKUP value. Otherwise,
    ON-Bar starts N backup or restore processes.

    thanks, Davis.




    "rkusenet"
    <rkusenetsympati To: [email]informix-listiiug.org[/email]
    co.ca> cc:
    Sent by: Subject: Re: onbar hangs
    owner-informix-li
    [email]stiiug.org[/email]


    08/06/2003 02:03
    PM
    Please respond to
    "rkusenet"





    "rkusenet" <rkusenetsympatico.ca> wrote in message
    news:bgro1s$rbvfo$1ID-75254.news.uni-berlin.de...
    > IDS 9.21.UC4
    > Solaris 2.6
    >
    > EDS takes our backup using onbar. What we have observed is that,
    > quite frequently onbar hangs. By hangs it means, onbar processes
    > keep running indefinitely. Since the backup is taken daily, all
    > subsequent backups also hang until the earlier on is killed.
    >
    > I have a script to look for onbar process 15 hours after it
    > starts, and if it finds one, it sends a mail as follows:-
    >
    > =========================================
    > onbar is still running on flx10
    > root 15709 15703 0 00:50:14 ? 0:00 /bin/sh
    /opt/informix/bin/onbar -b -L 0
    > root 15878 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15879 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15873 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15872 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15871 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15874 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d
    -b -L 0
    > root 15711 15709 0 00:50:14 ? 0:01 /opt/informix/bin/onbar_d
    -b -L 0
    > =========================================
    >
    > The ssyadmin then kills the processes indicated above.
    >
    > Once they kill the client process, we get this message in online.log
    > =======================================
    > 15:04:10 Archive on logs02 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on logs03 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on logs04 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on physdbs ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:11 Archive on fdbs04 ABORTED.
    > 15:04:11 Aborted by client.
    > 15:04:11 Archive on gdbs01 ABORTED.
    > 15:04:11 Aborted by client.
    > =======================================
    > In other words, the backup is useless.
    >
    > Earlier on, it use to be once/twice a month. Now it is once/twice a week.
    > What could be wrong with it. Why does onbar hang for 15 hours without
    doing
    > anything.
    Which manual describes onbar commands for 9.21?

    It seems it has something to do with parallelism.

    current request exceeds parallelism.

    solaris log file shows this error:-
    > XBSA-1.0.1 6.0.Build.153 15872 Wed Aug 6 00:52:17 2003
    > _nwbsa_is_retryable_error: received a retryable network error
    > (Severity 0 Number -13): current request exceeds parallelism
    > XBSA-1.0.1 6.0.Build.153 15873 Wed Aug 6 00:52:17 2003
    > _nwbsa_is_retryable_error: received a retryable network error
    > (Severity 0 Number -13): current request exceeds parallelism
    > XBSA-1.0.1 6.0.Build.153 15871 Wed Aug 6 00:52:17 2003
    > _nwbsa_is_retryable_error: received a retryable network error
    > (Severity 0 Number -13): current request exceeds parallelism
    > XBSA-1.0.1 6.0.Build.153 15874 Wed Aug 6 00:52:19 2003
    > _nwbsa_is_retryable_error: received a retryable network error
    > (Severity 0 Number -13): current request exceeds parallelism
    I would like to see whether there are options in onbar to turn off
    parallelism.




    sending to informix-list
    Davis Kwong Guest

  5. #5

    Default Re: onbar hangs


    "Davis Kwong" <dkwongus.ibm.com> wrote in message news:bgs75m$grj$1terabinaries.xmission.com...
    > Hi, I'm not sure which manual is for 9.21, but here's a cut & paste of the
    > 9.4 IBM Informix Backup and Restore Guide which talks about controlling
    > parallel backup and restore using onbar.
    >
    > The BAR_MAX_BACKUP parameter specifies the maximum number of
    > parallel processes that are allowed for each onbar command. Both UNIX and
    > Windows support parallel backups. Although the database server default
    > value for BAR_MAX_BACKUP is 4, the onconfig.std value is 0.
    Thanks. I will try this.

    rk-
    --

    my email id is bogus


    rkusenet Guest

  6. #6

    Default Re: onbar hangs

    Are you running parallel backups and if yes do you have enough
    tape heads in the storage device? What else is using the storage
    device at the time this is running?

    rkusenet wrote:
    >
    > IDS 9.21.UC4
    > Solaris 2.6
    >
    > EDS takes our backup using onbar. What we have observed is that,
    > quite frequently onbar hangs. By hangs it means, onbar processes
    > keep running indefinitely. Since the backup is taken daily, all
    > subsequent backups also hang until the earlier on is killed.
    >
    > I have a script to look for onbar process 15 hours after it
    > starts, and if it finds one, it sends a mail as follows:-
    >
    > =========================================
    > onbar is still running on flx10
    > root 15709 15703 0 00:50:14 ? 0:00 /bin/sh /opt/informix/bin/onbar -b -L 0
    > root 15878 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    > root 15879 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    > root 15873 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    > root 15872 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    > root 15871 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    > root 15874 15711 0 00:51:14 ? 0:00 /opt/informix/bin/onbar_d -b -L 0
    > root 15711 15709 0 00:50:14 ? 0:01 /opt/informix/bin/onbar_d -b -L 0
    > =========================================
    >
    > The ssyadmin then kills the processes indicated above.
    >
    > Once they kill the client process, we get this message in online.log
    > =======================================
    > 15:04:10 Archive on logs02 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on logs03 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on logs04 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on physdbs ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:11 Archive on fdbs04 ABORTED.
    > 15:04:11 Aborted by client.
    > 15:04:11 Archive on gdbs01 ABORTED.
    > 15:04:11 Aborted by client.
    > =======================================
    > In other words, the backup is useless.
    >
    > Earlier on, it use to be once/twice a month. Now it is once/twice a week.
    > What could be wrong with it. Why does onbar hang for 15 hours without doing
    > anything.
    --
    Paul Watson #
    Oninit Ltd # Growing old is mandatory
    Tel: +44 1436 672201 # Growing up is optional
    Fax: +44 1436 678693 #
    Mob: +44 7818 003457 #
    [url]www.oninit.com[/url] #
    Paul Watson Guest

  7. #7

    Default Re: onbar hangs


    "Brice Avila" <briceavilahotmail.com> wrote in message
    news:c46e3ec5.0308071049.50bec301posting.google.c om...
    > What is happening in the BAR_ACT_LOG, and with the storage manager?
    > The multiple onbar processes for a parallel backup is normal, one for
    > each dbspace. You'll notice that 15709 spawned 15711, and then 15711
    > spawned several dbspace onbar backup processes. The online.log you
    > provided shows a list of the the dbspace backups that are killed when
    > the onbar processes are killed: logs02, logs03, logs04, physdbs,
    > fdbs04, gdbs01. From the information you provided, things appear OK.
    >
    > If 15 hours is too long for a backup (there are some customers I know
    > that would like a backup as short as 15 hours) then you'll need help
    > from your storage manager vendor to see where the bottlenecks are.
    >
    > Hope this information helps.
    >
    > Brice Avila
    > Minneapolis, Minnesota
    brice, there is definitely some problem. onbar never completes and it also
    es up future onbar backups unless it is killed.
    The problem seems to be with multiple onbar processes.

    XBSA-1.0.1 6.0.Build.153 15872 Wed Aug 6 00:52:17 2003
    _nwbsa_is_retryable_error: received a retryable network error
    (Severity 0 Number -13): current request exceeds parallelism
    XBSA-1.0.1 6.0.Build.153 15873 Wed Aug 6 00:52:17 2003
    _nwbsa_is_retryable_error: received a retryable network error
    (Severity 0 Number -13): current request exceeds parallelism
    XBSA-1.0.1 6.0.Build.153 15871 Wed Aug 6 00:52:17 2003
    _nwbsa_is_retryable_error: received a retryable network error
    (Severity 0 Number -13): current request exceeds parallelism
    XBSA-1.0.1 6.0.Build.153 15874 Wed Aug 6 00:52:19 2003
    nwbsa_is_retryable_error: received a retryable network error
    (Severity 0 Number -13): current request exceeds parallelism

    I have reduced BAR_MAX_BACKUP tp 4. Yestday the backup went fine,
    though slow. I will wait for few more weeks to see whether it
    solves the problem.

    Ravi


    rkusenet Guest

  8. #8

    Default Re: onbar hangs

    Ravi,

    Now that you provided some Legato information you perhaps have a
    network bottleneck (?). You'll need to investigate logs along these
    lines for further solutions.

    Lowering BAR_MAX_BACKUP only prevents the amount of information
    passing over the network. While the Informix logs show normal
    processing, the Legato logs show some errant info. Hope this
    information helps.

    Brice Avila
    Minneapolis, Minnesota

    "rkusenet" <rkusenetsympatico.ca> wrote in message news:<bgu7rk$s3cpk$1ID-75254.news.uni-berlin.de>...
    > "Brice Avila" <briceavilahotmail.com> wrote in message
    > news:c46e3ec5.0308071049.50bec301posting.google.c om...
    <SNIP>
    Brice Avila Guest

  9. #9

    Default Re: onbar hangs

    How about something simple like maybe the storage manager is waiting for a
    tape mount?
    When the onbar is hanging, don't just kill it but go into your storage
    manager and see what it thinks is happening.

    "rkusenet" <rkusenetsympatico.ca> wrote in message
    news:bgro1s$rbvfo$1ID-75254.news.uni-berlin.de...
    > IDS 9.21.UC4
    > Solaris 2.6
    >
    > EDS takes our backup using onbar. What we have observed is that,
    > quite frequently onbar hangs. By hangs it means, onbar processes
    > keep running indefinitely. Since the backup is taken daily, all
    > subsequent backups also hang until the earlier on is killed.
    >
    > I have a script to look for onbar process 15 hours after it
    > starts, and if it finds one, it sends a mail as follows:-
    >
    > =========================================
    > onbar is still running on flx10
    > root 15709 15703 0 00:50:14 ? 0:00 /bin/sh
    /opt/informix/bin/onbar -b -L 0
    > root 15878 15711 0 00:51:14 ? 0:00
    /opt/informix/bin/onbar_d -b -L 0
    > root 15879 15711 0 00:51:14 ? 0:00
    /opt/informix/bin/onbar_d -b -L 0
    > root 15873 15711 0 00:51:14 ? 0:00
    /opt/informix/bin/onbar_d -b -L 0
    > root 15872 15711 0 00:51:14 ? 0:00
    /opt/informix/bin/onbar_d -b -L 0
    > root 15871 15711 0 00:51:14 ? 0:00
    /opt/informix/bin/onbar_d -b -L 0
    > root 15874 15711 0 00:51:14 ? 0:00
    /opt/informix/bin/onbar_d -b -L 0
    > root 15711 15709 0 00:50:14 ? 0:01
    /opt/informix/bin/onbar_d -b -L 0
    > =========================================
    >
    > The ssyadmin then kills the processes indicated above.
    >
    > Once they kill the client process, we get this message in online.log
    > =======================================
    > 15:04:10 Archive on logs02 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on logs03 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on logs04 ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:10 Archive on physdbs ABORTED.
    > 15:04:10 Aborted by client.
    > 15:04:11 Archive on fdbs04 ABORTED.
    > 15:04:11 Aborted by client.
    > 15:04:11 Archive on gdbs01 ABORTED.
    > 15:04:11 Aborted by client.
    > =======================================
    > In other words, the backup is useless.
    >
    > Earlier on, it use to be once/twice a month. Now it is once/twice a week.
    > What could be wrong with it. Why does onbar hang for 15 hours without
    doing
    > anything.
    >
    >
    >
    >

    Doug McAllister Guest

  10. #10

    Default Fw: onbar hangs


    You could also try increasing the debug level for onbar, see what its doing
    when its stuck.

    ----- Original Message -----
    From: "Doug McAllister" <doug.mcallisternospam.fmr.com>
    To: <informix-listiiug.org>
    Sent: Wednesday, August 13, 2003 16:45
    Subject: Re: onbar hangs

    > How about something simple like maybe the storage manager is waiting for a
    > tape mount?
    > When the onbar is hanging, don't just kill it but go into your storage
    > manager and see what it thinks is happening.
    >
    > "rkusenet" <rkusenetsympatico.ca> wrote in message
    > news:bgro1s$rbvfo$1ID-75254.news.uni-berlin.de...
    > > IDS 9.21.UC4
    > > Solaris 2.6
    > >
    > > EDS takes our backup using onbar. What we have observed is that,
    > > quite frequently onbar hangs. By hangs it means, onbar processes
    > > keep running indefinitely. Since the backup is taken daily, all
    > > subsequent backups also hang until the earlier on is killed.
    > >
    > > I have a script to look for onbar process 15 hours after it
    > > starts, and if it finds one, it sends a mail as follows:-
    > >
    > > =========================================
    > > onbar is still running on flx10
    > > root 15709 15703 0 00:50:14 ? 0:00 /bin/sh
    > /opt/informix/bin/onbar -b -L 0
    > > root 15878 15711 0 00:51:14 ? 0:00
    > /opt/informix/bin/onbar_d -b -L 0
    > > root 15879 15711 0 00:51:14 ? 0:00
    > /opt/informix/bin/onbar_d -b -L 0
    > > root 15873 15711 0 00:51:14 ? 0:00
    > /opt/informix/bin/onbar_d -b -L 0
    > > root 15872 15711 0 00:51:14 ? 0:00
    > /opt/informix/bin/onbar_d -b -L 0
    > > root 15871 15711 0 00:51:14 ? 0:00
    > /opt/informix/bin/onbar_d -b -L 0
    > > root 15874 15711 0 00:51:14 ? 0:00
    > /opt/informix/bin/onbar_d -b -L 0
    > > root 15711 15709 0 00:50:14 ? 0:01
    > /opt/informix/bin/onbar_d -b -L 0
    > > =========================================
    > >
    > > The ssyadmin then kills the processes indicated above.
    > >
    > > Once they kill the client process, we get this message in online.log
    > > =======================================
    > > 15:04:10 Archive on logs02 ABORTED.
    > > 15:04:10 Aborted by client.
    > > 15:04:10 Archive on logs03 ABORTED.
    > > 15:04:10 Aborted by client.
    > > 15:04:10 Archive on logs04 ABORTED.
    > > 15:04:10 Aborted by client.
    > > 15:04:10 Archive on physdbs ABORTED.
    > > 15:04:10 Aborted by client.
    > > 15:04:11 Archive on fdbs04 ABORTED.
    > > 15:04:11 Aborted by client.
    > > 15:04:11 Archive on gdbs01 ABORTED.
    > > 15:04:11 Aborted by client.
    > > =======================================
    > > In other words, the backup is useless.
    > >
    > > Earlier on, it use to be once/twice a month. Now it is once/twice a
    week.
    > > What could be wrong with it. Why does onbar hang for 15 hours without
    > doing
    > > anything.
    > >
    > >
    > >
    > >
    >
    >
    sending to informix-list
    Mark Denham Guest

Similar Threads

  1. CF hangs when database server hangs or crashes
    By Skjaeret in forum Coldfusion Server Administration
    Replies: 2
    Last Post: June 13th, 03:22 PM
  2. Onbar Backups on 9.4
    By Venkatesh Konnur in forum Informix
    Replies: 0
    Last Post: August 15th, 04:12 PM
  3. Onbar and Veritas Netbackup?
    By Venkatesh Konnur in forum Informix
    Replies: 1
    Last Post: August 8th, 07:40 PM
  4. Onbar Backup?
    By Venkatesh Konnur in forum Informix
    Replies: 2
    Last Post: July 10th, 02:59 PM
  5. Onbar and ISM
    By Phil Rule in forum Informix
    Replies: 1
    Last Post: July 1st, 12:51 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139