Australia - Updated: 24-SEP-2003
hp.com home products and services support and drivers solutions how to buy
» contact hp
hp.com home hp OpenVMS ECOs

IMPORTANT NOTICE

The online distribution of OpenVMS and related product patches is being migrated to the HP ITRC (Information Technology Resource Center) patch distribution site. The new ITRC patch server will allow OpenVMS customers to take advantage of many enhanced features for patch searching and distribution.

Beginning August 1, 2003, OpenVMS and related Layered Product, publicly available patches will be available from the HP ITRC web site at

http://itrc.hp.com/service/patch/mainPage.do

The same patches will still be available from the existing patch server in Colorado Springs (http://www.support.compaq.com/patches/) through the end of October 2003, to give customers sufficient time to update their bookmarks and make the transition to the HP ITRC web site.

ECO kits will also be available by raw FTP from (ftp://ftp.itrc.hp.com/).

PLEASE UPDATE YOUR BOOKMARKS AND REGISTER ON THE NEW SITE NOW

Note: if you're having trouble connecting to the ITRC site, please delete any cookies for "itrc.hp.com" from your browser and try again. Report any difficulties with or suggestions to MrVMS

» Sydney CSC home page

Navigation
» ECOinfo main index
» Search ECOs
» Search FTP site
» Browse FTP site

ECO Indexes
» Chronological Index
» Indexed by Version
» Indexed by Rating
» Alpha Indexed by Name
» VAX Indexed by Name
» On Hold List

Associated Links
» OpenVMS Home Page
» OpenVMS News
» DIA/WIS Web Service

Feedback
» mail to CSC
.
Sydney Customer Support Centre OpenVMS ECO information
    Updated: 24-SEP-2003 (Use your browsers' Reload button to ensure you're viewing the most recent version)

VMS73_MEM_CHAN-V0200 Alpha V7.3 Memory Channel ECO Summary

To obtain this kit please call the Customer Support Centre or use the FTP site

Search for this ECO kit and dependencies
Search the Compaq FTP web site this kit (exact match)
Search the Compaq FTP web site this or related ECOs

    
      
    
    *OpenVMS] VMS73_MEM_CHAN-V0200 Alpha V7.3 Memory Channel ECO Summary
    
    New Kit Date:       22-SEP-2003
    Modification Date:  None
    Modification Type:  NEW KIT
    
    Copyright (c) Hewlett-Packard Company 2002,2003.  All rights reserved.
      
    OP/SYS:     OpenVMS Alpha V7.3
    
    COMPONENT:  Memory Channel
    
    SOURCE:     Hewlett-Packard Company
    
    ECO INFORMATION:
    
         ECO Kit Name:  VMS73_MEM_CHAN-V0200
                        DEC-AXPVMS-VMS73_MEM_CHAN-V0200--4.PCSI
         ECO Kits Superseded by This ECO Kit: Yes
         ECO Kit Approximate Size: 720 Blocks
         Kit Applies To:  OpenVMS Alpha V7.3
         System/Cluster Reboot Necessary: Yes
         Rolling Re-boot Supported:  Yes
         Installation Rating:  INSTALL_2
                                 2 : To  be  installed  by   all  customers  using  the  following
    	                         feature(s):
    
                                     Memory Channel
      
         Kit Dependencies:
    
           The following remedial kit(s), or later, must be installed BEFORE
           installation of this, or any required kit:
    
    	VMS73_PCSI-V0100
    	VMS73_UPDATE-V0100
    
           In order to receive all the corrections listed in this
           kit, the following remedial kits should also be installed:
    
             None
    
    
    ECO KIT SUMMARY:
    
    FILES PATCHED OR REPLACED:
    
    
          o  [SYS$LDR]SYS$MCDRIVER.EXE (new image)
    
             Image Identification Information
    
             image name: "SYS$MCDRIVER"
             image file identification:  "X-59"
             image file build identification:  "X91Y-0060010000"
             link date/time: 14-MAY-2002 07:04:33.89
             linker identification:  "A11-50"
    
          o  [SYS$LDR]SYS$PMDRIVER.EXE (new image)
    
             Image Identification Information
    
             image name: "SYS$PMDRIVER"
             image file identification:  "X-31"
             image file build identification:  "X91Y-0060010010"
             link date/time: 12-MAY-2003 15:51:28.65
             linker identification:  "A11-50"
    
    
    
    
    PROBLEMS ADDRESSED IN THIS KIT
    
    New problems addressed in the VMS73_MEM_CHAN-V0200 kit
    
    
         o  PMDRIVER FORK_THREAD TQE DOUBLE-INSERT FIX
    
      
                   After installation of the VMS73_MEM_CHAN-V0100 ECO kit,
                   systems may hang when using the Memory Channel SCS-port.
                   The system will hang and not crash, requiring manual
                   intervention and a system-HALT (Console ^P) to recover.
                   This hang only occurs if there is high SCS-data-transfer
                   activity (MSCP/TMSCP disk/tape serving) with high IPL-8
                   fork latency on the Memory Channel target node.
    
                   A forced operator crash-dump and analysis will reveal the
                   OpenVMS EXEC looping within the following routines:
    
                   + SYSTEM_PRIMITIVE*.EXE: EXE$SWTIMER_FORK
                   Primary SMP CPU stuck scanning EXE$GL_TQFL
                   TQE-queue; check PCs on CPU-0 stack.
    
                   + SYS$PMDRIVER.EXE:      PM$COMQ_RETRY
                   V7.2-2: TQE$L_FPC: SYS$PMDRIVER+13CC0
                   SDA> FORMAT/TYPE=TQE @EXE$GL_TQFL
                   SDA> FORMAT/TYPE=TQE @.
                   SDA> REPEAT ..........
    
    
                   The OpenVMS EXE$GL_TQFL TQE-timer-queue will be
                   corrupted, typically with the first TQE linked back to
                   itself:
    
                   + SDA> VAL QUE EXE$GL_TQFL
    
    
                   Occasionally, there will be an ACCVIO within
                   TIMESCHDL_xxx (SYSTEM_PRIMITIVES) while servicing
                   TQE-queue.
    
                   Images Affected:[SYS$LDR]SYS$PMDRIVER.EXE
    
    
    
    
    
    
    
    
    Problems addressed in the VMS73_MEM_CHAN-V0100 kit
    
         o  Memory Channel virtual-hub  (VHUB)  can  fail  to  come
            "ONLINE"
    
      
                   1.  A Memory Channel virtual-hub (VHUB) will fail to come
                       "ONLINE" and form SCS-virtual-circuitlink-up if the
                       Memory Channel VHUB VH0/Master node is not booted
                       first, prior to booting the VHUB VH1/Slave MC-node
    
                   2.  If a VH0/Master Memory Channel node crashes and/or
                       reboots while the VH1/Slave Memory Channel node
                       remains running, the Memory Channel link will fail
                       and both VHUB Memory Channel nodes MCA0 (and MCB0 if
                       applicable) will remain "OFFLINE"
    
                   This MCx0 "OFFLINE" problem may also occur during
                   MCA0/MCB0 adapter/link error-handling/recovery.
    
                   The following symptoms are manifestations of this MC VHUB
                   BOOT "OFFLINE" problem:
    
                   OPA0: console errors:
                   --------------------
    
                   %MCA0 CPU00:  19-SEP-2000 04:17:50  Slave but adapter_ok
                                 off, retrying.
                   %MCA0 CPU00:  19-SEP-2000 04:17:50 MC re-init 5 second timer.
                   %MCA0 CPU00:  19-SEP-2000 04:17:55 Slave but adapter_ok
                                 off, retrying.
                   %MCA0 CPU00:  19-SEP-2000 04:17:55 MC re-init 5 second timer.
                                 .
                                 .
                                 .
                   ON REMOTE NODE ATTEMPTING MC SW INIT .........
                   MCA0 CPU00:  19-SEP-2000 04:27:50 node state retries exceeded"
    
                   DCL SHOW DEVICE command output:
                   -------------------------------
    
    
                   $ DCL SHOW DVICE MCA0: & PMA0: (& MCB0:/PMB0:) = OFFLINE:
    
                   $ SHOW DEVICE MC
                    Device                  Device           Error
                     Name                   Status           Count
                     MCA0:                  Offline           2
                     MCB0:                  Offline           16
      
                   $ SHOW DEVICE PM
                    Device                  Device           Error
                     Name                   Status           Count
                     PMA0:                  Offline            0
                     PMB0:                  Offline            0
    
    
                   Images Affected:[SYS$LDR]SYS$MCDRIVER.EXE
    
    
    
         o  MC_INCONSTATE (SYS$MCDRIVER) bugcheck
    
      
                   An MC_INCONSTATE (SYS$MCDRIVER) bugcheck may occur during
                   local/remote Memory Channel node reboot or Memory Channel
                   adapter/Memory Channel link- error-recovery.  This
                   bugcheck can occur regardless of the Memory Channel hub
                   configuration:  VHUB or real-HUB.  The MC_INCONSTATE
                   bugcheck will typically occur when a "nested error
                   (MCDRIVER-internal or MC-adapter HW-error)" is
                   encountered while recovering from a memory channel link
                   error or local/remote memory channel node crash/reboot.
    
                   The "MC_INCONSTATE" bugcheck is obvious, and is nearly
                   always caused by this "nested error-handling" bug.  A
                   typical MCx0:  error-log event sequence, and SDA> crash
                   summary are shown below:
    
                   MCx0: ERROR-LOG SUMMARY: Unsuccessful events:
                   ---------------------------------------------
                   MCB0 - Hardware error, reinitializing.
                   MCB0 -
                           Node 0:     State:  Uninitialized
                    Node 1:     State:  Uninitialized
                   MCB0 - Memory channel link online failure 2
                   MCB0 - We shouldn't be here.
                           CRASH - MC_INCONSTATE
    
                   Crashdump Summary Information:
                   ------------------------------
                   Bugcheck Type:     MC_INCONSTATE, Fatal error
                                      detected by Memory Channel
                   Failing PC:        FFFFFFFF.E2983A44  SYS$MCDRIVER+0BA44
                   Failing PS:        30000000.00000804
                   Module:            SYS$MCDRIVER (Link Date/Time:
    
    
                               29-DEC-1999 04:09:37.99)
                   Offset:            0000BA44
    
    
                   Images Affected:[SYS$LDR]SYS$MCDRIVER.EXE
    
    
    
         o  Memory Channel Receive channel (RX_MESS_CHAN) message
                     processing may hang
      
                   Memory Channel Receive channel (RX_MESS_CHAN) message
                   processing may hang after processing 512 RX_MESS_CHAN
                   messages during a single fork-thread
                   ([MEM_CHAN]MC$HANDLE_MESS_CHAN_INT routine).  This could
                   occur with heavy Memory Channel SCS-traffic and high
                   IPL-8 fork-thread scheduling latency.  A Memory Channel
                   RX_MESS_CHAN message-handling hang will lead to
                   CNXMGR/LOCK_MGR stalls (and potential cluster hangs) as
                   well as SCS "virtual-circuit timeouts".
    
                   OPA0: CONSOLE PM/MC ERROR MESSAGES:
                   -----------------------------------
                   %PMA0 CPU00:  ... MC$_CHAN_QUE_EMPTY
                                     channel = 541C8  ppd = 83DD4CC0
                   %PMA0 CPU00:  ... stall state CLEAR
                              channel = 541C8  ppd = 83DD4CC0
                   %MCA0 CPU00:  ... Timeslice exceeded
                              while in workque for node RM763A
                   %MCA0 CPU00:  ... Timeslice exceeded while in workque
                                     for node RM763A
                   %MCA0 CPU00:  ... Timeslice exceeded while in workque
                              for node RM763A
                   %PMA0, Virtual Circuit Timeout - REMOTE PORT xxxx
    
    
                   SCS VC-TIMEOUT ERRLOG ENTRY:
                   ----------------------------
                                            .
                                     .
                                     .
                   Error Type/SubType     x4009    Signaled via Packet, Virtual
                                                   Circuit Timeout.
    
                   The "...  Timeslice exceeded" error may continue to occur
                   after this fix is applied.  However, MC RX_MESS_CHAN
                   processing will no longer hang after this event.
    
                   Images Affected:[SYS$LDR]SYS$MCDRIVER.EXE
    
    
    
         o  MCDRIVER enters an infinite Hardware/Software
            initialization error-retry loop
    
      
                   Following a boot-time Memory Channel C
                   unit-init/self-test "LOOPBACK WRITE TEST" failure, which
                   indicates a Memory Channel adapter PCI-DMA error, the
                   MCDRIVER will enter an infinite HW/SW initialization
                   error-retry loop.  The following OPA0:/console errors
                   will be issued at 5 second intervals, changing to 10
                   minute intervals after 20 retries:
    
                   %MCA0 CPU00:  ... MC loopback write interrupt test failed.
                   %MCA0 CPU00:  ... Couldn't get mgmt lock.
                   %MCA0 CPU00:  ... ERR - ucb offline and adapter not crashing .
                   %MCA0 CPU00:  ... Couldn't get mgmt lock.
                   %MCA0 CPU00:  ... ERR - ucb offline and adapter not crashing .
                   %MCA0 CPU00:  ... Couldn't get mgmt lock.
                   %MCA0 CPU00:  ... ERR - ucb offline and adapter not crashing .
    
    
                   Note:  The first error message occurs on the first pass
                   only.
    
                   Images Affected:[SYS$LDR]SYS$MCDRIVER.EXE
    
    
    
         o  System crashes with a CPUSPINWAIT, CPU spinwait timer
            expired bugcheck.
    
      
                   CPUSPINWAIT bugchecks may occur on any GSxxx Alphaserver
                   platform (GS140,GS80/160/320) with a Memory
                   Channel-adapter.  The bugchecks occur due to an eror in
                   the SYS$MCDRIVER "MC$ALLOCATE_MESSAGE" routine performing
                   Memory Channel message free-queue-header "loopback
                   WRITE", and an incorrect timer implementation.  The
                   CPUSPINWAIT bugcheck will always involve an SMP$TIMEOUT
                   acquiring the SCS-spinlock while another SMP-CPU is
                   holding the SCS-spinlock within the SYS$MCDRIVER /
                   [MEM_CHAN]MCCHANNELS.C MC$ALLOCATE_MESSAGE routine.
    
                   Crashdump Summary Information:
                   ------------------------------
                   Bugcheck Type:     CPUSPINWAIT, CPU spinwait timer expired
                   Failing PC:        FFFFFFFF.8007A384    SMP$TIMEOUT_C+00064
                   Failing PS:        28000000.00000804
                   Module:            SYSTEM_SYNCHRONIZATION_MIN
                   Offset:            00000384
    
    
                   NOTE:  The "MC loopback write interrupt test failed"
                   error is typically due to a leftover/stale Memory Channel
                   adapter PCI-logic error-state that will only clear with a
                   CONSOLE >>> INIT operation (to perform PCI-bus RESET).
                   Users who frequently reboot without using the CONSOLE >>>
                   BOOT_RESET = ON switch (Environment Variable) or without
                   performing a CONSOLE >>> INIT command are susceptible to
                   this "MC loopback write test" error.
    
                   Images Affected:[SYS$LDR]SYS$MCDRIVER.EXE
    
    
    
         o  System can crash with a INVPTEFMT, Invalid page table
            entry format
    
      
                   Any SCS-data-transfer of "0-length", using the
                   Memory-Channel/MC SCS-port will result in an "INVPTEFMT,
                   Invalid page table entry format" bugcheck The bugcheck is
                   within IOC_STD$PTETOPFN, as a result of a call to
                   IOC_STD$FILSPT from PMDRIVER.C/SETUP_COPY.
    
                   Crashdump Summary Information:
                   ------------------------------
                   Bugcheck Type:     INVPTEFMT, Invalid page table
                                      entry format
                   Current Process:   NULL
                   Current Image:     <not available>
                   Failing PC:        FFFFFFFF.800B88FC
                               IOC_STD$PTETOPFN_C+0008C
                   Failing PS:        38000000.00000804
                   Module:            IO_ROUTINES (Link Date/Time:
                               13-DEC-2000 00:39:37.49)
                   Offset:            000048FC
    
    
                   Images Affected:[SYS$LDR]SYS$PMDRIVER.EXE
    
    
    
         o  SCS "SEND MESSAGE" and SCS data transfer commands can
            stall or hang
    
                   SCS "SEND MESSAGE" (typically LOCK_MGR and MSCP disk
                   commands) and SCS data transfer commands, issued over a
                   PM/MC SCS virtual circuit (VC), can stall or hang
                   following exhaustion of Memory channel
                   "channel-free-queue" entries.  The duration of this stall
                   or hang is entirely dependent on SCS-sysap traffic and
                   flow-control (SCS "credit") patterns and will persist
                   until one of the following occurs:
    
                    o  SCS VC timeout error closes the VC
    
                    o  SCS-sysap sends a message that breaks the stalemate
    
                    o  SCS VC timeout mechanism sends a message that breaks
                       the stalemate
    
                    o  PMx0:  SCS-port timeout occurs, crashing the MC port
    
    
                   This SYS$PMDRIVER MC-SCS-command processing hang/stall
                   can occur under the following two conditions:
    
                    -  HANG:  Under heavy and primarily unidirectional
                       loads;
    
                    -  STALL:  Under more bi-directional loads, stalls will
                       create low performance over the Memory Channel VC,
                       drastically reducing Memory Channel performance under
                       load.
    
    
                   Because this hang/stall will block internode SCS-sysap
                   cluster communications, symptoms can be obscure and
                   numerous, or may manifest as:
    
                    o  Performance degradation over Memory Channel based SCS
                       VCs
    
                    o  A SCS VC-timeout
    
                    o  A LOCK_MGR stall/hang or performance loss
    
                    o  MSCP served disk command timeouts or disk I/O
                       slowdowns
    
                    o  Customer LOCK_MGR-dependent application stalls,
                       hangs, or slowdowns
    
    
                   Images Affected:[SYS$LDR]SYS$PMDRIVER.EXE
    
    RELATED ARTICLES:
    
    Detailed articles describing the problems listed above may exist in
    the OPENVMS database(s).  To view these articles,
    open the appropriate product database and perform a query using either
    of the following search strings: 'VMS73_MEM_CHAN-V0200' or
    'VMS73_MEM_CHAN'.
      
    
    ECO KIT ORDERING INSTRUCTIONS:
    
    If after an evaluation you wish to obtain this kit, request it
    electronically using the appropriate Advanced Electronic Services
    (AES) Service Tool.  If you are not familiar with how to request
    kits electronically, open the DIA, WIS or DSNLINK database and
    review the article entitled:
    
         [AES] How To Electronically Request ECO Kits Using Service Tools
    
    
    
    INSTALLATION INSTRUCTIONS:
    
         Install this kit with the POLYCENTER Software installation utility
         by logging into the SYSTEM account, and typing the following at the
         DCL prompt:
    
         PRODUCT INSTALL VMS73_MEM_CHAN /SOURCE=[location of Kit]
    
         The kit location may be a tape drive, CD, or a disk directory that
         contains the kit.
    
         Additional help on installing PCSI kits can be found by typing
         HELP PRODUCT INSTALL at the system prompt
    
    
    
         o  Scripting of Answers to Installation Questions
    
         During installation, this kit will ask and require user response to
         several questions.  If you wish to automate the installation of
         this kit and avoid having to provide responses to these questions,
         you must create a DCL command procedure that includes the following
         definitions and commands:
    
          -  $ DEFINE/SYS NO_ASK$BACKUP TRUE
    
          -  $ DEFINE/SYS NO_ASK$REBOOT TRUE
    
          -  Add the following qualifiers to the PRODUCT INSTALL command and
             add that command to the DCL procedure.
    
               /PROD=DEC/BASE=AXPVMS/VER=V2.0
    
    
          -  De-assign the logicals assigned
    
         For example, a sample command file to install the VMS73_MEM_CHAN
         kit would be:
    
         $
         $ DEFINE/SYS NO_ASK$BACKUP TRUE
         $ DEFINE/SYS NO_ASK$REBOOT TRUE
         $!
         $ PROD INSTALL VMS73_MEM_CHAN/PROD=DEC/BASE=AXPVMS/VER=V2.0
         $!
         $ DEASSIGN/SYS NO_ASK$BACKUP
         $ DEASSIGN/SYS NO_ASK$REBOOT
         $!
         $ exit
    
    
    
    COPYRIGHT AND DISCLAIMER:
    
         (C) Copyright 2003 Hewlett-Packard Development Company, L.P.
    
         Confidential computer software.  Valid license from HP  and/or  its
         subsidiaries required for possession, use, or copying.
    
         Consistent  with  FAR  12.211  and  12.212,   Commercial   Computer
         Software,  Computer  Software Documentation, and Technical Data for
         Commercial  Items  are  licensed  to  the  U.S.   Government  under
         vendor's standard commercial license.
    
         Neither HP  nor  any  of  its  subsidiaries  shall  be  liable  for
         technical  or  editorial errors or omissions contained herein.  The
         information in this document is provided "as is"  without  warranty
         of  any  kind  and  is  subject  to  change  without  notice.   The
         warranties for HP products are set forth  in  the  express  limited
         warranty  statements  accompanying  such  products.  Nothing herein
         should be construed as constituting an additional warranty.
    
         DISCLAIMER OF WARRANTY AND LIMITATION OF LIABILITY
    
         THIS PATCH IS PROVIDED AS IS, WITHOUT WARRANTY OF  ANY  KIND.   ALL
         EXPRESS  OR  IMPLIED  CONDITIONS,  REPRESENTATIONS  AND WARRANTIES,
         INCLUDING ANY IMPLIED  WARRANTY  OF  MERCHANTABILITY,  FITNESS  FOR
         PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED TO THE
         EXTENT PERMITTED BY APPLICABLE LAW.  IN NO  EVENT  WILL  COMPAQ  BE
         LIABLE  FOR  ANY  LOST REVENUE OR PROFIT, OR FOR SPECIAL, INDIRECT,
         CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER  CAUSED  AND
         REGARDLESS  OF  THE  THEORY OF LIABILITY, WITH RESPECT TO ANY PATCH
         MADE AVAILABLE HERE OR TO THE USE OF SUCH PATCH.
    
    
    All trademarks are the property of their respective owners.
    
    
    
    
privacy statement using this site means you accept its terms feedback to the webmaster
VMS rules VMS rocks OpenVMS rules OpenVMS rocks