WO2001016746A3 - Method and apparatus for extracting first failure and attendant operating information from computer system devices - Google Patents

Method and apparatus for extracting first failure and attendant operating information from computer system devices Download PDF

Info

Publication number
WO2001016746A3
WO2001016746A3 PCT/US2000/024218 US0024218W WO0116746A3 WO 2001016746 A3 WO2001016746 A3 WO 2001016746A3 US 0024218 W US0024218 W US 0024218W WO 0116746 A3 WO0116746 A3 WO 0116746A3
Authority
WO
WIPO (PCT)
Prior art keywords
failure
information
storage
computer system
dedicated
Prior art date
Application number
PCT/US2000/024218
Other languages
French (fr)
Other versions
WO2001016746A2 (en
Inventor
Garry M Tobin
Joseph P Coyle
Peter Nixon
Original Assignee
Sun Microsystems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Microsystems Inc filed Critical Sun Microsystems Inc
Priority to AU73462/00A priority Critical patent/AU7346200A/en
Priority to EP00961521A priority patent/EP1210663A2/en
Publication of WO2001016746A2 publication Critical patent/WO2001016746A2/en
Publication of WO2001016746A3 publication Critical patent/WO2001016746A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0772Means for error signaling, e.g. using interrupts, exception flags, dedicated error registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0721Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/26Functional testing
    • G06F11/267Reconfiguring circuits for testing, e.g. LSSD, partitioning

Abstract

Information regarding the operating conditions of a computer system is stored in a storage which is dedicated to a failure management system. The storage is updated with the current operating conditions either periodically or upon the occurrence of predetermined events. When a first failure identification mechanism identifies a failure in the computer system, a capture mechanism interrupts the updating of the storage leaving information regarding operating conditions which contributed to the failure in the storage. This latter information can then be read out to aid in diagnosis of the failure. Since the operating condition information is stored in a dedicated storage, the information is not modified by events that take place after the failure is identified. In accordance with one embodiment, the computer system ordinarily holds state and other operating information in a set of storage devices, such as, for example, state registers. The dedicated storage device can be a shadow register or other shadow storage device for holding a separate dedicated copy of at least a portion of the operating information so that it is readily available in case a failure is detected. During operation, an updating mechanism continually transfers the information in the state registers to the shadow register until a first failure is detected. When a failure is detected, a capture mechanism controls the updating mechanism to cease transferring information from the state registers to the shadow register. The shadow register can then output its contents, e.g., for analysis, preferably under computer program control.
PCT/US2000/024218 1999-08-31 2000-08-31 Method and apparatus for extracting first failure and attendant operating information from computer system devices WO2001016746A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU73462/00A AU7346200A (en) 1999-08-31 2000-08-31 Method and apparatus for extracting first failure and attendant operating information from computer system devices
EP00961521A EP1210663A2 (en) 1999-08-31 2000-08-31 Method and apparatus for extracting first failure and attendant operating information from computer system devices

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/386,553 US6499113B1 (en) 1999-08-31 1999-08-31 Method and apparatus for extracting first failure and attendant operating information from computer system devices
US09/386,553 1999-08-31

Publications (2)

Publication Number Publication Date
WO2001016746A2 WO2001016746A2 (en) 2001-03-08
WO2001016746A3 true WO2001016746A3 (en) 2001-10-04

Family

ID=23526083

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/024218 WO2001016746A2 (en) 1999-08-31 2000-08-31 Method and apparatus for extracting first failure and attendant operating information from computer system devices

Country Status (4)

Country Link
US (1) US6499113B1 (en)
EP (1) EP1210663A2 (en)
AU (1) AU7346200A (en)
WO (1) WO2001016746A2 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6643796B1 (en) * 2000-05-18 2003-11-04 International Business Machines Corporation Method and apparatus for providing cooperative fault recovery between a processor and a service processor
DE10157188A1 (en) * 2001-11-22 2003-05-28 G I N Mbh Programmable data logger and classifier for CAN systems
US7047462B2 (en) * 2002-01-04 2006-05-16 Hewlett-Packard Development Company, Lp. Method and apparatus for providing JTAG functionality in a remote server management controller
US6884224B2 (en) * 2003-04-04 2005-04-26 Norfolk Medical Needle protection device
EP1577773A1 (en) * 2004-03-15 2005-09-21 Siemens Aktiengesellschaft Multiprocessor system with a diagnosis processor for saving the system state and method for operating a multiprocessor system
US20060195731A1 (en) * 2005-02-17 2006-08-31 International Business Machines Corporation First failure data capture based on threshold violation
ATE533288T1 (en) * 2005-06-10 2011-11-15 Nokia Corp RECONFIGURING THE STANDBY SCREEN OF AN ELECTRONIC DEVICE
US7805634B2 (en) * 2006-09-16 2010-09-28 International Business Machines Corporation Error accumulation register, error accumulation method, and error accumulation system
CN100555240C (en) * 2007-01-16 2009-10-28 国际商业机器公司 The method and system that is used for diagnosis of application program
US9389940B2 (en) * 2013-02-28 2016-07-12 Silicon Graphics International Corp. System and method for error logging
US9231595B2 (en) * 2013-06-12 2016-01-05 International Business Machines Corporation Filtering event log entries
CN106569557A (en) * 2016-11-01 2017-04-19 深圳市亿威尔信息技术股份有限公司 Intelligent board card Bypass control system and method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4559596A (en) * 1982-06-30 1985-12-17 Fujitsu Limited History memory control system
US4660198A (en) * 1985-04-15 1987-04-21 Control Data Corporation Data capture logic for VLSI chips
US4661953A (en) * 1985-10-22 1987-04-28 Amdahl Corporation Error tracking apparatus in a data processing system
US5072450A (en) * 1989-07-27 1991-12-10 Zenith Data Systems Corporation Method and apparatus for error detection and localization
US5383201A (en) * 1991-12-23 1995-01-17 Amdahl Corporation Method and apparatus for locating source of error in high-speed synchronous systems
US5440729A (en) * 1990-04-18 1995-08-08 Nec Corporation Method for handling error information between channel unit and central computer

Family Cites Families (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2075310A (en) 1980-04-30 1981-11-11 Hewlett Packard Ltd Bus extender circuitry for data transmission
US4375051A (en) 1981-02-19 1983-02-22 The Perkin-Elmer Corporation Automatic impedance matching between source and load
DE3584751D1 (en) 1984-09-21 1992-01-09 Amt Holdings DATA TRANSFER SYSTEM.
US4716525A (en) 1985-04-15 1987-12-29 Concurrent Computer Corporation Peripheral controller for coupling data buses having different protocol and transfer rates
US4797815A (en) 1985-11-22 1989-01-10 Paradyne Corporation Interleaved synchronous bus access protocol for a shared memory multi-processor system
JPH0820978B2 (en) * 1987-01-26 1996-03-04 日本電気株式会社 Failure analysis information edit output method
US4864496A (en) 1987-09-04 1989-09-05 Digital Equipment Corporation Bus adapter module for interconnecting busses in a multibus computer system
US4881165A (en) 1988-04-01 1989-11-14 Digital Equipment Corporation Method and apparatus for high speed data transmission between two systems operating under the same clock with unknown and non constant skew in the clock between the two systems
US4965717A (en) * 1988-12-09 1990-10-23 Tandem Computers Incorporated Multiple processor system having shared memory with private-write capability
KR930001922B1 (en) 1989-08-28 1993-03-20 가부시기가이샤 히다찌세이사꾸쇼 Data processor
JPH0540682A (en) * 1990-06-08 1993-02-19 Internatl Business Mach Corp <Ibm> High available trouble-resistant relocation of storage device having atomicity
EP0525221B1 (en) 1991-07-20 1995-12-27 International Business Machines Corporation Quasi-synchronous information transfer and phase alignment means for enabling same
JPH05327376A (en) 1992-05-20 1993-12-10 Fujitsu Ltd Digital control variable gain circuit
US5291123A (en) 1992-09-09 1994-03-01 Hewlett-Packard Company Precision reference current generator
US5574866A (en) 1993-04-05 1996-11-12 Zenith Data Systems Corporation Method and apparatus for providing a data write signal with a programmable duration
US5654653A (en) 1993-06-18 1997-08-05 Digital Equipment Corporation Reduced system bus receiver setup time by latching unamplified bus voltage
US5634014A (en) 1993-06-18 1997-05-27 Digital Equipment Corporation Semiconductor process, power supply voltage and temperature compensated integrated system bus termination
US5479123A (en) 1993-06-18 1995-12-26 Digital Equipment Corporation Externally programmable integrated bus terminator for optimizing system bus performance
US5657456A (en) 1993-06-18 1997-08-12 Digital Equipment Corporation Semiconductor process power supply voltage and temperature compensated integrated system bus driver rise and fall time
US5406147A (en) 1993-06-18 1995-04-11 Digital Equipment Corporation Propagation speedup by use of complementary resolver outputs in a system bus receiver
US5687330A (en) 1993-06-18 1997-11-11 Digital Equipment Corporation Semiconductor process, power supply and temperature compensated system bus integrated interface architecture with precision receiver
US5461330A (en) 1993-06-18 1995-10-24 Digital Equipment Corporation Bus settle time by using previous bus state to condition bus at all receiving locations
US5359235A (en) 1993-06-18 1994-10-25 Digital Equipment Corporation Bus termination resistance linearity circuit
US5534811A (en) 1993-06-18 1996-07-09 Digital Equipment Corporation Integrated I/O bus circuit protection for multiple-driven system bus signals
US5600824A (en) 1994-02-04 1997-02-04 Hewlett-Packard Company Clock generating means for generating bus clock and chip clock synchronously having frequency ratio of N-1/N responsive to synchronization signal for inhibiting data transfer
KR0128271B1 (en) * 1994-02-22 1998-04-15 윌리암 티. 엘리스 Remote data duplexing
US5592658A (en) 1994-09-29 1997-01-07 Novacom Technologies Ltd. Apparatus and method for computer network clock recovery and jitter attenuation
DE4445846A1 (en) 1994-12-22 1996-06-27 Sel Alcatel Ag Method and circuit arrangement for the termination of a line leading to an integrated CMOS circuit
EP0721162A2 (en) * 1995-01-06 1996-07-10 Hewlett-Packard Company Mirrored memory dual controller disk storage system
US5761410A (en) * 1995-06-28 1998-06-02 International Business Machines Corporation Storage management mechanism that detects write failures that occur on sector boundaries
US5790775A (en) * 1995-10-23 1998-08-04 Digital Equipment Corporation Host transparent storage controller failover/failback of SCSI targets and associated units
US5974562A (en) * 1995-12-05 1999-10-26 Ncr Corporation Network management system extension
US5705937A (en) 1996-02-23 1998-01-06 Cypress Semiconductor Corporation Apparatus for programmable dynamic termination
US5828889A (en) * 1996-05-31 1998-10-27 Sun Microsystems, Inc. Quorum mechanism in a two-node distributed computer system
US5819053A (en) 1996-06-05 1998-10-06 Compaq Computer Corporation Computer system bus performance monitoring
US5781028A (en) 1996-06-21 1998-07-14 Microsoft Corporation System and method for a switched data bus termination
US6363497B1 (en) * 1997-05-13 2002-03-26 Micron Technology, Inc. System for clustering software applications
US6145098A (en) * 1997-05-13 2000-11-07 Micron Electronics, Inc. System for displaying system status
US6275953B1 (en) * 1997-09-26 2001-08-14 Emc Corporation Recovery from failure of a data processor in a network server
US6078979A (en) 1998-06-19 2000-06-20 Dell Usa, L.P. Selective isolation of a storage subsystem bus utilzing a subsystem controller
US6233177B1 (en) 2000-06-22 2001-05-15 Xilinx, Inc. Bitline latch switching circuit for floating gate memory device requiring zero volt programming voltage

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4559596A (en) * 1982-06-30 1985-12-17 Fujitsu Limited History memory control system
US4660198A (en) * 1985-04-15 1987-04-21 Control Data Corporation Data capture logic for VLSI chips
US4661953A (en) * 1985-10-22 1987-04-28 Amdahl Corporation Error tracking apparatus in a data processing system
US5072450A (en) * 1989-07-27 1991-12-10 Zenith Data Systems Corporation Method and apparatus for error detection and localization
US5440729A (en) * 1990-04-18 1995-08-08 Nec Corporation Method for handling error information between channel unit and central computer
US5383201A (en) * 1991-12-23 1995-01-17 Amdahl Corporation Method and apparatus for locating source of error in high-speed synchronous systems

Also Published As

Publication number Publication date
AU7346200A (en) 2001-03-26
US6499113B1 (en) 2002-12-24
EP1210663A2 (en) 2002-06-05
WO2001016746A2 (en) 2001-03-08

Similar Documents

Publication Publication Date Title
CN101464819B (en) Hardware driven processor state storage prior to entering low power mode
US10482045B2 (en) Data communication interface for processing data in low power systems
CN100412865C (en) Read-copy update system and method
US20070028144A1 (en) Systems and methods for checkpointing
WO2001016746A3 (en) Method and apparatus for extracting first failure and attendant operating information from computer system devices
US6308318B2 (en) Method and apparatus for handling asynchronous exceptions in a dynamic translation system
US7496787B2 (en) Systems and methods for checkpointing
EP0495165A3 (en) Overlapped serialization
ATE216098T1 (en) MULTI-PROCESSOR SYSTEM BRIDGE WITH ACCESS CONTROL
US20060064517A1 (en) Event-driven DMA controller
US10579455B2 (en) Image forming apparatus and method for controlling image forming apparatus
CN108664655A (en) The log storing method and system of embedded system
US20070209039A1 (en) Message queue control program and message queuing system
EP0982664A3 (en) Coupling host processor to memory subsystem
CN1963768A (en) Processing method for interruption and apparatus thereof
CN110018921B (en) Event recording controller and electronic device
JP2993731B2 (en) Control method of hardware trace information
US11386277B2 (en) Magnetic recording medium processing apparatus and magnetic field generation method
JPS603223B2 (en) Central processing unit error collection method
KR20220037859A (en) Iot service flash code update system and iot service flash code update method
JP3191282B2 (en) Failure information data collection method
JPH0232885A (en) Printing apparatus
JPH05120096A (en) File overflow controlling system
JPH03100736A (en) Patrol diagnostic device
JP3199990B2 (en) Information processing device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 2000961521

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2000961521

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 2000961521

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP