On January 21, 2004, PICMG adopted the first revision to PICMG 3.0 R1.0, the Advanced Telecom Computing Architecture (AdvancedTCA) specification. This Engineering Change Notice (ECN) (officially: ECN 3.0-1.0-001) reflects a year of experience with the adopted specification.
The four PICMG-sponsored AdvancedTCA Interoperability Workshops (AIWs) in 2003 were a key factor in identifying potential improvement areas for the specification, which was adopted on the last day of 2002. These events typically include 15 to 20 organizations, each of which brought one or more independently implemented AdvancedTCA products for testing. Participants have built a collection of more than 25 test plans for exercising interoperability. During an AIW, these test plans are executed systematically with combinations of participant products.
This process has been very useful in uncovering areas where independent implementers interpreted specification requirements differently or where corner cases have not been sufficiently covered in the specification. Most of the current test plans focus on management functionality, but mechanical and backplane signal integrity topics are also covered. These events are continuing in 2004, with AIW #6 held in January, and AIW #7 planned for June.
PICMG platform management is based on the Intelligent Platform Management Interface (IPMI) specification that is widely used for this purpose in the server and PC industry. Figure 1 shows the management aspects of an example AdvancedTCA shelf, including potential areas to use Pigeon Point Systems’ IPM Sentry off-the-shelf management components.
Figure 1 (click to zoom). PICMG Platform Management Aspects Of An Example AdvancedTCA Shelf
In the shelf management area, most of the hundreds of change requests processed during the intensive ECN development process were editorial improvements or simple clarifications. The remainder of this article covers some of the more substantive changes for the following areas:
IPMB-0 The dual redundant Intelligent Platform Management Bus (IPMB) that is the main in-shelf communication path between the Shelf Manager (itself likely dual redundant) and the intelligent Field Replaceable Units (FRUs) in the shelf.
IPM Controllers These represent each of the managed FRUs in the shelf, monitoring the operation of those FRUs, reporting to the Shelf Manager(s).
Shelf Managers These supervise the operation of the shelf and represent the shelf to a higher level system manager.
IPMI entity facility This can be used to allow the system manager to construct an accurate picture of the component relationships in a shelf.
Specification Refinements For IPMB-0
First, the IPMB-0 fault detection responsibilities were clarified. In the revised requirements, the Shelf Manager takes responsibility for fault detection for all segments of IPMB-0 that are not behind the required bus isolators associated with each IPM Controller. Each IPM Controller is responsible for fault detection behind its local isolator. The original specification did not clearly delineate these responsibility zones. The result was potential ambiguity and redundancy when IPMB-0 faults were detected. Figure 2 shows these fault detection responsibilities.
Figure 2 (click to zoom): Fault Responsbility Zones For Each IPM Controller
Second, the ECN fully supports the possibility that IPMB-0 may be implemented in a radial rather than a bused fashion. In the radial approach, a separate isolated IPMB-0 segment could reach each board slot. This option was already present in the original specification but was not fully developed. For instance, there was no way for the IPMB-0 sensors and corresponding controls at the Shelf Manager to distinguish among the different radial segments to each slot. The revised specification addresses this omission by providing for distinct sensors for each radial IPMB-0 segment.
Finally, the electrical requirements for IPMB-0 were refined, based on experience at the AIWs. For instance, the original specification allowed much more implementation freedom regarding the location in the shelf of terminations for the open collector IPMB-0 signal lines. This freedom created some confusion among implementers in different parts of the AdvancedTCA ecosystem, so the ECN places the responsibility for this termination squarely with the Shelf Management Controllers (ShMCs).
Refinements in IPM Controller Operations
The first example in this category can be summarized in the context of the state transition diagram that governs every managed FRU in AdvancedTCA. A simplified version of this diagram is shown in Figure 3.
Figure 3 (click to zoom). Simplified State Transition Diagram That Governs Every Managed FRU In AdvancedTCA
This diagram is not different in the revised specification. The differences concern some state transition condition definitions. In particular, the Extraction Criteria Met condition that triggers entry into the Deactivation Request state (M5) is now based on a state variable for the FRU. This variable is set when the FRU is activated and cleared by the act of opening an extraction handle or by explicit software action. This resolves an ambiguity in the specification that showed up in AIW testing and also provides a standard way for software to request deactivation.
The second example in this category relates to the bused E-Keying facility in the specification, which supports shared use of the bused connections on an AdvancedTCA backplane, such as the synchronization clock buses, among independent applications in a shelf. When the AIW test plan for bused E-Keying was developed, some ambiguities and oversights in how the original specification defined this facility became obvious. The revised specification clarified these issues.
Refinements In Shelf Manager Operations
The Shelf FRU information, typically stored in nonvolatile memory attached to the backplane, describes a wide variety of critical information about the shelf, such as its power handling capabilities and backplane interconnects. AdvancedTCA provides wide latitude in how this storage is implemented; essentially any IPM controller in the shelf can represent it. But there are likely to be two or more copies for redundancy, and this information may change during shelf operation (for instance, to accurately reflect the maximum current available from the external battery plant). How should these changes be coordinated among the various redundant copies and between redundant Shelf Managers? In the revised specification, the Shelf Managers are responsible for finding copies of this information. The active Shelf Manager provides a single point of read/write access to the up-to-date version, coordinated with the backup Shelf Manager as necessary.
The active Shelf Manager also implements the IPMI LAN interface, which is the one required mechanism for an overall system manager to interact with the shelf. The LAN interface implements IPMI messaging over an Internet Protocol (IP) based connection, typically Ethernet. The default IP address for this communication is part of the Shelf FRU information so that a shelf can be configured to start up with this address automatically accessible. The ECN adds provisions for a corresponding default gateway IP address and subnet mask.
The ECN also clarifies how the system manager can send IPMI messages to IPM Controllers in the shelf via the LAN interface to the Shelf Manager. There were ambiguities in this area even in the base IPMI 1.5 specification (in particular, an ill-defined format for the response to a Send Message command). These ambiguities led to incompatibilities between AdvancedTCA Shelf Managers, as demonstrated by AIW testing. The authors of IPMI issued a formal clarification, and the ECN simply references that clarification.
Using IPMI Entities To Capture Component Relationships
One of the most important management additions in the ECN concerns the IPMI entity concept. Entities provide a consistent way to indicate the type (for example, board, processor, memory module, or power supply) and the instance (for example, which processor on a given board) for components in the shelf. When properly and consistently applied, the entity attributes of IPMI objects in a shelf allow system manager software to construct an accurate picture of the relationships among components and between components and associated sensors. For instance, a temperature sensor can be associated with a particular processor on a particular board in the shelf.
The entity concept already exists in the IPMI specification but is covered there primarily in the context of traditional server systems, which tend to have a static population of FRUs. In contrast, the component population in AdvancedTCA is much more dynamic, with hot swappable boards and even potentially hot swappable mezzanine modules based on the AdvancedMC specifications, now in development within PICMG. The original AdvancedTCA specification did not provide any guidelines for applying this concept; the ECN repairs this omission, being careful to respect the inherently dynamic properties of AdvancedTCA shelves.
One key consumer of ECN-compliant entity data is system management software based on the interfaces defined by the Service Availability Forum (SAF). The SAF Hardware Platform Interface (HPI) is well suited to be layered above the IPMI LAN interface (perhaps as the lowest layer of the shelf-external system manager (see Figure 4). Visit www.sa-forum.org for further information.
Figure 4: SAF Hardware Platform Interface (HPI) Layered
Above IPMI LAN Interface
Using ECN-compliant entity data accessed over the IPMI LAN interface, HPI can present a well-organized view of the components of a shelf to its client application(s) in the system manager.
Conclusion
The recently adopted ECN-001 for PICMG 3.0, the AdvancedTCA specification, captures numerous management-related specification improvements based on the first year’s experience with it. The PICMG-sponsored AdvancedTCA Interoperability Workshops have proven very useful in uncovering potential areas for specification improvement, as well as for accelerating the development of a strong ecosystem of interoperable AdvancedTCA components.
Developers of AdvancedTCA products should strongly consider the use of off-the-shelf management components, such as the IPM Sentry products, to take quickest advantage of these specification improvements and the interoperability testing that helped to drive them.
Mark Overgaard founded Pigeon Point Systems (PPS) in 1997 to focus on products and services supporting the adoption of open modular platforms to replace proprietary architectures, with an initial focus on the telecommunications market and CompactPCI. He is a leader in the technical subcommittees of PICMG (including the management aspects of AdvancedTCA and the corresponding CompactTCA specification, now in development). The current PPS product focus is the IPM Sentry line of platform management components, including the first available AdvancedTCA shelf- and board-level management components. Previously Mark was vice president of engineering at Lynx Real-Time Systems (a UNIX-compatible RTOS supplier) and TeleSoft (a major supplier of embedded development solutions for Ada). He earned an MS in Computer Science from UC San Diego and a BS in Physics from Geneva College.