Should we formalize architectural specifications of climate models?

For a little over a decade, software engineering research has shown that formalizing software architecture descriptions of software can lead to a number of benefits, including reducing the ambiguity (leniency of interpretation) of architectural specifications, helping developers to understand possible behaviors of individual components and configurations of components without building a full implementation, and allowing software architects to perform architectural analyses before the system is built or before components are assembled. Whether or not it is worth formalizing the architecture depends on the kind of software you are writing, and by and large, it seems that semantically-rigorous architectural descriptions are more likely to be found in an academic setting than anywhere else.

One question we might ask is if we can gain any significant benefit from formalizing the architecture of the scientific software such as numerical climate models and other high performance computer simulations. From my experience with the climate modeling community, there are very few informal architectural descriptions to be found, let alone descriptions with enough semantic precision to enable meaningful architectural analyses.

There are a number of different architecture description languages (ADLs) out there. [1] is a good survey paper. Each ADL enables certain kinds of analyses and makes other analyses impossible or difficult. Most ADLs support notions of components, connectors, and ways to configure instances of these into systems. Most ADLs sit on top of a conceptual framework (e.g., CSP, finite state machines) to provide semantics for the system elements and their interactions.  This allows us to do some analysis of a system at the architectural level.

Wright Architectural Description Language

One popular ADL is called Wright (it is 10+ years old, but it looks like much of the hard core ADL work petered out after 2000 or so–no offense to people named Peter). The Wright ADL has a formal semantics based on CSP. Wright’s architectural descriptions are based on describing components, which have ports of interaction and connectors, which have interaction points called roles.  Ports and roles are defined by providing abstract behavioral descriptions–these are basically possible sequences of events that the port or role can participate in.  Wright allows you to specify which events are initiated by a port or role and which events are merely observed.  When putting together a system, connector roles are filled by compatible component ports. In what follows I will give an informal treatment of how to describe an ESMF-like architecture using Wright. If you want to know more about the details of Wright, you should read [2] to familiarize yourself with the language.

Here is an attempt at a Wright description of an ESMF-like architecture:

 Style ESMF

//events prefixed with "do" are initiated by that process (not merely observed)
 Connector ESMF
     Role Driver = doRegister -> doInit -> DriverRunPhase
         Where {
             DriverRunPhase = doRun -> DriverRunPhase |~| doFinalize -> TICK
     Role Coupler = register -> init -> CouplerRunPhase
         Where {
             CouplerRunPhase = run -> CouplerRunPhase [] finalize -> TICK
     Role CompA = register -> init -> CompARunPhase
         Where {
             CompARunPhase = run -> CompARunPhase [] finalize -> TICK
     Role CompB = register -> init -> CompBRunPhase
         Where {
             CompBRunPhase = run -> CompBRunPhase [] finalize -> TICK

     Glue = Driver.doRegister -> RegisterPhase ; Glue []
            Driver.doInit -> InitPhase ; Glue []
            Driver.doRun -> RunPhase ; Glue []
            Driver.doFinalize -> FinalPhase ; Glue []
                Where {
                    RegisterPhase = ; c:{Coupler, CompA, CompB} @ c.register -> TICK
                    InitPhase = ; c:{Coupler, CompA, CompB} @ c.init -> TICK
                    RunPhase = ; c:{Coupler, CompA, CompB} @ -> TICK
                    FinalPhase = ; c:{Coupler, CompA, CompB} @ c.finalize -> TICK

 // no constraints
End Style

A few things to notice:

  • I am describing ESMF architectures as a style (note the first line of the Wright description above). As you might suspect, styles represent an entire class of architectures, not just one specific architecture.
  • For better or worse, Wright does not give much guidance on on how to determine what should be a component, what should be a connector, and how to determine which interaction fragments should be called a ports and which should be called a role.  Figuring out how to break down a software artifact into these structures is left to the writer (Wrighter?) of the architectural description.  Since connectors are the locus of interaction of two more components, I have considered ESMF itself a connector (lines 4-32).
  • The ESMF connector has four roles: Driver, Coupler, CompA, and CompB.  This represents a coupled model with two components, a coupler in between, and a driver sitting on top.  You might visualize it like this:

ESMF-like architecture with two components, a coupler, and a driver

  • Each role is defined by an abstract process description.  Think of this as a series of events that the role participates in.  Some events may be initiated by the role.  (Normally this is shown with an overbar, but here I have prefixed initiated events with “do” such as “doRegister” and “doInit.”)   Notice that the Driver role has four different events that it initiates, doRegister, doInit, doRun, and doFinalize. The way to interpret the Driver role behavioral description is: The Driver first initiates the doRegister event followed by the doInit event and then it behaves like the process DriverRunPhase.  The DriverRunPhase process, then, makes an internal choice (|~|) between two options.  The first option is to initiate the doRun event and then to behave like the DriverRunPhase process (recursive). The second option is to initiate the doFinalize event followed by TICK.  TICK (think of it like a check mark) means successful completion.
  • The Coupler, CompA, and CompB roles can be interpreted in a similar way.  A major distinction is that all of their events are observed (i.e., register, init, run, finalize) meaning that each of these roles waits on some other process to initiate the events (in this case the Driver–well, technically the Glue does this, but more on that below).  Also, the Coupler, CompA, and CompB roles all feature an external (deterministic) choice (via the [] operator) meaning than an external process makes the decision  (e.g., such as whether the coupler should run again or should finalize).
  • The last part of the ESMF connector description is the process Glue. Every connector has a Glue process which describes how all of the roles work together to form a coherent interaction. The Glue is the most complex of the process descriptions in the ESMF connector, however, I will briefly describe how the “register” interactions are coordinated and the others follow in like manner. The Glue process contains five external-choice alternatives. The first alternative observes the Driver.doRegister event and then behaves like the RegisterPhase process. The RegisterPhase process (described in the Where clause) then initiates a sequence of three events, Coupler.register, CompA.register, and CompB.register.  The ; operator is the sequence operator and, in this case, means that the three events may occur in any order.  This indicates that when the Driver initiates the doRegister event, the Coupler, CompA, and CompB will all respond (in some order) with their own register event. This is an abstract representation of the ESMF component registration process.  Notice on line 22 that after the RegisterPhase process completes, the Glue process makes a recursive call to itself, thereby waiting on the next event to be initiated by the Driver.
  • The whole ESMF connector terminates successfully after the Driver initiates the doFinalize event, the Coupler, CompA, and CompB finalize, and then all process synchronize on the TICK event.

Given the quick and dirty ESMF example above, we see some of the benefits of using Wright to describe a climate model architecture:

  • A set of distinct roles are identified, made explicit, and their behaviors are described in an abstract way. This gives model builders a precise understanding of the expected behavior of any components fulfilling the roles.
  • The use of overbars (or the less preferable doXXX as I have done above) shows which roles are responsible for initiating events and which roles are waiting for events. This is an important distinction as the current set of coupling technologies differ with respect to which parts of the coupled model are driving (controlling) and which parts of the coupled model are passive (waiting to be called).
  • The internal vs. external choice operators make it clear who is deciding on the next action.  At the same time, the reason for choosing an action is abstracted.  For example, from this particular description above, we have no idea why the driver would choose doFinalize instead of doRun.  We’d have to look at a different representation to know that.

These are all nice advantages, but the real power of Wright comes into play through its support of some automated architectural analyses.  The formal underpinnings of Wright are based on CSP.  This means that Wright specifications can be translated into CSP and analyzed using CSP processing tools, the most popular of which is FDR2.  Some of the more interesting analyses made possible by this approach are:

  • Automated port/role compatibility checking.  Compatibility can be determined using the notion of process refinement.  Informally, a process P is refined by a process Q if Q respects all of P’s obligations to the environment and therefore Q can be safely substituted for P. In other words, we can automatically check which component ports are going to fulfill which connector roles. So, we might use this to answer questions like:  Can I plug in physics package X with atmosphere Y?  Of course, a number of steps have to be taken before we can answer that question automatically. The real work here is in describing the abstract behaviors of the physics process and the atmosphere process in a way that enables the analysis.
  • Deadlock freedom.  This is a kind of sanity check that verifies that when all of the components are hooked together into a system, there is a viable path of events that will terminate successfully.  It is possible that this kind of analysis could be used to guarantee that the set of models that participate in a coupled simulation indeed fulfill each other’s data dependencies. Events might correspond to specific field data exchanges and only a non-deadlocking set of processes (models) will mutually satisfy each other’s field requirements.

Architectural Abstraction ≠ Low Resolution

A parting thought.  Computer simulations are scientific tools.  The interactions among components of a coupled model produce hard numbers about the physical processes under study. It is important to remember that the architectural perspective is not a kind of “low resolution” representation of the full model–at least not in the way that scientists think about low resolution. You actually have to run the full code to get low resolution output.  You just change the grid size to meet your resource constraints.  (In some cases, you might choose to ignore some physical processes which would not be resolved by not calling certain subroutines.  But this is an optimization…) Instead, the architectural view is a high level, abstract view of how components interact with much of the details about the internals of the components completely ignored. So, at this time, it does not seem to me that an architectural representation of a coupled model could be used to produce some kind of scientifically meaningful output.  Instead, it can be used to enable other kinds of analyses that gauge properties and qualities of the software system.

[1] Nenad Medvidovic and Richard N. Taylor. “A Classification and Comparison Framework for Software Architecture Description Languages.” IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 26(1), 2000.

[2] Robert J. Allen and David Garlan. “A Formal Basis for Architectural Connection.” ACM Transactions on Software Engineering and Methodology, July 1997.


About rsdunlapiv

Computer science PhD student at Georgia Tech

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: