Flexibly-Packaged Climate Model Components
In a previous post I pointed out the distinction between infrastructure code and scientific code in coupled models. I admit that the line can be blurry, but one way of making the distinction is that infrastructure code can be a target of software reuse while the scientific code is highly specific to the particular physical processes being modeled. The advent of reusable coupling infrastructure is a good thing (e.g., the OASIS coupler, the Model Coupling Toolkit, and the Earth System Modeling Framework) and the proliferation of these technologies in the major climate models shows that there are multiple ways to implement the coupling infrastructure in a scientific model. Coupling infrastructures are strongly linked to software architectures–that is, each of the major coupling technologies embody certain architectural choices. The result of this is that we see a large amount of architectural variability in climate models at the software level.
While diversity at the scientific level is generally seen as a good thing, architectural variability is not necessarily desirable because it impedes interoperability among scientific components. Work has been done in the software engineering community on dealing with architectural mismatch. I have recently applied some of those concepts to climate model components to see what it would take to separate the infrastructure code from the scientific code. In its present state, my approach is heavily inspired by DeLine’s notion of Flexible Packaging , although it is already clear that some adaptations must be made for the approach to be viable in a high performance setting. Very briefly, Flexible Packaging is based on the idea of separating a software component’s “ware” from its “packaging.” Informally, the “ware” is the functionality of the component–what it does. The “packaging” is the way that the component interacts with the outside world (e.g., as an ActiveX component, a browser plugin, a standalone app, etc.). DeLine argues that packaging mismatch is a problem that arises because engineering decisions are made too early. Most software components today are pre-packaged in a certain way and changing the packaging is not feasible in most cases. “Flexible” packaging means that a given ware (piece of functionality) can be packaged in multiple ways, and moreover, that the decision about which packaging to use can be delayed until shortly before the software is deployed. For more details about this, read the DeLine paper below .
Before getting into the details, a few words about motivation. Why think about climate modeling components in terms of a ware and a packaging? Here is what I’m thinking:
- The first advantage is the obvious one: interoperability. This is the primary reason given for DeLine’s Flexible Packaging work. Interoperability in the climate modeling context means the ability of a diverse set of scientific component to “talk” together because they share the same infrastructure layer. Architectural mismatch of climate modeling components is an artificial barrier to scientific progress.
- The ware/packaging dichotomy forces a separation of concerns. That is–all the code in a component needs to go into one of these two boxes. This is hard because of the intimate connection between the science and the infrastructure code. For example, the order in which the component models are called is a scientific choice (because it affects the numerics and the output), but the mechanism for actually making the call is an implementation decision. It is not clear how to separate these two so that decisions about them can be made independently.
- The separation of packaging (which is largely an architectural artifact) from functionality enables studying the impacts of different architectures on software qualities–such as modularity and performance–because the same scientific code can be “outfitted” with different architectures.
What would a flexibly packaged climate model component look like? A first step is to identify what is the “ware” and what is the “packaging” of a climate modeling component. One way of thinking about this is: ware = science code, packaging = coupling infrastructure.
The DeLine work is part of a group of related work in which separation of concerns is achieved via the use of separate processes–that is, each concern is coded in a separate process (which may be implemented as a thread) and the processes communicate when there is a data dependency between them. In the DeLine work, the “ware” process and the “packaging” process each exhibit “channel signatures” (described using CSP notation) that describe the required communication protocol. Wares and packagers must have compatible channel signatures in order to be compiled together into a working component.
I want to get right down to some nitty-gritty details to show one way that a flexibly-packaged climate model component might be implemented. There are two Fortran modules below: simple_ware and simple_packaging. Taking a look at the ware, you will notice that it is a standard Fortran module with two public subroutines, init() and run(). I have highlighted a few of the lines with subroutine calls that look like fp_out****() and fp_in****(). These calls are communication channels between the ware and the packager–the fp_out****() calls send data and the fp_in****() calls receive data. These are rudimentary versions of the channel abstraction in the DeLine work.
module simple_ware implicit none private public init, run contains subroutine init() print *, "Inside ware init()" call fp_out_int("rank", 2) call fp_out_int_array("minIndex", (/1,1/), 2) call fp_out_int_array("maxIndex", (/100,150/), 2) call fp_out_int_array("regDecomp", (/1,1/), 2) print *, "Leaving ware init()" end subroutine subroutine run() real(8) :: pi real(8), pointer :: farrayPtr(:,:) integer :: i, j integer :: arraySize, dim1, dim2 print *, "Inside ware run()" call fp_in_int("arraySize", arraySize) call fp_in_int("dim1", dim1) call fp_in_int("dim2", dim2) allocate(farrayPtr(dim1,dim2)) pi = 3.14159d0 ! Fill source Array with data do j = lbound(farrayPtr, 2), ubound(farrayPtr, 2) do i = lbound(farrayPtr, 1), ubound(farrayPtr, 1) farrayPtr(i,j) = 10.0d0 & + 5.0d0 * sin(real(i,8)/100.d0*pi) & + 2.0d0 * sin(real(j,8)/150.d0*pi) enddo enddo call fp_out_double_array("arrayData", farrayPtr, arraySize) end subroutine end module simple_ware
Below is another Fortran module that serves as a packaging for the above ware. This packaging is an ESMF-based packaging. (ESMF is one particular coupling infrastructure). The packaging has the customary interface for ESMF component, including register, init, run, and finalize subroutines. The packaging also contains calls to fp_in****() and fp_out****() subroutines for communicating with the ware.
How do the ware and the packaging interact? First, notice that the packaging references the ware init and run subroutines (e.g., see the use statement on line 4 of the packaging). Ideally, these references would be automatically added later (by a code generator) so that the packaging does not reference any particular ware. But for now I have hard coded in a direct link to one particular ware. Also notice lines 47 and 87 of the packaging have subroutine calls to pthread_create_f(…). When an external component calls the packaging’s init and run subroutines, the corresponding init and run subroutines from the ware are called in separate threads. Then, the packaging and ware threads run concurrently synchronizing on the fp_in****() and fp_out****() communication calls.
module simple_pack use ESMF_Mod use simple_ware, only: ware_init=>init, ware_run=>run implicit none public init, run, final, register contains subroutine register(comp, rc) type(ESMF_GridComp) :: comp integer, intent(out) :: rc ! Initialize return code rc = ESMF_SUCCESS print *, "User Comp1 Register starting" ! Register the callback routines. call ESMF_GridCompSetEntryPoint(comp, ESMF_SETINIT, userRoutine=init, rc=rc) call ESMF_GridCompSetEntryPoint(comp, ESMF_SETRUN, userRoutine=run, rc=rc) call ESMF_GridCompSetEntryPoint(comp, ESMF_SETFINAL, userRoutine=final, rc=rc) end subroutine subroutine init(comp, importState, exportState, clock, rc) type(ESMF_GridComp) :: comp type(ESMF_State) :: importState, exportState type(ESMF_Clock) :: clock integer, intent(out) :: rc type(ESMF_ArraySpec) :: arrayspec type(ESMF_DistGrid) :: distgrid type(ESMF_Array) :: array type(ESMF_VM) :: vm integer :: petCount integer :: rank integer, dimension(2) :: minIndex, maxIndex, regDecomp integer :: theThread print *, "Inside simple_pack init()" call fp_init() call pthread_create_f(theThread, ware_init) ! Determine petCount call ESMF_GridCompGet(comp, vm=vm, rc=rc) call ESMF_VMGet(vm, petCount=petCount, rc=rc) call fp_in_int("rank", rank) call fp_in_int_array("minIndex", minIndex, 2) call fp_in_int_array("maxIndex", maxIndex, 2) call fp_in_int_array("regDecomp", regDecomp, 2) call ESMF_ArraySpecSet(arrayspec, typekind=ESMF_TYPEKIND_R8, rank=rank, rc=rc) distgrid = ESMF_DistGridCreate(minIndex=minIndex, maxIndex=maxIndex, regDecomp=regDecomp, rc=rc) array = ESMF_ArrayCreate(arrayspec=arrayspec, distgrid=distgrid, indexflag=ESMF_INDEX_GLOBAL, rc=rc) call ESMF_ArraySet(array, name="array data", rc=rc) call ESMF_StateAdd(exportState, array, rc=rc) rc = ESMF_SUCCESS call pthread_join_f(theThread); print *, "Leaving pack init()" end subroutine subroutine run(comp, importState, exportState, clock, rc) type(ESMF_GridComp) :: comp type(ESMF_State) :: importState, exportState type(ESMF_Clock) :: clock integer, intent(out) :: rc !real(ESMF_KIND_R8) :: pi type(ESMF_Array) :: array real(8), pointer :: farrayPtr(:,:) ! matching F90 array pointer integer :: i integer :: theThread integer :: arraySize print *, "Inside simple_pack run()" call pthread_create_f(theThread, ware_run) print *, "User Comp1 Run starting" ! Get the source Array from the export State call ESMF_StateGet(exportState, "array data", array, rc=rc) ! Gain access to actual data via F90 array pointer call ESMF_ArrayGet(array, localDe=0, farrayPtr=farrayPtr, rc=rc) arraySize = (ubound(farrayPtr, 2) - lbound(farrayPtr, 2) + 1) * (ubound(farrayPtr, 1) - lbound(farrayPtr, 1) + 1) call fp_out_int("arraySize", arraySize) call fp_out_int("dim1", ubound(farrayPtr, 1) - lbound(farrayPtr, 1) + 1) call fp_out_int("dim2", ubound(farrayPtr, 2) - lbound(farrayPtr, 2) + 1) call fp_in_double_array("arrayData", farrayPtr, arraySize) call pthread_join_f(theThread) print *, "User Comp1 Run returning" end subroutine subroutine final(comp, importState, exportState, clock, rc) type(ESMF_GridComp) :: comp type(ESMF_State) :: importState, exportState type(ESMF_Clock) :: clock integer, intent(out) :: rc type(ESMF_DistGrid) :: distgrid type(ESMF_Array) :: array print *, "Inside simple_pack final()" rc = ESMF_SUCCESS call ESMF_StateGet(exportState, "array data", array, rc=rc) call ESMF_ArrayGet(array, distgrid=distgrid, rc=rc) call ESMF_ArrayDestroy(array, rc=rc) call ESMF_DistGridDestroy(distgrid, rc=rc) call fp_final() end subroutine end module simple_pack
Okay, given the example code above, you should notice a few things.
First, notice that there are no references to ESMF anywhere in the “ware” module. The ware is a framework-agnostic module (well, almost–more on that below) that must be paired with a packaging in order to interact with the outside world. Of course, the output calls from the ware must match with input calls to the packager (and vice-versa). This is formalized in the DeLine work by doing channel signature matching, but the basic idea is that the ware must provide what the packager needs and vice-versa.
Also notice that the ware is shorter than the packaging. This is because the example ware doesn’t really do anything that interesting except fill up an array with some dummy data (lines 41-47). A “real” component would have some actual science code there.
Currently, the fp_in****() and fp_out****() routines are based on data copies. This introduces a significant overhead if the array sizes are large and/or if the number of iterations is large (both of which are true for a legitimate model). A smarter implementation would use pointer manipulation such that both the ware and packaging are referencing the same memory addresses. This is possible since the two are implemented as threads in a shared address space.
Also, remember I mentioned that the ware is “almost” framework-agnostic. Even though there is no ESMF code inside the ware (e.g., the ESMF_Mod does not have to be imported) there is something a bit more subtle going on. The issue is that the ware code still has an “init” and “run” subroutine which are derived from the ESMF interface standard. So, using this ware with a non-ESMF packaging still requires some explicit knowledge about when to invoke the init and run threads in the ware… It is not clear how to make the ware truly architecture neutral–indeed, the ware will have to adhere to some architecture. The question is, which architecture is best suited for pairing with different kinds of packagings?
 DeLine, Robert. “Avoiding packaging mismatch with flexible packaging.” IEEE Transactions on Software Engineering: 27(2), February 2001.