Chapter 1
Many would argue that future breakthroughs in software productivity
will depend on our ability to combine existing pieces of software
to produce new applications. Less than 15% of the software developed
is innovative (i.e. "unique, novel, and specific to individual
applications") [Jones84]. The large bulk of the code in
our software systems implements solutions to routine design problems,
such as sorting, searching, and manipulating well-known data structures.
In mature engineering disciplines, such as chemical or civil engineering,
routine design problems are rapidly solved by reusing existing
designs [Kogut94a]. The knowledge for routine design is captured
in standardized descriptions, organized in design handbooks, and
shared within the engineering community. Thus, designers are able
to rapidly construct large, high-quality systems, out of well-tested,
existing parts.
The potential for reuse certainly exists in the software development
domain. In fact, over the past decade, there has been a very significant
amount of work in the area of software reuse (see [Biggerstaff89,
Krueger92]). However, despite all this effort, software production
is still dominated by build-from-scratch techniques.
This work is broadly motivated by the question: Why is it so
hard to build software applications out of existing components?
One difficulty lies in locating the desired components.
In contrast to more mature disciplines, software engineering is
still lacking comprehensive component libraries and design handbooks.
The convergence of the industry to a small number of dominant
designs, plus the easy access to software repositories offered
by technologies such as the Internet [Obraczka93], is already
making the process of finding appropriate components relatively
easy.
But even when the components are at hand, composing them into
new applications often requires so much additional effort, that
in many cases designers still decide to build their applications
from scratch.
Problems arise because the chosen parts "do not fit together
well". Designers typically have to either modify the code
of existing components, or write additional coordination software
that bridges mismatches among components. Examples of such mismatches
include:
low-level interoperability mismatches, such as differences
in expected and provided procedure names, parameter orderings,
data types, and calling conventions.
more fundamental architectural mismatches, that is, different assumptions about the structure of the application in which components are to appear. Such mismatches include differences in expected and provided communication protocols, different assumptions about resource ownership and sharing, different assumptions about the presence or absence of particular operating system and hardware capabilities, etc.
Most research efforts to facilitate the process of bridging component
mismatches have focused on limited classes of interoperability
mismatches, such as interface mismatches [Wileden91, Beach92,
Purtilo94] or data type mismatches [Lamb87]. The importance of
architectural mismatches has only recently been highlighted [Garlan95].
To this date there has been no unified framework for describing
the various kinds of component mismatches, nor a systematic set
of rules for dealing with them. Designers still have to rely on
their intuition and experience, and the problem of component composition
is still being confronted in a largely ad-hoc fashion.
The practical objectives of this work is:
to better understand why it is currently difficult to build software
applications by composing existing components, and
to propose a novel software system representation and a design
framework that aim to reduce the effort of integrating software
components into new applications.
This work adopts an architectural perspective on software applications,
viewing them as interdependent collections of software components
At the architectural level, components and their interdependencies
are represented as two distinct, equally important entities. Software
components represent the core functional pieces of an application
and deal with concepts specific to the application domain. Interdependencies
relate to concepts orthogonal to the problem domain of
most applications, such as transportation and sharing of resources,
and synchronization constraints among components.
However, as design moves closer to implementation, current programming
tools increasingly focus on representing components. At the implementation
level, software systems are sets of source and executable modules
in one or more programming languages. Although modules come under
a variety of names and flavors (procedures, packages, objects,
clusters, etc.) they are all essentially abstractions for components.
By failing to provide separate abstractions for specifying and
implementing interconnection protocols among software components,
current programming languages force programmers to distribute
such protocols among the interdependent components. As a consequence,
code-level components encode (apart from their ostensible function)
fragments of interconnection protocols from their original development
environments. These fragments translate into a set of undocumented
assumptions about their dependencies with the rest of the system.
When attempting to reuse components in new applications, such
assumptions have to be manually identified and modified, in order
to match the new interdependency patterns at the target environment.
This often requires extensive modifications of existing code,
or the development of additional coordination software.
This thesis argues that many of the practical difficulties associated
with the above process are due to our current failure to recognize
the problem of component interconnection as a separate design
problem, orthogonal to the problem of implementing a component's
core functionality. We need to develop programming languages and
tools with support for abstractions that localize and separate
the definition of interconnection relationships from that of the
interdependent components. Furthermore, we need to understand
the patterns of component interdependencies encountered in software
systems, and develop frameworks that guide designers in selecting
appropriate coordination processes for managing them.
If we can satisfy these requirements, we can develop improved
processes for component-based application development. Such processes
will center around architectural descriptions of applications,
where software activities and their interdependency patterns will
be explicitly represented by distinct entities. They will offer
the following set of practical benefits:
Reuse of code-level components. Designers will be able
to generate new applications simply by selecting existing components
to implement activities, and coordination processes to manage
dependencies, independently of one another. The development of
successful frameworks for mapping dependencies to coordination
processes will reduce the step of managing dependencies to a routine
one, and enable it to be assisted, or even automated, by design
tools. Overall, the objective is to be able to generate new applications
with minimal, or no need for user-written coordination software.
Reuse of software architectures. Designers will be able
to reuse the same architectural description, in order to reconstruct
applications after one or more activities have been replaced by
alternative implementations. They will simply have to semi-automatically
re-manage the dependencies of the affected activities with the
rest of the system. Furthermore, they will be able to reuse the
same set of components in different execution environments by
managing the dependencies of the same architectural description
using different coordination processes, appropriate for each environment.
Insight into software organization alternatives. The description
of software applications as sets of components interconnected
through abstract dependencies will help designers structure their
thinking about how to best integrate the components together.
A design space of coordination processes for managing interdependency
patterns will assist them to explore alternative ways of organizing
the same set of components, in order to select the one which exhibits
optimal design properties.
The main body of the thesis proposes a concrete implementation
of the previous requirements and demonstrates how they can be
integrated into a methodology for component-based application
development. The following are the principal deliverables of this
research:
SYNOPSIS: A software architecture description language
SYNOPSIS supports graphical descriptions of software application
architectures at both the specification and the implementation
level. It provides separate language entities for representing
software activities and dependencies. Activities
represent the main functional pieces of an application, while
dependencies describe their interconnection relationships. An
important attribute of dependencies is their coordination process.
Coordination processes represent interconnection protocols that
manage the relationships and constraints specified by their
associated dependency. SYNOPSIS language elements are connected
together through ports. Ports provide a general mechanism
for representing abstract component interfaces. All elements of
the language can contain an arbitrary number of attributes. Attributes
encode additional properties of the element, as well as compatibility
criteria that constrain its connection to other elements.
SYNOPSIS provides two mechanisms for abstraction: Decomposition
allows new entities to be defined as patterns of simpler ones.
It enables the naming, storage, and reuse of designs at the architectural
level. Specialization allows new entities to be defined
as variations of other existing entities. Specialized entities
inherit the decomposition and attributes of their parents and
can differentiate themselves by modifying any of those elements.
Specialization enables the incremental generation of new designs
from existing ones, as well as the organization of related designs
in concise hierarchies. Finally, it enables the representation
of reusable software architectures at various levels of abstraction
(from very generic to very specific).
A design handbook of software component interconnection
To assist the design task of specifying application interdependencies,
as well as the design of corresponding coordination processes,
this work proposes a standardized, but extensible, vocabulary
of dependency types and a design space of associated coordination
processes. The vocabulary is based on the observation that most
software interconnection relationships can be specified using
a relatively narrow set of concepts orthogonal to the problem
domain of most applications, such as resource flows, resource
sharing, and timing dependencies. The implementation of interconnection
protocols involves a similarly orthogonal set of coordination
concepts, such as shared events, invocation mechanisms, and communication
protocols. The development of an application-independent framework
that captures the most useful patterns of interdependencies and
the ways of managing them, can form the basis for a design
handbook for integrating software components. The development
of such a handbook aims to reduce the specification and implementation
of software component interdependencies to a routine design problem,
capable of being assisted, or even automated, by computer tools.
SYNTHESIS: A component-based application development environment
SYNTHESIS provides an integrated environment for developing software
applications by combining existing components. The system provides
support for:
Creating and editing software architectural diagrams written in
the SYNOPSIS language.
Maintaining repositories of SYNOPSIS entities (activities, dependencies,
coordination processes, ports), organized as specialization hierarchies.
Assisting, and in some cases automating, the process of generating
executable applications by successive transformations of SYNOPSIS
architectural diagrams.
The prototype implementation of SYNTHESIS that has been developed
for this thesis contains a version of our component interconnection
handbook encoded as a SYNOPSIS specialization hierarchy of dependency
types. Using this repository, the SYNTHESIS design assistant is
able to semi-automate the process of generating applications from
their SYNOPSIS diagrams, by managing dependencies with coordination
processes stored in the repository. This results in minimal, and
often no need for user-written coordination software to "glue"
the components together.
In order to demonstrate the thesis brought forth in Section 1.2.2,
we need to provide positive evidence about:
the feasibility of describing applications as collections
of orthogonal subcomponents, representing core activities and
interdependencies
the usefulness of such a representation in facilitating
both code-level and architectural software reuse, as well as in
providing insight into software component organization alternatives.
The feasibility and usefulness of our claims has been demonstrated
by building a prototype implementation of SYNTHESIS, and using
it to perform a set of four experiments. Each experiment consisted
in:
describing a test application as a SYNOPSIS diagram
selecting a set of components exhibiting various mismatches to
implement activities
using SYNTHESIS and its repository of dependencies in order to
integrate the selected components into an executable system
exploring alternative executable implementations based on the
same set of components
The test applications include:
a File Viewer system, which integrates heterogeneous components
written in C and Visual Basic.
a Key Word In Context index system, built by assembling components
with mismatching architectural assumptions (UNIX filters and servers).
an Interactive TEX system, which
integrates the components of the TEX
document typesetting system in a WYSIWYG (what-you-see-is-what-you-get)
ensemble.
a Collaborative Editor architecture, which extends the functionality
of existing single user editors with group editing capabilities.
This experience has demonstrated that that the proposed architectural
ontology and vocabulary of dependencies were capable of accurately
and completely expressing the architecture of all four test applications.
Furthermore, it has provided positive evidence for the principal
practical claims of the approach. The evidence can be summarized
as follows:
Support for code-level software reuse: SYNTHESIS was able
to resolve a wide range of interoperability and architectural
mismatches and successfully integrate independently developed
components into all four test applications, with minimal or no
need for user-written coordination software.
Support for reuse of software architectures: SYNTHESIS
was able to reuse a configuration-independent SYNOPSIS description
of a collaborative editor and the source code of an existing single
user editor, in order to generate collaborative editor executables
for two different execution environments (UNIX and Windows).
Insight into alternative software architectures: SYNTHESIS
was able to suggest a variety of alternative overall architectures
for integrating each test set of code-level components into its
corresponding application, thus helping designers explore alternative
designs.
Both the problem of representing software application architectures,
and that of developing applications from existing components have
received attention in a variety of different research areas. This
section briefly discusses the three research areas that are most
closely related to this work. Chapter 7 is devoted to a detailed
discussion of other related research.
Coordination Theory
Coordination theory [Malone94] focuses on the interdisciplinary
study of coordination. Research in this area uses and extends
ideas about coordination from disciplines such as computer science,
organization theory, operations research, economics, linguistics,
and psychology. It defines coordination as the process of managing
dependencies among activities. Its research agenda includes characterizing
different kinds of dependencies and identifying the coordination
processes that can be used to manage them.
This work is directly related to coordination theory, in that
it views the process of developing applications as one of specifying
architectures in which patterns of dependencies among software
activities are eventually managed by coordination processes.
It makes two principal contributions to coordination theory:
The development of SYNOPSIS, an architectural language that is
able to support coordination theory research, with distinct abstractions
for activities and dependencies. Although developed for describing
software applications, SYNOPSIS can be used to describe other
complex systems as well, such as business processes.
The definition of a vocabulary of dependency types and associated
coordination processes for the domain of software systems.
This project grew out of the Process Handbook project [Malone93,
Dellarocas94] which applies the ideas of coordination theory to
the representation and design of business processes. The goal
of the Process Handbook project is to provide a firmer theoretical
and empirical foundation for such tasks as enterprise modeling,
enterprise integration, and process re-engineering. The project
includes (1) collecting examples of how different organizations
perform similar processes, and (2) representing these examples
in an on-line "Process Handbook" which includes the
relative advantages of the alternatives.
The Process Handbook relies on a representation of business processes
that distinguishes between activities and dependencies and supports
entity specialization. It builds repositories of alternative ways
of performing specific business functions, represented at various
levels of abstraction. SYNOPSIS has borrowed the ideas of separating
activities from dependencies and the notion of entity specialization
from the Process Handbook. It is especially concerned with (1)
refining the process representation so that it can describe software
applications at a level precise enough for code generation to
take place, and (2) populating repositories of dependencies and
coordination processes for the specialized domain of software
component integration.
Architecture Description Languages
Architecture Description Languages (ADLs) provide support for
representing software systems in terms of their components and
their interconnections [Kogut94b, Shaw94b]. They are used to model
domain-specific software architectures so that new application
systems can be built from existing solutions. They typically provide
separate abstractions for representing components and their interconnections.
SYNOPSIS shares many of the goals and principles of ADLs. However,
whereas previously proposed architectural languages only provide
support for implementation-level connector abstractions (such
as a pipe, or a client/server protocol), SYNOPSIS is the first
language which also supports specification-level abstractions
for encoding interconnection relationships (dependencies). It
is also the first system that proposes a design framework for
describing the most common interconnection relationships encountered
in software systems, and for associating each relationship to
appropriate implementations (coordination processes).
Operating, Concurrent, and Distributed System Design
Operating system research is concerned with developing algorithms
for the allocation of system resources and the management of interdependencies
among user and system processes. The study of concurrent and distributed
systems in concerned with essentially the same problems in the
special domains of systems with multiple processors or physically
distributed components [Andrews91, Bacon93].
This work provides a unifying framework for organizing the techniques
and algorithms developed in those areas, relates them to dependency
patterns, and uses them to populate the design space of coordination
processes for software systems.
1.4 Organization of the Thesis
The rest of the thesis is organized as follows:
Chapter 2 contains a detailed exposition of the underlying thesis
of this work. It explains why current technologies for reusing
software components into new applications often require significant
additional design effort to resolve component mismatches. It argues
for treating component interconnection as a separate design problem,
orthogonal to the problem of implementing a component's
core function. It outlines a set of practical requirements to
achieve this separation and a set of practical benefits that are
expected to result from it.
Chapter 3 is devoted to a description of SYNOPSIS, a software
architecture description language. Its main distinguishing feature
is a clear separation of an application's core functional
pieces from their interconnection relationships.
Chapter 4 introduces a vocabulary of dependencies for describing
software component interconnection needs. It also describes a
design space of associated coordination processes. The vocabulary
and design space can form the basis of a design handbook of software
component interconnection, that facilitates the representation
and solution of component interconnection problems.
Chapter 5 describes an algorithm for generating executable applications
by successive transformations of their SYNOPSIS architectures.
The algorithm forms the basis of the SYNTHESIS design assistant.
Chapter 6 describes the prototype implementation of SYNTHESIS.
It also contains a detailed discussion of four experiments which
demonstrate the feasibility and usefulness of the ideas proposed
in this thesis.
Chapter 7 contains a detailed discussion of related research.
Finally, Chapter 8 summarizes the results of this work, distills
some lessons learned, and discusses some ideas for future research.