Software Design
Contents
Introduction
This paper presents the principles underlying software design and the major design practices.
Software design varies in formality according to an organization's culture and the size of a project. Sometimes the design work is very informal, and is done at the keyboard as the program code is typed in (McConnell, 1993). In this case the source code serves as the design. Other times, the design work is part of a more extensive software engineering process, where other activities may support the design work (e.g. requirements analysis, design inspections, etc.). In this case the design is written down using a design notation, possibly with a specialized program, and is usually put under version control.
In all cases software design is a "problem-solving task" and a "highly-creative process" (Budgen, 2003, p. 18, 32). Budgen makes a case that design is not an analytical process, and that it is not like the scientific approach to problem solving (ibid. p. 32). Consequently there may be several acceptable design solutions (ibid. p. 19).
On the other hand, scientific methods applied to physical processes can reduce a problem to smaller parts, and an analysis of the accumulated data will show a convergence that points to a single solution (ibid. p. 20). Software design rarely converges; in fact, the opposite is typical. This is because software design problems have the characteristics of "wicked" problems (ibid. p. 19). Some characteristics of 'wicked' problems are:
- A solution tends to change the problem
- No definitive problem description exists
- There are no definitive problem solution criteria
- No test of the solution exists
- The only solutions possible are good or bad, not correct or incorrect
Understanding that several acceptable design solutions may exist, the goal of design becomes a goal to find the best solution. The best solution for a software design is the least complex solution (Yourdon and Constantine, 1979). Such a design minimizes the greatest cost factors of software development: implementation, maintenance, and modification. Finding the "best solution" requires a way to quantify quality in design: i.e. we need the attributes of quality and a means of assessing them (Budgen p. 75).
Budgen argues that the underpinning principles of design quality have changed little since their introduction during the 1970's (ibid. p. 445). However, the pace of change in software technologies increased and the demand for faster delivery put pressure on software professionals to find faster ways of doing things. Budgen found this to be a major influence in the rise of agile methods (ibid. p. 443). Another influence on design practice is the idea of market strategy, in which competition for market space and other non-technical factors influence a design. He concludes, however, that the most important new idea in software design is the emergence of architectural style (ibid. p. 446).
Each design methodology includes a set of graphical symbols for representing and archiving design data. There is no theoretical model that describes design notations, nor are there any "empirical investigations" that can be used to determine when to use a particular form or how a particular form should be used (ibid. p. 445). Even the relatively new Unified Modeling Language (UML), which has wide support, leaves plenty of ambiguity in a design: you cannot guarantee that two programmers will produce the same code from a UML diagram.
Overview
This paper attempts to show what software design is and how it is done. We start by stating that design is simply "a problem-solving task" and that a software design describes how requirements (e.g. in a requirements specification) are to be met (ibid. p. 18).
Probably the best description of design as it applies to software is found in Budgen (ibid.). He assembles work from both inside and outside the software industry to show how design is used to solve problems, and how software design differs from design in other engineering disciplines.
Budgen does a good job of building a case showing how software design is different and more difficult than design in other engineering disciplines, which he summarized into four main characteristics (ibid. p. 16):
- The degree of specialization in software is lower than in other fields
- Software designers usually need to acquire the necessary domain knowledge
- A software design represents a process, which requires elements that describe behavior and functions.
- A software design cannot be expressed by one drawing, there must be several viewpoints.
Clearly, we need tools that can improve the situation. Budgen argues that the most practicable tools available are (ibid. p. 18):
- Design methods and patterns - Budgen shows that when domain knowledge is lacking, these are the two options (other than experience) to acquire it.
- Representations (e.g. models, prototypes)
- Abstraction (i.e. a logical model such as one created in a UML diagram)
Missing from the list is the CASE tool. While CASE tools may have the capability to check a design for "consistency and completeness," these tools are not yet capable of assessing the degree to which a design addresses the requirements of a system (ibid. p. 447).
Design Models
Design models have a fundamental role in software design (ibid. p. 23, paraphrased):
- They allow the designer to predict a system's behavior under different conditions
- They allow the design to be expanded in a systematic way towards a detailed, complete design
- They help in the transfer of knowledge
- Practices for their description, construction, and elaboration are fundamental to software design.
There are two design models. Each design method belongs to one of the two models:
- Graphical, a.k.a. systematic methods
- Mathematical, a.k.a. formal methods
Design Methods
A design method - whether graphical or mathematical - consists of two major components (ibid. p. 34):
- A representation - the symbols and diagrams that model and describe the structure of a solution. A representation usually includes both concrete and abstract properties of design objects
- A process - the steps or procedures required to implement a design. A software design process derives from theory, principles, and heuristics.
In addition to these two major components, most design methods include a "set of heuristics" that gude process activities for specific types of problems (ibid. p. 34).
Representations are useful in their ability to convey domain knowledge by illustrating concepts such as information flow and operations. Software design processes are useful in their ability to guide tasks (despite a lack of theoretical underpinnings that explain why this is). Such tasks include:
- Identification of the design actions to be performed
- Use of the representation forms
- Procedures for making transformations between representations
- Quality measures to aid in making choices
- Identification of particular constraints to be considered
- Verification/validation operations
Transfer of Knowledge
One of the main benefits of design representations is that they serve as a tool for knowledge transfer. Other uses for design documents are 1) they record design decisions and (ideally) the rationale behind them; 2) they serve as a reference during construction and maintenance. It is typical that software designs are written down for large, more formal projects but not for most smaller projects. This is the source of many problems.
When the design is not available in hard copy or electronic form, there is no vehicle for knowledge transfer. When new employees have questions about an application's design, they must rely on current employees for answers. There is a tendency to lose track of design decisions when they are not written down. This increases maintenance cost. When a system's design is not available, testing is hampered because there is no document specifying functionality that should be tested.
Recording of design decisions usually does not (but should) include the rationale behind the decision (ibid. p. 39). Budgen notes that the design audit (a.k.a. design inspection) encourages recording of design rationale.
Anticipating Change
Systems that evolve require more forethought in the design stage. Budgen cites the work of Lehman and Ramil (2002) who found that there are generally two kinds of systems (ibid. p. 58):
- Those that evolve - these are used to "mechanise a human or societal activity."
- Those that are static - these meet some "formal specification."
Even formal specifications can change from time to time. Parnas (1979) argues that a design should be based on the assumption that there will be a need to make changes to the product sometime.
Software Designers
A single person may be responsible for a design, or a team may share the responsibility. What does it take for a person or team to be successful, or even exceptional, at design? Budgen cites work by Curtis et. al. (1988) who found that exceptional designers possess three characteristics (ibid. p. 32):
- They are familiar with the application domain, which allows them to see solutions for problems more easily.
- They are good at communicating their technical vision, often by simply interacting.
- They identify with project performance.
One skill that is often missing in exceptional designers: good programming skills.
When design is performed by a team, coordination of design tasks and dividing the product are additional complications (Budgen p. 40). There are two kinds of team organization:
- Hierarchical - a chief designer makes the final decision on all design questions. Other team members perform a supporting role and may have specialized roles. This approach is seldom taken because of the lack of capable designers to fill the chief role.
- Non-hierarchical ("egoless peers") - different team members take the lead when the need for specific specialization arises.
Research in the psychology of teams has found:
- Productivity begins to suffer when team size grows above 10-12 members.
- Team members who possess the greatest domain knowledge exert more influence.
- There are organizational issues within a company that impact the quality of the design.
Elements of Design Quality
In order to measure design quality we need to know what the quality factors are and we need methods to measure them. Budgen described a group of quality factors (ibid. p. 70):
- reliability
- efficiency
- maintainability
- usability
- testability
Measures of design quality, to which the above quality factors are applied, include:
- Simplicity
- Modularity - coupling and cohesion
- Information hiding
Simplicity
Simplicity of design means it includes nothing more than what is essential. Simplicity is a measure of maintainability and testability, and to a lesser degree reliability and possibly efficiency (ibid. p. 76). Design simplicity is usually described in terms of complexity:
- control flow - the number of paths of code execution in a program unit (i.e. a block of code, a function, etc.)
- information flow - how complex the structure is with respect to information flow
- comprehension - indicated by the number of different identifiers and operators.
- complexity of relationships between structural elements
- relationships between system elements - another kind of structural complexity that is usually encountered in object-oriented forms.
These ideas are examined more thoroughly in other work. For example, Yourdon and Constantine (1979) describe factors of complexity, what they call "complexity in human terms" (p. 73): 1) size of a module; 2) number of decision-making statements; 3) span of data elements (e.g. between uses of a variable); 4) span of control flow (e.g. between an entry point and exit point). In summary, Yourdon and Constantine find that complexity is determined by:
- "the amount of information that must be understood correctly"
- "the accessibility of the information"
- "the structure of the information."
Yourdon and Constantine state that their discussion of complexity focuses on intermodular interfaces (ibid. p. 74). Furthermore, they confine their definition of "amount of information" to the number of arguments passed in a call to a routine. I.e., the intermodular interface.
Yourdon and Constantine describe "accessibility of the information" as applying to the use of the interface: what a programmer must know to interpret the code. They give four factors (ibid. p. 76):
- Direct vs. indirect access to information
- More complex: the information refers indirectly to some other data element.
- Less complex: the information can be accessed directly by the programmer.
- Local vs. remote presentation of data
- More complex: the information is remote from the interface statement.
- Less complex: the information is local (i.e. presented with) the interface statement.
- Standard vs. unexpected presentation of information
- More complex: the information is presented in an unexpected manner.
- Less complex: the information is presented in a standard manner.
- Nature of the interface
- More complex: the nature is obscure.
- Less complex: the nature is obvious.
Modularity
Modularity of design means the extent to which it is divided into components that
- are complete (i.e. can be used in another application without modification)
- relate to the problem solution
Modularity is measured in terms of coupling and cohesion. Modularity is a measure of maintainability, testability, and possibly usability and reliability (Budgen 2003, p. 77).
Coupling
Coupling is a measure of the type and strength of connections between modules. Highly coupled modules have strong interconnections, loosely coupled modules have weak interconnections. Uncoupled modules have no interconnections. A programmer who is coding, debugging or modifying one of two (or more) highly-coupled modules will encounter a higher probability that changes will be required in the other modules as well. Coupling is a relative measure indicated by four factors (Yourdon and Constantine 1979, p. 86); in decreasing magnitude of effect on coupling:
- The nature of the connection between modules (ibid. p. 88)
- Minimally connected (least coupling)
- Normally connected (somewhat higher coupling)
- Pathologically connected (highest coupling). These connections prevent us from understanding how a module works without understanding something about the program it's in.
- How complex the interface is (in human terms)
- The type of information flow along the connection
- Binding time of the connection.
From a quality perspective, we are concerned mostly with the type of information flow. Yourdon and Constantine (ibid.) describe forms of coupling in detail but their work is less accessible today because their examples use machinery like paper tape readers and punch cards as input, punched tape as output, and they refer to language constructs that are no longer mainstream. McConnell's work (1993) is much more accessible.
There is no standardized empirical measure of coupling, and names of coupling forms vary from one author to another. However, Budgen states that knowledge of the presence of particular forms of coupling is more useful to the designer than the extent of any particular form (2003, p. 78).
Table of coupling forms in decreasing order of desireability (Sources: Yourdon and Constantine (1979), Martin and McClure (1985), McConnell (1993), Budgen (2003)).
Yourdon and Constantine | Martin & McClure | McConnell | Budgen |
---|---|---|---|
Data coupling or input-output coupling | Data coupling | Simple-data coupling | Data coupling |
Not mentioned | Stamp coupling | Data-structure coupling | Stamp coupling |
Control coupling - activating | Control coupling | Control coupling | Control coupling - activating |
Control coupling - co-ordinating | Control coupling | Control coupling | Control coupling - co-ordinating |
Common-environment coupling | Common coupling | Global-data coupling | Common-environment coupling |
Not mentioned | Content coupling | Content coupling | Not mentioned |
Coupling Forms
McConnell (1993) gives a thorough as well as accessible description of coupling, from a programmer's point of view (p. 87). In order of decreasing coupling:
- Content coupling
- Global-data coupling
- Control coupling
- Data-structure coupling
- Simple-data coupling
Cohesion
Cohesion is a measure of the functional relatedness of the components of a module (Yourdon and Constantine 1979, p. 106; Budgen 2003, p. 78). Cohesion is directly related to a module's relation to the problem solution (Yourdon and Constantine 1979, p. 106). I.e. as cohesion increases, overall complexity decreases.
Table of cohesion forms in decreasing order of desireability (McConnell places temporal cohesion above procedural cohesion (1993 p. 83). Sources: Yourdon and Constantine (1979), Martin and McClure (1985), McConnell (1993), Budgen (2003)).
Yourdon and Constantine | Martin & McClure | McConnell | Budgen |
---|---|---|---|
Functional | Functional | Functional | Functional |
Sequential | Sequential | Sequential | Sequential |
Communicational | Communicational | Communicational | Communicational |
Procedural | Procedural | Temporal | Procedural |
Temporal | Temporal | Procedural | Temporal |
Logical | Logical | Logical | Logical |
Coincidental | Coincidental | Coincidental | Coincidental |
Cohesion Levels
Cohesion levels are determined by seven distinct associative principles. In order of increasing cohesion they are:
- Coincidental Cohesion
Elements (lines of code) in a coincidentally-cohesive module have no relationship. Typically occurs as the result of modularizing existing code, to separate out redundant code (Yourdon and Constantine 1979, p. 109).
Yourdon and Constantine (ibid.) take an example where a programmer following structured techniques ends up with duplicated code in many places. To reduce the duplication and create a more compact program, the programmer replaces the duplicate code with function calls, and puts the code in a separate module or modules. These modules and the functions within have no relationship to each other, and are coincidentally cohesive.
Why is this worse than leaving duplicated code alone? It depends - often the modules created to reduce duplicated code are not suited for reuse in other applications. Another reason is that it becomes a maintenance problem, e.g. when the functionality of the routine is not the same each time it is called.
- Logical Cohesion
While the elements of a logically-cohesive routine have some relationship to each other, the whole does not perform a function. The elements have a closer tie to the problem solution than coincidentally cohesive elements: they perform logically similar operations (ibid. p. 114).
McConnell (1993) describes an example of logical cohesion where the operations in a function are separated by some control structure, and the set that executes is determined by a control flag that's passed in the parameter list.
- Temporal Cohesion
Temporally-cohesive elements are related by time. E.g. all of the elements execute to perform some operation in a given time period, such as start up or shut down. This is also logical cohesion - the elements are related to each other.
Yourdon and Constantine found that whenever you have logical cohesion without temporal cohesion, such code is usually "tricky, obscure, or clumsy code which is difficult to maintain and modify" (1979, p. 116).
- Procedural Cohesion
Procedurally-cohesive operations are related by elements of processing such as looping or a decision process; a simple sequence of steps also relates elements of processing.
Yourdon and Constantine describe a design process where a flowchart of a process is used to determine where it should be separated into routines and modules. The results varied widely and often tended to produce lower cohesion. They discovered that higher forms of cohesion resulted when data relationships were separated from control features (ibid. p. 117).
Procedural cohesion typically results when using models of a process to determine code module structure. Modules that have procedural cohesion typically contain only part of a function, an entire function and parts of other functions, parts of several functions, or even a single function (ibid. p. 118). Sometimes a procedurally-cohesive module that performs a distinct task is acceptable, in terms of ease of maintenance and ties to problem structure.
- Communicational Cohesion
Communicational cohesion is the lowest type of cohesion where processing elements have a structural relationship to the problem. McConnell (1993) gives a description where all operations make use of the same data and the order in which those operations occur is not important. There is no single name that clearly describes the function.
- Sequential Cohesion
Sequential cohesion describes the situation where the flow of data proceeds sequentially through a set of functions; or, where operations on data proceed sequentially through a specific order and the data is shared from step to step, and the operations don't make a complete function.
All forms of cohesion discussed so far can result from flowchart design, which allows confusion of data flow with control flow (Yourdon and Constantine, 1979 p. 126). This happens because a single step on a flowchart may represent code from an entire function, part of a function, a separate program such as a DLL, etc.
- Functional Cohesion
Functional cohesion is the highest level of cohesion. It describes the case where a function does one thing: fulfilling what it's name says it does (ibid. p. 127). This brings up the subject of routine names. In the context of cohesion, a routine name should be a clear verb - and - object combination.
A problem with the structured concepts of coupling and cohesion is that they cannot be objectively assessed (Yourdon and Constantine p. 132; Budgen p. 78) (though work continues in an effort to change this - see for example Bieman and Kang, 1998, or Kramer and Kaindl, 2004).
Information Hiding
Information hiding means to hide details of data structures inside a module. Only routines that are part of the module have direct access to the data structures; routines that are not part of the module must call routines within the module to work with the data structures.
The term "information hiding" may have originated in the work of Parnas (1972), who describes a module as "a responsibility assignment" (p. 1054), which is "characterized by its knowledge of a design decision which it hides from all others. Its interface or definition was chosen to reveal as little as possible about its inner workings" (ibid. p. 1056).
Information hiding is related to the quality factors reliability and maintainability (Budgen p. 79).
Artifacts
The product of software design - also known as implementation design - is the technical specification. This document or set of documents usually includes flowcharts to present process model, data flow, and process flow cycles; and a form-level hierarchy (i.e. input forms and dialogs). The technical specification describes a program's operating environment, interfaces, functions, and modules.
Several documents fall into the category of design specifications. The particular documents you create will depend on the design method you use.
Diagrams
Diagrams give us a way to communicate complex ideas: what we are capable of thinking depends on the language we use for thinking (Martin and McClure, 1985 p. 109). Diagrams extend our vocabulary, and different kinds of diagrams extend our vocabulary in different ways.
Diagrams are tools that communicate information about a system and are thus particularly useful for (ibid., paraphrased):
- speeding up work on an application and improving the results
- communicating a design to team members
- enabling maintenance programmers to understand how a system works
- helping during debugging, as a reference when questions arise about how a system is supposed to work.
As better, more rigorous methods for specifying systems are created, new diagramming methods will be needed (ibid. p. 110).
Diagrams are used in four areas of design (ibid. p. 111):
- System overview
- Program architecture
- Program detail
- Data structure representation
Many of the early diagramming techniques have fallen into disuse. While some have been supplanted by newer UML techniques, others have no counterpart in UML. It is worth discussing these, if only briefly, because they are still useful. Martin and McClure's review of structured diagramming techniques concludes with a recommendation of the best of the techniques before UML and object-oriented methods appeared (ibid. p. 396). They recommend:
- Action diagrams
- HOS charts
- Decision tables
- Data flow diagrams
- Entity-relationship diagrams
- Data navigation diagrams
Methodologies
- Stepwise Refinement
- Incremental Design
- Structured Systems Analysis and Structured Design
- Object-Oriented Design and Object-Based Design
- Component-Based Design
Choosing a Method
All design methodologies have the same goals, but some methods work better on certain kinds of problems than other methods. Often you will choose a design method that complements the programming environment. Most languages in common use today have object-oriented (OO) capabilities.
Structured Design
Design principles developed in the area of structured design are not made obsolete by principles developed in the area of object-oriented (OO) design. Structured design concepts such as cohesion and coupling describe fundamental characteristics of all software that has routines or modules.
Structured design is based on theories developed by Bohm and Jacopini (1966; Yourdon and Constantine, 1979, p. 73). Yourdon and Constantine describe structured design as "a collection of guidelines for distinguishing between good designs and bad designs, and a collection of techniques, strategies, and heuristics that generally leads to good designs..." (p. 15).
In the quest for the least complex design, they present a body of evidence that shows that the least complicated design consists of small modules that are relatively independent of each other but that relate easily to the application (ibid. p. 29). Two principles of structured design help focus the design effort to achieve these characteristics:
- "Highly interrelated parts of the problem should be in the same piece of the system, i.e. things that belong together should go together.
- "Unrelated parts of the problem should reside in unrelated pieces of the system. That is, things that have nothing to do with one another don't belong together (ibid. p. 21).
The unit of structured design is the function. Structured methodologies apply strategies and tools that help you
- identify and isolate pieces of a system
- iteratively apply the techniques until the design is complete.
Structured design emphasizes two approaches:
- Top-down decomposition - starting with a drawing that depicts system functions as an upside-down tree of boxes. Work from the most general program function to the specific.
- Bottom-up composition - opposite of top-down decomposition, you start with specific lower-level functionality that you are familiar with and work to the general.
Applying the Principles
Structured design is concerned with relationships between pieces of a system. We refer to these pieces as modules, though the term 'module' does not necessarily indicate a group of functions. Some qualities of modularity are apparent only in the relationship between modules. Y&C introduce the concepts of coupling (p. 85) and cohesion (p. 106) for this purpose. These concepts are the theoretical basis of structured design.
References
Bieman, J. M. and Kang, B-K, 1998, Measuring Design-level Cohesion: IEEE Transactions on Software Engineering, v. 24, no. 2, pp. 111-124.
Bohm, Corrado and Jacopini, Giuseppe, 1966, Flow Diagrams, Turing Machines and Languages with Only Two Formation Rules: Communications of the ACM, v. 9, no. 5, pp. 366-371.
Budgen, David, 2003, Software Design (2nd ed.): Pearson Education Limited/Addison-Wesley, 468 pages.
ISBN 0-201-72219-4
Kramer, S., and Kaindl, H., 2004, Coupling and Cohesion Metrics for Knowledge-Based Systems Using Frames and Rules: ACM Transactions on Software Engineering and Methodology v. 13 no. 3, pp. 332-358
Martin, David and Estrin, Gerald, 1967, Models of Computations and Systems - Evalation of Vertex Probabilities in Graph Models of Computations: Journal of the ACM, v. 14, no. 2, pp. 281-299.
McConnell, Steve, 1993, Code Complete: A Practical Handbook of Software Construction: Microsoft Press, 880 pages.
ISBN 1-55615-484-4
Martin, J., and McClure, C., 1985, Structured Techniques For Computing: Prentice-Hall, 775 pages.
ISBN 0-13-855180-4
Meyer, Bertrand, 1997, Object-Oriented Software Construction: Prentice-Hall, 1254 pages.
ISBN 0-13-629155-4
Parnas, D. L., 1972, On The Criteria To Be Used In Decomposing Systems Into Modules: Communications ACM, v. 15, no. 12, pp. 1053-1058.
Yourdon, Edward and Constantine, Larry L., 1979, Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design: Prentice-Hall, 473 pages.
ISBN 0-13-854471-9
Zhu, Hong, 2005, Software Design Methodology: From Principles to Architectural Styles: Butterworth-Heinemann, 368 pages.
ISBN 0-75-066075-9