Blogs
By Linda Parker GatesSenior Member of the Technical StaffAcquisition Support Program
The appeal of Agile or lightweight development methods has grown steadily in the software development community. Having spent a number of years investigating strategic planning approaches, I’ve recently been thinking about whether Agile principles can be—and should be—applied to strategic planning. This blog post examines the applicability of Agile principles to strategic planning.
Strategic planning is the process of defining an organization’s plans for achieving its mission. The purpose is to outline a broad approach to achieving mission-aligned goals derived through a process of analyzing the full organizational environment, taking into account the organization’s vision, goals, objectives, enablers, barriers, and values. Although descriptions and analysis of the present situation are included, a strategic plan doesn’t merely endorse the status quo; it is directional in nature and directs change of some kind. As such, strategic planning is a critical foundation for executing work. It sets the stage for division- and unit-level planning, as well as enterprise architecture, process improvement, risk management, portfolio management, and any other enterprise-wide initiatives.
Strategic planning typically follows this type of sequence:
Scope the strategic planning effort.
Build a foundation of organizational information.
Define goals and objectives in terms of the organizational need and desired outcome.
Identify potential strategies for achieving the objectives.
Develop action plans.
Identify project measures.
Execute the work.
Track progress.
My February 2011 blog post, Strategic Planning with Critical Success Factors and Future Scenarios, proposes the integration of the Critical Success Factor (CSF) Method and future scenario planning into the strategic planning process. CSFs are a group of factors that determine group or company success, including key jobs that must be done exceedingly well. Future scenarios are a tool for exploring multiple, possible "futures" and developing decisions or strategies that will serve the uncertain future well. Together, these two techniques help foster strategic thinking, which is a strong complement to strategic planning. They also provide some leverage points for making strategic planning more nimble.
The values presented in the Agile Manifesto focus on the following four principles:
individuals and interactions over processes and tools
working software over comprehensive documentation
customer collaboration over contract negotiation
responding to change over following a plan
The manifesto is careful to point out that a balance is required; that is, the counter values cannot be ignored. So a key question becomes: Can organizations benefit from pursuing strategic planning approaches that embody these Agile values? The rest of this blog posting addresses this question.
Individuals and Interactions over Processes and ToolsA strong process is vital to effective strategic planning. Nonetheless, placing value on interactions between leadership and staff and on face-to-face conversations certainly enhances the value of the strategic planning process. Both the CSF method and scenario planning rely heavily on individuals and interactions. These conversations themselves bring value to strategic planning.
Working Software/Tangible Results over Comprehensive DocumentationA common criticism of strategic planning is that it over-emphasizes deconstruction of the past and present while creating the illusion that we can anticipate the future. If we replace "working software" with "tangible results" (to accommodate the strategic planning domain), I contend it is productive to focus strategy efforts on short-term accomplishments over thoroughly documented commitments about the future.
Customer Collaboration over Contract NegotiationWhile contract negotiation is not particularly pertinent to strategic planning, a strategic plan serves as an informal contract with the organization. The idea of involving customers is also intriguing. In my technical report, Strategic Planning with Critical Success Factors and Future Scenarios, published in November 2010, I recommend involving customers in scenario planning since they bring an external perspective that can be critical to getting quality results. There may even be a greater role for customers in the development of strategy.
Responding to Change over Following a PlanThis principle offers great potential for improving the way strategic planning is conducted and the results realized in implementation. A good strategic-planning process has always done more than just produce a plan—it supports ongoing strategic thinking, discussion, and behavior. Strategic thinking focuses on finding and developing organizational opportunities and creating dialogue about the organization’s direction—these are the foundation of an organization’s readiness to respond to change. Strategic planning is enhanced by strategic thinking, which makes planning adaptive and in sync with an evolving environment. So is the planning activity itself still needed? Yes! Effective strategic work cannot be accomplished without it. If nothing else, the divergent results of strategic thinking must be operationalized through a convergent planning activity.
So, the next question is, how do we adjust the strategic planning process to make it more agile?
The most effective adjustments to make are to apply the Agile values to steps 3 and 8 of the strategic planning process outlined above. Step 3 (Define goals and objectives) benefits from a focus on the first value (individuals and interactions) and the third value (customer collaboration). Step 8 (Track progress) is enhanced through a focus on the second value (which I am calling "tangible results") and the fourth value (responding to change).
Interestingly, the CSF and scenario planning methods provide opportunities to integrate all four Agile values into a strategic planning process as follows:
In terms of individuals and interactions, CSFs are derived through interviews with managers, a critical aspect of the technique that involves one-on-one conversations with people very closely acquainted with operational issues.
Scenario planning is a team-based exercise that relies heavily on interactions among a representative cross-section of organizational staff.
The interview-based method for developing scenarios that Kees van der Heijden describes in his 1996 book, Scenarios: The Art of Strategic Conversation, can enhance the role of individuals in setting strategy. As noted above, scenario planning provides a good opportunity for customer collaboration. Scenario planning relies heavily on monitoring for early warning signs, which are indicators that a particular future is unfolding. These indicators help planners make adjustments to strategies or their execution. Combined with shorter cycles, monitoring techniques are critical for delivering short-term, tangible results and responding to change.
It is important to understand the nature of planning. The fourth agile value emphasizes responding to change over following a plan. Without a strong plan, response to change is simply reaction, not agility. The Agile principles assert values in terms of preferences over the counter values. In the strategic planning domain, the ability to perform in accordance with Agile values requires significant strength in the counter values. Agility emerges from skill and strength in the counter values. It doesn’t replace them.
Agile strategic planning would best serve an organization that is applying agile methods. If development teams are already using lightweight methods, leadership should consider adopting agile processes to move the organization toward its goals. In general, agile strategic planning can offer value to organizations that are complex or self-organizing and that focus on adaptive, iterative delivery.
Additional Resources:
To read or download a copy of the SEI technical report, Strategic Planning with Critical Success Factors and Future Scenarios: An Integrated Strategic Planning Framework, please visitwww.sei.cmu.edu/library/abstracts/reports/10tr037.cfm
To read the blog post, Strategic Planning with Critical Success Factors, please visit http://blog.sei.cmu.edu/post.cfm/strategic-planning-with-critical-success-factors-and-future-scenarios
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:48pm</span>
|
By Bjorn Andersson, Senior Member of the Technical StaffResearch, Technology & System Solutions
Many DoD computing systems—particularly cyber-physical systems—are subject to stringent size, weight, and power requirements. The quantity of sensor readings and functionalities is also increasing, and their associated processing must fulfill real-time requirements. This situation motivates the need for computers with greater processing capacity. For example, to fulfill the requirements of nano-sized unmanned aerial vehicles (UAVs), developers must choose a computer platform that offers significant processing capacity and use its processing resources to meet its needs for autonomous surveillance missions. This blog post discusses these issues and highlights our research that addresses them.
To choose a computer platform that offers greater capacity, it is necessary to observe the major trends among chip makers. Historically, advances in semiconductor miniaturization (a.k.a., Moore's Law) periodically yielded microprocessors with significantly greater clock speeds. Unfortunately, microprocessor serial processing speed is reaching a physical limit due to excessive power consumption. As a result, semiconductor manufacturers are now producing chips without increasing the clock speed, but instead are increasing the number of processor cores on a chip, which results in multicore processors. For nearly a decade, the use of homogeneous multicore processors (which are chips with identical processing cores) gave us some headroom in terms of power consumption and allowed us to enjoy greater computing capacity. This headroom is diminishing, unfortunately, and is about to vanish, forcing semiconductor manufacturers to seek new solutions.
We are currently witnessing a shift among semiconductor manufacturers from homogeneous multicore processors with identical processor cores to heterogeneous multicore processors. The impetus for this shift is that processor cores tailored for a specific class of applications behavior have the potential to offer much better power-efficiency. AMD Fusion and NVIDIA Tegra 3 are examples of this shift. Intel Sandybridge, which has a graphics processor integrated onto the same chip as the normal processor, also reflects this shift, as well.
In a heterogeneous multicore environment, the execution time of a software task depends on which processor core it executes on. For example, a software task performing computer graphics rendering, simulating physics, or estimating trajectories of flying objects runs much faster on a graphics processor than on a normal processor. Conversely, some software tasks are inherently sequential and cannot benefit from the graphics processor; they execute much faster on a normal processor. For example, a software task with many branches and no inherent parallelism runs much faster on a normal processor than on a graphics processor. Ideally, each task would be assigned to the processor where it executes with the greatest speed, but unfortunately the workload is often not perfectly balanced to the types of processor cores available.
Efficient use of processing capacity in the new generation of microprocessors therefore requires that tasks are assigned to processors intelligently. In this context, "intelligently" means that the resources requested by the program are the ones possessed by the processor. Moreover, the desire for short design cycles, rapid fielding, and upgrades necessitates that task assignment be done automatically—with algorithms and associated tools.
THE TASK ASSIGNMENT PROBLEM
The problem of assigning tasks to processors can be described as follows: A task (such as computer graphics rendering or a program determining whether the process half-or-triple-plus-one reaches one with a known starting value) is described with its processor utilization, but it has different processor utilizations for different processors. For example, if a given task is assigned to a graphics processor, then the task will have a utilization of 10 percent. If the task is assigned to a normal processor, the task will have a utilization of 70 percent. We are interested in assigning each task to exactly one processor such that for each processor, the sum of utilization of all tasks assigned to this processor will not exceed 100 percent. If we can find such an assignment then it is known that if tasks have deadlines described with the model implicit-deadline sporadic tasks—and if the scheduling algorithm Earliest-Deadline-First (EDF) is used—then all deadlines will be met at run-time (with a minor modification, we can use Rate-Monotonic scheduling as well).
PREVIOUS APPROACHES FOR TASK ASSIGNMENT
The task assignment problem belongs to a class of problems that are computationally intractable, meaning that it is highly unlikely to be possible to design an algorithm that finds a good assignment and always runs fast. So we should either create an algorithm that always finds a good assignment or one that always runs fast. For the goal of designing an algorithm that always finds a good assignment, task assignment can be modeled as integer-linear programming (ILP) as follows:
Minimize zsubject to the constraints that for each processor p: x1,p*u1,p + x2,p*u2,p + … + xn,p*un,p <= zand for each task i: xi,1 + xi,2 + … + xi,m = 1and for each pair (i,p) of task i and processor p: xi,p is either 0 or 1
In the optimization problem above, n is the number of tasks and m is
the number of processors and ui,p is the utilization of task i if it
would be assigned to processor p. xi,p is a decision variable with the
interpretation that it is one if task i is assigned to processor p; zero
otherwise.
Unfortunately, solving this integer linear program takes a long time.
To design an algorithm that always runs reasonably fast, there are several algorithms, as described in a research paper by Sanjoy K. Baruha, that transform the ILP into a linear program (LP) and then perform certain tricks. Although an LP runs faster than the ones based on ILP, they still have to solve an optimization problem, which can be time-consuming. To design algorithms that run faster, we would like to perform task assignment in a way that does not require solving LP.
OUR APPROACH FOR TASK ASSIGNMENT
Previous work on task assignment for homogeneous multicore processors where all processor cores are identical is based on a framework called bin-packing heuristics. Such algorithms work approximately as follows:
1.Sort tasks according to some criterion.
2.for each task do
3.for each processor do
4.if the task has not yet been assigned and it is possible to assign the task to the processor so that the sum of utilization of tasks on the processor does not exceed 100 percent then
5.Assign the task on the processor
6.end if
7.end for
8.end for
Our approach involves adapting bin-packing heuristics to heterogeneous multicore processors. We believe it is possible to modify the algorithm structure outlined above so we can also assign tasks to processors even when the utilization of a task depends on the processor to which it is assigned. One can show that the use of bin-packing can perform poorly if processors and tasks are not considered in any particular order. Specifically, for a set of tasks that could be assigned, such an approach can fail even when given processors that are "infinitely" faster. One of our main research challenges is therefore to determine how to sort tasks (step 1) and in which order we should consider processors (in step 3). We are evaluating our new algorithms in the following ways:
We plan to prove (mathematically) the performance of our new algorithms. Specifically, we are interested in proving that if it is possible to assign tasks to processors then our algorithm will succeed in assigning tasks to a processor if a given processor is x times as fast. Given that x is our performance metric; the lower its value, the better.
We also plan to evaluate the performance of our algorithms by applying the algorithms on randomly-generated task sets. This will demonstrate the "typical" behavior of the algorithms.
CONCLUSION
Most semiconductor manufacturers are shifting towards heterogeneous multicore processors to offer greater computing capacity while keep power consumption sufficiently low. But using a heterogeneous multicore efficiently for cyber-physical systems with stringent size, weight, and power requirements requires that tasks are assigned properly. This blog post has discussed the state-of-art and summarized our ongoing work in this area.
ADDITIONAL RESOURCES
To read the paper "Partitioning real-time tasks among heterogeneous multiprocessors" by Sanjoy Baruah, please visithttp://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1327956
To read the proceedings "Assigning Real-Time Tasks on Heterogeneous Multiprocessors with Two Unrelated Types of Processors", please visithttp://www.cister.isep.ipp.pt/activities/seminars/%28S%28otidyq454nfyy255alnrdb3m%29%29/GetFile.aspx?File=%2Fspringseminars2011%2Frtss10_het2.pdf
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:48pm</span>
|
By Mike Phillips Principal ResearcherAcquisition Support Program
In my preceding blog post, I promised to provide more examples highlighting the importance of software sustainment in the US Department of Defense (DoD). My focus is on certain configurations of weapons systems that are no longer in production for the United States Air Force, but are expected to remain a key component of our defense capability for decades to come, and thus software upgrade cycles need to refresh capabilities every 18 to 24 months. Throughout this series on efficient and effective software sustainment, I will highlight examples from each branch of the military. This second blog post describes effective sustainment engineering efforts in the Air Force, using examples from across the service’s Air Logistics Centers (ALCs).
A Brief History of Software Sustainment
From its earliest days, the military has provided facilities to maintain the functionality of its various weapon systems. The descriptive terms for these units have included "arsenals," "depots," and "logistics centers," to name a few. In the 1990s the Air Force consolidated its depot maintenance capabilities into three centers: Warner-Robins ALC in Georgia, Oklahoma City ALC in Oklahoma, and Ogden ALC at Hill Air Force Base (AFB) in Utah. A 2012 initiative within the Air Force Material Command will centralize the leadership of the ALC’s into a single entity, although the three sites will remain as sister units in a single "super ALC" headquartered at the Oklahoma City site.
Within this geographical framework, we can overlay the increased importance of software in our weapon systems. Lloyd Mosemann, the deputy assistant secretary of the Air Force for communications, computers and support systems, was a visionary leader for software sustainment in the Air Force logistics arena in the 1980s. He recognized that the various ALCs would "inherit" responsibility for the various weapon systems as production waned and that "software sustainment" should be a well-developed capability within the organic structure of the ALCs. The first demand for an organization to achieve a maturity level was in a memorandum from Mosemann to the centers, directing them to achieve a maturity level 2 against the Capability Maturity Model for Software and then, level 3 a few years later (many think the "Mosemann letter" was aimed at industry, but it was not).
This blog posting highlights examples of effective sustainment of software intensive systems across the three sites and recognizes that the successes achieved are the results of improvement efforts across the domain, well beyond the process domain. The workforce has grown in its technical competence, and a modern systems engineering environment has been developed. With that historical perspective, we can explore some of the examples evident today.
In the mid 1970s, the Air Force created a "block" strategy as a way to pursue modernization while maintaining a relatively stable production approach. In the past, letter identifiers after the aircraft number showed a change to a new capability, including major hardware changes. With increasing software content, a block upgrade represented major functionality changes with relatively modest hardware configuration changes. The block strategy has become a common practice across the DoD, where software updates are released as a block of changes at regular intervals to avoid too many variations being sent to the field. By applying a block strategy, the DoD assures that there is a regular, anticipated opportunity to have the freshest capabilities deployed.
The US Air Force F-16 program is an excellent example of the block strategy, with more than 4,400 F-16’s being produced and flown by 26 countries. A key program decision made early in the development of the F-16 was the deliberate strategy for evolving the F-16’s capabilities. This strategy was called the F-16 Multinational Staged Improvement Program (MSIP) and involved the continual enhancement of new aircraft and the retrofit of previous aircraft with system hardware and software capabilities. The Ogden ALC’s experience with the F-16 provides a good example of the success of the MSIP being applied in a sustainment environment.
From a sustainment perspective, the focus has been on the F-16C/D, Block 30, since the later blocks (40 and 50) have remained in production.
Continuous Improvement
Initially the Ogden ALC used the CMM and then CMMI to guide its improvement efforts. More recently, they have joined their process improvement partners at the Navy Air Warfare Center at China Lake, Calif. in using the Team Software Process (TSP). The metrics produced by the Ogden team illustrate the following quality performance:
Major upgrades continue to enjoy higher and higher productivity due to the quality emphasis contained in the disciplined approach to software production.
Rework effort is down below 10 percent.
Efficient Upgrade Planning
Part of the effective planning highlighted above was an effort to provide reasonably sized updates on a regular schedule. For the F-16, the size has been about 500 thousand of source/software lines of code (KSLOC) using an upgrade cycle aimed at 18 to 24 months. The upgrade team has been able to accommodate high priority upgrades, such as a rapid introduction of the AIM-9X, by balancing sustainment and modernization loads effectively. The F-16 team has just begun to assimilate the transition of the later F-16 blocks 40 and 50. The foundation established with the Block 30 experience should enable the ALC sustainment team to master the learning curve for the more complex and diverse fleet being transitioned.
The Airborne Early Warning & Control Systems (AWACS), which are supported by the Oklahoma City ALC, provide another example. These systems fit in a Boeing 707-based aircraft with a large saucer shaped antenna housing radar systems that can be upgraded to improved hardware and software capabilities. Reflective of many DoD platforms, the AWACS system entered operations in 1976 and its critical mission systems are continually upgraded to extend its capabilities. The multi-source integration function that is part of the most recent AWACS upgrade used open systems and lean architecture approaches, which are advocated by the SEI and allow more rapid software updates to the fleet without requiring extensive hardware changes.
Another weapon system sustained and continuously improved upon at Oklahoma City is the venerable B-52 strategic bomber. If planned modernization efforts of this weapon system extend the life as currently proposed, the aircraft will have been in the active Air Force inventory for almost 90 years upon its retirement in 2040. The improvement to the software architecture expands its versatility with "smart weapons" and its network communications capability. In an era that demands a focus on affordability, extending the lifetime of the B-52—while enhancing its capability—demonstrates the power of software-focused modernization.
The Warner-Robins ALC has responsibility for electronic warfare and radar systems, as well as weapon systems, including the F-15, C-130 (almost as old as the B-52) and the C-5. Across this collection of software-reliant systems, the center continues to improve the quality and timeliness of its upgrades. Of 246 software releases in fiscal year 2010, the ALC delivered 99 percent at or below expected cost, and 98 percent on or ahead of schedule. Only three systems contained defects that were discovered in the field after release and required rework. As with its two sister ALCs, Warner-Robins has committed to continuous improvement of its software capability. While we at the SEI are pleased with the unit’s involvement with our CMMI models, the organization has complemented that effort with attention to Lean Six Sigma and the Air Force Smart Operations for the 21st Century (AFSO21). The results are notable.
The leadership of the software organization at Warner Robins has noted that the "old image" of software sustainment as small "bug fixes" has been replaced by recognition that with software-reliant systems, the opportunity arises to quickly develop innovative capabilities to solve the challenges facing their DoD—and allied nation—customers today.
Each of the services has a wide variety of software-reliant systems, and organic capabilities to complement the major contractors who create sturdy platforms like the B-52 that can last nearly a century. The next installment in this series on efficient and effective software sustainment will examine the Naval Air Weapons Station at China Lake.
Additional Resources
To read the SEI technical report, Sustaining Software Intensive Systems, please visit www.sei.cmu.edu/library/abstracts/reports/06tn007.cfm?DCSext.abstractsource=SearchResults
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:47pm</span>
|
Experience from Financial SystemsBy Bill Nichols,
Senior Member of the Technical Staff
Software Engineering Process Management
In his book Drive, Daniel Pink writes that knowledge workers want autonomy, purpose, and mastery in their work. A big problem with any change in processes is getting the people who do the work to change how they work. Too often, people are told what to do instead of being given the information, autonomy, and authority to analyze and adopt the new methods for themselves. This posting—the first in a two-part series—describes a case study that shows how Team Software Process (TSP) principles allowed developers at a large bank to address challenges, improve their productivity, and thrive in an agile environment.
In 2009, Nedbank, one of the four largest banks in South Africa, launched a TSP pilot program in collaboration with the Johannesburg Centre for Software Engineering (JCSE) at the University of Witwatersrand and the SEI. After TSP pilot teams at Nedbank were given a toolkit—including training on how to measure, analyze, and communicate—their behaviors changed. Not only were the developers (a mixed team of elite and average programmers) able to work more autonomously, they implemented methods that resulted in higher quality work in less time.
The Nedbank TSP pilot involved two software-intensive projects: one maintenance project and one new development project. Nedbank had faced software process improvement challenges in the past. They therefore sought to address the issues of quality and productivity among their software engineering teams by implementing a methodology that would improve the teams' performance through planning, tracking their work, establishing goals, and providing teams with the tools to take responsibility for their processes and plans.
Engage the Team
The Nedbank TSP pilots began with separate training for everyone involved: senior managers, team leads, software developers, and non-developer team members. The groups learned how the process change would affect them and their relationship with others in the organization. They learned that working on a TSP project required them to behave differently in terms of reporting, data collection, expectations regarding other team members, and interactions with other team members. Developers learned the Personal Software Process with a focus on software engineering discipline, measurement, and planning. Non-developers, such as testers, business analysts, and documenters, learned to work within the new team and project structure. Throughout the courses, Nedbank team members learned what a measured software process looks like and how to measure the software process. They also learned to communicate with a focus on the data, project, and work, not on personal issues.
Launch the Project
TSP projects begin with a structured launch to plan the work, including nine meetings plus a post-mortem meeting to satisfy specific project planning needs. The Nedbank teams began by understanding project goals from a management standpoint and then identified supporting team goals and personal objectives for each participant. These goals helped frame their subsequent decisions on strategy, process, support, and work breakdown.
The project was then broken down into small pieces, and work was assigned among team members. Each member was then able to commit to the plan and his or her specific interim goals, and understand how those goals support the organization's larger goals for the project. The key to success was that the team had the autonomy to plan and manage the work and determine how it would be done. For example, early in the planning phase, the team working on the maintenance project realized that the schedules of the designers and the developers were not aligned. The designs wouldn't be ready when the developers needed them. Working together, the team approached the schedule problems in the following two ways:
The team lead went to management and negotiated priorities, using his team's progress data to support the schedule needs. The team lead was then able to convince the designers to prioritize their work so that they could supply the developers with the designs needed to proceed. The senior developers next took the load off the designers by taking on some of the design work.
In addition, the team identified modules that would not have to be changed as well as sets of programs that would be easy to update. This change in approach not only helped meet project needs, but also helped satisfy the management goal that teams gain a big picture view of the project.
Deal with Mid-Project Issues
Early in the development process, both projects fell behind schedule. Discipline and measurement became critical. By tracking their time and following their defined steps, team members were able to identify their status precisely and understand how they got there. The team lead made sure the measurements and data were discussed in the status meeting. The coach helped the team understand what the measurements and data meant and how these facts affected the work. With this data-driven feedback, the teams saw an increase in the number of task hours (direct time on project tasks) per week, as well as a sharp reduction in defect rates. With the coach, the teams periodically analyzed their data and improved their processes.
By the end of each pilot project, the quality of each team's work had improved significantly. They were able to find and remove defects earlier in the process. After the initial delivery, no further defects were found in system tests or production in the remaining three cycles. There were zero defects in deliveries two through four. This improvement required months of effort, training, management support, coaching, worker motivation and engagement, and meaningful data-based feedback.
See the Results
In the end, the TSP pilot teams at Nedbank made significant behavioral changes that not only improved the quality of the software but also improved team members' work lives by decreasing the need for evening and weekend overtime. The teams were able to make these improvements because they had project-specific measurements to guide their decisions, and they had the authority to implement those decisions. Based on the results of the pilots Nedbank decided to implement TSP throughout the organization.
To learn more about Nedbank's views on TSP, see their rollout video below:
This video requires the Adobe Flash Player Plug-in please go to the following URL to download: http://get.adobe.com/flashplayer/
Expanding and scaling any process comes with challenges. These topics will be discussed in the second post of this series: Achieving Quality and Speed with TSP Organization-Wide.
Additional Resources:
To read the SEI technical report Deploying TSP on a National Scale: An Experience Report from Pilot Projects in Mexico, please visitwww.sei.cmu.edu/library/abstracts/reports/09tr011.cfm
To read the Crosstalk article A Distributed Multi-Company Software Project by Bill Nichols, Anita Carleton, & Watts Humphrey, please visitwww.crosstalkonline.org/storage/issue-archives/2009/200905/200905-Nichols.pdf
To read the SEI book Leadership, Teamwork, and Trust: Building a Competitive Software Capability by James Over and Watts Humphrey, please visit www.sei.cmu.edu/library/abstracts/books/0321624505.cfm
To read the SEI book Coaching Development Teams by Watts Humphrey, please visitwww.sei.cmu.edu/library/abstracts/books/201731134.cfm
To read the SEI book PSP: A Self-Improvement Process for Engineers by Watts Humphrey please visitwww.sei.cmu.edu/library/abstracts/books/0321305493.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:47pm</span>
|
By Robert Nord, Senior Member of the Technical StaffResearch, Technology, & System Solutions
New acquisition guidelines from the Department of Defense (DoD) aimed at reducing system lifecycle time and effort are encouraging the adoption of Agile methods. There is a general lack, however, of practical guidance on how to employ Agile methods effectively for DoD acquisition programs. This blog posting describes our research on providing software and systems architects with a decision making framework for reducing integration risk with Agile methods, thereby reducing the time and resources needed for related work.
The DoD chief information officer (CIO)’s office has recently released a 10 Point Plan to reform DoD information technology (IT). Point number 4 is "Enable Agile IT." The key tenets of Agile IT include
Emphasize incremental introduction of mature technology by delivering useful capability every 6 to 12 months to reduce risk through early validation to users.
Require tight integration of users, developers, testers, and certifiers throughout the project life cycle to meet Agile IT’s promise of rapid delivery in lieu of extensive up front planning.
Leverage common development, test and production platforms and enterprise products to deliver IT capabilities faster, cheaper, and more interoperable, without redundant infrastructure and documentation.
Establish a change-tolerant design environment enabled by discovery learning that promotes decisions based on facts rather than forecasts.
Program managers and acquisition executives are responding to this plan by applying industry practices, such as Agile methods. At the same time, related DoD guidelines encourage system development via a more open, modular software architectural style, such as loosely-coupled service-oriented architectures. The impact of these architectural decisions becomes more critical in an agile context because they promote or inhibit the ability to achieve agile goals, such as rapid feedback, responding to change, and team communication. Architectural decisions influence agile development properties, such as the length of an iteration, the allocation of stories to iterations, and the structure of the teams and their work assignments with respect to the architecture. What is needed, therefore, is research on eliciting and employing these properties and determining their relative importance in promoting rapid lifecycle development.
To address this need, I and other SEI technologists, Ipek Ozkaya, Stephany Bellomo, and Mary Ann Lapham, are conducting research that focuses on the implications of decisions made over the course of the software and systems lifecycle. We are examining when these decisions are made and the time when the implications surface to validate the following hypotheses:
The fundamental early decisions made during the pre-engineering and manufacturing development (pre-EMD) phase have an impact throughout the lifecycle. Acquisition programs lack the techniques needed to elicit and preanalyze the implications of which fundamental architectural decisions will enable or inhibit their ability to adopt an agile lifecycle. For example, architectural decisions that relate to decomposition and dependencies influence teaming structure, the capability to rapidly and confidently deliver features, and the ability to rapidly integrate new components among other factors.
The implications of the early decisions often surface in the final stages of the lifecycle, downstream from development. For example, unexpected rework related to correcting integration defects. When these problems are discovered further downstream from where they were injected in the lifecycle, they are more expensive to correct and this often causes cost overruns and delays project completion.
Identifying Critical Properties
A primary objective of our research is identifying critical properties of Agile software development influenced by architectural decisions that reduce lifecycle time. Identifying these properties is an important first step to give stakeholders the tools they need to make informed decisions that manipulate the settings of the properties to control for better outcomes.
Our approach involves examining the implications of possible change scenarios. One such scenario examines the impact of introducing an emerging multi-core technology to the mission processor software of the Apache helicopter, which is a US Army/Boeing program. Applications are increasingly favoring multi-core processors, a single integrated computing element with two or more independent processors, called "cores," allowing for greater processing capacity. The scenario explores questions that include
Will the multi-core technology change the architectural design (e.g., patterns, tactics, and architectural approaches)?
If so, which architectural decisions will change?
Are there engineering practices (such as Agile) that must be customized as a result of architectural decisions in support of a rapid lifecycle?
How will continuous integration be ensured?
How will team communication be impacted?
A key component of our work involves determining the critical decisions that will influence Agile practices. These decisions provide the rationale for how software and system architects design an architecture.
To expand our view, we will again collaborate with Philippe Kruchten, a professor of software engineering in the Department of Electrical and Computer Engineering of the University of British Columbia, who is active within Agile development and architecture decision making research communities. Another facet of our approach involves interviewing DoD programs and gathering data from members of the SEI Agile Collaboration Group.
Creating a Model
Rework must be considered early when reasoning about how to enable rapid lifecycle development. After we review the findings identified in the first phase of our work with collaborators, we will create an architectural decision model that allows software or systems architects to analyze the ramifications of their decisions. Based on our prior work, we anticipate that highly impactful architectural decisions will include
Decisions about interfaces and how the parts of the system are connected.
Decisions about structuring the systems to achieve quality attribute requirements. The impact of these decisions typically spans multiple areas within the system and is not localized within a single module.
We plan to use the Multiple Domain Matrix (MDM) to represent the decision model and to analyze the impact of architectural decisions on rapid lifecycle development. The MDM approach considers decision dependencies that provide visibility into how the ordering of decisions influences when development can be started and how changes propagate and may require rework of software elements. This approach will allow us to test our hypothesis that modeling architectural decisions during early stages of development (similar to pre-EMD) will reduce cycle time. Cycle time could be reduced upstream by enabling an earlier start of development, thus minimizing the time spent at the pre-EMD phase or reduced downstream by decreasing rework costs attributed to architectural decisions that affect integration.
Another component of our research will explore strategies for improving the relationship between architecture decision making and complex collaborative team interaction. A barrier that often arises is that decisions made by architects about partitioning the architecture are not aligned with the networks of agile teams at scale and for the kinds of systems relevant to the DoD. Another barrier is alignment with the teams that span the lifecycle beyond the development teams traditionally associated with agile and include system engineering, testing, validation and verification, certification and accreditation. We have observed the ramifications of this misalignment during the integration of components built by different teams, where incompatibilities lead to significant rework.
To map, analyze, and support the architectural decisions of industry collaborators, we plan to map our MDM approach into a conceptual model developed by Kevin Sullivan of the University of Virginia, who is working on creating a cyber-social conceptual model. Sullivan’s work focuses on the social networks and the value of relationships between decision makers in a system.
Challenges
A primary technical challenge that we face in this approach involves scaling. As a practice, Agile has been successful in helping to solidify the efficiency of development teams when projects involve small teams. Applying these same concepts to large-scale distributed systems— including the rest of the organization that has priorities larger than the development team—will be critical for success, but it will also present some of the greatest challenges. To address the challenge of scaling, we are looking at the influence architecture exerts on managing teams and how to provide practical guidance on what amount of architecture is needed and when. On the one hand, early or overproduction of architecture can create delay. On the other hand, not enough production of architecture can result in integration defects leading to rework. The focus of lean thinking on improving cycle time by eliminating waste, in the form of delay or unnecessary rework, in conjunction with architecture, has shown great potential for improving management of software development projects and increasing flow of value delivered to the customer.
Next Steps
We are in the process of conducting a survey of critical agile development properties and will write about the results in a future blog posting.
Additional Resources:
To read the SEI technical report, Agile Methods: Selected DoD Management and Acquisition Concerns, please visit www.sei.cmu.edu/library/abstracts/reports/11tn002.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:47pm</span>
|
By Dean Sutherland Senior Member of the Technical StaffThe CERT Program
Many modern software systems employ shared-memory multi- threading and are built using software components, such as libraries and frameworks. Software developers must carefully control the interactions between multiple threads as they execute within those components. To manage this complexity, developers use information hiding to treat components as "black boxes" with known interfaces that explicitly specify all necessary preconditions and postconditions of the design contract, while using an appropriate level of abstraction to hide unnecessary detail. Many software component interfaces, however, lack explicit specification of thread-related preconditions. Without these specifications, developers must assume what the missing preconditions might be, but such assumptions are often incorrect. Failure to comply with the actual thread-related preconditions can yield subtle and pernicious errors (such as state corruption, deadlock, and security vulnerabilities) that are intermittent and hard to diagnose. This blog post, the first in a series, describes our ongoing research towards solving this problem for a variety of languages, including Java and C11.
Previous Work
In previous work, we introduced the concept of "thread usage policy," which is a group of often unspecified preconditions used to manage access to shared state by regulating which specific threads are permitted to execute particular code segments or to access particular data fields. The concept of thread usage policy is not language specific; similar issues arise in many programming languages, including Java, C11, C#, C++, Objective-C, and Ada. The preconditions contained in thread usage policies can be hard to identify, poorly thought out, unstated, poorly documented, incorrectly expressed, out of date, or simply hard to find. These problems inspired us to devise a means of specifying these preconditions in a form that developers would find both useful and acceptable.
We developed a simple, formal specification language for modeling thread usage policies, which we call the language of thread role analysis. We devised appropriate abstractions of the key semantic building blocks of thread usage policies (including thread identity, concrete code segments, and data fields) so that developers can build a model of the thread usage policy (hereafter referred to as simply "policy") by expressing preconditions as simple precise annotations in program code.
Current Work
My focus is on bringing thread role analysis to bear on programs written in C11, which is the current standard of the C programming language (the core analysis is also similar to the analysis for Java). To ensure relevance to practicing programmers, we integrate our analysis with widely used programming tools—in this case, the CLANG/LLVM open-source compiler suite. The C11 work is still in its early stages, so I’ll present an example from our previous work in Java.
Electric Had a Problem
The developers of the Electric VLSI design tool had re-implemented it in Java for their V8.0 release, in part to support a change to a multi-threaded architecture. Users of the new version of the tool encountered seemingly random internal errors and crashes. These crashes were rarely repeatable, however, and usually disappeared when developers tried to debug the program. Since they were experienced developers, they quickly realized that these symptoms were probably caused by concurrency errors. This diagnosis seemed especially likely, since the major new feature in the most recent release was the implementation of a multi-threaded architecture for performance. The developers needed to quickly identify the problem in their 140 KLOC program, and we were able to do that in less than 8 hours using thread role analysis. We analyzed version 8.0.1, which contained roughly 140 KLOC in 44 Java packages.
We describe the process in the form of a reverse-engineering exercise, partly because that’s how we addressed the problem and partly because it helps explain thread role analysis. Note, however, that this reverse-engineering-focused description makes thread role analysis sound relatively time-consuming and hard. In case studies, we found that developers working on their own code could easily identify policies, answer questions about intended thread usage, and locate key points for annotation.
Identifying Thread Usage Policies
The first step in thread role analysis is to identify relevant pre-existing policies. We’ll use the Electric tool from our example because the developers had a pre-existing written thread usage policy, which appears here in a slightly edited form:
There is one persistent user thread called the DatabaseThread, created in com.sun.electric.tool.Job. This thread acts as an in-order queue for Jobs, which are spawned in new threads. Jobs are mainly of two types: Examine and Change. These jobs interact with the Database, which are objects in the package hierarchy com.sun.electric.database
The Rules are
Jobs are spawned into new Threads in order of receipt.
Only one Change job can run at a time.
Many Examine jobs can run simultaneously.
Change and Examine jobs cannot run at the same time.
Examine jobs can only look at the Database.
Change jobs can look at and modify the Database.
Because only one Change job can run at a time, the Change job is run directly in the DatabaseThread. Examine jobs are spawned into their own threads, which terminate when the Examine job is done.
We thus identified one of the two key policies that needed to be expressed; the other is the policy for Java’s AWT/Swing GUI framework, used by Electric’s graphical user interface (GUI). In less than one day of effort, we expressed models of these two policies and used the findings produced by our analysis tool to diagnose a set of "seemingly random intermittent failures" experienced by the development team and their users.
Policy for the GUI
The Electric GUI uses the AWT and Swing frameworks, which share a single thread usage policy. The GUI framework implementation is not multi-threaded; rather, it executes in its own "event thread" and prescribes rules for how non-GUI threads may interact with it through the framework APIs. The salient points of the policy (according to Bowbeer) are as follows:
There is at most one "AWT thread" at a time per application.
There may be any number of separate "Compute threads."
A Compute thread is forbidden to paint or to handle AWT data structures or events. Failure to comply can lead to exceptions from within the AWT, because the AWT avoids both potential deadlock and data races by accessing its internal data structures from within a single thread, without the use of locks.
Extended computation on the AWT thread is forbidden; "brief" computation is acceptable. While the thread is computing, it cannot respond to events or repaint the display; this "freezes" the GUI until the computation finishes.
It is important to note that the AWT data structures mentioned above explicitly include any fields of user-defined classes that extend the library-defined AWT and Swing framework classes.
Thread Role Versus Thread Identity
The policies expressed in prose above allow us to highlight an important feature of thread role analysis—focusing on thread roles rather than thread identities. The GUI policy above speaks of the "AWT thread"; although it fails to mention that, in some implementations, its identity changes from time to time. We don’t care which specific thread is the AWT thread right now; we care only that it performs the role of "AWT thread." Similarly, Electric’s thread usage policy permits multiple Examine jobs, each with its own thread. We care about the "Examine" role but not about the identity of the various threads that perform it.
Expressing Thread Usage Policy
The first step is to declare any needed thread roles. The @ThreadRole annotation declares names for thread roles. Roles are opaque identifiers that have their own namespace. The Electric policy mentions two thread roles: one for examining the Database and one for changing it (n.b.: Electric’s "Database" is a large data structure that represents the circuit the user is designing; it is not a database in the sense of MySQL or other relational databases). Here are the declarations of the thread roles for Electric, which are described as Java comments:
/** **@ThreadRole DBChanger , DBExaminer **@MaxRoleCount DBChanger 1 **@IncompatibleRoles DBChanger, DBExaminer, AWT **/
Likewise, here are the declarations of the thread roles for the GUI:
/** * @ThreadRole AWT, Compute * @MaxRoleCount AWT 1 * @IncompatibleRoles AWT, Compute **/
We will use the DBChanger role for all change Jobs and the DBExaminer role for all Examine jobs. Similarly, in the GUI policy, we see two thread roles: the "AWT thread" and "Compute threads." We declare these roles using the annotation @ThreadRole AWT, Compute, thus capturing the roles described in rules 1 and 2 of the GUI policy. We use the AWT role for the AWT thread and the Compute role for all other threads.
Next, we declare global constraints on thread roles. For example, an important aspect of the GUI policy indicates that the "AWT thread" and the "Compute threads" are distinct; it is forbidden for any thread to perform both roles at once. Similarly, the Electric policy implies that the Change and Examine roles are distinct. In each case, this incompatibility property allows us to conclude that a thread performing one of these roles necessarily excludes all of the others. Incompatibility is one of the few postconditions in our language for thread role analysis.
We state this incompatibility for the GUI framework by writing @IncompatibleRoles AWT, Compute for its API. The @IncompatibleRoles annotation in the snippet above specifies the incompatibility for Electric’s thread roles, as well as for the GUI’s AWT thread. These annotations capture the third rule of the GUI policy and the analogous—but implicit—rule for Examine and Change jobs from the Electric policy. Note that if we had followed the Electric policy literally, our annotation might have incorrectly omitted the AWT thread.
Finally, both the GUI policy and the Electric policy identify thread roles that may be performed by, at most, one thread at a time. We document this with @MaxRoleCount AWT 1 for the GUI and the similar annotation in the snippet above. Because "any number" is the default, we omit @MaxRoleCount annotations for the other thread roles.
Wrap Up and Look Ahead
We’ve now reached the halfway point in the process of expressing the thread usage policies. So far, we’ve identified the relevant thread usage policies along with the thread roles needed to express those policies. We’ve written annotations to declare the thread roles and to express the few global constraints on those roles—most notably incompatibility. In the second installment in this series, we’ll finish the thread usage policy by associating thread roles and thread role constraints with code segments, and show how thread role analysis enabled us to diagnose in 8 hours a violation that took "multiple" Electric developers "weeks of time" to fix independently. In the third and final post, we’ll discuss bringing thread role analysis to C11, along with techniques to improve adoptability by reducing required programmer effort and supporting analysis of partially annotated code.
Additional Resources
To read the paper, Composable Thread Coloring (which was an earlier name for the technique we now call thread role analysis) by Dean Sutherland and Bill Scherlis, please go towww.fluid.cs.cmu.edu:8080/Fluid/fluid-publications/p233-sutherland.pdf.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:46pm</span>
|
By Randy Trzeciak, Senior Member of the Technical Staff The CERT Program
According to the 2011 CyberSecurity Watch Survey, approximately 21 percent of cyber crimes against organizations are committed by insiders. Of the 607 organizations participating in the survey, 46 percent stated that the damage caused by insiders was more significant than the damage caused by outsiders. Over the past 11 years, CERT Insider Threat researchers have collected incidents related to malicious activity by insiders obtained from a number of sources, including media reports, the courts, the United States Secret Service, victim organizations, and interviews with convicted felons. From these cases, four patterns of insider threat behavior have been identified: (1) information technology (IT) sabotage, (2) fraud, (3) national security/espionage, and (4) theft of intellectual property (IP). From those patterns, our researchers developed controls that combine technological tools with behavioral indicators to identify employees at risk for committing cyber crimes. These tools and indicators provide those who monitor networks a better warning of potential anomalous behavior. This blog posting—the first in a series highlighting controls developed by the CERT Insider Threat Center—explores controls developed to prevent, identify, or detect IP theft.
Motives and Behaviors
By analyzing more than 700 insider threat cases, CERT researchers have identified a series of patterns and behaviors based on the motive of the perpetrator and the impact to an organization. For example, of the documented insider threat cases that we analyzed, 84 incidents are categorized as theft of IP in which employees take information with them as they leave to go to work for a competing organization, use the information to get a job with a competitor, or start their own competing company. In approximately one third of the theft of IP cases in the CERT database, the insider took the information and gave it to a foreign organization or government.
An interesting finding emerged when the researcher analyzed these cases: the majority of the insiders (approximately 70 percent) who steal IP do so within 30 days of announcing their resignation. This window gives an organization an opportunity to detect potential malicious activity. Many organizations do not have the resources to alert and investigate everytime a document is sent off of the network, so this window may allow for focused attention during higher risk periods, thereby reducing the high volume of false positives that may be returned via continual data leakage identification. That finding is used when developing the theft-of-IP technical control outlined in this blog. Based on these observations, we constructed a model of employees who steal information. The model takes into account technological variables, social variables, and the relationships between them.
By studying the patterns in various cases, we observed how the crimes tend to evolve over time, and the trends we noticed provided the foundation for our model. After our researchers established the model, they narrowed their focus to portions of it where controls may be applied to prevent or detect information leaving the organization’s network. For example, they configured a tool alert on suspicious activity possibly indicating that a departing employee may be stealing intellectual property. An organization can then use an open source, log-aggregation tool to write rules to alert when potential suspicious activity is observed. For example, analysts can write a query in a log-aggregation tool, such as SPLUNK, to flag employees who meet these criteria:
Their system accounts were disabled or are scheduled to be disabled in the next 30 days.
They are sending email through the organization’s network.
Those emails include attachments.
Analysts can further refine the SPLUNK tool to focus on employees in that group who have resigned within the last 30 days and are sending emails with attachments from personal email accounts. (Much of the activity will probably be authorized, but the approach allows organizations investigating suspicious activity to gain a better idea of what activity warrants additional investigation.)
Our aim with this research is not to create new tools, but rather to allow organizations to configure their existing arsenal so it is more effective at preventing or detecting malicious insider activity. The controls developed by our researchers should be used in addition to existing tools that many organizations already own, including
Data loss prevention (DLP) tools. These tools allow organizations to prioritize critical assets, and observe when employees are accessing data and when that data is being sent through the network.
Digital rights management (DRM) tools. These tools allow organizations to require that critical data be validated or authenticated against data on its network. For example, if information was taken off a network, it could not be used on another network, and no one would be able to open it up and view it.
Security information and event management (SIEM) tools. These tools allow organizations to detect anomalies on networks and networked systems. One example of such an anomaly would be an employee’s login outside of normal working hours using a remote connection.
Telltale Signs
In the October 2011 SEI technical note titled Insider Threat Control: Using Centralized Logging to Detect Data Exfiltration Near Insider Termination, CERT researchers Michael Hanley and Joji Montelibano described the controls developed to prevent IP theft. They reported that the primary vehicles for data exfiltration over the network are corporate email systems or web-based personal email services. They therefore concluded that organizations should consider doing the following when trying to prevent, detect, or deter IP theft:
Monitor for misuse of web-based personal email services. This mode of exfiltration will be addressed in detail in future research.
Monitor for email to the organization’s competitors or the insider’s personal account. Corporate email accounts running on an enterprise-class service contain robust auditing and logging functionality available for use in an investigation or, in this case, a query to detect suspicious behavior.
Taking these factors into account, an organization can proceed on an implementation strategy for these conditions on a logging engine. Hanley and Montelibano defined the following implementation outline: If the mail is from the departing insider and the message was sent in the last 30 days and the recipient is not in the organization’s domain and the total bytes summed by day are more than a specified threshold then send an alert to the security operator
With the time element serving as the root of a query, any of the following could be used to verify the query:
an active directory
a lightweight directory access protocol (LDAP) directory service
partial human resources records
badge access status
After the query has narrowed the field to all mail sent within a certain timeframe (the 30-day window), the query will next identify mail traffic that has left the local domain namespace of the organization. This constraint flags email messages to recipients in a namespace that the organization has no control over. The next constraint examines the email byte count to identify exfiltrated data. Hanley and Montelibano set a reasonable per-day byte threshold of between 20 and 50 kilobytes to identify whether several attachments or large volumes of text pasted into the bodies of email messages have left an organization’s network on a given day.
Future Research
Our future research is focusing on verifying that control models are still applicable and on developing new controls to address other modes of insider crime. The next blog post in this series will examine research that developed controls to prevent, detect, or mitigate IT sabotage by insiders.
Additional Resources
To read the SEI technical note, Insider Threat Control: Using Centralized Logging to Detect Data Exfiltration Near Insider Termination, please visit www.cert.org/archive/pdf/11tn024.pdf.
To read the CERT Insider Threat blog, please visit www.cert.org/blogs/insider_threat/.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:46pm</span>
|
Part 2: Understanding Success Drivers
By Douglas C. Schmidt,
Principal Researcher
Common operating platform environments (COPEs) are reusable software infrastructures that incorporate open standards; define portable interfaces, interoperable protocols, and data models; offer complete design disclosure; and have a modular, loosely coupled, and well-articulated software architecture that provides applications and end users with many shared capabilities. COPEs can help reduce recurring engineering costs, as well as enable developers to build better and more powerful applications atop a COPE, rather than wrestling repeatedly with tedious and error-prone infrastructure concerns. Despite technical advances during the past decade, however, building affordable and dependable COPE-based solutions for the DoD remains elusive. This blog posting—the second in a three-part series—builds upon the first posting to describe key success drivers for COPEs that proactively and intentionally exploit commonality across multiple DoD acquisition programs.
This blog is based on work by researchers at (and associated with) the SEI—including myself, Adam Porter, John Robert, and Mike McLendon—who are investigating how to create and govern COPEs successfully. We have identified the following three classes of drivers that DoD acquisition programs and system integrators must master to improve the odds of succeeding with COPEs:
Business drivers: Achieving effective governance and broad acceptance of the economic aspects of COPEs. When the DoD had the resources to acquire and sustain many redundant solutions, it was often hard to motivate the adoption of common software services and capabilities within the acquisition community. Adopting these services and capabilities were perceived as introducing program and technical risks and potentially impeding the ability of system integrators to offer more unique, custom-based solutions that were more expensive and perhaps perceived as more effective. Now that the status quo is no longer economically viable (which has been the case for many years and is now exacerbated in the shadow of sequestration), government and defense industry leadership has renewed interest in COPE initiatives. While this top-down support for COPEs is welcome and necessary, it is insufficient if program managers and system integrators do not fully accept the need to adopt new business models.Because both government and industry are significantly affected by new business models, they must devise collaborative, socio-technical ecosystems where participants share the risks and rewards of COPEs. One promising approach is managed consortia, which provide a solid commercial and legal foundation for forming and coordinating COPE government and industry stakeholders. These consortia are even more effective when they yield interoperable, open standards that are implemented by multiple open- and closed-source suppliers.Two often overlooked COPE strategy components are policy and governance, which are essential to success of collaborative approaches. These components are often not addressed explicitly because they are often not perceived as being important and are "organizational messy." DoD acquisition policy and guidance must emphasize COPE as an acquisition business and technical strategy at all levels within the acquisition and sustainment community. The collaborative environment also demands that concepts, structures, and processes for governance be an integral component of the overall COPE strategy. DoD program offices also need to work closely with system integrators to ensure that proper contracting models are adopted to incentivize cost-effective, on-time delivery of innovative and integrated solutions. While COPE based technical solutions may be feasible and desirable, these solutions cannot be achieved unless the government first puts in place the proper contract models to appropriately incentivize technical and program behaviors consistent with the government’s business goals. Effective contract models can also be the means to enable rapid-delivery order execution can help streamline technology insertion. In addition, the government must negotiate necessary licenses and data rights, technical data on verification and validation facilities, and procedures to decrease total ownership costs over program life cycles by retaining access to key software and documentation artifacts throughout the development and sustainment phases.
Management drivers: Ensuring effective leadership and guidance of COPE initiatives. While it has become fashionable to pay lip service to the goals and benefits of COPEs, it is much harder to find program managers and senior acquisition executives who can successfully sell and defend the near-term investments in time and effort needed to achieve the long-term payoffs of COPEs. These leaders must not only recognize the strategic role of software in DoD systems, but they must also articulate this role in ways that resonate with congressional appropriators, authorizers, and their staffs.Technical and acquisition leaders should also be savvy and avoid placing their bets on technological "silver bullets" and fads. They should also manage the application of modern iterative and incremental methods at scale, as opposed to traditional waterfall methods. COPEs are most effective when they are developed with strong feedback loops between developers of the reusable COPE infrastructure and developers of applications that use the COPE. Since COPE efforts rarely have the time or resources to please all customers, it is important for managers to be goal-directed—rather than exhaustive—when determining which common assets to develop and sustain. Without continual interactions with application developers, therefore, software artifacts produced by COPE developers rarely address core business problems and thus will not be reused effectively.COPE technical managers must also know when to build and when to buy reusable software platforms and tools. Managers who cling tenaciously to particular platforms or tools, and who ignore all other options, typically trade off short-term progress for long-term pain. A more effective, long-term approach involves working with open standards and establishing affiliations with industry standard groups to ensure continuity across the COPE life cycle.
Technical drivers: The foundations of COPE development. The operational and programmatic success drivers for COPEs often garner the most attention because they fundamentally depend on people, who represent the most complicated and demanding part of socio-technical ecosystems. Even if we could magically solve these vexing challenges, there are still many technical drivers that influence the success or failure of COPEs.To start with, developing a successful COPE requires a clear architectural vision. This vision should be codified and documented by experienced software and system architects who possess a deep understanding of the canonical patterns and architectural styles of the domain(s) associated with a COPE. Other key elements associated with achieving an architectural vision for COPEs include
developing open reference implementations for key parts of a COPE
infrastructure to help DoD programs avoid getting locked into
proprietary solutions
adopting effective licensing models to ensure broad adoption and commercialization of COPE components
ensuring
a strong connection with R&D communities in software engineering
and systems engineering to help mitigate technical risks
Having a strategy for mitigating technical risks is particularly important for new and planned systems. Although the DoD and software R&D communities have some knowledge about foundational patterns and architectures for legacy DoD systems, they are less aware of key patterns and architectural styles for emerging systems, particularly net-centric systems-of-systems. Unfortunately, there are too few designated software and system architect positions in the acquisition community and program offices to insure an architecture focused vision is a driving foundational and life cycle technical imperative.DoD COPEs necessarily comprise a wide range of network, hardware, and software configurations; different algorithms; and different security profiles. This variation is a key driver of total ownership costs because it affects the time and effort required to assure, optimize, and manage unique system deployments and their unique and often multi-configurations throughout the life cycle. To manage this variation effectively, the SEI helped pioneer software product lines (SPLs). SPLs have been applied in COPEs to manage software variation while reusing large amounts of code that implement common features within a particular domain. SPL-based COPEs help reduce software development and sustainment costs by maintaining and validating reusable components in a common repositoryOther technical drivers associated with successful COPEs include (but are by no means limited to) the following:
domain-engineering and use-case analysis methods that elicit and document COPE commonality and variability requirements and software architectures
iterative and incremental life cycle methods, processes, and toolkits that help developers better plan, measure, and improve software producibility so they have better confidence in COPE quality and cost estimates
software frameworks that codify the expertise needed to implement COPEs in the form of reusable algorithms and extensible and/or reusable component implementations
software patterns that codify expertise needed to design COPEs in the form of reusable architecture themes and styles, which can be reused even when algorithms, components implementations, or frameworks cannot
commercial-off-the-shelf component-based and service-oriented middleware that codifies expertise needed to develop COPEs in the form of portable open standard interfaces, interoperability protocols, and reusable building blocks
COPE-specific middleware components and services that provide APIs and data models via a simpler facade that shields applications from the powerful (and complex) capabilities of the underlying domain-specific middleware frameworks
higher-level languages, analysis tools, model-driven engineering technologies that enhance the productivity of COPE application developers and support "correct-by-construction" generation of software artifacts
automated verification and validation methods, standards conformance test suites, and system execution modeling tools that leverage the powerful, commoditized computing resources in a distributed, continuous manner to improve persistent quality attributes of COPEs, ensure portability and interoperability, and assure key functional and performance attributes
Succeeding with COPEs
This blog posting just scratched the surface of the technical and non-technical issues associated with developing and sustaining COPEs for the DoD. In our experience working with many COPE initiatives over the past two decades, achieving success requires a multi-dimensional perspective to foster effective COPE ecosystems and leverage key linkages between the success drivers identified above. Organizations implementing COPE initiatives that address these drivers in a thorough and holistic manner thus have a fighting chance to succeed. Many challenges persist, however, as evidenced by the relatively few COPE success stories to date for the DoD. History shows that organizations that do not understand (or do not execute) these drivers properly will fail, often at great expense and great detriment to the warfighter.
My next blog posting describes our work at the SEI on a COPE maturity model to help military and commercial organizations assess and improve their progress in developing and adopting systematic software reuse approaches for DoD acquisition programs. We welcome your feedback in the comments section below with suggestions on how the DoD can improve the technologies and ecosystems needed to develop COPEs more effectively.
Additional Resource:
To read the SEI technical report, A Framework for Evaluating Common Operating Environments: Piloting, Lessons Learned, and Opportunities, please visitwww.sei.cmu.edu/library/abstracts/reports/10sr025.cfm?DCSext.abstractsource=SearchResults
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:45pm</span>
|
By Randy Trzeciak Senior Member of the Technical StaffThe CERT Program
According to the 2011 CyberSecurity Watch Survey, approximately 21 percent of cyber crimes against organizations are committed by insiders. Of the 607 organizations participating in the survey, 46 percent stated that the damage caused by insiders was more significant than the damage caused by outsiders. Over the past 11 years, researchers at the CERT Insider Threat Center have documented incidents related to malicious insider activity. Their sources include media reports, the courts, the United States Secret Service, victim organizations, and interviews with convicted felons. From these cases, CERT researchers have identified four models of insider threat behavior: (1) information technology (IT) sabotage, (2) fraud, (3) national security/espionage, and (4) theft of intellectual property (IP). Using those patterns, our researchers have developed network monitoring controls that combine technological tools with behavioral indicators to warn network traffic analysts of potential malicious behavior. While these controls do not necessarily identify ongoing cyber crimes, they may identify behaviors of at-risk insiders that an organization should consider for further investigation. This blog posting, the second in a series highlighting controls developed by the CERT Insider Threat Center, explores controls developed to prevent, identify, or detect IT sabotage.
Existing technical tools can be better configured to prevent instances of IT sabotage. Many organizations deploy Data loss prevention (DLP) tools and digital rights management (DRM) tools to try to stop theft of IP, or security information event management (SIEM) tools to mitigate IT sabotage. These tools are able to detect and examine network traffic, but determining the difference between anomalous and normal behavior remains hard.
Behavioral Indicators Prior to IT Sabotage
The CERT Program’s research has shown that employees who commit IT sabotage typically exhibit certain behavioral indicators prior to the crime. These usually begin with an employee’s unmet expectations of the organization, precipitated by a negative workplace event, such as being passed over for a promotion, failure to receive a raise or bonus, or demotion. Next, the employee becomes disgruntled and seeks revenge against the organization for a perceived injustice.
Some behavioral indicators that may be observable in IT sabotage cases are performance problems, conflicts with coworkers or supervisors, outbursts in the workplace, and tardiness. The situation escalates to a point where the disgruntled employee sets up an attack using technical means. If such insiders have been denied access to the organization’s network, they often find ways to regain access (such as exploiting an unknown access path) to deploy their malicious code and then leave the organization or are terminated. The impact to the organization tends to become visible only after the insider’s departure.
Using the Security Information and Event Management (SIEM) Signature
The following SIEM signature can be used to determine the identity of individuals engaging in behaviors that an organization should consider investigating further, what remote connection protocol they are using, and whether this activity is occurring outside normal working hours. The signature is based on the following key fields: username, VPN account name, hostname of the attacker, and whether the attacker is using SSH, Telnet, or RDP.
The characteristics of insider attacks include remote access to the organization’s information systems, outside normal working hours. Given these characteristics, we developed following signature:
Detect <username> and/or <VPN account name> and/or <hostname> using <ssh> and/or <telnet> and/or <RDP> from <5:00 PM> to <9:00 AM>
Note: This signature should only be applied to individuals who warrant increased scrutiny. This signature should not be applied to all privileged users because it will generate inordinate false positives.
Two standards were used to create the SEIM signature: the Common Event Format (CEF) and the Common Event Expression (CEE):
The Common Event Format (CEF) is an event interoperability standard developed by ArcSight. The purpose of this standard is to improve the interoperability of infrastructure devices by instituting a common log output format for different technology vendors. It assures that an event and its semantics contain all necessary information. Using this standard and the key indicators identified during the database analysis, we developed two CEF-based SIEM signatures, for Microsoft and Snort products, to identify suspected attackers.
The Common Event Expression (CEE) architecture defines an open and practical event log standard developed by MITRE. Like CEF, the purpose of CEE is to improve the audit process and users’ ability to effectively interpret and analyze event log and audit data. It standardizes the event-log relationship by normalizing the way events are recorded, shared, and interpreted. Using the CEE format, we developed a signature based on the key indicators of insider IT sabotage. The signature identifies a suspected attacker who is using a remote connection to log onto the organization’s internal system outside normal working hours, and it also logs the time the event was recorded.
Recognizing these behaviors has allowed us to create rules for when to apply a SIEM signature to detect insiders at risk of committing IT sabotage. By applying a SIEM signature, network traffic analysts can detect changes in configuration and changes in timing of network connections and specifically look at people who log in to the network outside of normal working hours. Using nontechnical indicators with the signature also helps to minimize the number of false positives. By combining behavioral and technical aspects, the SIEM signature can be used to help organizations act proactively, not reactively, to protect themselves.
Future Research
We are not advocating that organizations "advertise" the controls in an attempt to dissuade disgruntled employees from harming the organization. Instead, we want to persuade organizations to improve the communication between human resources, managers, and co-workers to identify potential disgruntled employees and apply additional IT Controls (including the SEIM signature) to identify potential suspicious changes to critical files.
Our future work includes enhancing the CERT insider threat database by collecting incidents, verifying that the behavioral model is still current and applicable, and customizing the model to create more controls. We will continue to use our insider threat lab to test tools, develop controls, and make better recommendations for existing or new configurations of tools to prevent, detect, or respond to malicious attacks on a network.
Additional Resources
To read the technical report Insider Threat Control: Using a SIEM Signature to Detect Potential Precursors to IT Sabotage, please visit www.cert.org/archive/pdf/SIEM-Control.pdf.
To read the technical report Using Centralized Logging to Detect Data Exfiltration Near Insider Termination, please visit www.cert.org/archive/pdf/11tn024.pdf.
To read more about the CERT Program’s Insider Threat research, please visit www.cert.org/insider_threat/.
To read about the new book The CERT Guide to Insider Threats by Dawn Cappelli, Andrew Moore, and Randy Trzeciak, please visit www.sei.cmu.edu/newsitems/insider-book.cfm.
To read the CERT blog post Insider Threat Control: Using an SIEM signature to detect potential precursors to IT sabotage, please visit www.cert.org/blogs/insider_threat/2012/01/insider_threat_control_using_a_siem_signature_to_detect_potential_precursors_to_it_sabotage.html.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:45pm</span>
|
By Marc Novakouski, Member of the Technical StaffResearch, Technology & System Solutions
Our modern data infrastructure has become very effective at getting the information you need, when you need it. This infrastructure has become so effective that we rely on having instant access to information in many aspects of our lives. Unfortunately, there are still situations in which the data infrastructure cannot meet our needs due to various limitations at the tactical edge, which is a term used to describe hostile environments with limited resources, from war zones in Afghanistan to disaster relief in countries like Haiti and Japan. This blog post describes our ongoing research in the Advanced Mobile Systems initiative at the SEI on edge-enabled tactical systems to address problems at the tactical edge.
At the tactical edge, the people that need the information the most—warfighters, first responders, or other emergency personnel—depend on timely and valuable information to perform their tasks, or even survive. Unfortunately, the access to the information they need can be extremely hard to achieve, for the following reasons:
information overload stemming from too much information, coupled with an inability to locate truly vital information
information obscurity due to a lack of awareness of the available information, aka "you don’t know what you don’t know"
resource scarcity manifested as insufficient bandwidth, central processing unit (CPU) power, battery power, or even attention span to get the needed information and continue to process, exploit, and disseminate it for as long as needed
The remainder of this posting describes how we are tackling the information overload and information obscurity aspects of this problem by developing context-aware mobile applications.
A Different Approach to Context-Aware Mobile Applications
Context awareness in the mobile environment is not a new field of research. Most mobile devices come preloaded with applications that use location or time to account for user context. There is certainly no shortage of similar applications available for download. We decided, therefore, to explore alternative sources of data that would not only push the limit of what could be done with user context, but also focus on the extremely challenging environment at the tactical edge.
Our "eureka" moment came when we realized that when warfighters or first responders are at the tactical edge, they are almost never operating alone. As a result, the most important contextual information to warfighters or first responders is the context of the people in the group, and how they relate to that context. This realization drove us to explore group context-aware mobile applications. These applications would, if built correctly, first consider individual user context and then relate that information to the group context, thereby helping users understand both their own state, as well as the state of the group in which they participate.
Group context-aware mobile applications clearly have value at the tactical edge. For example, warfighters are well served by having access to positions of friends and foes on the battlefield (position data being a simple case). They could also benefit from supportive applications that monitor resources, such as food, ammunition, or vital signs. With sufficient data and processing power, these applications could even use historical trends to determine dynamically if a squad is walking into a possible ambush situation.
In less deadly (yet still hostile) environments, such as tsunami disaster areas, the ability to share information about resource needs, dangerous situations, or health emergencies in a structured way can also be valuable. Such applications could tailor information to managers, construction workers, doctors, and other emergency personnel to help coordinate an effective emergency response.
An extensive literature review on context awareness yielded relatively little research on the topic of group context. Much of the prior work cites the basic context model developed by Anind Dey, but does not expand the model past the individual, choosing instead to tailor the model to a particular domain. Our research project, called Information Security to the Edge (ISE), explores the structure, applications, and implementation of a context model that includes group information. We have constructed a prototype application on the Android platform that implements the essential components needed by group context-aware mobile applications, as discussed next.
App Architecture - Logic and Data
The ISE prototype application follows the common model-view-controller (MVC) pattern, which decomposes an application into the following parts:
The model is the data. This data is the information processed by the application. For example, the words typed by the user into a word processing application are data.
The view is the user interface. In the case of a word processing application, the view would be the buttons, menus, scroll bars, and other visual effects provided by the application to help a user write a document.
The controller is the logic. In the case of a word processing application, the controller would be the rules the application uses to save, present, filter, and otherwise modify the text. The function provided by each button or menu item can also be part of the controller.
Consistent with the MVC pattern, the ISE prototype has a central control mechanism (which forms the "brains" of the application) that manages data flow through the application. In practice, this means that the central controller coordinates data flow and processing through the following primary application elements:
The context engine is the central processor for all context information used by the application. As device sensors report new data and applications on external devices send data to the local application, all data is passed through the engine so that new events are detected as they occur. For example, if an external user sends their GPS coordinates that indicate they are within 100 feet of a warfighter, then the device can alert the warfighter to their presence. Expanding on this concept, if a group task must be performed but everyone is working individually on their own tasks the local device can monitor task status and user position and report to the leader when all group members are ready and close by so the group task can be performed.
The sensor manager accepts data from sensors that reside upon the mobile device. A typical smart phone contains position sensors, movement sensors, and in some cases, light and proximity sensors. The application captures data from these sensors and passes it through the sensor manager. The sensor manager enables the sensors and controls their sample rate, so that the application can tailor usage to the situation and avoid overwhelming the system.
The communications manager acts as the gateway to all external communications within the system. This gateway currently includes Bluetooth and TCP/IP communications, but can be expanded to include other communication mechanisms that are available to the device. Any messages to and from users on other devices are passed through the communications manager.
The sensor and communications manager architecture consolidates all sensor and communication concerns into a single location. This consolidation approach enabled us to build a standardized interface that simplifies integration an arbitrary sensor (for example, a radiation sensor) or an arbitrary communication mechanism (for example, a line-of-sight radio that communicates with UAVs) with the application. We tested this feature through a collaboration with Joao Sousa of George Mason University. This testing resulted in the development of an alternative communication mechanism that integrates with the prototype with only a few weeks of effort, instead of months or years. We anticipate leveraging these standardized interfaces to collaborate with a variety of external groups and organizations as new sensor technologies and communication mechanisms become available.
App Architecture - User Interface (UI)
The ISE app, through the use of Android UI screens called Activities, reflects the view part of the MVC Pattern. There are currently only three supported UIs in ISE:
User: Allows users to look at the people with whom they are or can be connected, as well as the context data associated with each person.
Task View: Allows users to create their own tasks, receive updates about other users’ tasks, and mark their tasks complete or incomplete. We are expanding the task view to include tasks, with main tasks and subtasks under them. Ultimately, we will develop a capability that displays complex missions in an intuitive manner.
Alerts View: As events occur, some will automatically appear in the alerts view along with a list of the considerations the context engine has identified as items of importance for users. The alerts presented will be tailored to the needs and context of individual users.
We are upgrading the ISE architecture to support any UI that subscribes to standardized updates from the data services.
Challenges
One challenge we face involves accounting for the lack of network infrastructure. In particular, limited bandwidth exists for the available communication channels. We are building atop of communication capacities that other organizations are field testing in Afghanistan to tailor our solution to practical field situations.
A second challenge involves providing warfighter access to backend data sources. Soldiers told us that important information is available in such sources, but they can’t readily find the relevant information. Moreover, they can’t access the database in the field. Other Advanced Mobile Systems work is investing ways to provide access to critical data through the use of cloudlets.
A third challenge involves reducing the user’s cognitive load by limiting the amount of interaction and attention required of the user. Residents in a metropolitan area can use their smart phones without undue concern for being in a distracted state, as long as they are not engaging in tasks that demand undivided attention. A soldier in Haiti, on the other hand, must be cognizant of crumbling buildings, while a warfighter on the ground in Afghanistan might need to digest information while taking enemy fire. Our goal is to use hardware that allows the warfighter to capture and process information seamlessly, without sacrificing valuable time and resources.
We are also addressing the challenge of resource scarcity. Resources are limited at the tactical edge and warfighters are typically limited to the power and bandwidth of whatever devices they can carry. We are therefore exploring resource optimization based upon our expanded model of context. For example, if a warfighter’s assignment involves driving through a known safe area, it may not be necessary for the smartphone to activate the GPS capability. By optimizing the system to use sensors only when needed, warfighters can save battery power, CPU cycles, and communication bandwidth that can be used to support other mission-critical needs.
Finally, our work will not have the desired impact if we cannot meet the challenge of relevance. Warfighters made it clear to us that if a device or application is not directly useful to their immediate task, it will be ignored. In any given day, a warfighter in Afghanistan may be asked to determine if a particular individual is a threat, sweep a village to establish identities of residents, deliver food to children, or check for a weapons cache. These different missions impact the type of information that interests soldiers and the type of information a software application should consider. Solving this problem requires a deep understanding of the needs of soldiers and the missions in which they engage. We are leveraging this domain knowledge so our ISE application can tailor information processing to a particular mission, thereby ensuring relevance to the current mission and the ability to change mission parameters as needed.
Looking Ahead
The ISE prototype is just one part of our strategy to address the problems of information overload, information obscurity, and resource scarcity. The Advanced Mobile Systems initiative is also engaged in the Edge-Enabled Programming project as well as the Resource Optimization for Mobile Platforms at the Edge project. Each project attacks the three problems of information overload, information obscurity, and resource scarcity from different perspectives. We intend to integrate each project together after they have matured, thereby providing an end-to-end solution to warfighters and first responders at the tactical edge.
Additional Resources
For more information on the MVC pattern, consult the Documenting Software Architectures: Views and Beyond and the Pattern-Oriented Software Architecture: Volume 1 books.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:44pm</span>
|
By Bill Scherlis, Chief Technology Officer (Acting)SEI
The extent of software in Department of Defense (DoD) systems has increased by more than an order of magnitude every decade. This is not just because there are more systems with more software; a similar growth pattern has been exhibited within individual, long-lived military systems. In recognition of this growing software role, the Director of Defense Research and Engineering (DDR&E, now ASD(R&E)) requested the National Research Council (NRC) to undertake a study of defense software producibility, with the purpose of identifying the principal challenges and developing recommendations regarding both improvement to practice and priorities for research. The NRC appointed a committee, which I chaired, that included many individuals well known to the SEI community, including Larry Druffel, Doug Schmidt, Robert Behler, Barry Boehm, and others. After more than three years of effort—which included an intensive review and revision process—we issued our final report, Critical Code: Software Producibility for Defense. In the year and a half since the report was published, I have been asked to brief it extensively to the DoD and the Networking and Information Technology Research and Development (NITRD) communities.
This blog posting, the first in a series, highlights several of the committee’s key findings, specifically focusing on three areas of identified improvements to practice—areas where the committee judged that improvements both are feasible and could substantially help the DoD to acquire, sustain, and assure software-reliant systems of all kinds. The "help" is in the form of reduced costs, greater productivity, improved schedules, and lower risks of program failure—and also in enabling the DoD to build systems with much greater levels of capability, flexibility, interlinking, and assurance. The next blog postings will cover some of the lessons learned since the report came out.
Practice Improvement 1: Process and Measurement
Success in developing software-dominated systems requires organizational processes that enable managers and developers to set achievable goals, analyze data, and guide decisions—and to succeed in these processes despite rapid change in operating context and in the technical and infrastructural environment. Advances related to process and measurement help facilitate broader and more effective use of incremental and iterative development methods, which have relatively short process feedback loops. These iterative approaches can better accommodate change and uncertainty. As a consequence, these approaches are commonplace in commercial and enterprise development. But for the DoD, advances in incremental and iterative development methods must account for the typical "arms-length" relationships, common in acquisition programs, that exist between contractor development teams and government stakeholders.
Incremental development practices enable continuous identification and mitigation of engineering risks during a system’s development process. Engineering risks pertain to the consequences of particular choices made within an engineering process—the risks are high when the outcomes of immediate project commitments are consequential, hard to predict, and apparent only well after the commitments are made. Engineering risks may relate to many different kinds of engineering decisions—most notably architecture, quality attributes, functional characteristics, and infrastructure choices.
When managed properly, incremental practices can enable successful innovative engineering without increasing the overall programmatic risk related to completing engineering projects, such as managing stakeholder expectations and triaging priorities for cost, schedule, capability, quality, and other attributes. Incremental practices help identify and mitigate engineering risks earlier in system lifecycles than traditional waterfall approaches—the feedback is sooner, and so the costs and consequences are lower. These practices are enabled through the use of diverse techniques, such as modeling, simulation, prototyping, and other means for early validation—coupled with extensions to earned-value models that measure and acknowledge the accumulating body of evidence in support of program feasibility. Incremental methods include iterative approaches (such as Agile), staged acquisition, evidence-based systems engineering, and other methods that explicitly acknowledge engineering risks and their mitigation.
The committee found that incremental and iterative methods are of fundamental significance to innovative, software-reliant engineering in the DoD, and they can be managed more effectively through improvements in practices and supporting tools. The committee recommended a diverse set of improvements related to advanced incremental development practice, supporting tools, and earned-value models.
Practice Improvement 2: Architecture
In software-reliant DoD systems, architecture represents the earliest and often most important design decisions—those that are the hardest to change and the most critical to get right. Architecture is the principal way we address requirements related to quality attributes such as performance, security, adaptability, and the like. Architectural design also embodies expectations regarding the various dimensions of variability and change for a system. When architecture design is successful, system quality is more predictable, and change is more likely to be accommodated through smaller increments of effort rather than through wholesale restructuring of systems.
Advances related to architecture practice thus contribute to our ability to build systems with demanding requirements related to quality attributes, interlinking, and planned-for flexibility.
Software architecture techniques and tools model the structures of a system that comprises software components, externally visible properties of those components, and relationships among the components. Architecture thus has both structural and semantic aspects—it is not just about how components interconnect. Good architecture entails a minimum of engineering commitment that yields a maximum of business value. Architecture design is thus an engineering activity that is separate, for example, from standards-related policy setting and the certification of commercial ecosystems and components.
For complex innovative DoD systems, architecture definition embodies planning for flexibility—defining and encapsulating areas (such as common operating platform environments and cyber-physical systems) where innovation, change, and competition are anticipated. Architecture definition strongly influences diverse quality attributes, ranging from availability and performance to security and isolation. It also embodies planning for the interlinking of systems to form systems-of-systems and ultra-large-scale systems and for product line development enabling encapsulation of individual innovative elements of a system.
For many innovative DoD systems it is essential to consider architecture and quality attributes before making too many specific commitments to functionality. This may seem backwards from the usual model, of putting functional requirements first. But the engineering reality is that architecture includes the earliest and typically the most important design decisions: those engineering costs that are the hardest to change later. Early architectural commitment (and validation) can therefore often yield better project outcomes with less programmatic risk.
The committee found that in highly complex DoD systems with emphasis on quality attributes, architecture decisions may dominate functional capability choices in overall significance. The committee also noted that architecture practice in many areas of industry is sufficiently mature for the DoD to adopt. The committee recommended that the DoD more aggressively assert architectural leadership, with an early focus on architecture being essential for systems with innovative functional or demanding quality requirements.
Practice Improvement 3: Assurance and Security
A significant—and growing—challenge for DoD systems is software assurance, which encompasses diverse reliability, security, robustness, safety, and other quality-related and functional attributes. The weights given these various attributes are often determined by modeling hazards associated with operational context, including potential threats and the penalties of system failure. Software assurance is very expensive—the process of achieving assurance judgments, regardless of sector, is generally recognized to account for approximately half the total development cost for major projects. Advances related to assurance and security would therefore facilitate greater mission assurance for systems at greater degrees of scale and complexity.
Advances in assurance and security are particularly important to the rich supply chains and architectural ecosystems that are increasingly commonplace in modern software engineering. The growing reliance on software by the DoD has increased the functional capability of all kinds of systems, as well as a growth in the interconnectedness of systems and the extent of potential for rapid adaptation of systems. With this growth has come a dependence of DoD software-reliant systems on increasingly complex, diverse, and geographically distributed supply chains. These supply chains include not only custom components developed for specific mission purposes, but also commercial and open-source ecosystems and components, such as the widely used infrastructures for web services, cloud computing environments, mobile devices, and graphical user interaction. This places emphasis on composition and on localizing points of trust within a system.
In addition to managing overall costs, the DoD faces many challenges for assurance relating to technology, practices, and incentives, including:
The arms-length relationship between contractor development teams and government stakeholders also complicate the creation and sharing of information necessary to make assurance judgments. This type of relationship can lead to approaches that focus excessively on post hoc acceptance evaluation, rather than on the emerging practice of building evidence in support of an overall assurance case.
Modern systems draw on components from diverse sources, implying that supply-chain and configuration-related attacks must be contemplated, with "attack surfaces" existing within an overall application, and not just at its perimeter. The consequence of this trend is that evaluative and preventive approaches should ideally be integrated throughout a complex supply chain. A particular challenge is managing black box components in a system (this issue is addressed in the full report).
The growing role of DoD software in warfighting, protection of national assets, and the safe guarding of human lives creates a diminishing tolerance for faulty assurance judgments. The Defense Science Board notes that there are profound risks associated with the increasing reliance on modern software-reliant systems: "this growing dependency is a source of weakness exacerbated by the mounting size, complexity, and interconnectedness of its software programs."
Losing the lead in the ability to evaluate software and prevent attacks can confer advantage to adversaries with respect to both offense and defense. It can also force the DoD to restrict functionality or performance to a level such that assurance judgments can be achieved more readily.
The Defense Science Board also found "it is an essential requirement that the United States maintain advanced capability for ‘test and evaluation’ of IT products. Reputation-based or trust-based credentialing of software (‘provenance’) needs to be augmented by direct, artifact-focused means to support acceptance evaluation." Achieving this capability is a significant challenge due to the rapid advance of software technology generally, as well as the increasing pace by which potential adversaries are advancing their capabilities. This challenge—coupled with the observations above regarding software innovation—provides an important part of the rationale for the committee’s recommendation that the DoD actively and directly address its software producibility needs.
The committee found that assurance is facilitated by advances in diverse aspects of software engineering practice and technology, including modeling, analysis, tools and environments, traceability and configuration management, programming languages, and process support. The committee also found that simultaneous creation of assurance-related evidence with ongoing development has high potential to improve the overall assurance of systems. The committee recommended enhancing incentives for software assurance practices and production of assurance-related evidence throughout the software lifecycle and through the software supply chain for both contractor and in-house developments.
Looking Ahead
The next blog posting in this series will focus on lessons learned in the many interactions subsequent to the publication of the NRC Critical Code report. I will also discuss what these lessons signify for developing software strategy for the DoD, in general, and the SEI, in particular.
Additional Resources
This posting is an excerpted, edited copy of an article that Bill Scherlis wrote for The Next Wave, "Critical Code: Software Producibility for Defense," which was published in Volume 19, No. 1 (2011). To request copies of the journal, please send an email to tnw@tycho.ncsc.mil.
To download a PDF of the report Critical Code: Software Producibility for Defense, go towww.nap.edu/catalog.php?record_id=12979.
To download a PDF of the Report of the Defense Science Board Task Force on Defense Software (2000), go to http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA385923
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:42pm</span>
|
By Dave Zubrow, Chief ScientistSoftware Engineering Process Management Program
By law, major defense acquisition programs are now required to prepare cost estimates earlier in the acquisition lifecycle, including pre-Milestone A, well before concrete technical information is available on the program being developed. Estimates are therefore often based on a desired capability—or even on an abstract concept—rather than a concrete technical solution plan to achieve the desired capability. Hence the role and modeling of assumptions becomes more challenging. This blog posting outlines a multi-year project on Quantifying Uncertainty in Early Lifecycle Cost Estimation (QUELCE) conducted by the SEI Software Engineering Measurement and Analysis (SEMA) team. QUELCE is a method for improving pre-Milestone A software cost estimates through research designed to improve judgment regarding uncertainty in key assumptions (which we term program change drivers), the relationships among the program change drivers, and their impact on cost.
Our Approach
According to a February 2011 presentation by Gary Bliss, director of Program Assessment and Root Cause Analysis, to the DoD Cost Analysis Symposium, unrealistic cost or schedule estimates are a frequent causal factor for programs breaching a performance criterion. Steve Miller, director of the Advanced Systems Cost Analysis Division of OSD Cost Analysis and Program Evaluation, noted during his DoDCAS 2012 presentation that "Measuring the range of possible cost outcomes for each option is essential …Our sense is not that the cost estimates were poorly developed [but] rather key input assumptions didn’t pan out." For instance, an estimate might assume
It is possible to mature technology A from technology readiness level 4 to level 7 in three years.
The program will not experience any obsolescence of parts within the next five years.
Foreign military sales will support lower production costs.
An interdependent program will complete its development and deployment in time for this program to use the products.
We can reuse 70 percent of the code in the missile tracking system.
QUELCE addresses the challenge of getting the assumptions "right" by characterizing them as uncertain events rather than certain eventualities. As we’ve noted previously, modeling uncertainty on the input side of the cost model is a hallmark of the QUELCE method. By better representing uncertainty, and therefore risk, in the assumptions and explicitly modeling them, DoD decision makers, such as Milestone Decision Authorities (MDAs) and Service Acquisition Executives (SAEs), can make more informed choices about funding programs and portfolio management. QUELCE is designed to ensure that DoD acquisition programs will be funded at levels consistent with the magnitude of risk to achieving program success, fewer and less severe program cost overruns will occur due to poor estimates, and there will be less rework reconciling program and OSD cost estimates.
QUELCE relies on Bayesian Belief Network (BBN) modeling to quantify uncertainties among program change drivers as inputs to cost models. QUELCE then uses Monte Carlo simulation to generate a distribution (as opposed to a single point) for the cost estimate. In addition, QUELCE includes a DoD domain-specific method for improving expert judgment regarding the nature of uncertainty in program change drivers, their interrelationships, and eventual impact on program cost drivers. QUELCE is distinguished from other approaches to cost estimation by its ability to
allow subjective inputs based solely on expert judgment, such as the identification of program change drivers and the probabilities of state changes for those drivers, as well as empirically grounded ones based on historical data, such as estimated system size and likely growth in that estimate
visually depict influential relationships, scenarios, and outputs to aid team-based development, and explicit description and documentation of assumptions underlying an estimate
use scenarios as a means to identify program change drivers, as well as the impacts of alternative acquisition strategies, and
employ dependency matrix transformation techniques to limit the combinatorial effect of multiple interacting program change drivers for more tractable modeling and analysis
The QUELCE method consists of the following steps in order:
Identify program change drivers: workshop and brainstorm by experts.
Identify states of program change drivers.
Identify cause-and-effect relationships between program change drivers, represented as a dependency matrix.
Reduce the dependency matrix to a feasible number of inter-driver relationships for modeling, using matrix transformation techniques.
Construct a BBN using the reduced dependency matrix.
Populate BBN nodes with conditional probabilities.
Define scenarios representing nominal and alternative program execution futures by altering one or more program change driver probabilities.
Select a cost estimation tool and/or cost estimating relationships (CERs) for generating the cost estimate.
Obtain program estimates of size and/or other cost inputs that will not be computed by the BBN.
For each selected scenario, map BBN outputs to the input parameters for the cost estimation model and run a Monte Carlo simulation.
Improving the Reliability of the Expert Opinion
Early cost estimates rely heavily on subject matter expert (SME) judgment, and improving the reliability of these judgments represents another focus of our research. Expert judgment can be idiosyncratic, and our aim is to try to make it more reliable. QUELCE draws upon the work of Dr. Douglas Hubbard, whose book How to Measure Anything describes a technique known as "calibrating your judgment" that we are adapting for our DoD cost estimation analysis.
For example, if you state you are 90 percent confident, you should be correct in your answers 90 percent of the time. If you state you are 80 percent confident, you would be correct 8 times out of 10. Performing in agreement with your statement of confidence is termed "being calibrated."
Hubbard’s technique operates by giving participants a series of questionnaires. The participants are asked to provide an upper and lower bound for the answer to each question such that they believe they will be correct 90 percent of the time. Hence, a participant should get 9 out of 10 answers right. If they answer all 10 correctly, they are being too conservative in their answers; they provided too wide of a range. If they get fewer than 9 correct, they are over confident and providing too narrow of a range for their answers. Hubbard’s approach provides feedback so that participants are consistently correct 90 percent of the time. Through this method of testing and feedback, they learn to calibrate their judgment.
Applying that same approach to DoD cost estimation analysis would ideally mean that if two calibrated judgments are being applied to the same cost estimate, there is now a more precise idea of what those judgments mean. Hubbard, who taught a class at the SEI, demonstrated that most people start off being highly over confident in terms of their knowledge and judgment.
We plan to test Hubbard’s approach of calibrating judgment with questions specific to software estimating at several universities, including Carnegie Mellon University and the University of Arizona. To develop the materials for these experiments, we are mining information from open-source repositories, such as Ohloh.net. Our objective is to increase the consistency and repeatability of expert judgment as it is used in software cost estimation.
Addressing Challenges
A key challenge that our team faces in conducting our research is validating the QUELCE method. It can literally take years for a program to reach a milestone against which we can compare its actual costs to the estimate produced by QUELCE. We are addressing this challenge by validating pieces of the method through experiments, workshops, and retrospectives. We are currently conducting a retrospective on an active program that provided us access to its historical records. Key to this latter activity is the participation of team members from the SEI Acquisition Support Program (ASP). The ASP members are playing the role of program experts as we work our way through the retrospective.
Another challenge that our work on QUELCE addresses is insufficient access to DoD information and data repositories may significantly jeopardize our ability to conduct sufficient empirical analysis for the program change driver repository. To address this, we have been working with our sponsor and others in the Office of the Secretary of Defense to gain access to historical program data stored in a variety of repositories housed throughout the DOD. We plan to use this data to develop reference points and other information that will be used by QUELCE implementers as a decision aid when developing the BBN for their program. This data would also be included in the program change driver repository.
Developing a Repository
We are creating a program change driver repository that will be used as a support tool when applying the QUELCE method. The repository is envisioned as a source of program change drivers—what events occurred during the life of a program that directly or indirectly impacted its cost—along with their probability of occurrence. The repository will also include information that will be used as part of the method for improving the reliability of expert judgment such as reference points based on the history of Mandatory Procedures for Major Defense Acquisition Programs.
Developing the repository is a major task planned for FY13. We also plan to conduct additional pilots of the method including use of the repository and support tools. From those pilots, we will develop guidance for the use of the repository and make it available on a trial basis within the DoD. After the repository is adequately populated and developed, we intend it to become an operational resource for DoD cost estimating.
Transitioning to the Public
During the coming year, our SEMA team will work to
create guidance and procedures on how to mine program change relationships and related cost information from DoD acquisition artifacts for growth of the program change driver repository
collaborate with Air Force Cost Analysis Agency to include results from analyzing Software Resources Data Report data in the program change driver repository
assemble a catalog of calibrated mapping of BBN outputs to cost estimation models and make it available to the DoD cost community
continue discussions with Defense Acquisition University (DAU), Service Cost Centers, and the DoD cost community about research and collaboration opportunities (for example, discussions at the DoD Cost Analysis symposium)
Additional Resources
To read the SEI technical report Quantifying Uncertainty in Early Lifecycle Cost Estimation (QUELCE), please visit www.sei.cmu.edu/library/abstracts/reports/11tr026.cfm.
For more information about Milestone A, please see the Integrated Defense Life Cycle Chart for a picture and references in the "Article Library."
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:41pm</span>
|
David Svoboda
Software Security Engineer
CERT Secure Coding Initiative
As security specialists, we are often asked to audit software and provide expertise on
secure coding
practices. Our research and efforts have produced several coding standards specifically dealing with security in popular programming languages, such as
C,
Java, and
C++. This posting describes our work on the
CERT Perl Secure Coding Standard, which provides a core of well-documented and enforceable coding rules and recommendations for
Perl, which is a popular scripting language.
Perl is a relatively young language, only slightly older than Java. Perl became popular early in its lifetime because it was the first general-purpose scripting language on many
Unix
platforms. Perl enjoyed a second burst of popularity as the web became prominent because it was especially well-suited to writing
Common Gateway Interface (CGI)
scripts.
In recent years Perl's popularity has been cemented by
CPAN, a public repository of free software libraries written in Perl. Any computer with Perl installed provides straightforward mechanisms to install and use any software library from CPAN. This feature enables programmers to use libraries provided by the community easily and quickly. Several new features in Perl began life as CPAN modules before being integrated into the language. As a result of its popularity, many important software systems are written in Perl, such as the
Request Tracker database (RT), an open-source project for managing tickets or bugs for a help desk, which is maintained by
Best Practical Solutions. Many websites, such as
amazon.com, also rely on Perl code on their servers.
The CERT Perl Secure Coding standard is still young and growing. The C and Java standards have more than 200 rules in about 20 sections each. The Perl standard currently has slightly more than 30 rules in the following eight sections:
Input Validation and Data Sanitization - issues dealing with data provided by an attacker, such as
XML injection and cross-site scripting (XSS).
Declarations and Initialization
- issues dealing with securely declaring variables and functions including package versus lexical variables,
name clashes, and dangers of
uninitialized data.
Expressions - issues dealing with Perl’s expressions syntax including
list
versus
scalar
contexts, when to use the
$_ variable, and when to use the various types of
comparison operators.
Integers - issues dealing with numbers, such as how to specify
octal numbers.
Strings - issues dealing with strings and regular expressions (regexes) including the
danger
of providing a string literal to a subroutine that expects a regex.
Object-Oriented Programming (OOP)
- issues dealing with OOP are covered in this section, such as recognizing the convention of
private variables.
File Input and Output
- issues dealing with how to safely work with files, including safely working with
Perl’s filehandles.
Miscellaneous
- issues that don’t fall into other sections, such as handling
dead code
and
unused variables.
Addressing Security Vulnerabilities in Perl
The Perl community has always prioritized practicality over theoretical elegance, and so it has always been considered an easy language to write code in—although Perl code is often considered ugly due to the tendency of some Perl developers to create "write only" programs. Perl was not designed as a secure programming language. However, problems relating to security in Perl programs have been discussed in security circles, and appear in databases such as the
CERT vulnerability database. Moreover, companies that request software audits are just as likely to want Perl software audited as they are to request audits for C, C++, or Java. While the Perl community is interested in improving the language, the focus on security has historically tended to take a back seat to other priorities, such as new features and improved performance.
Our work on the CERT Perl Secure Coding Standard therefore centers on addressing issues in the Perl language and libraries that deal specifically with security. The standard covers issues, such as
XML injection,
integer security, and proper
input and output, as outlined above. By making the standard publicly accessible, we invite the Perl community to help us improve the standard.
The standard leverages several sources to provide relevant material on security. For example, it takes advantage of the
US-CERT vulnerability database, which contains entries on several vulnerabilities that address the Perl language or applications written in Perl. It also leverages experience gained from the
Source Code Analysis Lab (SCALe), which has been used to perform security audits on several pieces of Perl code, including the previously-mentioned
RequestTracker (RT)
tool created by
Best Practical Solutions. Other analysis tools, such as
Perl::Critic, provide an automated audit of a Perl program by examining a codebase and producing a list of diagnostics. These diagnostics can range from insecure coding practices and bugs to stylistic issues. The SCALe project uses these tools to harvest the diagnostics that address security issues, while discarding diagnostics not relevant to security.
The CERT Perl standard can leverage the other CERT standards for security issues that are not bound to any particular language. For instance, many issues about securely opening files on a Unix machine are language-independent. As a result, these portions of CERT standards for security issues can affect any software that runs on Unix systems
regardless
of the
language
in which it is
written.
While Perl has many of the same security issues that plague C and Java, several issues are unique to Perl. For example, Perl's
open()
function can take two arguments, with the latter argument being either a file name or a shell command. The
open()
function either opens the file or executes the command. If the argument begins or ends with a | (pipe) character, it is interpreted as a command to execute. Consequently, if an attacker can specify a filename to Perl's
open()
function—and that filename begins or ends with |—the attacker can cause Perl to execute the command for which the file is named. This issue is discussed further in rule
IDS31-PL
in the CERT Perl Secure Coding Standard.
Perl has some technology that appears similar to other languages but presents unique problems when examined more closely. For example, C, Java, and Perl all share the concept of an
array, which is a continuous vector of items that can be accessed via an index. In C and Java, arrays are fixed-size, which means they are created to hold a specific number of elements and their size remains fixed until they are destroyed. Trying to refer to an element greater than the size of the array is illegal. For example, asking for the 11th element in a 10-element array in Java will cause an exception to be thrown, which usually causes the program to crash.
In contrast, Perl's arrays can grow over their lifetime. Assigning a value to the 11th element of a 10-element Perl array causes the array to grow in memory such that the array contains 11 elements, so the request becomes valid. This quality makes Perl an especially agreeable language to work with Ibecause it never reports that an array is too small. If you were to assign a value to the 1,000,000,000th element of an array,however, Perl would attempt to grow the array enough to accommodate the request and might exhaust memory.
Exhausting system memory, whether deliberate or unintentional, can lead to security vulnerabilities because a system with limited memory will refuse to provide memory to any program that requests more. At the same time, many programs fail to check whether their memory requests succeeded. A machine with no free memory, therefore, is likely to have running programs that crash, either unintentionally or by design, using some sort of "out of memory" error. Consequently, the CERT Perl Secure Coding standard contains rule
IDS32-PL, which forbids allowing untrusted users from providing an array index, lest they cause Perl to exhaust memory with an excessively large number.
What’s Ahead for the CERT Perl Secure Coding Standard
We are adding several rules each week, and presumably the Perl secure coding standard can grow to about the same size as the C or Java standards since it’s comparable in scope. We welcome your assistance in helping us complete the standard.
Editor's Note: In response to feedback from our readers, this post has been edited. The post originally stated "Asking for the 11th element of a 10-element Perl array causes the array to grow in memory such that the array contains 11 elements, so the request becomes valid." As our readers pointed out in the comments section, "Simple asking is not enough." The post now states that "Assigning a value to the 11th element of a 10-element Perl array causes the array to grow in memory such that the array contains 11 elements, so the request becomes valid."
Additional Resources
The CERT Perl Secure Coding Standard may be viewed at
https://www.securecoding.cert.org/confluence/display/perl/CERT+Perl+Secure+Coding+Standard
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:40pm</span>
|
By Douglas C. Schmidt Principal Researcher
While agile methods have become popular in commercial software development organizations, the engineering disciplines needed to apply agility to mission-critical software-reliant systems are not as well defined or practiced. To help bridge this gap, the SEI recently hosted the Agile Research Forum, which brought together researchers and practitioners from around the world to discuss when and how to best apply agile methods in the mission-critical environments found in government and many industries. This blog posting, the first in a multi-part series, highlights key ideas and issues associated with applying agile methods to address the challenges of complexity, exacting regulations, and schedule pressures that were presented during the forum.
Carleton’s Forum Introduction
When introducing the forum, Anita Carleton, director of the SEI’s Software Engineering Process Management Program, summarized how agile methods can provide customer value sooner and enable organizations to respond to change more quickly. Carleton started by highlighting the four key tenets presented in the Agile Manifesto, which forms the foundation for many agile methods, such as Scrum and Kanban:
people over processes and tools
working software over comprehensive documentation
customer collaboration over contract negotiations
responding to change over following a plan
Carleton explained that agility means the ability to move quickly and easily, the ability to think and reason quickly, or to possess intellectual acuity. "In a business context this is the definition of agile that matters," Carleton said. Agility in business means having an organization that moves, thinks, and responds quickly to change, not only in the short term, but over the lifetime of the system product or relationship. Agility is the ability to provide customer value sooner, to align development tempos with operational tempos, and to turn fast response into the business competitive advantage.
While agile methods have become popular with many information technology (IT) professionals as a means to replace perceived top-down bureaucracy with greater self-discipline and team-discipline, Carleton noted in her presentation that the SEI emphasizes measurable performance value for DoD programs and contractors who apply agile practices. Agility is not a capability you achieve by accident. Like agility in sports, which requires teamwork, strategy, training, management, and discipline, achieving agility in a software development organization is no different.
Carleton explained that the SEI’s research and transition efforts have concentrated in recent years on enhancing lightweight approaches and processes to address what’s needed and required by our sponsors, partners, customers, and other stakeholders. Meeting these needs involves scaling up agile methods to the mission-critical software-reliant systems common in the Department of Defense (DoD), as well as mission-critical programs in other domains, such as finance, energy, telecommunications, space exploration, and aviation.
Carleton outlined the following areas where SEI work is focusing on applying agile methods at-scale:
defining and evaluating practical guidance for DoD project managers, systems engineers, and contracting officers who are considering the adoption of agile methods
defining metrics that can be used to measure and appraise the performance and success of programs that apply agile methods
working with DoD acquisition programs to pilot and roll-out agile methods in combat system development environments
developing an architecture-focused measurement framework for managing technical debt in agile projects
formulating a decision making framework for reducing integration risk with agile methods
applying agile principles in strategic planning processes
Takai’s Keynote Presentation
In her keynote presentation during the forum, Teresa M. Takai, chief information officer (CIO) for the DoD, discussed how agile methods have been introduced into the DoD software acquisition and development environment. As the DoD CIO, Takai provides IT support for 2 million individuals across the globe: 1 million on the civilian side and 1 million on military side. Providing IT support for the DoD means delivering reliable computing and communications capability for warfighters, regardless of their location. IT is an essential part of what the DoD does to enable service men and women to perform their duties.
Challenges that Motivate DoD Agility
The DoD faces many challenges—including budget pressures and rapid technology insertion—that motivate the need for agile methods, Takai said during her keynote. These challenges have resulted in two realities for IT development:
The DoD must be a good custodian of technology dollars, Takai commented during the forum, pointing out that the DoD spends at least $38 billion a year on IT.
The DoD also needs to speed up IT delivery. Men and women hired by the DoD now expect to do their jobs using their smart phones, the same way that they do in their private life, Takai explained. Long development cycles don’t fit with user expectations. In the DoD, it can take up to 81 months to acquire or develop a new technology. "Typically, slow acquisition time has been blamed on the acquisition process, but that’s not always fair," Takai said. The challenge for the DoD is that the acquisition process encompasses front-end requirements gathering, recruiting industry partners, and, testing and implementation. The mandate for the DoD to change is enormous.
Agile practices are not just a methodology, they involve a cultural change in the way business is conducted. Takai suggested to the audience that cultural change is the hardest part of adopting agile methods in the DoD. Nearly 33 percent of DoD IT programs are canceled during development because, as programs move through a process taking 81 months, they realize they can’t deliver the capability they had intended to deliver. Over 60 percent of DoD IT programs are late and/or over budget. Larger IT projects have a much larger risk of over budget and under delivery.
Cultural and Process Changes Need to Support DoD Agility
As part of the DoD IT acquisition reform effort, Takai explained that DoD leaders are examining how to take agile best practices, continue to educate the IT technology workforce on the meaning of these best practices, and ensure that all involved understand the overall DoD culture to ensure IT developers can apply agile methods effectively. This cultural change involves enhancements to established DoD policies and practices. The DoD is part of a larger government-wide effort that, under the US CIO, published a 25-point plan, a portion of which prescribed transitioning to agile methods, as elaborated in Takai’s 10-point plan for IT modernization in the DoD.
Some segments within the DoD have already begun transitioning to agile methods, Takai said. Based on these experiences, the DoD has been establishing the framework and a handbook that program managers can to guide their implementations of agile methods within its organizations. One of the DoD’s strategies has been to share best practices. An important best practice has involved establishing a governance process that promotes the successful adoption of agile IT methods.
DoD organizations are accustomed to developing DoD specific solutions, such as radio or weapons systems, that involve rigorous processes. Unfortunately, with this approach it’s hard for program managers to decompose software systems into smaller, more deliverable chunks necessary for agile development. "The DoD can no longer simply write and sign off on requirements and then turn it over to acquisition to manage delivery," Takai said. The DoD must find a better way to involve the user through the development process. For the DoD that’s not always been the norm, so making that move involves cultural changes.
Takai said that as the DoD considers large-scale IT development projects (some in the billion-dollar range), leaders need to learn how divide them into chunks effectively. This decomposition process should start by specifically examining the ongoing requirements process, involving the user, and ensuring a much stronger governance process. The DoD must also examine how to manage agile development from a risk-mitigation standpoint.
One challenge of traditional waterfall methods is that the focus is often on avoiding risk. Risk avoidance is not what IT is about anymore, Takai said. Implementing agile methods will enable the DoD to mitigate risk and make the changes needed after a small increment of delivery, and then build on that concept to reach next stages of delivery. That approach is a tough concept for the DoD, which has historically focused on ensuring that requirements and processes remain iron clad, with minimal risk involved. Such an approach eliminates the ability to bring in innovative technologies, as well the ability to implement industry best practices. That’s not the way to move forward, she said, especially in an era of austerity in the DoD budget.
What's Next
Our next posts in this series will summarize discussions of four SEI researchers, including myself, at the Agile Research Forum who examined aspects of applying agile methods at-scale in mission-critical development environments:
Mary Ann Lapham highlighted the importance of collaboration with end users, as well as among cross-functional teams, to apply agile approaches to DoD acquisition programs successfully at-scale. She noted that effective agile DoD teams are flexible, experienced, and able to work fluidly between disciplines.
Ipek Ozkaya discussed the use of strategic, intentional decisions to incur architectural technical debt. The technical debt metaphor describes the tradeoff between taking shortcuts in software development to speed up product delivery and slower—but less risky—software development.
James Over noted that lack of teamwork can critically impede agility. He advocated, among other principles, the building of self-managed teams, planning and measuring project process, designing before building, and making quality the top priority for achieving agility at-scale.
Finally, I wrapped up the forum with a discussion on the importance of applying agile methods to crucial common operating platform environments (COPEs) at the DoD. I explained how agile methods can encourage more effective collaboration between users, developers, testers, and certifiers to help the DoD successfully build integrated, interoperable software systems.
In addition to providing you weekly updates of the latest research from our technologists, the SEI blog has also become a catalyst for sparking thoughtful discussions on the latest challenges facing commercial and DoD organizations. We therefore look forward to hearing your thoughts on applying agile at-scale in the comments section below.
Additional Resources
The slides and recordings from the SEI Agile Research Forum can be accessed at www.sei.cmu.edu/go/agile-research-forum/.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:39pm</span>
|
By Douglas C. Schmidt, Principal Researcher
While agile methods have become popular in commercial software development organizations, the engineering disciplines needed to apply agility to mission-critical, software-reliant systems are not as well defined or practiced. To help bridge this gap, the SEI recently hosted the Agile Research Forum, which brought together researchers and practitioners from around the world to discuss when and how to best apply agile methods in mission-critical environments found in government and many industries. This blog posting, the second installment in a multi-part series, summarizes a presentation made during the forum by Mary Ann Lapham, a senior researcher in the SEI’s Acquisition Support Program, who highlighted the importance of collaboration with end users, as well as among cross-functional teams, to facilitate the adoption of agile approaches into DoD acquisition programs.
Lapham’s Talk on Agile Methods: Tools, Techniques, and Practices for the DoD Community
The broad—and rapidly expanding—threats the DoD must address necessitates an ability to develop software faster, Lapham told the audience. "In today’s environment people want results faster. They want to use information technology (IT) applications and infrastructure sooner; there is a real need," Lapham said. In the commercial and DoD domains, the prevailing question is, "How can IT be delivered faster?" The answer, Lapham told the audience, is an iterative approach that lends itself to agile methods.
Lapham noted that the SEI is focused on reducing the DoD information technology development cycle—which can currently take as long as 81 months—to short, increemental approaches that yield results more quickly. One complicating factor is that DoD acquisition programs (like other highly-regulated commercial environments) have a prescribed vision of how IT systems are developed, Lapham explained. She referenced the DoD 5000 Series Acquisition Lifecycle, which has traditionally employed a waterfall approach that focuses largely a sequential process of requirement analysis followed by design, implementation, and testing.
Lapham said that she and other SEI researchers are working on developing an approach that will allow the DoD to develop applications in a shorter period, ideally 18 to 24 months. One aspect of Lapham’s research focuses on helping the DoD transition from a traditional method to more iterative and incremental development methods, while still operating within the regulatory boundaries of the overarching DoD 5000 Series Acquisition Lifecycle model.
Implementing Agile Effectively for in DoD Environments
Lapham said the SEI’s research in this field began in 2009 when a DoD client asked about using agile methods. "We reviewed the 5000 series to see if something in it would preclude us from using agile, and there wasn’t. From there, we’ve gone on to investigate different parts of the acquisition process so we can help program offices and contractors understand how to implement agile effectively in DoD environments."
Lapham said her team applied agile methods to study agile methods. Researchers started by interviewing practitioners who were experts with traditional waterfall methods about how those methods fit into the DoD acquisition lifecycle. Next, Lapham and her team identified the gaps that would occur if agile principles were applied by the DoD in the traditional acquisition lifecycle. After identifying the gaps, Lapham’s team consulted with DoD stakeholders to ensure they had identified the appropriate gaps. The researchers then characterized those gaps and built a model to form a complete overview of the lifecycle with agile principles.
The research conducted by Lapham’s team yielded a compendium of topics that addressed the barriers of adopting agile methods in the DoD. "We have a list of 30 topics, including such questions as: How do I do agile contracting? How do I do agile requirements management? How do I do agile cost estimation, testing, and system engineering?" Lapham explained. Next, the researchers consulted with DoD acquisition stakeholders to ensure that the topics they are addressing are relevant. The team is in the midst of piloting their agile approach with practitioners. "We’re applying a lot of the agile methods that have been used successfully in the commercial arena," Lapham said, adding that their research accounted for the fact that certain agile terms in the commercial world differ from those in a DoD environment. The published results—which will be released starting later this year—will be a set of validated tools, techniques, and practices.
Comparing and contrasting traditional and agile approaches to software development
Lapham noted the research results thus far have yielded the following findings about traditional versus agile development:
Characteristics of traditional approaches
an arms-length relationship between developers and acquirers
hierarchical, command-and-control-based teams
leader as keeper of the vision and primary source of authority to act
conventional, representational documents used by the program management office to oversee the progress of developers in a software development lifecycle model with separate teams, particularly for development and testing; some independent program teams involve multiple functions
Characteristics of agile approaches
collaborative relationships between developers, acquirers, and end users
strong team relationships, with collocated teams, or effective communication mechanisms with distributed teams
facilitated leadership, with the leader as champion and team advocate
"just enough" documentation to maintain a product and continue to use it and evolve it (documentation is highly dependent on product context)
cross-functional team relationships that includes all roles throughout the lifecycle where every member of the team performs their function, but they perform it together and reinforce it
Lapham also described how SEI researchers are compiling a compendium of
cultural issues that organizations need to consider when implementing
agile, as described below
Organizational Structure. Many DoD practitioners are content with traditional hierarchical structures where one person is in charge. Traditional DoD organizational structures are hard to change due to their command-and-control-based integrated-product teams that have formal responsibilities and roles. They meet on a prescribed schedule, usually once a month. Often, those teams work through certain issues as part of their charter.
Agile organizations, in contrast, are characterized by flexible and adaptive structures. Teams are cross-functional and small. An agile organization might have multiple teams working together in different locations, but still maintain constant communication. Teams will be self-organized, but that doesn’t mean they lack discipline, Lapham said. Instead, agile projects require developers with rigor across a core set of processes.
Leadership. In traditional DoD software development approaches, the leader is the keeper of the vision and the primary source of authority to act. In an agile DoD organization, in contrast, the goal is facilitative leadership, the leader is an advocate and champion for the team. This approach is a different style of leadership that requires a paradigm shift in management styles in DoD organizations.
Reward systems. A traditional DoD organization focuses on the individual and rewarding individuals for high performance. In an agile DoD environment, the team is the focus of the rewards system. Lapham commented that team members typically behave based on the activities for which they are incentivized. If developers are rewarded for being the hero, therefore, that may not create an environment conducive to team building.
Staffing model. A traditional DoD organization uses a lifecycle model with separate teams, particularly for development and testing. Different roles are active at defined points in the lifecycle and are not substantively involved, except at those defined times. An agile DoD environment, in contrast, employs cross-functional teams, including all roles across the lifecycle of the project. The teams contain an agile mentor or coach who explicitly attends to the team’s process and ensures that they work together cooperatively.
Communication and decision making. In organizations that employ a traditional approach to software development, top-down communication structures dominate. Likewise, external regulations drive the focus of the work while indirect communications, such as documented activities and processes, dominate over face-to-face dialogue. Program management office oversight tools focus on demonstrating compliance. In an Agile DoD environment, in contrast, teams usually hold 15-minute daily standup meetings in which three main topics are discussed:
What am I going to do today?
What did I do yesterday?
What problems did I have? (the goal is not to solve problems in this short meeting, but the agile coach determines whose responsibility it is to solve those problems at the end of the meeting)
In an agile environment, teams hold frequent retrospectives to improve practices while information radiators are used to communicate critical project information to avoid surprises. Information radiators are entities (sometimes automated tools, sometimes just stickies on a board) that provide status and ensrue an open and transparent flow of information. Documents serve to feed conversation among team members. Agile organizations produce enough documentation required to meet DoD acquisition regulations, which is highly dependent on product context.
What's Ahead
The first posting in this series summarized discussions by Anita Carleton, director of the SEI’s Software Engineering Process Management program, and Teri Takai, chief information officer for the DoD. Carleton provided an overview of the forum and discussed areas where SEI work is focused on applying agile methods at-scale. Takai then discussed how agile methods have been introduced into the DoD software acquisition and development environment.
Our next posts in this series will summarize discussions of three SEI researchers, including myself, at the Agile Research Forum who examined aspects of applying agile methods at-scale in mission-critical development environments:
Ipek Ozkaya discussed the use of strategic, intentional decisions to incur architectural technical debt. The technical debt metaphor describes the tradeoff between taking shortcuts in software development to speed up product delivery and slower—but less risky—software development.
James Over noted that lack of teamwork can critically impede agility. He advocated, among other principles, the building of self-managed teams, planning and measuring project process, designing before building, and making quality the top priority for achieving agility at-scale.
Finally, I wrapped up the forum with a discussion on the importance of applying agile methods to crucial common operating platform environments (COPEs) at the DoD. I explained how agile methods can encourage more effective collaboration between users, developers, testers, and certifiers to help the DoD successfully build integrated, interoperable software systems.
We look forward to hearing your thoughts on applying agile at-scale in the comments section below.
Additional Resources
The slides and recordings from the SEI Agile Research Forum can be accessed at www.sei.cmu.edu/go/agile-research-forum/.
To read Lapham’s SEI blog posting on Using Agile Effectively in DoD Environments, please visit http://blog.sei.cmu.edu/post.cfm/using-agile-effectively-in-dod-environments.
To read the SEI technical note, Considerations for Using Agile in DoD Acquisition, please visitwww.sei.cmu.edu/library/abstracts/reports/10tn002.cfm.
To read the SEI technical report, Agile Methods: Selected DoD Management and Acquisition Concerns, please visit www.sei.cmu.edu/library/abstracts/reports/11tn002.cfm.
To read the SEI technical report, A Closer Look at 804: A Summary of Considerations for DoD Program Managers, please visit www.sei.cmu.edu/library/abstracts/reports/11sr015.cfm.
To read an article in Crosstalk by Lapham, DoD Agile Adoption: Necessary Considerations, Concerns, and Changes, please visit www.crosstalkonline.org/storage/issue-archives/2012/201201/201201-Lapham.pdf.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:39pm</span>
|
By Douglas C. SchmidtPrincipal Researcher
While agile methods have become popular in commercial software development organizations, the engineering disciplines needed to apply agility to mission-critical, software-reliant systems are not as well defined or practiced. To help bridge this gap, the SEI recently hosted the Agile Research Forum. The event brought together researchers and practitioners from around the world to discuss when and how to best apply agile methods in mission-critical environments found in government and many industries. This blog posting, the third installment in a multi-part series highlighting research presented during the forum, summarizes a presentation made during the forum by Ipek Ozkaya, a senior researcher in the SEI’s Research, Technology & System Solutions program, who discussed the use of agile architecture practices to manage strategic, intentional technical debt.
Ipek’s Talk on Strategic Management of Technical Debt
In her opening comments to the audience, Ozkaya established that two decades ago Ward Cunningham coined the "technical debt" metaphor, which refers to the degraded quality resulting from overly hasty delivery of software capabilities to users. Cunningham stated that shipping code quickly is like going into debt. A little debt speeds up development, and can be beneficial as long as the debt is paid back promptly with a rewrite that reduces complexity and streamlines future enhancements. A delicate balance is needed between the desire to release new software capabilities rapidly to satisfy users and the desire to practice sound software engineering that reduces subsequent rework.
Increasingly, the software engineering community and those adopting agile techniques are interested in understanding how to quantify technical debt and manage debt pay-back strategies. Ozkaya observed that organizations are often driven to agile techniques after observing increasing technical debt in their software-reliant systems. Ironically, adopting agile practices at scale without considering their long-term implications can also easily lead to technical debt. Ozkaya’s talk emphasized the need to explicitly acknowledge the tradeoffs between taking shortcuts in software development to accelerate product delivery versus applying slower—but less risky—software development methods.
Ozkaya questioned whether it is possible to avoid technical debt altogether, especially given the increasing scale and complexity of software-reliant systems, coupled with trends in the DoD and other government agencies to sustain systems that are expected to operate for decades. Another factor impacting technical debt is workforce diversity and turnover, which often yields distributed teams that must be managed remotely. Given these factors, Ozkaya said, it is inevitable that technical debt will accumulate since environments, systems, and technologies will change, so technical debt has become an ongoing software engineering practice that must be understood and managed effectively.
Recognizing the Financial Implications of Technical Debt
Technical debt has financial implications, just like monetary debt payments. Developers can choose to pay interest on their technical debt in the form of additional time and effort required to understand and modify poor structured code. Conversely, developers can pay down the debt by refactoring poorly designed code to reduce future effort. Ozkaya suggested that understanding the financial model implied by the "debt" metaphor can help establish the structural aspect of debt. These financial implications suggest the following questions that agile development teams must consider:
What is the "interest rate" that organization signs up for when incurring technical debt?
Can this interest rate be controlled?
What is the period of the loan?
What is it we’re borrowing? Time? Or, other opportunities we need to bring to bear when managing timeline of loan?
How do we create a realistic repayment strategy?
Identifying What Constitutes Technical Debt
Much of the existing literature on technical debt focuses on code-level issues, such as reducing the time needed to modify software functions, add new features, or fix bugs. Ozkaya said it’s also important for organizations to consider how to best describe technical debt from architecture- and system-level perspectives.
The SEI focuses on managing debt as an agile software architecture strategy. Specifically, SEI researchers are focusing on identifying any implications to the cost of architectural changes. Often when a particular symptom in a system is described as technical debt, it’s not just the code that is bad, but it’s also accumulating problems that happen in terms of architectural changes that have occurred throughout the system’s development.
To establish a common understanding of the term "technical debt," Ozkaya referenced the taxonomy of technical debt created by Steve McConnell. To date, much work has focused on McConnell’s "Type 1" debt, which is unintentional and non-strategic. This type of technical debt often results from poor design decisions and poor coding.
Ozkaya specifically drew attention to the second type of debt described by McConnell: intentional and strategic, optimized for the present and the future. This "Type 2" debt can occur in an agile software development lifecycle when trying to accelerate development from a perspective that requires optimizing for short-term goals, such as shipping a product with known shortcomings to gain or protect market share. What is crucial when incurring Type 2 debt is to have a process for revisiting and reworking these short-term shortcuts to ensure system longevity.
Ozkaya also highlighted Jim Highsmith’s prescription for managing technical debt. Highsmith focuses on understanding and monitoring the accumulating cost of change as a result of technical debt. As years or iterations go by and new functions are added or new technologies upgraded, the cost of change can start to increase dramatically.
Lastly, Ozkaya referenced Philippe Kruchten’s perspective on technical debt, which focuses on emphasizing a value perspective on system development. Value could include features that have immediate benefit to stakeholders. Value can also be negative, such as defects that must be resolved. Most of the time, however, value that goes unrecognized are invisible aspects of the software, which are often architectural features that enhance the system when done well, but incur technical debt when done poorly.
Tracking and Analyzing Debt
Ozkaya presented three strategies for managing technical debt:
Do nothing. When using this approach, it’s important to understand the implications (both technical and economic) of "doing nothing."
Replace the whole system. In some cases, this approach might have high cost and risk associated with it; in others, it might be precisely what is needed.
Incremental refactoring (commitment to invest). An explicit focus on architectural agility becomes an instrument in this approach.
In large-scale software-reliant systems, the number of years spent before a system is launched is often detrimental to success since gaps in requirements and performance are not detected until very late in the lifecycle, when they are expensive to remedy. In such instances, using technical debt as a strategy and dividing the system delivery into chunks might be advantageous.
Ozkaya told the audience that eliciting and quantifying the impact of technical debt with pay-back strategies is not yet a repeatable engineering practice. Factors to consider in quantifying debt include tracking defects, changing velocity (what actually got done during an agile iteration versus what was planned), and the cost of rework. These indicators could be mapped into the cost of development, which yields a greater understanding of the value of paying back technical debt versus not paying it back.
Ozkaya’s work focuses on quantifying technical debt, which can include code, though Ozkaya is most interested in quantifying debt early in the lifecyle, when code analysis alone may not provide enough direction. Specifically, Ozkaya’s research focuses on understanding the right set of architectural models that can be used seamlessly within agile software development methods to provide feedback to development teams and help them understand the impact of rework. Ozkaya stressed that this rework might not be planned for, but could resurface as a change of requirements or technology.
Technical Debt Tools and Analysis
Among those organizations interested in managing technical debt, there is an increasing focus on tools for conducting structural analysis. Trends show increasing sophistication, support for structural analysis in addition to code analysis, and the first steps toward analyzing the financial impact of technical debt by relating structural analysis to cost and effort for rework. Several architecture-related capabilities also exist, including
architecture visualization techniques, such as dependency structure matrix, conceptual architecture, architectural layers, and dependence growth
architecture quality analysis metrics, such as component dependencies, cyclicity, architectural rules compliance, and architectural debt
architecture compliance checking, such as defining design rules and ensuring that they are not broken, for example disallowing communication between certain components
architecture sandboxing, such as providing features to enable easier discovery of the current architecture of the software, which may include navigating the file structure and moving components around easily
Deciding to Pay Down Debt
Ozkaya said the main motivation of structured technical debt analysis—and the emerging analysis tools—is to help organizations develop strategies for systematically paying down their debt. These strategies involve eliciting business indicators in a system that could be defined for a particular application domain and determining how those indicators are managed. Example indicators include
increasing amount of defects. While this indicator may seem obvious, in some systems with many stakeholders, it involves an inability to deliver the system.
slowing rate of velocity. At the first sign of slowing rate of velocity (the rate at which planned requirements for the iteration are being fulfilled) it could be easy to analyze implications (for example during an a sprint retrospect) and create an alternative strategy
changing business and technology context. Often, organizations don’t realize that a change in business and technology could result in technical debt in a system that has been perfectly fine heretofore
a future business opportunity. The need to embrace new business opportunities could motivate the need to rework the system
time to market. Time to market is another key indicator to consider when deciding to pay down technical debt or not
For large-scale software-reliant systems, adding technical debt to the backlog and continuously monitoring that debt should be common practice. Development teams can determine strategies for addressing and monitoring technical debt appropriate for the organization. For example, strategies could involve amortizing by 10 percent or conducting a dedicated iteration that focuses on paying back the debt.
Ozkaya also pointed to a recent Crosstalk article she co-authored that highlights architectural tactics involved in strategically managing technical debt. The principles of both agile software development and software architecture improve the visibility of project status and offer better tactics for risk management. These principles help software teams develop higher-quality features on time frames and on budget. The article described three tactics: aligning feature-based development and system decomposition, creating an architectural runway, and using matrix teams and architecture. Harmonious use of these tactics is critical, especially in large-scale DoD systems that must be in service for several decades, are created by multiple contractor teams, and have changing scope due to evolving technology and emerging needs.
Future Areas of Research
In describing future areas of SEI research on the strategic management of architectural technical debt, Ozkaya pointed out that this topic succinctly communicates key issues observed in large-scale, long-term projects, including
solving optimization problems. In some cases focusing on optimizing for the short-term puts the long-term into economic and technical jeopardy when debt is unmanaged
creating appropriate design shortcuts. Design shortcuts can give the perception of success, but cannot focus on the present alone, but need to consider future iterations and plan accordingly
modeling software development decisions. Decisions concerning architecture should be continuously analyzed and actively managed since they incur cost, value, and debt. SEI research is focusing on opportunities for developing and quantifying effective payback strategies
In conclusion, Ozkaya recommended several immediate steps that organizations should take to managing technical debt:
Make technical debt visible, even if it’s just acknowledging that there is a problem
Differentiate strategic, structural technical debt from unintended debt incurred as a result of factors like low code quality, bad engineering, or practices that have not been followed
Bridge the gap between the business and technical sides
Associate technical debt with risk and track it explicitly
What's Ahead
The first posting in this series on the SEI Agile Research Forum summarized discussions by Anita Carleton, director of the SEI’s Software Engineering Process Management program, and Teri Takai, chief information officer for the DoD. Carleton provided an overview of the forum and discussed areas where SEI work is focused on applying agile methods at-scale. Takai then discussed how agile methods have been introduced into the DoD software acquisition and development environment. The second posting summarized discussions by Mary Ann Lapham, a senior researcher in the SEI’s Acquisition Support Program, who highlighted the importance of collaboration with end users, as well as among cross-functional teams, to facilitate the adoption of agile approaches into DoD acquisition programs.
Our next posts in this series will summarize discussions of two SEI researchers, including myself, at the Agile Research Forum who examined the following aspects of applying agile methods at-scale in mission-critical development environments:
James Over noted that lack of teamwork can critically impede agility. He advocated, among other principles, the building of self-managed teams, planning and measuring project process, designing before building, and making quality the top priority for achieving agility at-scale.
Finally, I wrapped up the forum with a discussion on the importance of applying agile methods to crucial common operating platform environments (COPEs) at the DoD. I explained how agile methods can encourage more effective collaboration between users, developers, testers, and certifiers to help the DoD successfully build integrated, interoperable software systems.
We look forward to hearing your thoughts on applying agile at-scale in the comments section below.
Additional Resources
To view the Crosstalk article, Architectural Tactics to Support Rapid and Agile Stability, please visitwww.crosstalkonline.org/storage/issue-archives/2012/201205/201205-Bachmann.pdf.To visit the International Workshop on Managing Technical Debt workshops website, please visitwww.sei.cmu.edu/community/td2012/.To view the Hard Choices Board Games website, please visitwww.sei.cmu.edu/architecture/tools/hardchoices/.To view blog posts about technical debt by Ozkaya and other SEI researchers, please visithttp://blog.sei.cmu.edu/archives.cfm/category/technical-debt.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:39pm</span>
|
By Douglas C. SchmidtPrincipal Researcher
While agile methods have become popular in commercial software development organizations, the engineering disciplines needed to apply agility to mission-critical, software-reliant systems are not as well defined or practiced. To help bridge this gap, the SEI recently hosted the Agile Research Forum. The event brought together researchers and practitioners from around the world to discuss when and how to best apply agile methods in mission-critical environments found in government and many industries. This blog posting, the fourth installment in a multi-part series highlighting research presented during the forum, summarizes a talk by James Over, manager of the Team Software Process (TSP) initiative, who advocated the building of self-managed teams, planning and measuring project process, designing before building, and making quality the top priority, among other principles associated with applying agile methods at-scale.
Over’s Talk on Balancing Agility and Discipline
In his opening comments to the audience, Over shared his views on agility and discipline and stressed the importance of finding a balance between the two. Over said that his presentation and research on combining agility and discipline is based on his work with software teams and software projects, although it is applicable to other fields.
One of the reasons that agile methods are so popular today is their ability to respond to change. As evidence of this popularity, Over pointed out that about 40 percent of software developers use one or more agile methods, compared with 13 percent that only use traditional methods, according to a 2010 Dr. Dobbs survey on trends among global developers.
One reason that agile methods have become so popular is that the pace of change is accelerating. Organizations are seeking solutions that will allow them to become more responsive to change. Agile methods provide some key parts of that capability, Over said.Balance is key for organizations seeking to implement agile methods. While organizations work to improve agility, they must do so in a disciplined way. Discipline is particularly important for organizations like DoD acquisition programs and other federal agencies developing large, mission-critical software-reliant systems at scale.
It’s important for organizations to understand what is meant by agility. While several definitions exist, Over stated one he likes: Agility is responding rapidly and efficiently to change with consistency. An agile business, he told the audience, should be able to respond quickly to change, make decisions quickly, and respond quickly to customers’ needs every single time, not just occasionally.
What Does Agile Look Like
Many software organizations claim to be agile because they follow agile principles, but they often lack an understanding of what it means to be agile. To achieve a greater understanding of agile methods, Over recommended that organizations measure their agility and evaluate their success along the following factors:
Response time. Organizations should assess how quickly they respond to a customer’s needs. What sort of experiences are users having with the software they are producing in terms of response time?
Efficiency. Organizations should consider their ability to deliver software. Can they produce projects with the desired balance between cost and quality? Are processes performing efficiently? Can organizations respond to change quickly and efficiently or do costs balloon out of control when changes are made to the content or requirements of the system?
Consistency. Does every customer have the same experience? When customers interact with an organization, are they always seeing the same response time, the same types of efficiency, and the same types of behavior on each application?
Impediments to Agility in Software
In 2010, the authors of the Agile Manifesto reunited to hold a retrospective on Agile. Examining the state of the practice in the last decade, the authors identified 10 impediments to achieving agility. From that list, Over identified the following five impediments that he deemed critical for organizations to overcome when they seek to making agile methods work in practice:
Lack of a ready backlog, which is the list of features that developers prioritize to build the software, is a serious concern. The manifesto authors found that 80 percent of teams had a ready backlog, but within that backlog, only 10 percent of items were ready for implementation. As a consequence, delays would occur in projects because the backlog was not prepared.
Lack of being "done" at the end of a sprint means that as the project is nearing the end of the sprint and declaring the sprint completed, some work items are being deferred or skipped, which often causes delays. The most common cause is a poorly implemented practice or a deferred practice. For example, skipping unit testing can cause delays later in the project, as well as increase cost by at least a factor of 2.
Lack of teamwork surfaces when the team fails to come together and operate as a team and instead focuses on its individual needs. In this type of setting, developers focus on their own user features or stories and don’t pay attention to what the team is doing. They attend their daily standups, but fail to report problems that might affect the rest of the project. When team gets to the end of a sprint and discovers that there is still a lot of work to do, therefore, sprint failure and project delay will result.
Lack of good design stemming from organizational structures that affects the kind of designs that software teams produce. For example, an inflexible hierarchical structure often results in inflexible hierarchical designs, which in turn can yield code that is brittle and hard to use, excessively expensive, and prone to high failure rates.
Tolerating defects is an unfortunate reality for many teams as they near the end of a sprint. They are out of time and deferring defects to the end of a sprint. Teams in this situation often take a set of unit tests or other issues and defer them to later sprints. When issues are deferred (which is a form of technical debt, as discussed by Ipek Ozkaya in her Agile Research Forum presentation), the result is increased in cost and higher failure rates later in the lifecycle because the defects aren’t being addressed as they are discovered.
Avoiding the Impediments by Balancing Agility and Discipline
Over next identified work that he and other SEI researchers have conducted to remedy the impediments to agility described above. Over’s work with software teams has focused on identifying a set of principles that teams should adopt to improve agility and response time. Over highlighted the following five principles that help address impediments identified by the Agile community in their retrospective:
Build high-performance teams. Software is knowledge work. Build self-managed teams that make their own plans, negotiate their commitments, and know the project status precisely.
Plan every project. Never make a commitment without a plan. Use historical data. If the plan doesn’t fit the work, fix the plan. Change is not free.
Use a measured process to track progress. To measure the work, define its measures. To measure the process, define its steps. If the process doesn’t fit the work, fix the process. Tracking progress via a measured process need not require a complicated solution given the appropriate set of tools and a codified method for applying the tools effectively.
Design before you implement. Design is critical since it informs the software implementation. Good designs produce less code and simplify the downstream evolution of the code. Design what you know and explore the rest. When you know enough, finish the design and then implement it. To be clear, this principle isn’t espousing "big upfront design" (BUFD). It is BUFD without the BUF, or just D. Produce enough design to build the implementation. If the design is still too challenging, explore the problem first via prototyping to reduce design risk. Applying this principle effectively depends on various factors, including the size and complexity of the application, as well as familiarity with the problem, domain, methods, and tools.
Make quality software the top priority. Defects are inevitable. Poor quality wastes resources. The sooner a developer fixes a problem, the better. As Jeff Sutherland stated in his retrospective, tolerating defects, ignoring pair programming or code review practices, and inadequate unit testing will substantially reduce team velocity. Continuous integration tools and automated testing are also important, but testing alone is insufficient because poor quality software will always cost more to produce because finding and fixing defects is still expensive, and the longer you wait to fix them the greater the cost of repair.
In summary, Over told the audience that in working on 20 different projects in 13 organizations to implement these disciplined agile principles, SEI researchers found that organizations delivered more functionality than originally planned or finished ahead of schedule. Among the other benefits, Over stated, was that projects realized test costs of less than 7 percent of the total cost with an average cost of quality of only 17 percent. Also, the projects delivered software with an average of only six defects in 100,000 lines of new and modified code.
Finally, discipline is the key to agility, Over explained, adding that agility can only be achieved when everyone in an organization acts professionally and uses a disciplined, measured approach to their work.
What's Ahead
The first posting in this series summarized discussions by Anita Carleton, director of the SEI’s Software Engineering Process Management Program, and Teri Takai, chief information officer for the DoD. Carleton provided an overview of the forum and discussed areas where SEI work is focused on applying Agile methods at-scale. Takai then discussed how Agile methods have been introduced into the DoD software acquisition and development environment. The second posting summarized discussions by Mary Ann Lapham, a senior researcher in the SEI’s Acquisition Support Program, who highlighted the importance of collaboration with end users, as well as among cross-functional teams, to facilitate the adoption of Agile approaches into DoD acquisition programs. The third posting highlighted the forum presentation by Ipek Ozkaya, a senior researcher in the SEI’s Research, Technology, and System Solutions Program, who discussed the use of strategic, intentional decisions to incur architectural technical debt. The technical debt metaphor describes the tradeoff between taking shortcuts in software development to speed up product delivery and slower—but less risky—software development.
In the next—and final—posting in this series I will summarize my presentation at the forum on the importance of applying agile methods to common operating platform environments (COPEs) that have become increasingly important for the DoD. I will explain how these methods can encourage more effective collaboration between users, developers, testers, and certifiers to help the DoD successfully build more integrated, interoperable, and affordable software systems.
We look forward to hearing your thoughts on applying agile at-scale in the comments section below.
Additional Resources
To learn more about the Team Software Process (TSP), please visit www.sei.cmu.edu/tsp.
The slides and recordings from the SEI Agile Research Forum can be accessed at www.sei.cmu.edu/go/agile-research-forum.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:39pm</span>
|
By Douglas C. Schmidt, Principal Researcher
While agile methods have become popular in commercial software development organizations, the engineering disciplines needed to apply agility to mission-critical, software-reliant systems are not as well defined or practiced. To help bridge this gap, the SEI recently hosted the Agile Research Forum. The event brought together researchers and practitioners from around the world to discuss when and how to best apply agile methods in mission-critical environments found in government and many industries. This blog posting, the fifth and final installment in a multi-part series highlighting research presented during the forum, summarizes a presentation I gave on the importance of applying agile methods to common operating platform environments (COPEs) that have become increasingly important for the Department of Defense (DoD).
The first half of my presentation motivated the need for COPEs that help collapse today’s stove-piped, software-reliant DoD system solutions to decrease costs, spur innovation, and increase acquisition and operational performance. Since this material has appeared in my first SEI blog posting on COPEs, I’ll skip it in this posting and focus on the second half of my presentation, which discussed how applying agile methods can encourage more effective collaboration between users, developers, testers, and certifiers of COPEs to help the DoD build integrated, interoperable, and affordable software-reliant systems more successfully.
What’s Taking So Long to Achieve the Promise of COPEs?
Decades of public and private research and development investments—coupled with globalization and ubiquitous connectivity—has enabled information technology to become a commodity, where common off-the-shelf (COTS) hardware and software artifacts get faster and cheaper at a remarkably predictable pace. During the past two decades, we've benefitted from the commoditization of hardware and networking elements. More recently, the maturation and widespread adoption of object-oriented programming languages, operating environments, and middleware is helping to commoditize many software components and architectural layers. Despite these technology advances, however, developing affordable and dependable COPE-based solutions remains elusive for the DoD. There are a number of reasons for this, including
When DoD acquisition programs have tried to apply COPEs, they’ve tended to spend a lot of time building common software infrastructure, which consists largely of various layers of middleware. This process leads to something called the serialized phasing problem, where developers spend so much time building the infrastructure, that it may take years (often hundreds or thousands of person years) to develop this infrastructure. Meanwhile, application developers sit idle, not knowing precisely what the characteristics of the infrastructure will be. As a result, by the time the infrastructure has matured enough to support application development, the teams often discover the infrastructure provides inappropriate quality-of-service (QoS) or functionality. If this discovery occurs late in the lifecycle (e.g., during system integration, which is all too common in large DoD acquisition programs) it’s extremely costly to remedy.
A related problem faced by acquisition programs is the length of time it takes to get projects under contract via traditional contracting models. While this glacial contracting pace is not unique to COPEs, it greatly exacerbates the serialized phasing problem outlined above. In particular, if contract delays become excessive, infrastructure developers won’t have sufficient time to build and test the common software infrastructure thoroughly. If the common software infrastructure is not built and tested properly, application developers will end up wrestling with many defects and bottlenecks they are ill-equipped to handle. Problems stemming from serialized phasing that are caused by contract delays will result in even further delays in application delivery.
Classic DoD waterfall models assume that requirements can be largely defined early in the lifecycle. When some of these requirements change—as they certainly will as COPEs are reapplied in different contexts—it becomes quite expensive for the government to request the change orders needed to make the modifications. Without more streamlined and flexible lifecycle and contracting models, the inevitable changes to COPEs will be prohibitively expensive, thereby obviating the goal of cost savings achieved by sharing common software.
There’s a long-term trend in the DoD towards adopting COTS technologies for hardware and software. Although many COTS products are well-suited for mainstream commercial applications, they’re often not as well-suited for certain types of DoD systems, especially mission-critical weapons systems. The challenge for COPE developers is to ensure that they don’t base their common software infrastructure on COTS products that work well when applied in contexts where an 85 percent solution is sufficient (where dropping an occasional call or manually refreshing a hanging web page is acceptable) in a DoD weapons system with more stringent QoS requirements. In these mission-critical environments—where the right answer delivered too late becomes the wrong answer—many COTS products are too big, too slow, or too unreliable to serve as the basis of mission-critical COPEs. If DoD common software infrastructure is built naively atop a house of sand, applications will be hard-pressed to provide the end-to-end QoS that’s expected of them in hostile settings.
Another challenge facing developers of COPEs is overly-adhering to ossified standards and reference architectures that made sense a decade or so ago, but which failed to keep pace with technology advances. Unlike mainstream commercial systems—where technologies are often refreshed every couple of years—a longer-term perspective is needed to develop COPEs for DoD weapons systems. Likewise, it’s essential to integrate new technologies into COPEs to keep pace with advances in hardware, software, platforms, and requirements. We also need software architectures that are more flexible and resilient than simply adhering to standards that become obsolete. The crucial issue here is not new standards, but the capability of coming up with new architectures and new standards that themselves can be refreshed and reapplied with minimal impact over long periods of time.
How Agility Helps Achieve the Promise of COPEs
At the heart of the problems described above is a lack of a holistic approach that balances key business, managerial and technical drivers at scale. The last part of my presentation discussed key success drivers for COPE initiatives that proactively and intentionally exploit commonality across multiple DoD acquisition programs and outlined ways in which agility helps to make those drivers more effective. In my experience working at the SEI, DARPA, and Vanderbilt University’s Institute for Software-Integrated Systems for the past several decades, successful COPE efforts require a balance of the following drivers:
Business drivers, which focus on achieving effective governance and broad acceptance of the economic aspects of COPEs. Examples include managed industry/government consortia, agile contracting models, and effective data rights and licensing models.
Management drivers, which focus on ensuring effective leadership and guidance of COPE initiatives. Examples include mastery of agile lifecycle methods, strong science and technology connections to reduce technical risk, and of course a solid understanding of the critical role of software for the DoD.
Technical drivers, which focus on the foundations of COPE development. Examples include systematic reuse expertise, agile architecture expertise, and automated conformance and regression test suites.
As outlined above, agility can be applied to help each of these drivers. For example, agility can be applied to expedite contracting, which is an important business driver. Getting task, delivery, and change orders in place quickly via agile contracting models (such as Indefinite-Delivery, Indefinite-Quantity (IDIQ) contract vehicles) helps to mitigate serialized phasing problems. Likewise, agile contracting methods can also help ensure that the people who are building the systems and the people who are doing the acquisition triage and align their priorities so that the contract supports the needs and considerations of key stakeholders.
Agility can also be applied to manage COPE lifecycles more effectively, which is a key management driver. Common software infrastructure development efforts work best when there are close, interactive feedback loops between people building the applications and the systems engineers and the software engineers and developers building the infrastructure. Without these feedback loops, it’s easy to develop many reusable artifacts that aren’t useful and won’t be applied systematically. An agile approach is thus essential to
enable close cooperation between users, developers, testers, and certifiers throughout the lifecycle to ensure the necessary COPE capabilities are delivered rapidly and robustly
avoid integration "surprises" where things tend to break or underperform at unexpected times late in the lifecycle when its more expensive to fix problems
We’ve observed in recent years that rolling out incremental deliveries of a COPE capability every 4 to 8 months helps application developers establish a battle rhythm of knowing when to upgrade and when to leverage what’s in the common software infrastructure. This tight spiral avoids long and fragile serialized phasing lifecycles, which is something that agile methods do a good job of from a management perspective.
Finally, agility can be applied to ensure architectural flexibility, which is a crucial technical driver. Long-lived, software-reliant DoD systems need software architectures that are change tolerant. Inevitably standards will evolve; hardware will get better, faster, and cheaper; and software programming languages, operating systems, and middleware will all evolve over time. It is therefore essential to devise ways of "future proofing" COPE architectures and using technologies, techniques, and methods (such as the technical debt work that Ipek Ozkaya discussed at the Agile Research Forum) when making decisions about when to reengineer and when to refactor. These decisions should be based on empirical data and evidence, rather than relying on forecasts or legacy commitments that become obsolete and ossified over time.
Equally important is the ability to select COTS technologies for use in COPEs that have appropriate QoS capabilities for the ways in which they are applied to particular missions. Some COTS- and standards-based products work well in mission-critical contexts, whereas others don’t. The choice of which ones to use should be driven as much as possible through trade studies, empirical analysis, and various types quantitative analysis, as opposed to the latest techno-fad.
The following table summarizes how agility helps resolve the COPE challenges discussed above:
COPE Challenges
How Agility Helps Resolve COPE Challenges
Serialized phasing of COPE infrastructure and application development postpones identifying design flaws that degrade system QoS until late in the lifecycle, i.e., during system integration
Enables close cooperation of users, developers, testers, and certifiers throughout lifecycle to rapidly deliver COPE capabilities and avoid integration "surprises" without needing extensive upfront planning and serialized phasing
Emphasizes incremental rollout of COPEs by delivering useful capability every 4 to 8 months to reduce risk via early validation by application developers and users
Glacial contracting processes don’t support timely delivery of COPE capabilities to meet mission needs
Engages users and testers in developing COPE contract scope, evaluation criteria, incentives, and terms/conditions to ensure contracting supports all needs/considerations
Contracting models that assume COPE requirements can be defined fully up front are expensive when inevitable changes occur
Expedites execution of COPE work packages via multiple award Indefinite-Delivery, Indefinite-Quantity (IDIQ) contract vehicles, and issue Task/Delivery Orders for each release
QoS suffers when COPE initiatives attempt to use COTS products that are not suited for mission-critical DoD combat systems
Leverages common development, test and production platforms, and QoS-enabled standards-based COTS to deliver COPE capabilities faster, cheaper, and more interoperably, without redundant ad hoc infrastructure
Rigid adherence to ossified standards and reference architectures impedes COPE technology refresh and limits application capabilities
Establishes a change-tolerant architecture enabled by discovery learning that promotes decisions based on empirical data/evidence, rather than forecasts or legacy commitments
The Road Ahead for COPE Agility
One reason for the spotty track record of success with COPE-based systems is that the DoD hasn’t taken a holistic view of the way these types of systems are built. Existing brittle and proprietary stovepiped approaches to acquisition systems do not address the cost efficiency and cyber security challenges the DoD are wrestling with, nor do they help to deploy software and new technologies to the field rapidly. Moreover, managing the production of these systems via waterfall processes is simply not an effective way forward, especially in the shadow of sequestration.
Developing COPEs is achievable and valuable, but it’s not easy. Agility in business, management, and technical dimensions is essential, but it’s also no panacea. Additional research and engineering investment is needed to devise the appropriate methods, tools, and techniques that will enable agility at-scale, which is a key theme that we’ve emphasized throughout the SEI Agile Research Forum.
Finally, we need your help. Achieving success with COPEs for the DoD isn’t something that can be done by any one group, institute, or government program. The SEI needs to help bring together researchers, developers, and managers from academia, industry, and government to conduct the appropriate work to help reduce risk and ensure the success of current and planned COPE initiatives. We also need to work closely with academia and industry to train the workforce and identify key requirements. Likewise, we need to work with government to ensure effective oversight and acquisition dynamics to reduce the cycle time needed to acquire new systems, insert new technology into legacy systems, and sustain software-reliant systems at a lower cost over their lifecycles and across the DoD enterprise. The stakes are high, and now is the time to make a difference!
Additional Resources
To learn more about the SEI’s work on common operating platform environments, please visit http://blog.sei.cmu.edu/archives.cfm/category/common-operating-platform-environments-copes
To learn more about the SEI’s work on agile methods at-scale, please visit the SEI Agile Research Forum webinar site at http://www.sei.cmu.edu/go/agile-research-forum/.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:39pm</span>
|
By Robert Stoddard, ResearcherSoftware Engineering Measurement and Analysis Program
As part of our research related to early acquisition lifecycle cost estimation for the Department of Defense (DoD), my colleagues in the SEI’s Software Engineering Measurement & Analysis initiative and I began envisioning a potential solution that would rely heavily on expert judgment of future possible program execution scenarios. Previous to our work on cost estimation, many parametric cost models required domain expert input, but, in our opinion, they did not address alternative scenarios of execution that might occur from Milestone A onward. Our approach, known as Quantifying Uncertainty in Early Lifecycle Cost Estimation (QUELCE), asks domain experts to provide judgment not only on uncertain cost factors for a nominal program execution scenario, but also for the drivers of cost factors across a set of anticipated scenarios. This blog post describes our efforts to improve the accuracy and reliability of expert judgment within this expanded role of early lifecycle cost estimation.
Our work in cost estimation began two years ago, building upon a review of existing cost estimation and expert judgment research. As an example, we identified an industry consultant, Douglas Hubbard, whose book, How to Measure Anything, presents an approach known as "calibrating your judgment" (my colleague, Dave Zubrow, describes Hubbard’s technique in a recent blog post). Hubbard’s focus on calibrating expert judgment using "trivial pursuit" exercises led to our team’s decision to pursue research into the use of domain-specific reference points to further improve the accuracy and reliability of expert judgment within cost estimation.
Our research on early lifecycle cost estimation for the DoD consists of two tasks:
Development of the QUELCE method, in which the probability of changes occurring in program execution is separated from the final assessment of the effects of such changes on the program cost estimate
Critical thinking and designed experiments that would contribute to current research on expert judgment
We hypothesized that a domain-specific approach to calibration training and development of reference points would be necessary to reduce unwanted variation in judgments rendered by experts participating in the QUELCE method. We decided to take a two-pronged approach to improving expert opinion. The first part of the approach involved data mining of DoD program execution experience. The second part of the approach interviewed DoD experts about cost estimation and DoD program cost experience. One of our goals is to create an online repository of domain reference points that embodies the historical DoD program cost experience.
The repository will include a searchable database of reference points that helps domain experts exercise better judgment during cost estimation. Domain experts will be able to query the reference points using key words based on search technology. Search results will show the key reference points in relation to the domain and technology challenge. The domain expert(s) will then review those reference points before formulating judgement for the current cost estimation exercise. At this point in the project, we are mining reference points from DoD and other open-source data examples.
My colleague, James McCurley, has investigated DoD repositories for raw information that outlined why acquisition programs experience cost and schedule overruns. Our team compiled domain reference points from McCurley’s data that identify selected changes associated with cost and schedule overruns. We are categorizing these changes into a set of common change drivers that are rooted in the various sources we have accessed.
One of the first sources we accessed was the U.S. Navy’s Probability of Program Success (PoPS) program. The PoPS criteria came from studies of program performance used by the Navy to implement a step-by-step approval process for a program to continue, independent of—but aligned to—the DoD acquisition process. PoPS identified a number of categories of reasons for cost and schedule overruns in government programs.
PoPS was always seen as just one of many sources of programmatic factors that might provide information useful to QUELCE. The PoPS criteria are biased toward programmatic change issues (such as sponsorship, contractor performance, and program office performance) that are of primary concern to DoD sponsors and Program Executive Offices. As expected, when we started this project, we are finding the need to supplement PoPS with more technical change issues, such as those related to system engineering and integration factors.
Many technical change drivers may be seen in the Capability Based Assessment (CBA) activity performed by programs in preparation for the Milestone A decision. The CBA includes the Functional Area Analysis (FAA), Functional Needs Analysis (FNA), and Functional Solution Analysis (FSA). Another early source is the Analysis of Alternatives (AoA). These and other early documents often include information that identifies technical and programmatic uncertainties not captured in the cost estimation process, but which can be incorporated as program change drivers in QUELCE method. Consequently, many technical change drivers are rooted in artifacts that proposed programs must draft prior to Milestone A.
Examples of change drivers we’ve identified from various sources include:
Interoperability - a program is affected by changes from a dependent program
Contractor Performance - a subcontractor must be replaced
Obsolescence - a part is made obsolete before a program is operational
Technical Performance - either a technology is not ready for use or a technology fails to achieve key performance goals
Scope - the source of many changes, including new users, additional delivery targets, and extra platforms, all of which fall outside the realm of "code growth"
Funding - funding may be increased or decreased in DoD programs, often with little warning
Applying Our Expert Opinion Approach
In the coming year, our goal is to develop a database with information that supports experts implementing the QUELCE method. We will publish our approach to improving expert judgment and increasing and structuring their involvement in cost estimation using procedures similar to Team Software Process (TSP) scripts. Our goal is to ensure that domain experts have more active involvement with the cost estimation activity. The problem today is that domain expert results are often loosely coupled with the cost estimates. In contrast, QUELCE will facilitate domain experts systematically discussing change drivers and then mapping the change drivers explicitly to the cost driver inputs of traditional cost estimating models and estimating relationships (CERs).
In our approach, the domain expert will be prompted at different points throughout QUELCE to access the reference point database. These activities will consist of just-in-time virtual training for calibration within a given domain. For example, a domain expert may be participating in QUELCE to develop a cost estimate for a new communication system that involves satellite technology. If the domain expert has not recently completed virtual calibration training for that domain, he or she may receive a refresher course consisting of a two- to four-hour online exercise.
Our approach to improving expert opinion will help domain experts during the following three different points in the QUELCE method that depend significantly on expert judgment:
Identifying pertinent change drivers. After completing the training, domain experts will be asked to participate in a workshop exercise that anticipates which change drivers will most likely be relevant to a particular program. In the workshop, domain experts will query for communication programs or specific technology names related to particular programs. In the example involving the new communication system above, the results should yield information related to historical communication programs or technologies and domain reference points, explaining why certain aspects went over budget or schedule.
Populating the change driver cause-and-effect matrix. The second judgment point involves a change driver cause-and-effect matrix. The domain expert will evaluate each change driver and rate the probability, on a scale of 0 to 3, that the change driver will cause any other change drivers on the list to switch from a nominal to an off-nominal condition, thereby signaling the danger of cost and schedule overruns. This exercise requires judgment about the relationships between change drivers. The domain expert will get information from querying our repository before rendering this type of judgment. For example, reference points might include historical information about a change driver going off nominal and subsequently causing three other change drivers to go off-nominal. The reference points therefore give the domain expert a basis to understand the relationships between change drivers and help make them more accurate.
Establishing probabilities for the Bayesian Belief Network (BBN). The BBN models the change drivers as nodes in a quantitative network, including probabilities that state changes in one node will create a state change in another node. Every change driver has a parent table that presents all the possible scenarios resulting from different combinations of its parent change driver states. For example, in our BBN, we have change driver A and change driver B, and both have an influence on change driver C. If change driver A has nominal and off-nominal states and change driver B has nominal and off-nominal states, there are four different combinations of parent change driver states, e.g. scenarios that may affect change driver C.
While our work to date has focused on calibrating expert judgment in the
DoD cost estimation of program development, our approach could be
applied to many situations beyond cost estimation. We envision this
approach being used in domains such as portfolio management and
strategic planning.
Summary
Our research into the QUELCE method for pre-Milestone A cost estimation represents a significant advance by enabling the modeling of uncertain program execution scenarios that are dramatically different from the traditional cost factor inputs of cost estimation models currently employed later in the DoD acquisition lifecycle. By synergizing the latest advancements in proven methods such as scenario planning workshops, cause-effect matrices, BBNs, and Monte Carlo simulation, we have created a novel and practical method for early DoD acquisition lifecycle cost estimation. If you’re interested in helping us succeed in these efforts, please let us know by leaving a comment below.
Additional Resources
To read the SEI technical report, Quantifying Uncertainty in Early Lifecycle Cost Estimation (QUELCE) please visit www.sei.cmu.edu/library/abstracts/reports/11tr026.cfm.
For more information about Milestone A, please see the Integrated Defense Life Cycle Chart for a picture and references in the "Article Library."
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:39pm</span>
|
Second in a Two-Part Series
By Lisa Brownsword
Acquisition Support Program
Major acquisition programs increasingly rely on software to provide substantial portions of system capabilities. All too often, however, software is not considered when the early, most constraining program decisions are made. SEI researchers have identified misalignments between software architecture and system acquisition strategies that lead to program restarts, cancellations, and failures to meet important missions or business goals. This blog posting—the second installment in a two-part series—builds on the discussions in part one by introducing several patterns of misalignment—known as anti-patterns—that we’ve identified in our research and discussing how these anti-patterns are helping us create a new method for aligning software architecture and system acquisition strategies to reduce project failure.
Identifying Anti-Patterns
We used an interview-based approach to discover and document patterns of alignment among four key aspects: business and mission goals, architecture, quality attributes, and acquisition strategy. We then analyzed the interview data looking for evidence of alignment or misalignment. Since most of our data comes from troubled programs, we have primarily discovered evidence of misalignment—known as anti-patterns—to date. Our initial set of anti-patterns includes
undocumented business goals - the lack of well-documented business goals expressed as they apply to an acquisition program
unresolved conflicting goals - the lack of analysis and reconciliation of known goals
failure to adapt - failure of an acquisition program to modify the architecture and the acquisition strategy in response to changing goals, priorities, or technology
turbulent acquisition environment - requested changes are so frequent and contradictory that an acquisition program cannot realistically accommodate them
poor consideration of software - critical decisions made early in an acquisition program’s lifecycle have strong negative implications on the system’s software
inappropriate acquisition strategies - the acquisition strategy fails to consider important software attributes
overlooking quality attributes - a failure to define and use software quality attributes in the definition of the software architecture or acquisition strategy
Let’s explore one of these anti-patterns to show how it might be used. The first anti-pattern—undocumented business goals—reflects a lack of precise, well-defined, and well-documented business goals for a DoD acquisition program. In the programs we examined, we found that these goals were seldom explicitly expressed (e.g., "replace legacy system") or they reflected high-level program constraints (e.g., "maximize competition") or policy regulations (e.g., "implement an open architecture").
When this anti-pattern is present, we found that mission requirements dominate the definition of the software architecture, often leading to an architecture contrary to the achievement of the unstated business goal. For instance, an architect might reasonably design a monolithic architecture that was an excellent fit for the mission goals for performance, but which could be strongly at odds with an implicit—but unspecified—business goal to avoid vendor lock.
The lack of explicit business goals has a more direct impact on the acquisition strategy. In one program in our study, a key element for the program was to build a new system with significant new capabilities. The acquisition strategy specified a slow, deliberate pace to ensure that the new capability was defined correctly. A competing goal was to replace several "end-of-life" systems. Not stated in this goal was the urgent need to replace these failing system as quickly as possible. When the operators and maintainers of the legacy systems became aware of the intended acquisition strategy, they forced a major change in program focus. The sequence of acquisition activities required alteration, which caused a significant delay in meeting either goal.
Creating a Method for Identifying Misalignments
Discovering and documenting anti-patterns is only the beginning of our work in addressing the problems of misalignment. The second phase of our project involves creating a method that helps acquisition programs avoid the anti-patterns we’ve discovered and provides options that could help programs better align their acquisition strategy and software architecture to satisfy stakeholder mission and business goals. We will then validate the utility of this method through the help of projects and programs outside the SEI.
Software-reliant systems are inherently social and technical endeavors. A key facet of our method, therefore, is thus its ability to bring disparate actors together—often for the first time—to rationally identify and discuss issues of mutual concern and be able to make hard and informed choices based on rational information. We plan to adapt and tailor existing methods where possible. We are currently exploring methods in the following areas:
identifying salient stakeholders - There are many requirements elicitation and analysis methods. Unfortunately, most methods assume that it is possible to know which stakeholders will most affect or be most affected by the program. We are therefore considering Controlled Requirements Expression (CORE), which is a method that assists developers in identifying stakeholders related to a given acquisition to help develop a more complete list of stakeholders.
defining business and mission goals - The Pedigreed Attribute eLicitation Method (PALM) discussed in part one of this blog series remains a central element of our new method. PALM enables organizations to systematically identify the high-priority mission and business goals from system stakeholders. The architectural implications of those goals are then captured by determining the quality attribute requirements they imply. We are extending PALM to investigate the acquisition strategy implications of business or mission goals.
analyzing quality attributes - Quality Attribute Workshops (QAW) are a widely accepted method for developing definitions of the quality attributes that form the basis for deriving the software architecture. We are using the same approach to derive attributes that should drive the acquisition strategy.
trading off architecture and acquisition strategy options - analysis methods, such as Architecture Tradeoff Analysis Method (ATAM) or Cost Benefit Analysis Method (CBAM), are used to ensure consistency of software and system quality attributes. We are analyzing these methods to explore consistency between the acquisition strategy and its driving quality attributes.
Looking Ahead
Government acquisitions are more likely to succeed if a program can align its acquisition strategy and software architecture with each other and with respect to satisfying stakeholder mission and business goals. Our research has shown evidence of misalignments in the form of anti-patterns. Discovering the patterns a program should avoid is a key step toward our objective to develop a method that systematically supports business and mission goals by aligning the acquisition strategy and software architecture.
We welcome opportunities to validate and expand the anti-patterns or pilot our emerging method. Please leave us feedback or questions about our research in the comments section below and we will follow up with you.
Additional Resources
For more information about the Pedigreed Attribute eLicitation Method (PALM), please visitwww.sei.cmu.edu/architecture/tools/establish/palm.cfm
To read A Method for Controlled Requirement Specification, please visithttp://ss.hnu.cn/oylb/tsp/CORE-mullery.pdf
To read the SEI technical report, Quality Attribute Workshops, please visithttp://www.sei.cmu.edu/library/abstracts/reports/03tr016.cfm
For more information about the book, Evaluating Software Architectures: Methods and Case Studies, please visit www.sei.cmu.edu/library/abstracts/books/020170482X.cfm
To read the SEI technical report, Making Architecture Design Decisions: An Economic Approach, please visitwww.sei.cmu.edu/library/abstracts/reports/02tr035.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:38pm</span>
|
Final Installment in a Three-Part SeriesBy Bill Nichols, Senior Member of the Technical StaffSoftware Engineering Process Management
This post is the third and final installment in a three-part series that explains how Nedbank, one of the largest banks in South Africa, is rolling out the SEI’s Team Software Process (TSP) throughout its IT organization. In the first post of this series, I examined how Nedbank addressed issues of quality and productivity among its software engineering teams using TSP at the individual and team level. In the second post, I discussed how the SEI worked with Nedbank to address challenges with expanding and scaling the use of TSP at an organizational level. In this post, I first explore challenges common to many organizations seeking to improve performance and become more agile and conclude by demonstrating how SEI researchers addressed these challenges in the TSP rollout at Nedbank.
In a 10-year retrospective on agile methods, Jeff Sutherland, co-creator of the Scrum agile method, listed some of the major challenges and impediments for organizations that adopt agile methods, including
demanding technical excellence
promoting individual change
leading organizational change
organizing knowledge
improving education
We’ve encountered and addressed many of these challenges in our work with Nedbank, which is one of several large companies to successfully pilot TSP and undertake an organizational rollout. For a TSP project (or any project using a new process) to succeed, management and other stakeholders must often change their behavior to support the process; they all must use the process in a way appropriate for their roles. Their behavior must support the empowerment of the TSP project team. Conversely, if the stakeholders do not behave appropriately, they can undermine the empowerment of the team and its ability to complete the project successfully.
Demanding Technical Excellence
In our experience, the key to success is to train management on how to ask for technical excellence and to know it when they see it. Technical people often know the details far better than their managers. Managers must be provided training and guidance on how to set the right goals for these empowered teams and to track the right measures. For Nedbank, the most important outcome of technical excellence is low rates of fielded severity 1 defects, that is, defects that stop the system. With TSP, this quality goal is not just a matter just for quality assurance (QA), but for the development teams to manage.
In an organization with self-managed teams based on TSP principles and practices, the teams and individuals manage the technical work. The teams define their process, estimate their work, track progress, and collect data on all defects. Status reports do not include detailed data, but summaries. Managers who oversee technical excellence need to hold teams accountable, but must avoid micromanaging the details of the technical work. This trust, while sometimes hard for management to grant, is essential to team empowerment and commitment.
At Nedbank, senior management reviews a laundry list of visible outcomes for the project including adherence to committed delivery date, cost estimation accuracy, defects in QA and production, and accuracy of status reports. The teams are thus held accountable for project outcomes, managing their work, and keeping management informed. This accountability eliminates the fear of the data being misused, empowers the teams to make decisions, and aligns incentives and goals. To expand TSP use beyond early adopters, the organization must make expectations explicit and non-threatening. Project scorecards must focus on publically visible outcomes rather than on detailed process data. TSP includes training for management to help with this transition. This training is just one aspect of managing organizational change.
Promoting Individual Change and Leading Organizational Change
Attempts to change organizational behavior often fail. At the SEI, we address this difficulty by providing change-management training to TSP coaches. This training is effective at the individual and team level, but more is needed on an organization-wide scale.To address this gap, we coach the executive team on change management and the logistics of rollout and sustainment. The organization needs to build these new ways of working into its processes for
training
project planning
project evaluation
personnel evaluations
Nedbank is fortunate to have an executive who understands the need for change and the difficulties surrounding it, and has provided resources to ease the transition.
Organizing Knowledge and Improving Education
Education must be tailored to the target audience. TSP provides specific training for coaches, instructors, developers, non-developer team members, team leads, and executive management. This training, which also addresses another challenge referenced by Sutherland, helped all involved at Nedbank understand the principles behind the change and their role in making new organizational practices a success. Through the Center of Excellence (COE) presented in my second blog posting, Nedbank assures that training is performed and that everyone involved is prepared to work on a TSP project.
The function of organizing knowledge remains a work in progress as the COE collects project data and experiences. Perhaps the greatest barrier to collection of TSP data at the organizational level is the lack of enterprise tools. We are teaming with two large partners to develop the next generation of TSP-data tools at the enterprise level, but they are still a work in progress. Agile development organizations must also secure sponsorship, build capability, identify and train change agents, develop organizational support for the initiative, track progress, and highlight success to senior management. They must not associate long-term success with any single project, but rather with demonstrated and sustained improvement over many projects and many people. Nedbank, which is at the leading edge of TSP implementation, started down this path with a rollout strategy that included funding and time for
training executives, project leads, coaches, and team members
setting expectations about how projects will be planned and conducted
gathering, reporting, and analyzing specific project data
The SEI worked with the COE to help Nedbank pilot TSP and train instructors and coaches. TSP coaches continue to help Nedbank plan and implement its organizational rollout.
Results: An Organization Changed
Nedbank now begins all new software projects with TSP. This effort is led by the TSP COE, which estimates needs and supports teams with coaching and training. In addition to providing operational support, the Nedbank COE collects data to track organizational progress and provides relevant planning data for its teams. Relevant data includes ranges of project schedule and cost variances, scope growth, component size, and effort estimation accuracy, normalized defect levels in QA and user-acceptance test, schedule and cost ratios of development, testing, maintenance, cost of specific activities such as peer inspection, and cost of rework. Nedbank uses this data at the organization level to
Assess the overall cost efficiency of work. The benefits in schedule predictability, quality, and overall cost containment are made explicit and real.
Plan projects. The use of historic data of comparable projects means the planning parameters are no longer just guesses. We have data from similar projects at Nedbank to suggest how results might be affected by the duration of front-end or back-end projects, the duration in testing, the calendar time that resources must be committed to support testing, and changes in the number of staff.
Improve performance. Technical excellence and quality are economic decisions. Improving cost, quality, or schedule performance requires change, (e.g., taking time to plan, training for inspections, and measuring quality early). It’s important to know what changes will occur and what those changes will cost.
Only by understanding the current process can the organization set realistic goals. By combining realistic goals and a data-driven knowledge of the process, empowered teams can make specific changes to the way they do work and evaluate the results. Simple outcome measures include severity defects in production, schedule overruns, and cost per change. To improve performance, Nedbank measured how much effort and time were required for testing and how much effort was devoted to defect fixes. With reliable data consistently defined in projects across the organization, Nedbank is building performance benchmarks and can now begin to make data-based cost benefit analysis for process changes.
When Nedbank introduced detailed project planning and tracking, the use of TSP made it possible to see how these changes altered time spent in specific activities and the associated results on schedule performance, cost containment, and quality at delivery. In the context of a defined process, a few simple measures—direct effort, defects, schedule, and size—transformed their ability to see and manage their software releases.
Looking Ahead
We are continuing to work with Nedbank (and other organizations) to share our TSP research. Nedbank’s COE currently focuses on growing capacity during the rollout. The process is arduous and will take several years to complete, which can lead to frustration. Shortcuts, such as incomplete training, lack of coaching support, or stakeholders who have not yet fully bought into the new approach, will degrade results and likely create resistance. When the rollout is complete, the emphasis will shift to sustainment. As shown in the COE’s pilot video, Nedbank is already seeing benefits from these changes.
If you’re interested in learning more about TSP, please consider attending the upcoming TSP Symposium in St. Petersburg, Florida, USA.
Additional Resources
For more information about TSP, please visitwww.sei.cmu.edu/tsp
For more information about the 2012 TSP Symposium, please visitwww.sei.cmu.edu/tspsymposium/2012/
To read the SEI technical report Deploying TSP on a National Scale: An Experience Report from Pilot Projects in Mexico, please visitwww.sei.cmu.edu/library/abstracts/reports/09tr011.cfm
To read the Crosstalk article A Distributed Multi-Company Software Project by Bill Nichols, Anita Carleton, & Watts Humphrey, please visitwww.crosstalkonline.org/storage/issue-archives/2009/200905/200905-Nichols.pdf
To read the SEI book Leadership, Teamwork, and Trust: Building a Competitive Software Capability by James Over and Watts Humphrey, please visit www.sei.cmu.edu/library/abstracts/books/0321624505.cfm
To read the SEI book Coaching Development Teams by Watts Humphrey, please visitwww.sei.cmu.edu/library/abstracts/books/201731134.cfm
To read the SEI book PSP: A Self-Improvement Process for Engineers by Watts Humphrey please visitwww.sei.cmu.edu/library/abstracts/books/0321305493.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:38pm</span>
|
By Donald FiresmithSenior Member of the Technical StaffAcquisition Support Program
Engineering the architecture for a large and complex system is a hard, lengthy, and complex undertaking. System architects must perform many tasks and use many techniques if they are to create a sufficient set of architectural models and related documents that are complete, consistent, correct, unambiguous, verifiable, usable, and useful to the architecture’s many stakeholders. This blog posting, the second in a two-part series, takes a deeper dive into the Method Framework for Engineering System Architectures (MFESA), which is a situational process engineering framework for developing system-specific methods to engineer system architectures.
In our previous blog entry, we introduced MFESA and its four components:
the MFESA ontology defining the foundational concepts underlying system architecture engineering
the MFESA metamodel defining the base superclasses of method components
the MFESA repository of reusable method components
the MFESA metamethod for creating project-specific methods using method components from the MFESA repository
We also briefly discussed the applicability of MFESA and how it simultaneously provides the benefits of standardization and flexibility. In this blog posting, we will take a closer look at the four components comprising MFESA.
The MFESA Ontology
To create a complete and well-defined method for performing system architecture engineering, it is first necessary to understand terminology (technical jargon) and concepts underlying this area. This understanding goes beyond a mere glossary of terms to encompass an information model that also describes how these concepts relate to each other. The MFESA ontology defines the domain of system architecture engineering and is the foundation on which the rest of MFESA is build.
Figure 1 below summarizes many of the most important contents of the MFESA ontology, an information model of the foundational concepts underlying system architecture engineering. Starting at the center of the diagram and moving to the left, we see that system architecture comprises a combination of
architectural structures that can be static or dynamic as well as logical or physical
architectural decisions that include the use of architectural styles, architectural patterns, and architectural mechanisms
Starting at the center and moving to the right, we see that system architecture can be documented in many ways including in the form of
architectural descriptions, some of which are various types of architectural documents as well as various types of architectural models that model the different kinds of architectural structures
executable representations, such as architectural prototypes, architectural simulations, and executable architectures
At the top of the diagram, we see that the architectural concerns of stakeholders are architectural drivers, including architecturally-significant requirements that drive the engineering of the system architecture. Architectural concerns are also, often quality focus areas (i.e., quality characteristics such availability, performance, portability, reliability, robustness, safety, security, and usability) and architectural support for these qualities can be organized in the form of architectural quality cases (a.k.a., assurance cases) that provide arguments and evidence that the architecture adequately supports the architecturally-significant requirements.
View larger Image
The MFESA Metamodel
The second MFESA component is a process metamodel that is restricted to system architect engineering. Figure 2 shows the MFESA view of the concepts: process, method, and process metamodel.
The system architecture engineering processes that are performed on different projects are at the lowest level of Figure 2. Each process consists of components such as
actual work products (e.g., architectural models and architecture documents)
actual work units (e.g., instances of architecture engineering tasks and techniques) that are used to produce the work products
actual workers (e.g., specific architecture teams and architects) who perform the work units to produce the work products.
The middle layer consists of system architecture engineering process models that are the as-intended methods for engineering system architectures. These methods contain reusable components that describe their instances: the process components. MFESA thus recognizes that there is both a theoretical and practical difference between the methods documented in standards, procedures, and guidelines (middle layer) and the work people actually perform on their projects (bottom layer).
At the top level of this three-level structure is the MFESA system architecture engineering process metamodel, which models the process model. This metamodel consists of metamethod components, which are the abstract subclasses that are specialized to produce the method components. For example, the MFESA metamethod component task is subclassed to produce 10 specific tasks of system architecture engineering that, in turn, are instantiated as actual task executions on the project.
Figure 3 shows the four metamethod components within the MFESA process metamodel. They are the abstract classes of process components that are subclassed to create the MFESA method components.
The MFESA repository contains an extensive set of reusable method components derived via subclassing from the MFESA metamethod components. Figure 4 depicts the first nine method components (abstract classes of process components). MFESA thus recognizes three types of architecture workers who perform three types of architectural work units to produce three types of architecture work products. The complete class hierarchy of method components is considerably larger and includes the concrete method components that are instantiated to produce the actual process components seen on real projects.
Example reusable method components include subtypes of:
architectural work products, such as architectural representations (which we saw in the MFESA ontology) that include both architectural documents (e.g., system architecture document, software architecture document, and architecture vision document, various types of architectural whitepapers and reports such as how the architecture handles concurrency or fault tolerance) and architectural models (e.g., class diagrams, sequence diagrams, statecharts, and data flow diagrams)
architecture workers, such as the system architect, software architect, and architecture team, and various types of architecture modeling and documentation tools
architecture work products, such as the many types of architecture tasks (e.g., identify the architectural drivers and maintain the architecture and its representations) and techniques (e.g., brainstorming and architectural patterns)
View larger Image
The MFESA Metamethod
As mentioned in the previous blog entry, MFESA is not a method for engineering system architectures but rather a framework for creating methods for engineering system architectures. Figure 5 shows the MFESA metamethod for creating these methods. Each box in the figure is a step in the method.
The first metamethod step determines the project’s needs regarding system architecture engineering methods. The second step determines the number of such methods that are needed, which for most projects is one. The third step determines whether
a previously constructed method can be tailored to fit the specific needs of the project, in which case a method is selected and then tailored, or
whether a new method needs to be constructed from the reusable method components, in which case the relevant method components are selected, tailored, and integrated to form the new method
In both cases, the resulting method(s) must be documented, typically in plans, standards, procedures, guidelines, templates, and user manuals. The document method(s) must also be verified as complete, consistent, correct, and usable. Finally, the verified method(s) must be approved and published.
View larger Image
Wrapping Up
Systems, organizations, and contractual relationships between acquirers and developers are diverse and multi-dimensional. No single architecture engineering method is therefore sufficiently complete and tailorable enough so that it is appropriate for all situations. MFESA is not a method for engineering system architectures but rather a method framework that can be used by system architects and process engineers to produce appropriate system architecture engineering methods using situational method engineering. To accomplish this, MFESA consists of four, interrelated components:
an ontology that defines the foundational concepts of system architecture engineering and the relationships between them
a metamodel that defines the base classes of reusable method components from which all other method components are subclassed
a repository of reusable method components (architectural work products to be produced, work units to be performed, and architectural workers) from which situation-specific system architecture engineering methods can be constructed
a metamethod for selecting appropriate method components, tailoring them, and integrating them to produce the appropriate method
Even if you are content using a single organizational system architecture engineering method, you may find it worthwhile to peruse the MFESA repository to see if your method is missing any important elements. Consultants, trainers, and educators may also find MFESA a good foundation on which to build training courses and classes on system architecture engineering. Finally, researchers in system architecture engineering methods and situational method engineering may find MFESA useful as a repository of reusable method components.
Additional Resources
MFESA is primarily documented in the book, The Method Framework for Engineering System Architectures, published in 2009 by CRC Press. The book was co-authored by a six-member team from the SEI, MITRE, and the U.S. Air Force. The book was recently added to the Intel Corporation’s Recommended Reading List.
To see a tutorial on MFESA presented at the 2011 IEEE International Systems Conference (ISC) in Montreal, Quebec, Canada please go to
http://donald.firesmith.net/home/publications/publicationsbyyear/2011/2011-MFESA_ISC.pdf
To see a tutorial on MFESA presented at the 21st System and Software Technology Conference (SSTC) in Salt Lake City, Utah, please go to
http://donald.firesmith.net/home/publications/publicationsbyyear/2009/MFESA-SSTC.pdf
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:38pm</span>
|
By Andrew P. MooreSenior Member of the Technical StaffThe CERT Program
Since 2001, researchers at the CERT Insider Threat Center have documented malicious insider activity by examining media reports and court transcripts and conducting interviews with the United States Secret Service, victims’ organizations, and convicted felons. Among the more than 700 insider threat cases that we’ve documented, our analysis has identified more than 100 categories of weaknesses in systems, processes, people or technologies that allowed insider threats to occur. One aspect of our research has focused on identifying enterprise architecture patterns that protect organization systems from malicious insider threat. Enterprise architecture patterns are organization patterns that involve the full scope of enterprise architecture concerns, including people, processes, technology, and facilities. Our goal with this pattern work is to equip organizations with the tools necessary to institute controls that will reduce the incidence of insider compromise. This blog post is the second in a series that describes our research to create and validate an insider threat mitigation pattern language that focuses on helping organizations balance the cost of security controls with the risk of insider compromise.
Our Approach
The aim of our pattern work is to develop insider threat mitigation strategies that are scientifically and operationally valid. To create those strategies, we employ mixed-methods research, which combines both qualitative and quantitative approaches. Among the various types of insider crimes—IT sabotage, theft of intellectual property (IP), national security/espionage, and fraud—our work has initially focused on IP theft, which includes theft of an organization’s proprietary information.
We have already established a mitigation pattern of IP theft that is based on the types of crime we’ve observed in our case database. This pattern is oriented around the observation that many IP thieves steal information close to announcing their resignation. This behavior gives organizations a window of opportunity for identifying and responding to insider IP theft activity.
Since it’s costly and time-consuming for organizations to monitor departing employees 100 percent of the time, we directed our resources at the timeframe when it is the highest likelihood that IP theft will occur. Our pattern focused on this question:
I am establishing a program that looks for evidence of insider theft of my organization’s IP. Review and analysis of employee activities is costly. How can I improve the efficiency of resources I direct at IP theft detection?
To help answer this question, our research team decided to focus on the distribution of durations between the following two dates across our sample of insider threat cases:
the date of the last confirmed theft of IP event prior to an insider’s termination and
the date of the insider’s termination
Past qualitative analyses of our insider threat data have suggested that the approach of a termination day accelerates the insider’s decision-making process in a nonlinear manner. Our primary hypothesis is therefore the following:
Primary Hypothesis: The distribution of the times between an insider IP thief’s last confirmed theft of IP before termination and the date of the insider’s termination follows a nonlinear distribution.
Preliminary Analysis
To determine whether our data on insider theft of IP crimes supports this hypothesis, we collaborated with Dave Zubrow, acting chief scientist with the SEI’s Software Engineering Process Management Program and lead of the Software Engineering Measurement & Analysis Initiative. To test the hypothesis, we used Crystal Ball software to evaluate the best fit distribution for our data on 30 IP theft cases from the CERT database. The geometric distribution (with p=0.02) was the best fit to our data when compared with other candidate distributions.
We also ran a Monte Carlo simulation that generated 1,000 resampled data sets from the best fit distribution. From that data set, we graphed the cumulative probability function. We found that about 70 percent of insider IP theft cases can be caught by reviewing for significant theft events by the insider during the last 60 days of employment. Perhaps more importantly, the graphed function provides a tool to help organizations adjust their review window in an informed way, based on their particular risk aversion for IP theft and the cost of insider activity review within the organization.
It is important to emphasize the limitations of our data analysis to date. Our data analysis and results are preliminary in part because of the small number of cases in our data set. While the best-fit distribution was the geometric distribution (as compared to a wide variety of other distributions), the fit was statistically different from the theoretical distribution. While future research will continue to add additional cases to better identify the underlying distribution and refine our analysis, the resampling approach described above allowed us to use the data that we had to greatest effect. Given that the bestfit for the data is the geometric distribution, we contend that this result provides at least prima facie evidence that the subject mitigation pattern will be effective in fighting insider theft of IP. Continuing research will strive to bolster this evidence.
We expect that the patterns and pattern language developed through this research will enable coherent reasoning about how to design enterprise systems to protect against malicious insider activity. Instead of working with vague security requirements and inadequate security technologies, system designers will have a coherent set of patterns that enable them to develop and implement effective strategies against the malicious insider activity more quickly and with greater confidence.
Looking Ahead
In addition to collecting and analyzing new cases of insider theft of IP, our future work in this area will explore another critical question with regard to patterns of IP theft:
How can organizations distinguish between insider theft activity and legitimate employee activity?
An answer to this question will help an organization use the mitigation pattern cost effectively by reducing the chance of false positives during the review process.
Evaluating the effectiveness of individual mitigation patterns is just one aspect of our work to help organizations bolster their defenses against malicious activity by insiders. We view our pattern work as a way of helping organizations integrate what we’ve learned into their existing enterprise architecture and practices. The first post in this series described our work to protect next-generation DoD enterprise systems against insider threats by capturing, validating, and applying enterprise architectural patterns.
In our upcoming post in this series, fellow researcher David Mundie will describe a pattern language that we’ve developed to help software architects better understand how to apply patterns in sequence to design a system that provides balanced protection against malicious activity by insiders.
Additional Resources:
To read the SEI technical report, A Pattern for Increased Monitoring for Intellectual Property Theft by Departing Insiders, please visitwww.sei.cmu.edu/reports/12tr008.pdf
To read the SEI technical note, Insider Threat Control: Using Centralized Logging to Detect Data Exfiltration Near Insider Termination, please visit www.cert.org/archive/pdf/11tn024.pdf.
To read the CERT Insider Threat blog, please visit www.cert.org/blogs/insider_threat/
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:37pm</span>
|
By Suzanne Miller, Senior Member of the Technical StaffAcquisition Support Program
All software engineering and management practices are based on cultural and social assumptions. When adopting new practices, leaders often find mismatches between those assumptions and the realities within their organizations. The SEI has an analysis method called Readiness and Fit Analysis (RFA) that allows the profiling of a set of practices to understand their cultural assumptions and then to use the profile to support an organization in understanding its fit with the practices’ cultural assumptions. RFA has been used for multiple technologies and sets of practices, most notably for adoption of CMMI practices. The method for using RFA and the profile that supports CMMI for Development adoption is found in Chapter 12 of CMMI Survival Guide: Just Enough Process Improvement. This blog post discusses a brief summary of the principles behind RFA and describes the SEI Acquisition Support Program’s work in extending RFA to support profiling and adoption risk identification for Department of Defense (DoD) and other highly-regulated organizations that are considering or are in the middle of adopting agile methods.
One of the fundamental principles of technology adoption is that of mutual adaptation. This principle asserts that a successful technology adoption by an organization usually requires adaptation of both the technology and the organization. The technology may adapt, for example, by being configurable - allowing switching on or off of different features - or by allowing localization to a different native language. The organization may adapt by changing some of its business workflows so they are more compatible with the technology or by changing the roles of the people involved in different processes that are affected by the technology.
This blog post is the latest in a series examining our work on the adoption of agile methods in U.S. DoD settings. In July of this year, we kicked off a series that highlighted key ideas and issues associated with applying agile methods to address the challenges of complexity, exacting regulations, and schedule pressures in the DoD.
When an organization adopts a new set of practices, it sees many of the same issues associated with adopting a new hardware or software technology. The SEI has observed that when adopting new practices—as when adopting new technologies—the principle of mutual adaptation applies. One of our observations has been that the closer the organization’s culture is to the implied cultural assumptions of a set of practices, the easier it is for that organization to adopt those practices.
As part of our research in the adoption of agile methods in U.S. DoD settings, we have adapted the RFA profiling technique to accommodate both the typical factors used in RFA and some factors that are more uniquely associated with the DoD acquisition environment. We found that only applying the commercial profile didn’t highlight enough of the issues that we were seeing in our interviews and observations of practice. A technical note on RFA factors for agile adoption in DoD will be published at a future date.
In this post, we want to present the categories and factors that we have identified so far, with the help of our interviewees and our SEI Agile Collaboration Group. This latter group consists of over a dozen DoD and other federal government acquisition practitioners, plus several DoD contractor organization representatives who are all actively adopting various relevant Agile methods in their organizations. We have characterized the following six categories to profile for readiness and fit:
business and acquisition - adoption factors related to business strategy, acquisition strategy, and contracting mechanisms
organizational climate - adoption factors related to sponsorship, leadership, reward systems, values, and similar "soft" issues
system attributes - adoption factors related to the actual characteristics of the system(s) being developed
project and customer environment - adoption factors related to project management norms, team dynamics and support structures, and customer relationships and expectations
technology environment - adoption factors related to the technologies that are in place or planned to support the selected agile methods
practices - a taxonomy of agile practices that is used to understand which practices an organization plans to adopt so that other factors can be calibrated around those expectations
If an organization has used RFA in other settings, the factors that were found in the original RFA are scattered among the business and acquisition, organizational climate, and project and customer environment categories.
Each category has a set of attributes that can be characterized by a statement that would represent what you would expect to see if you were observing a successful agile project or organization operating in relation to that attribute. For example, an attribute of business and acquisition is stated as:
Oversight mechanisms are aligned with agile principles.
Oversight is an aspect of acquisition that can either support or disable an agile project. Alignment of oversight with agile principles thus reduces the risk that oversight will be counterproductive.
The remainder of this blog posting describes key factors in the business and acquisition category. In future posts, we will explore other categories of factors that deal with issues that cause different challenges to adoption of agile mthods.
The Business and Acquisition Category
This category covers issues related to an organization’s business strategy or mission and some specific factors related to acquisition and contracting. Business strategy is an important fit element because many organization values and principles are tied to the strategy. If the strategy changes, the organization’s values may change, creating either a better or worse fit environment for a particular set of practices. Similarly, in DoD settings, certain contracting approaches are more aligned with particular sets of values and practices, and changing the way a contract is formulated can have a significant impact on the values and practices that will be needed to execute that contract. The following list has both a short title that summarizes the statement and a statement that provides a condition or behavior found in an organization successfully using engineering and management methods consistent with agile principles as published in the Agile Manifesto.
Clear Program Goals. Business or program goals are clear and reflect stakeholder concerns.From an agile methods perspective, the organization’s mission or business goals are one of the touchpoints for decision making. If they are not clear—or if they do not adequately reflect the concerns of the organization’s stakeholders—then lower level decision-making runs the risk of being misaligned with the organization’s focus.
Defined Success Strategies. Success strategies (e.g., roadmaps, product portfolios, etc.) are defined and clearly communicated.From an agile methods perspective, being clear about the roadmaps, portfolios, etc. that an organization uses to define its productivity and successful completions is a key to understanding how an individual project fits into the broader organizational mission.
Project Funding Secured. Funding for the project has been secured.This factor may seem obvious and one that is a success criterion for any project, which is true. Of particular importance when applying agile methods to DoD organizations, however, is that there are multiple ways to fund and contract for information technology products and services. Some steps in the formulation of a program can be executed prior to official funding, but there are many tasks that cannot be initiated until the funding allocation process has completed.
Close Stakeholder/Developer Collaboration Enabled. Mechanisms are in place in the contract and acquisition strategy to allow close collaboration between developers and other stakeholders (e.g., certification and accreditation personnel, end users, and others).The fourth principle derived from the Agile Manifesto states, "Business people and developers must work together daily throughout the project." In a commercial environment, business people includes managers of the project and end users of the product being developed. In the DoD, these roles may be in different organizations, and there are multiple business-related stakeholder roles to account for--program office personnel, information assurance, independent verification and validation agents, end users, logisticians, trainers, and others. If the acquisition strategy and associated contract vehicles create barriers to collaboration among these roles and the developer, it will be hard to achieve the performance of shoulder-to-shoulder agile implementations.
Interim Delivery Enabled. Mechanisms are in place in the contract and acquisition strategy that allow for interim demonstration and delivery between official releases.The first principle derived from the Agile Manifesto states, "Our highest priority is to satisfy the customer through early and continuous delivery of valuable software." DoD contracts can specify the cadence of delivery in the Statement of Work (SOW) and also in the way they apply different standards and define line items in their Contract Data Requirements List. If a contract specifies a single delivery of the software, other mechanisms may be in place to prevent productive early demonstration and re-orienting of priorities or focus.
Oversight Supports Agile Principles. Contract oversight mechanisms are aligned with agile principles.As with delivery enablement, the contract is the mechanism wherein program office technical and management oversight is specified. Contracts for large acquisition programs typically mandate document-centric capstone reviews, such as Preliminary Design Reviews (PDRs) and Critical Design Reviews (CDRs). These reviews analyze requirement, preliminary design (PDR), and detailed design (CDR) documentation; software development does not begin until after all these documents have been approved following the CDR. This linear lifecycle model is not as productive an oversight strategy for contracts employing agile methods, where contracting language enables incremental, more frequent (and less formal) progress reviews. Beyond the contract language itself, the expectations of reviewers and oversight personnel must also be set appropriately.
Clear Alignment of Software Goals/Program Goals. The alignment of software-related goals with program-level goals is clear.This factor is also important in non-agile settings, but its urgency in agile settings comes from the fact that software will be available earlier to test and interact with the other parts of the system. For systems engineers unaccustomed to this early access, provisioning test beds consisting of hardware emulators and simulation environments may not get the attention needed to ensure the software part of the program can take advantage of incremental deliveries.
Appropriate Contract Type. Contract type accounts for use of agile or lean methods in the program.This factor may seem obvious, but it’s actually quite a challenge for DoD program offices. Almost any contract type (firm fixed price, indefinite deliver/indefinite quality, time and materials, level of effort, cost plus incentive fee, etc.) can be used to effectively support development using agile methods. For each contract type, however, the way the agreement is framed determines how effective it will be. The contract type and the acquisition strategy must therefore be aligned to support agile methods implementation.
Appropriate Lifecycle Activities. Lifecycle activities that are planned in the acquisition strategy are compatible with agile methods.It’s not enough that the contract vehicle be written correctly. It’s also important that the life cycle activities are specified in a way that can leverage the iterative and incremental nature of agile software development. For example, building test support equipment and test suites early in the life cycle is essential if test-driven development is an agile method being applied.
Agile at-Scale Enabled. The acquisition strategy takes into account the use of agile methods at the scale needed for the program.The most prevalent use to date for agile methods has been on smaller projects, but even in the DoD there have been successful projects with dozens of developers. To appropriately express the agile principles, stakeholders must consider communication mechanisms, architectural patterns, and layered management approaches. If these factors are not taken into account in the acquisition strategy, larger agile implementations may not be resourced effectively.
Looking Ahead
Upcoming blog entries in this series will describe factors in organizational climate, system attributes, and technology environment. We will also describe factors in the project and customer environment and practices categories that form a picture of the organization planning to adopt agile practices to help them understand where they are likely to have challenges adopting the desired practices. From there, risk mitigation and issue management strategies can be defined to minimize the probability and/or impact of the adoption risks that have been identified.
We welcome feedback and comments on both the concept and the content of the model so far. We solicit your feedback, especially if you are a practitioner that is being asked to adapt your own work to accommodate agile methods.
Additional Resources
For more about the readiness and fit analysis method, please visitwww.sei.cmu.edu/sos/consulting/sos/readinessandfit.cfm
For more information on management and acquisition considerations in using agile methods in DoD environment, please our first two technical notes in the Agile Acquisition series:
Agile Methods: Selected DoD Management and Acquisition Concernswww.sei.cmu.edu/library/abstracts/reports/11tn002.cfm
An Acquisition Perspective on Product Evaluationwww.sei.cmu.edu/library/abstracts/reports/11tn007.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:37pm</span>
|