Blogs
|
By Julien Delange, Senior Member of the Technical StaffSoftware Solutions Division
When life- and safety-critical systems fail, the results can be dire, including loss of property and life. These types of systems are increasingly prevalent, and can be found in the altitude and control systems of a satellite, the software-reliant systems of a car (such as its cruise control and GPS), or a medical device. When developing such systems, software and systems architects must balance the need for stability and safety with stakeholder demands and time-to-market constraints. The Architectural Analysis & Design Language (AADL) helps software and system architects address the challenges of designing life- and safety-critical systems by providing a modeling notation that employs textual and graphic representations. This blog posting, part of an ongoing series on AADL, describes how AADL is being used in medical devices and highlights the experiences of a practitioner whose research aims to address problems with medical infusion pumps.
Although AADL was initially applied in the avionics and aerospace domains, it is now being used in other domains where software failures have significant impact. The design of life- and safety-critical systems in the medical domain, where an error or software fault may have catastrophic consequences including loss of human life, has been the topic of recent media reports, as well as increased scrutiny from the FDA as described in this story in The New York Times. The size and complexity of medical-related software continues to grow and the more complex the software, the more likely it is to contain bugs or errors. A classic example was the Therac-25 linear accelerator, which malfunctioned due to software problems and gave overdoses of radiation to patients.
Medical devices must be designed and validated carefully to shield users from software defects. For that purpose, engineers must adhere to a strict development process that requires analyzing and validating their architecture prior to implementation efforts. Such a process would ensure requirements enforcement and reduce potential re-engineering efforts.
The SEI’s Work on AADL
The SEI has been one of the lead developers of AADL and has participated on the AADL standardization committee since its inception. SEI researchers also developed the Open Source AADL Tool Environment (OSATE), which is the reference implementation for supporting AADL within the Eclipse environment. Other researchers and developers have used AADL as a foundation to design and analyze life- and/or-safety-critical systems. Likewise, several projects have used AADL to design software architecture and analyze, validate, or improve various quality attributes, such as reliability, latency/performance or security.
While AADL was initially conceived for use in avionics or aerospace systems, it can be used to design systems in other domains that place a premium on life- and/or safety-critical behavior. An apt example is the domain of medical devices, which have expensive and time-consuming certification and accreditation processes to validate and verify that they are free of software bugs. Thus, the AADL design and validation methods and tools that have been developed for avionics and aerospace can be applied and reused between the various domains.
One Practitioner’s Experience
Oleg Sokolsky, a research associate professor with the department of computer and information science at the University of Pennsylvania, first became involved in AADL in 2004 after applying for and receiving a Small Business Innovative Research grant. Sokolsky described his introduction to AADL:
The sponsor was looking for modeling languages and analysis tools for complex distributed systems. We had some tools that would be applicable, but we had to make them work in a wider context. A connection to the architecture level was needed. I recently heard a workshop talk on AADL and it was still fresh in my head. I have a colleague who had a startup, so he led the tool development effort. We began by applying schedulability analysis techniques to AADL models, using AADL semantics. It was a good fit for the tools we had in house.
Next, Sokolsky’s team extended the analysis for the same semantic representation and built what he believes is the first simulator for AADL models.
Applying AADL in the Medical Domain
After their initial application, Sokolsky said his team applied AADL to modeling whatever systems they were developing at the time, including medical devices. Sokolsky’s team started using AADL in their research to address software problems as part of the U.S. Food and Drug Administration’s (FDA) Infusion Pump Improvement Initiative, which seeks to make infusion pumps safer and more reliable. Sokolsky described some of the architecture problems with the infusion pump that his team was trying to address:
In several cases, there wasn’t enough sanity checking on inputs. Software should be designed in such a way that it should reject wrong inputs. Unfortunately, several pump models weren’t designed that way.It was very easy for a nurse to enter a wrong value. Over-infusion and under-infusion are both serious hazards.
Specifically, his team used AADL to describe the infusion pump’s software architecture, the platform-independent controller (which receive inputs and reacts to outputs), and the platform-dependent software, such as drivers for sensors and actuators between them. Sokolsky described the ways in which his team used architecture and modeling techniques to address some of the problems with the infusion pump:
In the reference architecture, there should be a module that assesses quality of inputs and rejects the wrong ones. Also, timing information on the components in the architecture can help us evaluate latency of data flows to determine whether the pump will stop the motor fast enough in case of an alarm.
Over the years, Sokolsky’s AADL-based research has focused on developing platform-independent code generation techniques, specifically for model-based development of infusion pump software.
We had a very detailed model of the safe controller for the infusion pump, and we were looking for ways to generate code for it automatically. What we realized is that existing code generation tools are producing platform-independent code that had to then be manually connected to the platform. We were looking for ways to minimize the amount of new code that had to be written manually. That’s where AADL came in. It allowed us to describe the various platform dependencies that existed in our systems and characterize, for each kind of dependency, what code needs to be generated for a particular platform.
Sokolsky said his team continues to use AADL because they understand the semantics of the language and have long-standing experience with it. One of the challenges they face, however, is keeping up with all of the changes that are occurring within the AADL community, which operates its own wiki site:
Right now there are so many things going on in this AADL ecosphere. We have a hard time picking things that are mature enough that we can actually use.
Looking Ahead
Sokolsky described various AADL-based design artifacts that his team has developed for the infusion pump. The artifacts include hazard analysis documents, requirement specifications, models of pump controllers, verification results, generated code, and a preliminary safety case:
The goal is to provide a set of artifacts for the community and the FDA, so, on the one hand, researchers have a way to apply their formal methods to these artifacts. The FDA can look at the safety argument that we are constructing. It gives feedback to the community.
Sokolsky said that the manufacturer’s intellectual property restrictions do not impede the dissemination of documents, models, and code developed by his research team.. Such artifacts can serve as case studies for the wider research community, enabling comparison of modeling and analysis tools from different research groups. These artifacts could also potentially be used by the FDA to showcase good development practices to manufacturers and develop future guidance documents for industry.
Future Work
SEI researchers are working to extend AADL so that it can describe system faults and analyze the safety of systems. These capabilities will enable AADL users to augment their software architectures with fault description and specify error source and propagation across hardware and software components. This enhanced notation would then be used to improve system validation and detect faults that are potentially not spotted during the initial development process and that may lead to errors. This work consists of two parts:
enhancing the core of the technology with a sub-language to describe architecture safety concerns
developing new tools to process this additional information within the model, validate system safety, and assist engineers in the production of documents for validating the system
For the purpose of improving system validation and detecting faults, a new language has been proposed by the SEI and is currently under review by the AADL standardization committee. The language would be published as an official standard annex later by the Society of Automotive Engineers (SAE). In addition, the SEI is developing new functions within OSATE to analyze system safety and produce documents required by safety evaluation standards, such as SAE ARP4761. Looking ahead, Sokolsky said that his team will work with the SEI to use the error-model annex and specify an error model for the pump to reason about its reliability.
AADL is fairly versatile and, if used to its full capacity, it gives a boost to whatever other development techniques that you are using on top of it.
Our next blog post will discuss the use of AADL in the System Architecture Virtual Integration(SAVI) project. SAVI is part of the collaborative and applied research performed by an aerospace industry research cooperative. Our post will describe the use of AADL within this project to improve software safety and reliability.
Additional Resources
To view the AADL Wiki, please visit https://wiki.sei.cmu.edu/aadl/index.php/Main_Page
For more information about AADL, please visit http://www.aadl.info
For more information about the Generic Infusion Pump research at the University of Pennsylvania, please visit http://rtg.cis.upenn.edu/gip.php3
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:24pm</span>
|
|
By Robert Ferguson Software Solutions Division
Software sustainment involves coordinating the processes, procedures, people, information, and databases required to support, maintain, and operate software-reliant aspects of DoD systems. The 2011 book Examination of the U.S. Air Force’s Aircraft Sustainment Needs in the Future and its Strategy to Meet Those Needs states
The Air Force is concerned that the resources needed to sustain its legacy aircraft may increase to the point where they could consume the resources needed to modernize the Air Force.
With millions of lines of code riding on aircraft and automobiles, the cost of software sustainment is increasing rapidly. Several studies show that the cost of sustainment is already as much as 70 percent of the total cost for the life of the software. All the armed services face similar challenges, including deciding how to improve the efficiency and productivity of sustainment organizations and how much should be invested in these improvements. This blog post describes an SEI research initiative aimed at developing an economic model to help anticipate costs and postpone the potential tipping point when sustaining current products is less attractive than replacing legacy systems.
Balancing Stakeholders and Resources
The software sustainment problem is particularly complex for the Department of Defense (DoD) because funding decisions involve an understanding of tensions between three different perspectives:
operational need (warfighter view)
management of the portfolio (materiel view)
capability and capacity of the sustaining organization (process, skills, tools and people)
Our research is motivated by the need to help the DoD make decisions about allocating resources between sustainment work (supporting the warfighter) and improving the performance of sustainment organizations in ways that optimize long-term value to the armed services. Performance improvements are needed in many situations, including
new test kits to support deployment of new radar or electronic warfare capabilities
software analysis tools to analyze existing code to help developers learn a legacy system and accelerate their understanding of supported products
tools supporting automated software testing to accelerate the testing process and assure test coverage of supported products
training for engineers when upgrades to an existing system employ new processors, operating systems, or programming languages
This SEI research initiative is developing an economic model to support decisions about allocating investments in various performance improvement alternatives. The model will analyze various factors, such as demand for sustainment, capacity of an organic workforce to perform sustainment activities, as well as timing of funding in terms of its impact on long-term costs and readiness of aircraft fleets.
I, along with fellow researchers Sarah Sheard, Andrew Moore, William Nichols, and Mike Phillips, have spent the last year exploring several issues related to software sustainment. Part of our work included developing a systems dynamics model to study how funding decisions and the timing of implementation of changes in sustainment organizations affect the performance of both sustainers and warfighters. We theorized that small differences in funds and timing can have large impacts on performance. This type of system dynamics model uses stocks and flows to represent sustainment performance over time.
Foundations of Our Approach
Jay Forrester, a professor at Massachusetts Institute of Technology, founded system dynamics in the 1950s. Systems dynamics modeling studies the changes in many interrelated variables. Since its inception, system dynamics has been used as a modeling approach in the study of economics and organizations.
With many factors changing simultaneously, simpler economic models (such as return-on-investment and net present value) are insufficient. The interaction of the many inputs to sustainment work can cause emergent effects, such as a sudden and dramatic change in ability to meet demand. Through modeling and analysis research, we are looking for the minimum amount of data that can be used to forecast a sudden and dramatic change (which is also known as a "tipping point"). Forecasting the tipping point gives decision makers time to take action before a problem becomes intractable.
We seek to answer the following questions through our research:
Does the model actually show the expected tipping point behavior either in the performance of the sustainment organization or the demand for system improvement?
What data provides an indication of growth in demand for sustainment on a particular acquisition program? If we can identify the needed data, will it be possible to collect it and do the necessary analysis?
Do the models provide actionable information to decision makers in a timely manner? For example, does reallocating some funding from the sustainment work to the development of the workforce (test kits, process changes, technology training) help reduce the cost of sustainment?
What measure of warfighter readiness correlates to the predictive factors in the model?
Through this approach, we aim to help DoD acquisition programs better plan their financial investments to ensure long-term software sustainment and deliver the best value for taxpayer dollars.
While the construction of a model that exhibits the expected changes in the performance of both mission and sustainment is a necessary step, much of our research will focus on sources of data. Real data collected from real programs will be needed to calibrate the model for making real decisions, including
potential data collection points within the sustainment processes,
such as demand for sustainment work, rate of hiring and attrition, rate of delivery of, sustainment work, and availability of skilled staff
potential opportunities to measure warfighter readiness or system use
such as fleet availability, successful mission performance, and error rate in operation
standards for applying data collection across different kinds of products such as airplanes, satellites, and communications systems
The Systems Dynamics Model
The basic goal of a simulation model is first to represent the normal behavior of a system and then stimulate it with a new input to see how the responses change.
The model that we have developed represents the behavior of the different players in the sustainment process, including the warfighter, the technical capability of the sustainment organization, and the capacity of the sustainment organization to deliver the work. We are testing the system response to various scenarios, such as
Threat Change. An external change (such as a new threat to the warfighter) results in a request to update the system capability. This request means the sustaining organization will have to perform both product and process changes; the development process and testing may need to change as well. The changes often require some funding to re-equip the facility and re-train the workforce. Our systems dynamics model helps decision makers analyze the effect if funding for this improvement is delayed.
Support Technology Change. The sustainment organization decides to improve its own throughput and adopts new processes to "do more with less." Typically the change is also in response to new quality goals. In this case our model helps codify the effect on sustainment capability and capacity and therefore on operational performance.
Workforce Changes. Sequestration effectively decreases the staff available to sustainment organizations by 10 percent to 20 percent. How does this decrease affect a sustaining organization’s ability to meet its sustainment demand? Does it affect aspects of the warfighter mission as well?
Our current model of sustainment consists of five basic process loops:
Mission Demand. This loop represents system deployment and use in theater. With successful application, demand for additional missions and increased capability causes an increase in sustainment support required. When missions are less successful or systems cannot be deployed in time, demand for the system decreases and funding support wanes.
Sustainment Work. The process of sustainment typically takes a system off-line as it is enhanced, repaired, and redeployed. Some of the requests for sustainment include requests for additional capability.
Limits to Growth. This loop shows how the capacity and capability of a sustainment organization limit the rate of completion of sustainment work. As these limits begin to extend the time required to redeploy, the long-term effect may be a reduction in demand or a switch to using an alternate platform.
Work Bigger. A sustainment organization may attempt to meet sustainment demand by employing overtime or extra contract employees. Either of these approaches may work for a short time or a small additional cost, but they stress the organization and quickly reach limits of effectiveness. The organization can hire staff, but it must also allow time for training and acculturation of new hires to meet performance objectives.
Work Smarter. The sustainment organization invests in new capabilities (skills, tools, and processes) and possibly additional resources (people and facilities) to improve capacity for sustaining work.
Each scenario outlined above entails several decisions and stimulates response curves from the model. The response curves help decision makers forecast how deferring decisions or reallocating resources affect both warfighters and sustainment organizations. We will consider our systems dynamic model "good" if decision makers believe they are able to make faster decisions and if the data from the model makes it easier to get sponsor support for the decisions.
Initial Findings and Future Work
A recent study by the U.S. Army provided us with information confirming the importance of understanding the cycles of demand for sustainment work. From that study, we identified three patterns of release in the software sustainment lifecycle:
The basic fix pattern mainly focuses on low-level defects and addresses software breaks, which occur frequently (at least once a year).
The block release pattern mainly focuses on enhancements to address users who have discovered new opportunities to use the system but need some additional functionality to employ it efficiently. The resulting block releases can include additional features for which there are plans and budgets. These sustainment efforts are not patches or quick fixes; they typically occur every year to two years.
The modernization pattern addresses a significant change in technology or in the pattern of use, such as adding electronic warfare capabilities to an aircraft or accessing an additional external system that hasn’t been accessed previously. These sustainment efforts occur about every four to seven years. This third type of release is also often three to four times more expensive than the block release.
The modernization release is often a response to some technology change. The timing (at least four years) has an interesting correlation. Moore’s Law suggests the cost-performance of processors doubles every 18 months. A modernization release every four years skips about two processor generations. A modernization release every seven years skips about six generations.
Modernization releases require the sustaining organization to develop new practices and tools and to retrain some existing personnel. Funding for these internal changes is often hard to justify and may be delayed. These delays, in turn, can cause the sustaining organization to fall so far behind the technology curve that a new acquisition contract is often required. If a contractor does win the development contract, the organic sustainment organizational capability may decline while the bulk of funds go to the contractor.
Other organizations have studied various aspects of the sustainment problem. The Army examined costs of software sustainment and the work breakdown structure of sustainment as it is currently performed. Jack McGarry, author of the book Practical Software Measurement: Objective Information for Decision Makers, charted for the U.S. Army the tremendous variability in the demand for sustainment work, with the cost of the largest releases up to 10 times that of the smallest releases. The frequency of the largest releases appears to track to large-scale technology change, occurring about every four to seven years. In that period of time, the speed of processors increases by nearly 10 times, and the number of transistors on a chip also increases by 10 times. This finding suggests that technology change is an important variable in our systems dynamic model.
The research we’ve conducted thus far has revealed that our system dynamics model exhibits the expected and observed behavior of product sustainment. The model, however, has not yet been calibrated to apply to a real situation. We are actively seeking an opportunity to study a real sustainment program and to collaborate on calibrating the model. We believe that data collection will not be hard to implement and the data potentially can be used for other purposes by sustainers.
If you are interested in collaborating with SEI researchers on this initiative, please send an email to info@sei.cmu.edu, or leave contact information in the comments section below.
We welcome your feedback on our research and look forward to hearing if you’d like to help us achieve our project goals. Please leave feedback in the comments section below.
Additional Resources
To read An Examination of the U.S. Air Force’s Aircraft Sustainment Needs in the Future and its Strategy to Meet Those Needs by the National Academies Press, please visithttp://www.nap.edu/catalog.php?record_id=13177
To read about software sustainment practices for the DoD, (especially chapter 16) please visit http://www.stsc.hill.af.mil/resources/tech_docs/gsam4.html
To read The Economics of Software Maintenance in the Twenty-first Century, please visithttp://www.compaid.com/caiinternet/ezine/capersjones-maintenance.pdf
To read the SEI technical report, Sustaining Software-Intensive Systems, please visit http://www.sei.cmu.edu/library/abstracts/reports/06tn007.cfm
To see a presentation on Sustaining Software Intensive Systems — A Conundrum please visit http://www.dtic.mil/ndia/2005systems/wednesday/lapham.pdf
To read an excerpt of the book Business Dynamics: Systems Thinking and Modeling for a Complex World by John Sterns, please visithttp://web.boun.edu.tr/ali.saysel/Esc578/Sterman%2013.pdf
To view the proceedings of the 2012 PSMSC User Group, please visit http://www.psmsc.com/UG2012/Workshops/w4-%20files.pdf
To read the Air Force Scientific Advisory Board report, Sustaining Aging Aircraft, please visit http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA562696
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:24pm</span>
|
|
By Cory Cohen Senior Member of the Technical StaffCERT Division
In 2012, Symantec blocked more than 5.5 billion malware attacks (an 81 percent increase over 2010) and reported a 41 percent increase in new variants of malware, according to January 2013 Computer World article. To prevent detection and delay analysis, malware authors often obfuscate their malicious programs with anti-analysis measures. Obfuscated binary code prevents analysts from developing timely, actionable insights by increasing code complexity and reducing the effectiveness of existing tools. This blog post describes research we are conducting at the SEI to improve manual and automated analysis of common code obfuscation techniques used in malware.
Obfuscation Example and Impact on Malware Analysis
When shown the following example of obfuscated assembly code, an experienced malware analyst at CERT took 550 seconds (more than 9 minutes) to determine its basic functionality.
43E401: push 9387D2AF
43E406: mov edx, F25C92BA
43E40B: pop eax
43E40C: xor edx, F21F75B2
43E412: and eax, 37D28E56
43E418: jmp 43E501
...
43E501: push ecx
43E502: mov ecx, eax
43E504: push edx
43E505: jnz 43E601
43E506: jmp 43E603
...
43E601: sub eax, 13828202
43E607: ret
...
43E708: pop edx
43E709: mov ecx, [eax+edx]
While this example was created explicitly for this demonstration, it shows several techniques commonly encountered in malware and represents a realistic scenario. In particular, the net effect of this code is equivalent to a single instruction that is easily understood:
43E401: mov ecx, [ecx+4]
In the Department of Defense (DoD) and other large analysis environments, several hundred-fold increases in the effort required to analyze malware raises computer and network defense costs and delays the necessary insights from being generated within reasonable time and budget constraints.
Our Approach to Deobfuscation
Deobfuscation is a method for reverse-engineering obfuscated code. Our approach to the deobfuscation problem applies multiple semantic transformations to an obfuscated program to produce a new program that is functionally equivalent to the obfuscated program but easier to analyze. Some example obfuscation techniques and rough descriptions of the transformations include
Complex hexadecimal arithmetic. Malware authors frequently add extraneous addition, subtraction, and bitwise logical operations to hide important program addresses and constants. When possible, the instructions performing these operations should be replaced with immediate values in simpler instructions.
Stack pointer abuses. Malware authors will often have many unnecessary PUSH and POP instructions to make the code harder to reason about. When used in conjunction with CALL and RET instructions, control flow can be obscured as well. Such instructions should be removed to minimize stack changes.
Control flow obfuscation. Unnecessary jumps in intra-procedural flow control make it hard for a human analyst to follow overall program flow control. Unconditional and always-true conditional JMP instructions should be removed.
Dead stores. Instructions that write to registers and memory that are never subsequently read do not contribute meaningfully to the net effect of the function. Instructions performing dead stores can be detected by definition and usage analysis and subsequently removed.
These types of transformations may be run in multiple passes and in varying orders to make it easier to defeat obfuscations that involve the application of techniques in sequence. A modular approach to the transformation should improve code maintenance and simplify the addition of new techniques.
Prototype Deobfuscation Tool
Early in this research effort, we used the ROSE compiler infrastructure to build a functioning prototype of a general deobfuscation tool that was specifically targeted at removing dead code. ROSE provides facilities for disassembly, instruction emulation, and control flow graph representation. We have applied ROSE in our prior research on Using Machine Learning to Detect Malware Similarity and Semantic Comparison of Malware Functions.
In our current effort, we expanded upon the core ROSE capabilities by adding definition and usage analysis, dead code elimination, and other deobfuscation-specific techniques. We also implemented a rudimentary method for generating new executables as an output format by building on ROSE’s assembly features. This prototype was tested against several members of the Ramnit family, using a transformation to eliminate dead code.
Applying Our Prototype Tool to String Deobfuscation
While working on the prototype tool, our team was presented with an operational need to combat a specific type of string obfuscation. String obfuscation is used by various malware families to prevent important strings from being easily recovered from the executable. This technique involves moving immediate bytes into a local stack variable to construct a normal C-style null terminated string.
By emulating instructions and collecting memory writes to the stack, we were able to extract the deobfuscated strings from the code. This transformation allowed us to automatically process many more files than would have been possible using a manual approach. The deobfuscated strings assisted in the development of a malware family report. We also produced a catalog of common obfuscation techniques, with specific reference to the addresses of several files demonstrating each of the techniques.
At this point in our research, we had demonstrated a working prototype and operational relevance, but needed to implement more transformations to have a complete and generally applicable tool. The primary question we still faced was "which transformations should be implemented next?" We decided to conduct an obfuscation technique prevalence study to gain some basic facts about the prevalence and distribution of the techniques in our catalog.
Analyzing the Prevalence and Distribution of Obfuscation Techniques
We chose to implement six tests for our initial study. Each test looked for the occurrence of a common obfuscated code pattern, such as a dead store, an opaque predicate, or an obfuscated control flow. We counted how often these patterns appeared in several datasets, containing a total of 150,000 malware files. The number of detected obfuscations was fairly low, with only about 12 percent of functions having a detected pattern, and we detected only a single test in most of those functions.
This distribution suggested we might be encountering an occasional single false positive in many functions, while truly obfuscated functions were routinely positive for many detections in several tests. Adjusted to account for these probable false positives, the percentage of functions in which we detected obfuscated code declined to about 1 percent. Using the adjusted criteria, only about 23 percent of files contain an obfuscated function.
Some of the likely false positives filtered by the revised criteria appear to have been caused by incorrect disassembly. In several cases we examined manually, we found that arbitrary bytes had generated nonsense instructions that legitimately contained the obfuscated code patterns that we were searching for. It is hard to build firm program analysis conclusions on unreliable disassembly foundations, so we’ve started a new research effort to develop improved disassembly methods and objective metrics for the assessment of disassembly correctness.
Collaborations
Charles Hines from the CERT Division and Wesley Jin, a Ph.D. candidate in CMU’s Department of Electrical and Computer Engineering, have been actively involved in this research effort. Sagar Chaki and Arie Gurfinkel, both senior members of the technical staff in the Software Solutions Division have contributed significantly as well.
Additional Resources
To read more about this research and the results of other exploratory research projects, download the SEI technical report, Results of SEI Line-Funded Exploratory New Starts Projects: FY 2012, athttp://www.sei.cmu.edu/library/abstracts/reports/13tr004.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:23pm</span>
|
|
By David MundieSenior Member of the Technical StaffCERT Division
Researchers on the CERT Division’s insider threat team have presented several of the 26 patterns identified by analyzing our insider threat database, which is based on examinations of more than 700 insider threat cases and interviews with the United States Secret Service, victims’ organizations, and convicted felons. Through our analysis, we identified more than 100 categories of weaknesses in systems, processes, people, or technologies that allowed insider threats to occur. One aspect of our research focuses on identifying enterprise architecture patterns that organizations can use to protect their systems from malicious insider threat. Now that we’ve developed 26 patterns, our next priority is to assemble these patterns into a pattern language that organizations can use to bolster their resources and make them more resilient against insider threats. This blog post is the third installment in a series that describes our research to create and validate an insider threat mitigation pattern language to help organizations balance the cost of security controls with the risk of insider compromise.
Developing an Enterprise Architecture Pattern Language
Our aim in developing an insider threat pattern language is to equip enterprise engineers with the tools necessary to make an organization resilient against insider threat. The patterns in our language capture solutions to recurring patterns in insider threat. For example, the Thirty-Day Window pattern, discussed by my colleague Andrew Moore in his blog Effectiveness of a Pattern for Preventing Theft by Insiders, is based on the observation that a large percentage of malicious exfiltration by insiders happens within 30 days of termination. So, organizations can improve their detection of such violations by focusing on a very narrow window of time.
Once we identified and documented the original 26 patterns, we wanted to go beyond simply publishing them as a flat, two-dimensional collection. Instead, we wanted to show the relationships among the patterns by arranging them in an organic hierarchy (i.e., a pattern language). This approach follows the example of Christopher Alexander, the father of the patterns movement in the building architecture community, and his work on patterns and pattern languages.
Unfortunately, insider threat poses challenges to a hierarchical approach to organizing patterns because insider threats permeate the organization. Addressing and protecting enterprise organizations from insider threat involves an enterprise-wide approach involving several different strategies. This diversity leads to multiple potentially conflicting classification systems. A classification that makes sense for human resource systems might not make sense from an incident response perspective. Likewise, some third classification might be required for the information technology staff, and so forth.
Trying on Classification Systems
We explored several categorizations before ultimately deciding on a multi-dimensional approach. Each of the systems that we explored provided useful perspectives on our patterns by situating them within a specific domain, whether information security, enterprise architectures, incident management, resiliency, or organizational structure.
Information security pattern languages. We initially assumed that our insider threat patterns would fit well within existing information security pattern languages. As a guide, we relied on the book Security Patterns by Schumacher et al., which maps information security pattern languages to the categories in the Zachman Framework (see below). Our insider threat patterns were most compatible with the "Enterprise Security and Risk Management" patterns because the crux of the insider threat problem is that, for employees to do their jobs efficiently, they must be trusted by their employers and operate in an environment largely unfettered by authentication controls, access controls, and firewalls. As a result, our patterns are a bit of a mismatch for the identification and authentication patterns, the operating system access control patterns, and the firewall architecture patterns, which are largely focused on implementing those fetters. We soon realized that this mismatch meant that despite a commonality of purpose, the Schumacher landscape would not be ideal as a single taxonomy for our pattern language.
The Zachman Framework. We then mapped our patterns to the Zachman Framework, which is an enterprise architecture framework that provides a formal and highly structured view of an enterprise. The Zachman Framework provides a diagram for organizing architectural artifacts on the basis of the intended recipient and the issue being addressed. The diagram helped us quickly realize that our patterns were at the enterprise security strategy and policy levels and not at the mechanisms and implementation levels. This was an important insight, but it did mean that our patterns were clustered in one small area of the framework, making it of limited utility as the single way of classifying our patterns.
Business units. We also considered organizing our patterns by business units, such as human resources, legal, etc. Business units provide a very useful organization system for the patterns. Business units are less useful as a general-purpose classification system, however, because they obscure the fact that our patterns frequently cross business-unit boundaries.
Lifecycle phase. Another approach we explored involved using the insider threat lifecycle to break the patterns into prevention, detection, and response patterns. This approach made sense in terms of risk management and incident response, but proved less satisfactory in terms of the organization-wide enterprise focus of insider threat.
CERT Resilience Management Model. Finally, we considered the CERT Resilience Management Model (CERT-RMM), a broad-based model of the organizational process areas needed for resilience. Unsurprisingly, this was a better fit for insider threat patterns in the sense that its process areas divided our patterns into logical groups, but even so we did not want to give up the insights provided by the other schemas.
A Multi-Dimensional Organizational Structure After exploring the categorizations listed above, we realized that it would make sense to move away from rigid, top-down, linear hierarchical systems. No one system would serve all users and all use cases equally well, so a multi-dimensional classification system was called for.One problem with multi-dimensional constructs is that the human brain struggles to conceive of ideas beyond three dimensions. Instead, we looked to a library classification technique known as faceted classification. Recently, faceted classification has become widely used again in search engines on commerce websites. When shopping on Amazon’s site, for example, users can narrow their searches by classifications or facets, such as price, color, or manufacturer. This approach made sense in this age of near-ubiquitous computing where users have easy access to higher dimensional structures. The specific implementation of faceted classification that we used is the facet map, which we downloaded from Facetmap software. We realized two benefits to organizing the pattern language as a drill-down facet map
Users can specify the exact aspects of the patterns that are applicable to their circumstances. For example, a user can narrow a search to only address insider threat patterns within human resources. That same search can also be narrowed to include misuse of information technology infrastructure.
The faceted classification’s formal description of the pattern language organization makes it easier for users to generate alternative representations and categories.
The facet map model allowed us to organize our patterns into a map that categorizes each of the 26 patterns in a five-dimensional space defined by the classifications described above. Figure 2 below shows the Facetmap interface to this hyperspace.
Current and Future Work
Our current work focuses on pattern composition because we feel that is a crucially important issue in transitioning patterns to the operational community. Instead of trying to impose a single composition method on all end users, we have created a pattern language to help usersselect from a number of different composition methods, depending on the situation. For example, the way a small software startup would want to integrate an insider threat pattern into its organization will be different from the way it might be integrated into a military organization or a multinational corporation.
To assist in validating the composition operations, we are exploring the idea of using simple ontologies to capture the essential components of a pattern and its relationships. For example, the 30-day Window Pattern is essentially a relationship among the human resources and information technology staff and the employee.
To allow users to easily guarantee the completeness of their pattern composition, we are testing the use of a formal ontology expressed in the Web Ontology Language (OWL). For more information on OWL and the more general use of ontologies in information security, please see CERT’s Security and Ontology webpage.
The integration of our pattern language and its multi-dimensional interface, combined with the pattern composition pattern language and our ontology-driven validation methodology, will be a significant step in the evolution of insider threat mitigation techniques. We welcome your feedback on our work. Please send us an email at info@sei.cmu.edu or leave feedback in the comments section below.
Additional Resources
Cybersecurity experts from the CERT Insider Threat Center will present a free virtual event on current research aimed at establishing best practices to mitigate insider threats. Managing the Insider Threat: What Every Organization Should Know will take place Thursday, Aug. 8, from 9 a.m. to 5 p.m. EDT. To register, please visithttp://www.webcaster4.com/Webcast/Page/139/1742
To learn more about OWL and the use of ontologies in security, please visithttp://www.cert.org/csirts/security-and-ontology.html
To read the SEI technical report, A Pattern for Increased Monitoring for Intellectual Property Theft by Departing Insiders, please visitwww.sei.cmu.edu/reports/12tr008.pdf
To read the SEI technical note, Insider Threat Control: Using Centralized Logging to Detect Data Exfiltration Near Insider Termination, please visit www.cert.org/archive/pdf/11tn024.pdf
To read the CERT Insider Threat blog, please visit www.cert.org/blogs/insider_threat/
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:23pm</span>
|
|
Charles B. WeinstockSenior Member of the Technical StaffSoftware Solutions Division
From the braking system in your automobile to the software that controls the aircraft that you fly in, safety-critical systems are ubiquitous. Showing that such systems meet their safety requirements has become a critical area of work for software and systems engineers. "We live in a world in which our safety depends on software-intensive systems," editors of IEEE Software wrote in the magazine’s May/June issue. "Organizations everywhere are struggling to find cost-effective methods to deal with the enormous increase in size and complexity of these systems, while simultaneously respecting the need to ensure their safety." The Carnegie Mellon Software Engineering Institute (SEI) is addressing this issue with a significant research program into assurance cases. Our sponsors are regularly faced with assuring that complex software-based systems meet certain kinds of requirements such as safety, security, and reliability. In this post, the first in a series on assurance cases and confidence, I will introduce the concept of assurance cases and show how they can be used to argue that a safety requirement (or other requirement such as security) has been met.
Making a Sound Evaluation
A system is deemed "safety-critical" when its failure could result in loss of life or catastrophic damage. Such systems are expensive to develop, build, and deploy. Due to the severe consequences of failure, most safety-critical systems are also subject to evaluation by some external authority (such as the U.S. Food and Drug Administration (FDA) for medical devices in the United States or the Ministry of Defence for flight control systems in the United Kingdom). To enable the evaluator to make a sound evaluation, developers must provide information about the system and its development, including requirements, documentation of design decisions, hazard analyses, development process, and verification and validation results. Developers submit this information (which constitutes an "argument") that they believe will convince evaluators to approve the system.
Industry has long relied on process-based arguments to help evaluators conduct their evaluations, but as the National Research Council of the National Academy of Science reported in 2007:
For a system to be regarded as dependable, concrete evidence must be present that substantiates the dependability claim. This evidence will take the form of a dependability case arguing that the required properties follow from the combination of the properties of the system itself (that is, the implementation) and the environmental assumptions. … In addition, the case will inevitably involve appeals to the process by which the software was developed.
Although called a "dependability case" in the NRC report, most researchers, evaluators, and users have since coalesced around the term "assurance case." When developing an assurance case for safety, the case is usually called a "safety case." As defined by the U.K. Ministry of Defence:
A safety case is a structured argument, supported by a body of evidence that provides a compelling, comprehensible and valid case that a system is safe for a given application in a given environment.
An assurance case is somewhat similar in form to a legal case. In a legal case, there are two basic elements:
evidence, including witnesses, fingerprints, DNA, etc.
an argument given by the attorneys as to why the jury should believe that the evidence supports (or does not support) the claim that the defendant is guilty (or not guilty)
If an attorney argues that a defendant is guilty, but provides no evidence to support that argument, the jury would certainly have reasonable doubts about the guilt of the defendant. Conversely, if an attorney presents evidence, but no supporting argument explaining its relevance, the jury would have difficulty deciding how the evidence relates to the defendant.
The goal-structured assurance case is similar. There is evidence (such as test results) that a property of interest (such as safety) holds. Without an argument as to why the test results support the claim of safety, however, an interested party could have difficulty understanding its relevance or sufficiency. With only a detailed argument of system safety but no test results, it would again be hard to establish the system’s safety.
A goal-structured assurance case therefore specifies a claim regarding a property of interest, evidence that supports that claim, and a detailed argument explaining how the evidence supports the claim. An assurance case differs from a legal case in one significant respect. In a legal case there are advocates for both sides (the defendant is guilty; the defendant is not guilty). Typically, an assurance case is only used to show one point of view—the system is safe (and not the system is unsafe.)
Figure 1: A notional safety case
Figure 1 above shows a notional safety case in Goal Structuring Notation (GSN). The case argues that the system is safe because both of hazards A and B have been eliminated. The elimination of those hazards is supported by evidence Ev1, Ev2, and Ev3. The evidence might be test results, analysis, modeling, simulation, etc.
Support for the use of assurance cases in the United States is nascent but increasing:
The FDA has taken steps towards requiring their use when a device manufacturer submits a medical device for approval.
NASA suggested the use of assurance cases as a part of development of the Constellation system.
The Aerospace Corporation used assurance cases to help with a GPS satellite ground station replenishment program.
In Europe and the United Kingdom, the use of assurance cases is more common. They are required in systems as diverse as flight control systems, nuclear reactor shutdown systems, and railroad signaling systems.
Whether required or not, creating an assurance case is not just a pro forma exercise. Assurance cases provide many benefits. For instance, the act of creating an assurance case helps stakeholders establish a shared understanding of issues that contribute to safety. Once complete, an assurance case provides a reviewable artifact documenting how safety is achieved in the system. The assurance case is also useful as an artifact that a reviewer or regulator can use to understand and make judgments about the system.
The SEI has an active research program in assurance cases, and in particular how to know that an assurance case adequately justifies its claim about the subject system, which is actually a classic philosophical problem:
determining the basis for belief in a hypothesis when it is impossible to examine every possible circumstance covered by the hypothesis
Our exploration has led us to examine theories from philosophy, law, rhetoric, mathematics, and artificial intelligence—as well as the vibrant assurance case community—as sources of relevant material to develop a theory of argumentation. The theory we are developing uses Baconian probability and eliminative induction to help eliminate sources of reduced confidence in a claim.
Subsequent posts in this series will discuss our research into what we are calling argumentation theory. The theory will help make it easier to construct, modify, and evaluate assurance cases that are believable and will also make it possible to determine how much justified confidence one should have in it. In the next post we will begin to discuss these subjects. In the meantime we welcome your comments on the topic of assurance cases below.
Additional Resources
To read the SEI technical report Toward a Theory of Assurance Case Confidence, please visit http://www.sei.cmu.edu/reports/12tr002.pdf
To read the SEI technical note Towards an Assurance Case Practice for Medical Devices, please visit http://www.sei.cmu.edu/reports/09tn018.pdf
To read the report Software for Dependable Systems: Sufficient Evidence? please visit http://www.nap.edu/catalog/11923.html
To read the dissertation Arguing Safety—A Systematic Approach to Safety Case Management by Timothy Patrick Kelly, please visit http://www-users.cs.york.ac.uk/~tpk/tpkthesis.pdf
To read the report Ministry of Defence, Defence Standard—0056: Safety Management Requirements for Defence Systems—Part 2, please visit https://www.dstan.mod.uk/standards/defstans/00/056/02000400.pdf
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:23pm</span>
|
|
By Mike McLendon, Associate DirectorSoftware Solutions Division
Software is the principal, enabling means for delivering system and warfighter performance across a spectrum of Department of Defense (DoD) capabilities. These capabilities span the spectrum of mission-essential business systems to mission-critical command, control, communications, computers, intelligence, surveillance, and reconnaissance (C4ISR) systems to complex weapon systems. Many of these systems now operate interdependently in a complex net-centric and cyber environment. The pace of technological change continues to evolve along with the almost total system reliance on software. This blog posting examines the various challenges that the DoD faces in implementing software assurance and suggests strategies for an enterprise-wide approach.
Over the past decade, the DoD has been increasingly challenged to develop, implement, and continually evolve comprehensive enterprise software assurance policies, guidance, and infrastructure capabilities to anticipate the impact of technological change and the dominance of software. As a result, there is significant uncertainty about DoD’s institutional software assurance capability to achieve the level of confidence that software functions as intended and remains free of vulnerabilities across legacy systems and systems being acquired. Although the DoD has taken various initiatives to examine software assurance issues and craft comprehensive assurance strategies, this uncertainty and its impact on warfighter performance persist as a major problem.
Congressional Concerns about Software Assurance
Congress has become increasingly concerned about the DoD’s progress in implementing its software assurance strategies and capabilities. For example, the National Defense Authorization Act (NDAA) for FY 2011 (Section 932) mandated that the Secretary of Defense develop and implement a strategy for assuring the security of software and software-based applications for all covered systems by no later than October 1, 2011. The FY 2012 NDAA was the first law to provide strong policy guidance to secure both new and legacy software from attack throughout the software development lifecycle.
More recently, Section 933 of the FY 2013 NDAA directs the DoD to implement a baseline lifecycle software assurance policy. This policy requires the use of appropriate, automated vulnerability analysis tools in computer software code. These tools are intended for using during development, operational testing, and operations and sustainment phases, through retirement for specific types of systems.
The Scope of the DoD Software Assurance Challenge
The DoD faces several challenges in creating and implementing an institutional software assurance capability for systems in acquisition, as well as legacy systems. Such a capability must be multidimensional (including policy, guidance, process, practice, tool, and workforce concerns) to encompass reliability, security, robustness, safety, and other quality-related attributes critical to achieving the level of confidence that software functions as intended and is free of vulnerabilities. This software assurance capability must be broad in scope and rigorous in its discipline, but with enough flexibility and adaptability to address the challenges discussed below.
Software assurance policies and capabilities should span a spectrum of system and use-case challenges. These policies and capabilities need to account for the fact that the DoD maintains a diverse and complex systems portfolio including business and enterprise network information systems; modeling and simulation; automated tools for design, test, and manufacturing; complex C4ISR; and autonomous, semi-autonomous, and manned systems.
This diverse and complex systems portfolio must operate in a net-centric cyber environment where each system acts as an information node in one or more networks. Although DoD systems operate in this environment, they are overwhelmingly acquired as individual systems, rather than being managed in a systems-of-systems or portfolio context for acquisition and sustainment. More than 100 systems alone are classified as major defense acquisition programs. The software assurance use case challenge also spans legacy systems that may be in operation and sustainment for multiple decades. During their lifetime, these systems undergo continuous evolution to meet warfighter performance needs and to continue to operate in the net-centric environment.
Another issue that must be considered in formulating software assurance capabilities and policies is the limited visibility at the DoD level of the size and demographics of the total software inventory. As a result, there is no ongoing analysis of this ever changing inventory to inform software acquisition and sustainment enterprise decisions as well as assurance policy, program, and investment decisions.
Further complicating matters is the fact that the software supply chain is not well integrated vertically in terms of prime-to-subcontractor flow-down of software assurance requirements, independent verification and validation (IV&V), and test and evaluation to enable consistent visibility, traceability, and integrated testing.
The myriad of assurance type acquisition requirements creates a confusion of stove-piped policies that sustainment and acquisition program managers must navigate, understand, and make actionable. For example, government and industry acquisition program managers face a variety of policies and requirements for information assurance, software assurance, trusted systems, program protection planning, anti-tamper, and the like. The acquisition community is therefore often confused about what these and other various terms mean, why they are important, what needs to be done and how, and, finally, how to evaluate achievement of the desired outcome.
Another challenge is that the DoD’s current assurance policies are the result of incremental changes over many years, rather than originating from a coherent and integrated strategy that recognizes the rapid pace of technology and software change. As a result, there is a need for an overarching, integrated assurance framework (ideally evidence-based) that can be continually adapted to address emerging needs and that communicates more effectively to officials who plan and execute acquisition and sustainment programs.
Finally, the DoD’s sustainment and acquisition communities are decentralized geographically and organizationally with limited, on-going visibility at the DoD enterprise level of the state of the software assurance infrastructure (including tools, practices, and workforce) to inform policy and resource decisions.
Towards a Strategic DoD Software Assurance Approach
The DoD must deal with a critical strategy issue in addressing the software assurance challenge. DoD’s approach relies on a decentralized approach to software assurance. In this approach, each program (acquisition and legacy) addresses software assurance (and overall system assurance) within the acquisition and contract strategy for that specific program. This approach can enable the proliferation of an ever-increasing variety of approaches to software assurance for individual systems that must operate in a systems-of-systems environment. In 2010, the DoD initiated the Program Protection Plan (PPP) requirement for acquisition programs to consolidate in one document a number of existing assurance-type policy requirements. This policy is certainly a step in the right direction to elevate the criticality of assurance in all phases of the system lifecycle. However, the policy allows each program to satisfy assurance requirements in different ways and relies on idiosyncratic (and potentially conflicting) software assurance tools and practices of multiple contractors.
The persistence of concerns and their consequences drives the need to consider the merits of creating and organizing the DoD software assurance policy and infrastructure around a standard infrastructure. This infrastructure should include policies, contract requirements, practices, tools, and a workforce that is continually refreshed. This infrastructure should also be continually evaluated for its value to achieve consistency across programs, within the DoD’s software sustainment organizations, and the SoS network environment.
This shift to an enterprise software assurance approach should include creating a meta-assurance framework that synthesizes myriad legacy assurance models into a more technologically current model. This model should effectively communicate to acquisition and sustainment managers what must be done; how to implement it; and then how to evaluate outcomes. An effective enterprise software assurance approach should also include the means to continually and comprehensively assess and improve the state of the art of the DoD’s software assurance infrastructure to develop a baseline of past, current, the gaps, and then identify the impacts of emerging technology. Moreover, the DoD needs enterprise strategy, plan, and performance measures for the inventory and continual analysis of the software portfolio to inform corporate decisions about software assurance policies, programs, and investments.
In addition to these policy frameworks and assessments measures, the DoD also needs to enhance workforce competencies for software assurance within the context of a broad- based enterprise strategy to improve software acquisition management policy, practices, and competencies. Since these workforce issues cannot entirely be addressed by applying existing best practices, a deliberate and executable DoD research and development investment strategy is also needed to advance capabilities in software assurance and vulnerability detection to address infrastructure gaps and future needs. This investment strategy should consider the following technical areas:
Architecture and composition principles that enable separate evaluation of individual components with the possibility of combining results to achieve aggregate assurance judgments. These principles are motivated by the reality of modern software supply chains, which are rich and diverse in sourcing and geography.
Modeling and analytics to support the diversity of quality attributes significant to DoD and infrastructural systems including modeling, simulation, and design techniques that support critical security attributes.
Exploration of development approaches that incorporate creation of evidence in support of assurance claims into the process of development. This evidence-based assurance can harmonize incentives to create designs and implementations that can more readily support evaluation.
Evaluation and other techniques to support the use of more opaque components in systems, including binary components and potentially dangerous components from unknown sources.
Summary
Software assurance is a complex domain that is one element of a larger mission assurance framework needed by the DoD to encompass reliability, security, robustness, safety, and other quality-related attributes. Although software assurance nests with the DoD’s overall software strategy, its related policies, plans, and infrastructure have not kept pace with the impacts of advancing technology and reliance on software to achieve warfighter performance. To address critical software assurance challenges, the DoD must shift to an enterprise led and managed infrastructure.
Additional Resources
To download a PDF of the report Critical Code: Software Producibility for Defense, please go towww.nap.edu/catalog.php?record_id=12979.
To read the SEI report Making the Business Case for Software ASsurance, please go to http://www.sei.cmu.edu/library/abstracts/reports/09sr001.cfm.
Acquisition and sustainment workforce competencies for software assurance are a critical element in creating a robust infrastructure.For an overview of the software assurance competency domain, please go to http://www.sei.cmu.edu/library/abstracts/reports/13tn004.cfm.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:22pm</span>
|
|
By Douglas C. SchmidtPrincipal Researcher
Department of Defense (DoD) program managers and associated acquisition professionals are increasingly called upon to steward the development of complex, software-reliant combat systems. In today’s environment of expanded threats and constrained resources (e.g., sequestration), their focus is on minimizing the cost and schedule of combat-system acquisition, while simultaneously ensuring interoperability and innovation. A promising approach for meeting these challenging goals is Open Systems Architecture (OSA), which combines (1) technical practices designed to reduce the cycle time needed to acquire new systems and insert new technology into legacy systems and (2) business models for creating a more competitive marketplace and a more effective strategy for managing intellectual property rights in DoD acquisition programs. This blog posting expands upon our earlier coverage of how acquisition professionals and system integrators can apply OSA practices to decompose large monolithic business and technical designs into manageable, capability-oriented frameworks that can integrate innovation more rapidly and lower total ownership costs.
Collapsing Stove-Pipes with Technical Reference Frameworks
DoD programs have struggled for decades to move away from stove-piped solutions towards common operating platform environments (COPEs). An earlier post in this series described the drawbacks of stove-piped solutions, which lock the DoD into a small number of system integrators, each devising proprietary point solutions that are expensive to develop and sustain over the lifecycle. Although stove-piped solutions have been problematic (and unsustainable) for years, the budget cuts occurring under sequestration have motivated the DoD to reinvigorate its focus on identifying alternative means to drive down costs, create more affordable acquisition choices, and improve acquisition program performance.
The recently released DoD Better Buying Power 2.0 policy initiative, which superseded Better Buying Power 1.0, is one such alternative. Critical to the success of the Better Buying Power initiatives is the focus on the OSA paradigm, which includes a continually evolving integrated business and technical program management strategy that employs modular design, publishes consensus-based standards for key interfaces, embraces transparency, invigorates enterprise innovation practices, and shares risk via systematic reuse of software/hardware capabilities and assets.
OSA technical practices—which are the focus of this blog posting—foster competition and innovation through published open interfaces, protocols, and data models; open standards; full-design disclosure; modular components; and intentionally-defined system structures and behaviors. Likewise, OSA business models—which will be the focus of later blog postings in this series—use open competition to expand the pool of defense contractors that can participate in DoD acquisition program contracts, thereby lowering costs and invigorating innovation.
The SEI is helping the DoD craft its OSA strategy and produce an implementation plan aimed at helping program managers deliver better capability to warfighters, given the fiscal constraints of sequestration. A working group has been established to help the DoD move away from stove-piped software development models to COPEs that embody OSA practices. As part of this effort, I am serving as co-lead of a task area on "published open interfaces and standards" that is seeking to avoid vendor lock-in, encourage competition, and spur innovation by defining a limited number of technical reference frameworks. These frameworks are integrated sets of competition-driven, modular components that provide reusable combat system architectures for families of related warfighting systems. Key tenets of technical reference frameworks include
improving the quality and affordability of DoD combat systems via the use of modular, loosely coupled, and explicitly-articulated architectures that provide many shared capabilities to warfighter applications
Examples of these shared capabilities include data- and net-centricity, utility computing, situational awareness, and location awareness.
fully disclosing requirements, architecture, and design specifications and development work products to program performers
The goal of full disclosure is to ensure that competitors and small businesses have sufficient information to implement drop-in replacements for key system modules.
enabling systematic reuse of software and software artifacts, such as objects, components, services, data models, and associated metadata and deployment plans
In a well-honed systematic reuse process, each new project or product leverages time-proven architectures, designs and implementations, only adding new code that's specific to a particular application or service.
mandating common components based on published open interfaces and standards supported by a range of open-source (access to source code) and closed-source (no access to source code) providers
Open interfaces and standards are essential to spur competition and avoid being locked in/by a specific technology and/or vendor.
achieving interoperability between hardware and/or software applications and services via common protocols and data models
These common protocols and data models simplify data interchange and exchange between components from different suppliers or components implemented using different technologies.
amortizing the effort needed to create conformance and regression test suites that help to automate the continuous verification, validation, optimization, and assurance of functional requirements and non-functional requirements
These automated test suites ensure that developers need not wait until final system integration to perform critical functional and quality-of-service evaluations, nor must they expend signifcant effort (re)creating these tests manually.
Limiting the number of these technical reference frameworks will increase interopera-bility and reuse opportunities, leading to cost savings across the enterprise and across the lifecycle.
The challenge, of course, is to define guidance that helps program managers and associated acquisition professionals systematically apply technical reference framework tenets to address technical, management, and business perspectives in a balanced manner. This blog posting—together with others we have planned—describes how advances in DoD combat system architectures lay the groundwork for success with technical reference frameworks.
The Architectural Evolution of DoD Combat Systems
The technical reference framework approach described above has been applied with varying degrees of maturity and success to DoD combat systems over the past several decades. I, along with fellow SEI researcher Don Firesmith and Adam Porter from the University of Maryland’s Department of Computer Science, have been documenting the evolution of DoD combat systems with respect to their adoption of systematic reuse and the OSA paradigm described above, as shown in the following diagram.
(To view a larger image of this graphic, click on the image.)
This diagram shows eight distinct stages along the evolutionary continuum of DoD combat systems. The ad hoc architectures in the columns on the left are highly stove-piped and exhibit little or no shared capabilities among various warfighter capabilities, such as communications, radars, launchers, etc. The increasingly advanced architectures on the right are intentionally designed to share many capabilities at different levels of abstraction in combat systems, including
infrastructure capabilities, such as internetworking protocols, operating systems, and various layers of middleware services, such as identity managers, event disseminators, resource allocators, and deployment planners
common data and domain capabilities, such as trackers, interoperable data models, and mission planners involving battle management, control, and interaction with sensors and weapons in C4ISR systems
external interfaces over the global information grid (GIG) to external weapon systems, as well as sources and users of information
In practice, production combat systems vary in terms of their progression along the continuum shown in the figure above. This visualization simply provides a birds-eye view of the design space of DoD combat systems with respect to architectural evolution.
What’s Ahead
This blog posting is the latest in an ongoing series that describes how the SEI is helping acquisition professionals and system integrators apply OSA principles and practices to acquire and sustain innovative warfighting capabilities at lower cost and higher quality. Upcoming postings in this series will describe the other stages of DoD combat system architectural evolution shown in the diagram above. Subsequent posts will then explore a research effort to help one Navy program obtain accurate estimates of the cost savings and return on investment for both development and lifecycle of several product lines that were built using a common technical reference framework.
Additional Resources
To read the SEI technical report, A Framework for Evaluating Common Operating Environments: Piloting, Lessons Learned, and Opportunities, by Cecilia Albert and Steve Rosemergy, please visit http://www.sei.cmu.edu/library/abstracts/reports/10sr025.cfm
To read the SEI technical note, Isolating Patterns of Failure in Department of Defense Acquisition, by Lisa Brownsword, Cecilia Albert, David Carney, Patrick Place, Charles (Bud) Hammons, and John Hudak, please visithttp://www.sei.cmu.edu/library/abstracts/reports/13tn014.cfm
To read the SEI technical report, The Business Case for Systems Engineering Study: Results of the Systems Engineering Effectiveness Survey, by Joseph Elm and Dennis Goldenson, please visithttp://www.sei.cmu.edu/library/abstracts/reports/12sr009.cfm
To read the SEI technical report, Quantifying Uncertainty in Early Lifecycle Cost Estimation (QUELCE), by Robert Ferguson, Dennis Goldenson, James McCurley, Robert Stoddard, David Zubrow, and Debra Anderson, please visit http://www.sei.cmu.edu/library/abstracts/reports/11tr026.cfm
To read the handbook, QUASAR: A Method for the Quality Assessment of Software-Intensive System Architectures, by Donald Firesmith, please visit http://www.sei.cmu.edu/library/abstracts/reports/06hb001.cfm
To read the SEI technical report, Lessons Learned from a Large, Multi-Segment, Software-Intensive System, by John Foreman and Mary Ann Lapham, please visit http://www.sei.cmu.edu/library/abstracts/reports/09tn013.cfm
To read the SEI technical report, Resource Allocation in Dynamic Environments, by Jeffrey Hansen, Scott Hissam, B. Craig Meyers, Gabriel Moreno, Daniel Plakosh, Joe Seibel, and Lutz Wrage, please visit http://www.sei.cmu.edu/library/abstracts/reports/12tr011.cfm
To read the SEI technical report, Incremental Development in Large-Scale Systems: Finding the Programmatic IEDs, by Charles (Bud) Hammons, please visit http://www.sei.cmu.edu/library/abstracts/reports/09tn015.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:19pm</span>
|
|
By Eric WernerChief ArchitectSEI Emerging Technology Center
The power and speed of computers have increased exponentially in recent years. Recently, however, modern computer architectures are moving away from single-core and multi-core (homogenous) central processing units (CPUs) to many-core (heterogeneous) CPUs. This blog post describes research I’ve undertaken with my colleagues at the Carnegie Mellon Software Engineering Institute (SEI)—including colleagues Jonathan Chu and Scott McMillan of the Emerging Technology Center (ETC) as well as Alex Nicoll, a researcher with the SEI’s CERT Division—to create a software library that can exploit the heterogeneous parallel computers of the future and allow developers to create systems that are more efficient in terms of computation and power consumption.
As we look at computing trends in data centers available to developers, the move towards many-core heterogeneous CPUs shows no sign of abating. The majority of computers (such as smartphones and other mobile devices) contain heterogeneous hardware with multi- and many-core chips. Many existing software libraries, frameworks, and patterns, however, were not developed for large-memory, many-core, heterogeneous computing environments. Since software developers often aren’t accustomed or trained to write software for many-core architectures, new hardware architectures aren’t being used to their potential.
Complicating matters even further is the fact that many common software libraries for these environments are not designed for ease of use, but rather for efficient and optimal computing. Unfortunately, software developers haven’t received the training necessary in parallel programming algorithms to best leverage the capabilities of these new architectures.
Foundations in Moore’s Law
Our research approach traces its foundations back to Moore’s Law. Many software and systems engineers abbreviate Moore’s Law as stating that over the history of hardware, the processor speed on integrated circuits doubles every two years. Moore’s Law actually states that the transistor density on microchips doubles every 18 months. In recent years, however, CPU manufacturers have focused less on clock speed and more on multi-core and special-purpose cores with deeper memory architectures.
"The performance of individual computer processors increased on the order of 10,000 times over the last two decades of the 20th century without substantial increases in cost and power consumption," noted Samuel Fuller and Lynn Millet in The Future of Computing Performance. Fuller and Millet also advocate, "Future growth in computing performance will have to come from parallelism. Most software developers today think and program by using a sequential programming model to create software for single general-purpose microprocessors."
From Gaming to High-Performance Computing: Graph Analytics for Everyday Users
While heterogeneous, multi-core architectures were once largely seen in gaming systems, the high-performance computing (HPC) community has also migrated to heterogeneous, multi-core architectures. These architectures allow for high-level computations, as well as three-dimensional physics simulations. While still a specialty field, the architectures witnessed in HPC systems will soon be widely available to the everyday user.
In June of this year, the International Supercomputing Conference released the Top500 supercomputer list. According to an article on the Top500 list posted on AnandTech, "the number of hybrid systems is actually down—from 62 on the last list to 54 now—but on the other hand the continued improvement and increasing flexibility of GPUs and other co-processors, not to mention the fact that now even Intel is involved in this field, means that there’s more effort than ever going into developing these hybrid systems."
One phase of our research involves using HPC architectures to simulate future computer architectures and develop software libraries, best practices, and patterns that can be used by a broad community of software developers. Initially, we limited our focus to graph analytics, which are algorithms that operate on graphs, which do not have locality of reference, making it hard to parallelize operations on them.
Graph analytics are widely used in government, commerce, and science and can highlight relationships that might be obscured by data. One example of a system that can be represented as a graph is a social network, where the individual components are people and the connections they form represent social relationships.
For a reference platform, we are relying on the Graph500, an international benchmark started in 2010 that rates how fast HPC systems, test, traverse, and navigate a graph. Graph 500 is a benchmark similar to the Top500 referenced above. The Graph 500 is specifically designed to test graph algorithms because they’re fundamentally different from easily parallelizable algorithms. We’re starting with the algorithms defined by the Graph 500, and we’re using that framework as a starting point.
Identifying and Validating Design Patterns
With the understanding that the development of patterns is primarily a bottom-up endeavor, we initially focused on reviewing patterns developed for homogenous, HPC patterns. These patterns will be culled from those developed by ETC researchers, as well as our collaborators in government, academia, and industry.
Validation is a critical process in this phase and we are using two, independent technical validation mechanisms:
We are beta-testing the software library with software engineers and have them use the library to generate novel graph analytical code for advanced computing architectures
We are soliciting feedback from real-world users and plan to present the patterns that we reviewed at relevant upcoming technical conferences.
The next phase of our work will focus on culling the homogenous HPC patterns that we audited to develop a library of templates and patterns that software developers, architects, and technology planners will use to effectively access and exploit future computing architectures. As stated previously, a greater utilization of resources means faster computation and possibly more efficient use of resources.
A Collaborative Approach
In this early phase of our work, we have been collaborating with researchers at Indiana University’s Extreme Scale Computing Lab, which developed the Parallel Boost Graph Library. In particular we are working with Andrew Lumsdaine who serves on the Graph 500 Executive Committee and is considered a world leader in graph analytics.
Addressing the Challenges Ahead
We recognize that even if we achieve all of our milestones, our research will not yield a silver bullet. Programmers need niche skills to address some of the problems of multiple architectures. Our approach focuses on reducing the time that developers need to spend solving the problem of programming for heterogeneous architectures rather than fighting the computer with the hardware as the problem.
This challenge problem aligns with Emerging Technologies Center (ETC)’s mission, which is to promote government awareness and knowledge of emerging technologies and their application and to shaping and leverage academic and industrial research. We hope our research will enable programmers in government and industry to use the library of templates and design patterns that we develop to produce effective software for future computing systems. Our next step involves releasing this library to our stakeholders in the DoD, and to other users, via an open-source platform to enable them to effectively access and exploit future computing architectures.
While our initial phase of research focused on graph analytics in graphics processing units (GPUs), we will also investigate other hardware platforms in the future, including field-programmable gate arrays (FPGAs). We plan to develop a library that separates the concerns of graph analysis from the details of the underlying hardware architecture. Future work will focus on graphs, but add new hardware architectures to the mix such as FPGA and potentially distributed platforms.
If you are interested in collaborating with us on this research, please leave a comment below or send an email to info@sei.cmu.edu.
Additional Resources
For more information about the Emerging Technologies Center, please visithttp://www.sei.cmu.edu/about/organization/etc/index.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:19pm</span>
|
|
By Will Casey Senior Member of the Technical StaffCERT Division Exclusively technical approaches toward attaining cyber security have created pressures for malware attackers to evolve technical sophistication and harden attacks with increased precision, including socially engineered malware and distributed denial of service (DDoS) attacks. A general and simple design for achieving cybersecurity remains elusive and addressing the problem of malware has become such a monumental task that technological, economic, and social forces must join together to address this problem. At the Carnegie Mellon University Software Engineering Institute’s CERT Division, we are working to address this problem through a joint collaboration with researchers at the Courant Institute of Mathematical Sciences at New York University led by Dr. Bud Mishra. This blog post describes this research, which aims to understand and seek complex patterns in malicious use cases within the context of security systems and develop an incentives-based measurement system that would evaluate software and ensure a level of resilience to attack.
In March of this year, an attacker issued a DDoS attack that was so massive, it slowed internet speeds around the globe. Known as Spamhaus/Cyberbunker, this attack clogged servers with dummy internet traffic at a rate of about 300 gigabits per second. By comparison, DDoS attacks against banks typically register only 50 gigabits per second, according to a recent article in Business Week. The Spamhaus attack came 13 years after the publication of best practices on preventing DDoS attacks, and it was not an isolated event.
The latest figures indicate that cyberattacks continue to rise. Research from the security firm Symantec indicates that in 2012 targeted cyber attacks increased by 42 percent. How is this possible? In part, existing technologies facilitate the role of attacker over the role of defender, since in this hide-and-seek game, the tricks to hide an attack are many, whereas the techniques to seek them are meager and resource intensive.
In the SEI CERT Division our work aims at going beyond simply detecting strategic and deceptive actions of an attacker, by reversing the very incentives that ultimately make more transparent the choices made in hide-and-seek dynamics. Attackers have incentives to find weaknesses software which facilitate system compromise. We envision the possibility that these dynamics can be reversed through an altered incentive structure, credible deterrence/threats, and powerful measurement systems. For example, we may incentivize an emerging group to acquire and deploy particular expertise to evaluate software and guarantee their validity, albeit empirically using techniques from machine learning. This combination of techniques including expert systems, model checking and machine learning can ensure increased level of resilience without loss of transparency. Moreover, game theory provides a means to evaluate the dynamics of incentives and to understand the impacts of new technologies and use cases.
Deterring Malicious Use in Systems
Existing proposals for deterring malware attacks rely on the isolation of an elite network with enhanced security protocols, which undermines the utility of networking and does little to deter incentives for maliciousness. Instead, this strategy concentrates digital assets in one place, putting all eggs in one highly vulnerable basket. Such proposals, while costly and risky, underscore the importance of introducing alternative ideas into the discussion of common information assurance goals.
For example, since computer networks gather users with a variety of different interests and intents, we may wish to incentivize computer users to take steps that will compel them to reassure other users that they have not been maliciously compromised. To obtain this assurance we may leverage the work of technical and security experts, which involves sophisticated software vulnerably probing techniques (such as fuzz testing) and trust mechanisms (such as trusted hardware modules), etc. With these assurances we demonstrate the possibility of economic incentives for software adopters to have deeper and clearer expectations about a network’s resilience and security.
Foundations in Game Theory
Many of the ideas in our approach can be traced back to John von Neumann, a Princeton University mathematician who, with his colleague Oskar Morgenstern, created the basic foundations of modern game theory, which studies how rational agents make strategic choices as they interact. An example of one such strategic choice is the concept of mutual assured destruction (MAD), which describes a doctrine thata war in which two sides would annihilate each other would leave no incentive for either side to start a war. Once the two sides have come to such a mutually self-enforcing strategy, neither party will deviate as long as the opponent does not. Such a state-of-affairs is described in game-theory by the concept of Nash equilibrium. We aim to cast the cyber-security problem in a game-theoretic setting so that every "player" will choose to be honest, will check that they are honest and not an unwitting host for malware, can prove to others that they are honest and accept confidently the proofs that others are honest and not acting deceptively.
A Collaborative Approach
Through our collaboration with the research team at NYU—Dr. Mishra and Thomson Nguyen—we can access their extensive knowledge in game theory, pattern finding, model checking, and machine learning. We are also collaborating with Anthony Smith of the National Intelligence University. This collaboration will allow us to access Smith’s advanced knowledge in network security science and technology. Researchers from the SEI’s CERT Division involved in the project include Michael Appel, Jeff Gennari, Leigh Metcalf, Jose Morales, Jonathan Spring, Rhiannon Weaver , and Evan Wright. Building on the deep domain knowledge from CERT about the nature and origin of malicious attacks and how often those attacks occur, our collaboration with the Courant Institute will provide a better understanding of the implications of such attacks in a larger system. The theoretical framework for this approach is based on model checking and follows strategies similar to what Dr. Mishra developed for hardware verification as a graduate student at CMU.
One of our preliminary tasks was to develop the mathematical frameworks to describe vulnerabilities including attack surface, trace data, software errors and faults, and malicious traces. The ability to rigorously define these patterns allowed us to formalize and detect a large class of malicious patterns as they transfer from user to user.
As an interesting use-case, we focused on several critical patterns to identify malicious behaviors in traces. By studying the Zeus Trojan horse, which was used to steal users’ banking information, we were able to identify atomic actions that allow malicious users to persist on a system and compromise their web browsers.
The enhanced system that we are proposing will additionally provide some degree of guaranteed resilience. When fully implemented, our approach will provide three key benefits to our stakeholders in government and industry;
a well-understood theoretical model of interactions among benign and malicious users, providing a more accurate picture of forces (technological, economic, political and social) that shape the security landscape
a scalable method for malware mitigation, including an adaptive system that can identify and address new threats
a transparent mechanism to vet commercialized software, which touches upon our notions of trusted computing at multiple levels, from firmware to Android applications.
Measures for Resilience to Malicious AttacksOur new system has many attractive features: it does not simply stop after identifying a network attack. Instead, it motivates and enables deployment of measures of weaknesses using practical techniques such as vulnerability assessments for servers, fuzz testing binaries for weaknesses, and verifying adherence to best practice. These measures provide decision makers and users alike with means to adapt best practices and keep the network operational. Consequently, the system’s designers will also better understand what security features are needed in response to current threats and attack methods. Many software vulnerabilities result from implicit assumptions made at the time of design. While it may not be possible to anticipate all the attacks against a design, we can begin to measure and minimize the time it takes for designers to respond to current attacks within our frame work of measures.
In summary, we believe the proposed system will not just deter malicious attackers, but will also motivate users to ensure that their computers and mobile devices have not been purposefully or unintentionally compromised. In addition designers will benefit from adding in security as user demands for security increase.
Challenges
There is no widely accepted definition of what constitutes malicious behaviors stated in a way that can be understood and guarded against by average users. We need to be able to help average users and those in government and industry gain a better understanding of whether behavior is malicious or benign.
A simpler and much more popular concept used in the physical context is trust. If trust is perceived to be a valuable economic incentive in the cyber context, and users can assess whether they can trust a server or a software application, then a trust-based technique can be used and can benefit a diverse group of users, ranging from individual users to personnel in industry and government.
Our approach will be especially powerful in trusted computing situations, where trust may be based on crytopgraphic signatures that validate the source code that operates a device. Granted that complete certainty may still be elusive. Note that users can entertain some assurance about the health of their network, because a third party can verify and certify that all are components are trustworthy and are behaving well.
Our Next Steps
Our next step is to check our design, as well as our implicit assumptions about how individuals behave in this framework. To this end, we have been developing a system that we can simulate in silico—initially aimed at understanding only the incentives to attack and counter attacks with mitigation —so that we can better understand how the individuals strategize to select equilibrium (a strategy profile from which no single player is incentivized to deviate). In the future, we plan to invest in deep, multi-trace modeling, extending the game theory to include temporal patterns of attacks on software and systems, which will involve simulation modeling and model checking. By simulation modeling we can estimate the resource needs, overheads and other requirements of the system for practical deployments.
Additional Resources
For more information about the work of researchers in the SEI’s CERT Division, please visit www.cert.org
For more information about New York University’s Courant Institute of Mathematical Sciences, please visithttp://www.cims.nyu.edu/
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:19pm</span>
|
|
Examples From Real-World ProjectsBy Stephany BellomoSenior Member of the Technical StaffSoftware Solutions Division
Agile projects with incremental development lifecycles are showing greater promise in enabling organizations to rapidly field software compared to waterfall projects. There is a lack of clarity, however, regarding the factors that constitute and contribute to success of Agile projects. A team of researchers from Carnegie Mellon University’s Software Engineering Institute, including Ipek Ozkaya, Robert Nord, and myself, interviewed project teams with incremental development lifecycles from five government and commercial organizations. This blog posting summarizes the findings from this study to understand key success and failure factors for rapid fielding on their projects.
A key area in our interviews explored how Agile projects deal with the pressure to rapidly deliver high-value capability while maintaining project speed (delivering functionality to the users quickly) and product stability (providing reliable and flexible product architecture). For example, due to schedule pressure, we often see a pattern of high initial velocity (the amount of product backlog effort a team can handle in one sprint) for weeks or months, followed by a slowing of velocity due to stability issues.
Business stakeholders often find these changes in velocity disruptive since the rate of capability delivery slows while the team addresses stability problems. We found that when experienced practitioners were faced with these challenges, they did not apply Agile practices in a silo. Instead they combined practices—Agile, architecture, or other—in creative ways to respond quickly to unanticipated stability problems, as described below.
Balancing Speed and Stability
The ability to balance speed and stability involves achieving and preserving a software development state that enables teams to deliver releases that stakeholders value at a tempo that makes sense for their business. The desired software development state is different for each organization. This state is one in which architecture (often in the form of platforms and application frameworks), supporting tool environments, practices, processes, and team structures exist to support efficient and sustainable development of features. The entire organization—including development teams, management, and other stakeholders—must have visibility into the desired state, so that they neither over-optimize the supporting development infrastructure nor quit working on it.
In organizations that operate in highly regulated environments, such as defense, avionics, financial services, and health care, software development teams often interact with system engineering, deployment, and quality assurance teams that may be operating under different tempos. One challenge that software development teams face is that these competing forces often result in a significant slowdown in delivery following a high initial velocity. When confronted with this slowdown, the organizations that we interviewed applied a variety of tactics and practices to get back to the desired state. For example they often applied Agile practices in combination with other practices, especially architecture practices, to rapidly field projects.
A New View of Agile
Many software developers steadfastly maintain that Agile requires small, co-located teams, downplays architectures, and provides no documentation. The reality is that organizations—especially those faced with the challenge of rapidly fielding software systems in highly-regulated environments—have been applying varied architecture practices that build on the foundations of Scrum and Extreme Programming (XP).
The approaches revealed by our analysis of the interviews we conducted fall into three categories:
When the project was going well, teams described using foundational Agile practices, such as Scrum status meetings, Scrum collaborative management style, small dedicated teams and limited scope, continuous integration, test-driven development, and so on to enable rapid development.
When teams described a problem, they would explain how they often combine Agile practices with architecture and other practices ranging from management to engineering to get back to desired state. Examples of these combined practices include release planning with architectural considerations, prototyping with a quality attribute focus, release planning with external dependence management, test driven development with quality attribute focus, and technical debt monitoring with quality attribute focus. Development projects incur technical debt when shortcuts (intended or otherwise) lead to degraded quality.
We also collected some examples of inhibiting factors that prevented developers from rapidly delivering software products. These included slow business decision-making processes, limitations in measuring architectural technical debt, over-dependency on architecture for knowledge, and stability-related efforts that are not entirely visible to the business.
Below are a few examples of Agile architecture practices that enable speed and stability that emerged from our interviews:
Release planning with architecture considerations. This practice extends the feature release planning process by adding architectural information to the feature description document prior to release prioritization.
Prototyping with a quality attribute focus. This practice extends the prototyping/demo user feedback practice commonly integrated into Scrum to include a focus prototyping on quality attributes, such as performance or security.
Roadmap/Vision with External Dependency Management. This practice incorporates external dependency analysis into the roadmap planning process to reduce the risk of being blind-sided by unanticipated external changes.
Test-Driven Development with Quality Attribute Focus. This practice merges test-driven practices, such as automated test-driven development and continuous integration, with a focus on runtime qualities, such as performance, scalability, and security.
Technical Debt Monitoring with Quality Attribute Focus. This practice merges the notion of tracking technical debt with a focus on quality attributes (such as performance, security, reliability) that go beyond defects and functional correctness.
More examples of combined practices can be found in the paper A Study of Enabling Factors for Rapid Fielding: Combined Practices to Balance Tension Between Speed and Stability that I co-authored on this research with Ipek Ozkaya and Robert Nord.
The combined practices listed above allow experienced developers to address the problem of velocity slowdown resulting from stability issues with minimal disruption to capability delivery. Over time, however, acceptance of this approach has evolved. The initial release of Scrum advocated that the practices must be applied exactly as described in the Scrum handbook (if this was not done the project was considered to have "Scrum But") syndrome, so the outcome would be questionable.
In a recent blog posting, Ken Schwaber—one of the originators of Scrum—amended the initial Scrum doctrine by saying he would like to change the mindset of "Scrum But" to "Scrum And." He explained that "Scrum And" characterizes an organization that is on a path of continuous improvement in software development beyond the basic use of Scrum. In highly regulated environments, such as defense, avionics, financial services, and health care, organizations must often employ "Scrum And" approaches to balance speed and stability in an effort to meet budget, quality and timeline expectations.
Inhibiting Factors
Through our interviews with organizations, we also identified several factors that prevented development teams from rapidly delivering a software product. Many of these inhibiting factors were the result of incorrect or inconsistent applications of Agile or architecture practices, including (but not limited to) the following:
a desire for features that limits requirements analysis or stability-related work
slow business decision, feedback, or review-response time
problems that resulted from challenges with external dependency management
stability-related effort that was not entirely visible
limitations in measuring architectural technical debt
inadequate analysis, design, or proof-of-concept
inconsistent testing practices and/or deficiency in quality attribute focus
poor testing consistency
Concluding Thoughts
While agile architecture practices can help organizations assure the stability of fielded systems , it is important to understand the root causes of the inability to deliver at the expected pace and management of the tension between speed and stability. Organizations must also make problems more visible to developers, management, and stakeholders. When considering whether to combine Agile and architecture practices, organizations should consider the following questions:
Are we delivering software to our customer at an expected pace?
Are we aware of problems that are cropping up as a result of losing focus on architecting when Agile adoption activities become the primary focus?
Does our technical roadmap address short-term and long-term issues?
Does the team of software developers have skills that would enable them to successfully implement Agile and architecture?
Do we have the visibility into not only the project management of the system, but also the quality expected from the system?
We hope that by codifying and sharing the practices exemplified above, other organizations can learn to apply these approaches to contend with the often conflicting demands of rapidly delivering software that is reliable, stable, and flexible in a rapidly changing environment.
Future posts in this series will explore other hybrid Agile approaches identified by the organizations that we interviewed including prototyping with a quality attribute focus. If you have experience applying a hybrid Agile architecture approach in your organization, please share your story with us in the comments section below.
Additional Resources
This article is an excerpted version of an article that appeared in the June 2013 issue of the Cutter IT Journal. For more information, please visit http://www.cutter.com/itjournal.html
To read more about the SEI’s research in architectural technical debt, please visithttp://www.sei.cmu.edu/architecture/research/arch_tech_debt/
To read more about the review of architecture-centric risk factors, please read the article "Architecture for Large Scale Agile Architecture Development: A Risk-Driven Approach," which was published in the May/June edition of Crosstalk by Mike Gagliardi, Mike, Robert Nord, and Ipek Ozkaya, please visithttp://www.crosstalkonline.org/issues/mayjune-2013.html
To read the article "A Study of Enabling Factors for Rapid Fielding: Combined Practices to Balance Tension Between Speed and Stability," by Stephany Bellomo, Robert Nord, and Ipek Ozkaya, please visit http://www.sei.cmu.edu/library/assets/whitepapers/ICSE2013%20Study%20of%20Rapid%20Fielding%20Practices_camera%20ready.pdf
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:19pm</span>
|



