Blogs
By Mark Kasunic,Senior Member of the Technical Staff,Software Engineering Process Management Program
Organizations run on data. They use it to manage programs, select products to fund or develop, make decisions, and guide improvement. Data comes in many forms, both structured (tables of numbers and text) and unstructured (emails, images, sound, etc.). Data are generally considered high quality if they are fit for their intended uses in operations, decision making, and planning. This definition implies that data quality is both a subjective perception of individuals involved with the data, as well as the quality associated with the objective measurements based on the data set in question. This post describes the work we’re doing with the Office of Acquisition, Technology and Logistics (AT&L)—a division of the Department of Defense (DoD) that oversees acquisition programs and is charged with, among other things, ensuring that the data reported to Congress is reliable.
The problem with poor data quality is that it leads to poor decisions. This problem has been well documented by many researchers, notably by Larry English in his book Information Quality Applied. According to a report released by Gartner in 2009, the average organization loses $8.2 million annually because of poor data quality. The annual cost of poor data to U.S. industry has been estimated to be $600 billion. Research indicates the Pentagon has lost more than $13 billion dollars due to poor data quality.
Data quality is a multi-dimensional concept and the international standard data quality model identifies 15 data quality characteristics, including accuracy, completeness, consistency, credibility, currentness, accessibility, compliance, confidentiality, efficiency, precision, traceability, understandability, availability, portability, and recoverability. In our data quality research, we have been focusing on the accuracy attribute of data quality. Within the ISO model, accuracy is defined as the degree to which data has attributes that correctly represent the true value for the intended attribute of a concept.
Ensuring data quality is a multi-faceted problem. Errors in data can be introduced in multiple ways. Sometimes it’s as simple as mistyping an entry, but more complex organizational factors also lead to data problems. Common data problems include misalignment of policies and procedures with how the data is collected and entered into databases, misinterpretation of data entry instructions, incomplete data entry, faulty processing of data, and errors introduced while migrating data from one database to another.
A number of software applications have been introduced in recent years to address data quality issues. Gartner estimates that the number of software tools available for data quality grew by 26 percent since 2008. The bulk of these applications, however, focus on problems with customer-relationship management (CRM) data, materials data, and financial data (for example reconciling duplicate records and missing and inconsistent data). As part of our research, we are going beyond these basic types of data checks using statistical, quantitative methods to identify data anomalies that are not addressed by current off-the-shelf data quality software tools. While available data quality automated platforms address erroneous data, these applications are intended for customer relationship management, materials processing, and financial accounting. The types of data errors that they are intended to find and correct include missing data, incomplete data, character mismatches, and duplicate records.
Examples of the data anomalies that our research is focused on exposing include cost estimates and performance values that are unusual when compared to the time series values that constitute the remainder of the data series. These unusual data values are considered outliers and tagged as anomalies.
A data anomaly is not necessarily the same as a data defect. A data anomaly might be a data defect, but it might also be accurate data caused by unusual, but actual, behavior of an attribute in a specific context. Root cause analysis is typically required to resolve the cause(s) of data anomalies. We are working with our DoD collaborators on the resolution process to determine if the anomalies detected are actual data defects.
Our research is analyzing performance data submitted by DoD contractors in monthly reports about aspects of high-profile acquisition programs, including cost, schedule, and technical accomplishments on a project or task. Some methods that we are evaluating include
Dixon’s Test
Rosner’s Test
Grubb’s Test
Regression analysis
Autoregressive integrated moving average (ARIMA)
Various statistical control charting applications (individuals, moving range, moving average, exponentially weighted moving average, etc.)
Various non-parametric approaches (kernel function based, histogram based)
Slippage Detection
These approaches to anomaly detection are being compared and contrasted to determine what specific methods work best for each EVM variable we are studying.
Our data quality research complements our recent work on the Measurement and Analysis Infrastructure Method (MAID), which is an evaluation tool that helps organizations understand the weaknesses and strengths of their measurement systems. MAID is broader in scope than what is being addressed with our current research, recognizing that data is part of a life cycle that begins with sound definition, specification, collection, storage, analysis, packaging (for information purposes), and reporting (for decision-making). The integrity of data can be compromised at any of these stages unless policies, procedures and other safeguards are in place.
Our research thus far has found a number of methods that have been effective for identifying anomalies in the EVM data. Our work will culminate in a report that we plan to publish by the end of 2011. With support from AT&L, we’re hoping these methods will identify problems in the data they receive and report, ultimately leading to better decisions made by government officials and lawmakers.
Additional Resources
For more information about the SEI’s work in measurement and analysis, please visit www.sei.cmu.edu/measurement/
To read the SEI technical report, Issues and Opportunities for Improving the Quality and Use of Data in the Department of Defense, please visit www.sei.cmu.edu/library/abstracts/reports/11sr004.cfm
To read the SEI technical report, Can You Trust Your Data? Establishing the Need for a Measurement and Analysis Infrastructure Diagnostic, please visit www.sei.cmu.edu/library/abstracts/reports/08tn028.cfm
To read the SEI technical report, Measurement and Analysis Infrastructure Diagnostic, Version 1.0: Method Definition Document, please visit www.sei.cmu.edu/library/abstracts/reports/10tr035.cfm
To read the SEI technical report, Measurement and Analysis Infrastructure Diagnostic (MAID) Evaluation Criteria, Version 1.0, please visit www.sei.cmu.edu/library/abstracts/reports/09tr022.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:59pm</span>
|
Andrew P. Moore,
Insider Threat Researcher
CERT
The 2011 CyberSecurity Watch survey revealed that 27 percent of cybersecurity attacks against organizations were caused by disgruntled, greedy, or subversive insiders, employees, or contractors with access to that organization’s network systems or data. Of the 607 survey respondents, 43 percent view insider threat attacks as more costly and cited not only a financial loss but also damage to reputation, critical system disruption, and loss of confidential or proprietary information. For the Department of Defense (DoD) and industry, combating insider threat attacks is hard due to the authorized physical and logical access of insiders to organization systems and intimate knowledge of organizations themselves. Unfortunately, current countermeasures to insider threat are largely reactive, resulting in information systems storing sensitive information with inadequate protection against the range of procedural and technical vulnerabilities commonly exploited by insiders. This posting describes the work of researchers at the CERT® Insider Threat Center to help protect next-generation DoD enterprise systems against insider threats by capturing, validating, and applying enterprise architectural patterns.
Enterprise architectural patterns are organizational patterns that involve the full scope of enterprise architecture concerns, including people, processes, technology, and facilities. This broad scope is necessary due to the fact that insiders have authorized access to systems—not only online access but physical access too. Our understanding of insider threat stems from a decade of experience cataloging more than 700 cases of malicious insider crime against information systems and assets, including over 120 cases of espionage involving classified national security information.
Our experience reveals that malicious insiders exploit vulnerabilities in business processes of victim organizations as often as they do detailed technical vulnerabilities. Likewise, our data analysis has identified well over 100 categories of weaknesses in enterprise architectures that allowed the insider attacks to occur. We have used this analysis to develop an insider threat vulnerability assessment method, based on qualitative models for insider IT sabotage and insider theft of intellectual property (IP) that characterize patterns of problematic behaviors seen in insider threat cases. We have also applied these models to identify insider threat best practices and technical insider threat controls.
For example, an organization must deal with the risk that departing insiders might take valuable IP with them. One set of practices and controls that helps reduce the risk of insider theft of IP is based on case data showing that most insiders who stole IP did so within 30 days prior to their forced or voluntary termination. The pattern describing this set of practices and controls helps balance the costs of monitoring employee behavior for suspicious actions with the risk of losing the organization’s intellectual property.
Organizations aware of this pattern can ensure that the necessary agreements are in place (IP ownership and consent to monitoring), critical IP is identified, key departing insiders are monitored, and the necessary communication among departments takes place. At the point at which an insider resigns or is fired, technical monitoring and scrutiny of that employee’s activities within a 30-day window of their termination date are increased. Actions taken upon and before employee termination are vital to ensuring IP is not compromised and the organization preserves its legal options.
Capturing our understanding of insider threat mitigations as architectural patterns allows us to translate effective solutions in forms useful to engineers who design DoD systems. As part of our research, we are analyzing the subset of insider IT sabotage cases from the CERT insider threat database. We are updating and refining our existing qualitative insider IT sabotage model to include a quantitative simulation capability intended to exhibit the predominate patterns of insider IT sabotage behavior.
We are using a system dynamics approach to model and analyze the holistic behavior of complex problems as they evolve over time. System dynamics modeling and simulation makes it easier for us to understand and communicate the nature of problematic insider threat behavior as an enterprise architectural concern. After validating that simulating the problem model accurately represents the historical behavior of the problem—and does so for the right reasons—the next step is to examine the enterprise-level architectural insider threat controls proposed to help mitigate it. Our research will focus on two aspects:
Are those controls effective against insider threats? For example, do the controls mitigate the problematic behavior exhibited in the simulation model?
Do those controls introduce negative unintended consequences? For example, even if the controls are effective against the threat, do they unintentionally undermine organizational trust and reduce team performance?
A key challenge in our research is the difficulty associated with testing these controls in an operational environment. One manifestation of this problem is in the form of unknown false positive rates associated with insider threat controls. From the perspective of technical observations and resource usage, most malicious insiders behave as their non-malicious counterparts do. We therefore expect that poorly-designed controls will overwhelm operators with false positives. Controls are also hard to test operationally because insider attacks occur relatively infrequently, but nevertheless result in huge damages for victim organizations.
To meet these challenges, we are using system dynamics modeling and simulation to identify and test enterprise architectural patterns to protect against insider threat to current DoD systems. We are interviewing members of the DoD who have expressed interest in information security controls to mitigate the insider threat. These steps are enabling us to characterize the baseline enterprise architecture, which represents their operational architecture as a starting point for our analysis.
Identified architectural patterns will be applied to modify the baseline architecture to better protect against insider threat. The basis for establishing the efficacy of the architectural patterns is system dynamics simulation-based testing. The experiments conducted in the simulation environment provide a body of evidence that supports strong hypotheses going into pilot testing within organizations.
Enterprise architectural patterns developed through our research will enable coherent reasoning about how to design—and to a lesser extent implement—DoD enterprise systems to protect against insider threat. Instead of being faced with vague security requirements and inadequate security technologies, DoD system designers will have a coherent set of architectural patterns they can apply to develop effective strategies against insider threat in a more timely and confident manner. Confidence in these patterns will be enhanced through our use of established theories in related areas and the scientific approach of using system dynamics simulation models to test key hypotheses prior to pilot testing. We expect our research results will improve DoD enterprise, system, and software architectures to reduce the number and impact of insider attacks on DoD information assets.
We will be periodically blogging about the progress of this work. Please feel free to leave your comments below and we will reply.
Additional Resources:
For more information about the work of the CERT Insider Threat Center, please visitwww.cert.org/insider_threat/
To read a report about preliminary technical controls derived from insider threat data, Deriving Candidate Technical Controls and Indicators of Insider Attack from Socio-Technical Models and Data, please visit www.cert.org/archive/pdf/11tn003.pdf
To read a report about our insider threat modeling, A Preliminary Model of Insider Theft of Intellectual Property, please visit www.cert.org/archive/pdf/11tn013.pdf
To read the CERT Insider Threat blog, please visit www.cert.org/blogs/insider_threat/
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:59pm</span>
|
Part 2: SEI R&D Activities Related to Sustaining Software for the DoD
By Douglas C. Schmidt,
Deputy Director, Research, and Chief Technology Officer
Software sustainment is growing in importance as the inventory of DoD systems continues to age and greater emphasis is placed on efficiency and productivity in defense spending. In part 1 of this series, I summarized key software sustainment challenges facing the DoD. In this blog posting, I describe some of the R&D activities conducted by the SEI to address these challenges.
Primary Sustainment ActivitiesThe term software sustainment is often used synonymously with software maintenance. Sustaining software for the DoD, however, requires attention to certain issues (such as operations and training) that are less essential in commercial software maintenance. There are four primary categories of software sustainment activities:
Corrective sustainment diagnoses and corrects software errors after release.
Perfective sustainment upgrades existing software to support new capabilities and functionality.
Adaptive sustainment modifies software to interface with changing environments.
Preventive sustainment modifies software to improve future maintainability or reliability.
SEI Sustainment R&DThe software engineering research community has devised various approaches to improve software sustainment. For example, tools for detecting software modularity violations help identify eroding design structure (referred to whimsically as "bad code smells") so the code can be refactored to enhance its sustainability. Likewise, intelligent automated regression testing frameworks help ensure that changes to legacy software work as required and that unchanged parts have not become less dependable.
SEI sustainment strategies. Over the past several decades, the SEI has created methods and guidelines for sustaining, migrating, and evolving legacy systems. For example, the SEI has devised strategies for modernizing legacy systems and reusing legacy components in service-oriented architecture (SOA)-based systems. These strategies employ risk-managed, incremental approaches that encompass changes in software technologies, engineering processes, and business practices. In addition, the SEI has created techniques for measuring the effectiveness of software sustainment practices. These techniques can be used to help decision-makers choose a course of continued sustainment, replacement, or selecting which redundant legacy systems to keep and which to retire.
Software product lines. Legacy DoD systems comprise a wide range of software variations, such as network, hardware, and software configurations; different algorithms; and different security profiles. This variation is a key driver of total ownership costs because it impacts the time and effort required to assure, optimize, and manage system deployments and configurations throughout the lifecycle. To manage this variation effectively, the SEI helped pioneer software product lines (SPLs), which have been applied in DoD systems to manage software variation while reusing large amounts of code that implement common features needed within a particular domain. Software sustainment costs (particularly SPL testing) for an SPL-based family of systems can be reduced because reusable components in the SPL are maintained and validated in one place, instead of separately within each application.
Team Software Process. The Team Software Process (TSP) is another approach pioneered by the SEI that managers and engineers can use to sustain legacy software projects. TSP is a team-centric, time-boxed approach to developing software. By using TSP, organizations can better plan, measure, and improve software development productivity so they have more confidence in sustainment quality and cost estimates. The U.S. Air Force and other DoD and industry organizations have applied TSP successfully to manage software sustainment in large-scale weapons systems for the U.S. Air Force, as well as other DoD and industry organizations.
Software architecture. The SEI has also focused extensively on software architecture, which comprises the structure of the software elements in a system, the externally visible properties of those elements, and the relationships among them. SEI research has shown that a solid understanding of software architecture—and the associated methods, infrastructure, and tools—is essential to modify and improve software-reliant systems correctly, dependably, rapidly, and cost effectively throughout the lifecycle. Likewise, successful sustainment of software-reliant DoD systems requires techniques and tools for evaluating and improving software engineer and manager competence with respect to software architecture, including the following:
Understanding, analyzing, and engineering tradeoffs among system properties (such as performance, dependability, and security) that are critical to achieving desired levels of quality in software-reliant systems as they evolve. These properties are quality attributes that determine system viability throughout the sustainment phase.
Using architecture-centric practices to elicit quality attribute requirements and to design and analyze changes that are needed throughout sustainment of systems at all scales. Architecture-centric practices can be used to plan system releases and address sustainment challenges pertaining to integration and operational problems due to inconsistencies between system and software architectures.
Applying architecture principles for systems-of-systems and ultra-large-scale systems to develop architecture design and analysis principles that help document and account for socio-technical interactions, decentralized control, and continuous evolution and sustainment environments where failures/changes are the norm. For example, some soldiers or support staff on the battlefield are capable of creating or modifying existing systems in response to needs that were not anticipated by the designers of the original systems.
SEI assessments, workshops, and red teaming. The SEI regularly works with DoD programs to conduct independent technology assessments, reviews, and "red teams" that apply many of the methods and approaches described above to review the planning for—and conducting of—sustainment of DoD systems. For example, architecture practices such as the Architecture Tradeoff Analysis Method (ATAM) can help DoD programs elicit stakeholder input to identify likely long-term sources of change throughout the sustainment phase.
The SEI’s experience helping DoD programs transition from the production phase of acquisition to the sustainment phase of acquisition indicates that the DoD often focuses on how its contracts and contractors will change rather than on how its program offices will need to change. The SEI helps acquisition programs plan for these transitions to sustainment and has collected lessons learned from these activities into software acquisition planning guidelines (including Guideline #4: Software Sustainment). An interesting trend is that DoD programs are increasingly interdependent and interoperable, leading to sustainment interdependencies that require new coordination. To address this need, the SEI developed interoperable acquisition workshops to bring program offices together and draft plans that address sustainment.
Information assurance and software security. Increasing requirements for interdependence and interoperability also yield new challenges for information assurance and software security in legacy systems. In particular, many legacy systems were developed as isolated enclaves. With the advent of net-centric systems-of-systems, however, these legacy enclaves are being interconnected in ways that subject them to vulnerabilities not anticipated by their original designers.
For example, legacy systems programmed in languages like C may be susceptible to buffer overflows that will not occur until they are connected to a network. Moreover, maintainers may not resolve these types of vulnerabilities correctly. They might, for instance, simply add input validation to eliminate a particular path to a buffer overview vulnerability rather than remove the out-of-bounds write.
The CERT Secure Coding Team works with developers and maintainers to eliminate these and other types of vulnerabilities by establishing secure coding standards and processes for conformance testing against these standards. Likewise, the CERT Vulnerability Analysis Team can use an analysis of vulnerabilities based on secure coding rule violations to help handle the response. Legacy software systems can also undergo conformance testing against a secure coding standard in the CERT Source Code Analysis Laboratory (SCALe) to detect and eliminate vulnerabilities before the software is deployed. SCALe has also been used by DoD program offices to access the quality of legacy code to inform modernization versus replacement decisions.
Related SEI Blog Posts
SEI researchers have written several blog postings that are relevant to the sustainment of software-reliant DoD systems. For example, Rick Kazman’s posting on Measuring the Impact of Explicit Architecture Documentation focused on understanding the value of documenting software architectures for complex, software-reliant systems. Thorough software architecture documentation helps engineers who sustain DoD software understand how they can refactor, maintain, and update the software without introducing new defects or degrading existing capabilities.
Ipek Ozkaya’s posting on Enabling Agility by Strategically Managing Architectural Technical Debt examined how metrics extracted from the code and module structures of software can help repay technical debt, which is a conceptual framework for understanding how and when to defer design choices during the planning or execution of a software project. Repaying technical debt via refactoring and re-architecting is an effective strategy to alleviate architectural dependencies that impact system-wide architectural rework and minimize software decay during sustainment.
Steve Rosemergy’s posting on A Framework for Evaluating Common Operating Environments described a framework for exploring the interdependencies among common language, business goals, and software architecture when evaluating the sustainability of proposed software solutions.
We Want to Hear Your ThoughtsThis post has just scratched the surface of the solutions that meet the challenges of sustaining software-reliant DoD systems. While the SEI has expertise in methods and tools related to software sustainment, the DoD faces deeper and broader challenges than any one organization (or blog post) can address. We welcome your feedback in the comments section below on ways to improve the technologies and ecosystems needed to sustain DoD software effectively.
Additional Resources:
More information about sustaining software-reliant DoD systems is available below.
To read about software sustainment practices for the DoD, please visit www.stsc.hill.af.mil/resources/tech_docs/gsam4.html, especially chapter 16.
To read about the SEI’s work in software architecture, please visitwww.sei.cmu.edu/architecture
To read about the SEI’s work with the Team Software Process (TSP), please visit www.sei.cmu.edu/tsp
To read about the SEI’s work in Software Product Lines, please visitwww.sei.cmu.edu/productlines
To read about the SEI’s work in system of systems and SOA, please visitwww.sei.cmu.edu/sos
To read about the SEI’s work on Ultra-Large-Scale Systems, please visit www.sei.cmu.edu/uls
To read about the SEI CERT’s work in secure coding, please visitwww.cert.org/secure-coding/
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:59pm</span>
|
By Bill Novak, Senior Member of the Technical Staff, SEI Acquisition Support Program, Air Force Team
This is the fourth in an ongoing series examining themes across acquisition programs.
Background: Over the past decade, the U.S. Air Force has asked the SEI’s Acquisition Support Program (ASP) to conduct a number of Independent Technical Assessments (ITAs) on acquisition programs related to the development of IT systems; communications, command and control; avionics; and electronic warfare systems. This blog posting is the latest installment in a series that explores common themes across acquisition programs that we identified as a result of our ITA work. Previous themes explored in this series include Misaligned Incentives, The Need to Sell the Program, and The Evolution of "Science Projects." This post explores the fourth theme: common infrastructure and joint programs, which describes a key issue that arises when multiple organizations attempt to cooperate in the development of a single system, infrastructure, or capability that will be used and shared by all parties.
The Fourth Theme: Common Infrastructure and Joint Programs
This theme focuses on joint programs, which are popular for the potential they offer to reduce costs and improve interoperability. Joint programs are also recognized, however, as being hard to manage successfully due to many different reasons, including the number of stakeholders, the organizational size and complexity, differing organizational goals, interoperability challenges, geographical separation, coordination overhead, communication issues, and other factors.
There are other types of programs that may not technically be joint programs, but which have similar characteristics For example, a common infrastructure system, such as an enterprise-wide IT system, is similar to a joint program. Both are often trying to replace a set of isolated, yet related, existing capabilities with a single new system that will offer an integrated capability that is the union of the existing capabilities—and in the process is both modernizing the capability, as well as making it more efficient to develop and maintain.
To explore the issues of common infrastructure and joint programs more closely, consider a scenario that aggregates together the experiences of some joint programs the SEI has worked with:
A joint program office has several stakeholder programs that are planning to use the joint infrastructure software being developed, but each program demands that at least one major feature be added to the software just for them. The joint program manager agrees to the additional requirements, for fear of losing stakeholders (who could always build their own custom software). The additional design and coding changes that are needed significantly increase the total program cost, schedule, complexity, and risk. As the schedule now begins to slip, one program decides to leave the joint program and develop its own custom software instead. With one stakeholder gone, the amortized costs for the other programs increase further—and so another program leaves. As cost escalates, participation in the joint program begins to unravel and may ultimately collapse.
Many problems we’ve seen in acquisition programs belong to a category known as "social dilemmas" where planned cooperation can turn into opposition. Garrett Hardin’s article "The Tragedy of the Commons" (1968) is one of the most famous types of social dilemmas (the scenario above is such an example). The "Tragedy of the Commons" can be summed up simply: an individual desires an immediate benefit that will cost everyone else—and if all succumb to the same temptation, everyone is worse off. In the case of the joint program, the stakeholders each want custom features—but if they all demand them, it drives up cost, schedule, and risk, and everyone is worse off.
Social dilemmas are inherently hard to fix, which is why they persist not only in acquisition, but also in aspects of public policy, economics, sociology, and many other areas. Nonetheless, researchers have identified a range of solutions and mitigations that can be applied. For example, one approach for resolving many instances of the "Tragedy of the Commons" dilemma is privatization, which removes the social aspect of a social dilemma by converting shared ownership (with diffused responsibility) into private ownership (with sole responsibility), so that each owner now has a strong incentive to properly care for what they own. Privatization, however, may defeat the intent of achieving the original objectives (in this case cost savings and interoperability) through cooperation. In the joint program scenario, for example, it would mean that each of the stakeholder programs would build their own custom system, which can be prohibitively costly and time consuming.
An alternative solution might be "altruistic punishment," where cooperating participants can penalize uncooperative participants in some way, to encourage them to cooperate—even if the penalty costs the cooperators, and may produce no immediate direct gain for them. The cost of imposing the penalty prevents its overuse, making it self-correcting. Research by Fehr and Gachter has found that cooperation flourishes when altruistic punishment is present, and can break down if it is not.
Altruistic punishment might incentivize stakeholder programs to stay with the joint program, despite the difficulties. If it were unsuitable in a given situation, such as a joint program, other solutions to the "Tragedy of the Commons" dilemma still exist, including assurance contracts, rewards and penalties, building trust, and exclusion mechanisms. Elinor Ostrom’s Nobel prize in Economics in 2009 acknowledged her extensive work on how people create successful institutions to manage common resources. The choice of the best solution will depend on the specific circumstances of the program.
The SEI is exploring ways to model acquisition program behavior, such as the joint program scenario discussed above, to help analyze, predict, and ultimately manage the effects of various specific solution approaches on program outcome. As this work progresses, a key aspect will be how to best leverage this work in a form that's most helpful to the acquisition community. We know that acquisition leaders may be inexperienced with certain types of decision-making and may also be unfamiliar with some unique complexities of software-reliant acquisition programs—especially joint programs. Moreover, we know that conventional training may not be fully effective in preparing decision-makers for dealing with dynamically complex domains.
What acquisition leaders need is experience in complex decision-making, such as they might develop over decades of experience with actual acquisition programs. To accelerate this learning process, we plan to create interactive experiential learning tools, which are essentially "flight simulators" for acquisition professionals that address these types of situations. These learning tools are key since actively learning through experience produces better understanding and superior retention of the knowledge. With such an approach, we believe it will be possible to improve the decision-making abilities of acquisition program staff, thereby achieving more successful program outcomes.
Additional Resources:
For more information, about the SEI's Acquisition Support Program, please visit www.sei.cmu.edu/acquisition.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:58pm</span>
|
By Sagar Chaki, Senior Member of the Technical Staff Research, Technology, and System Solutions Malware, which is short for "malicious software," consists of programming aimed at disrupting or denying operation, gathering private information without consent, gaining unauthorized access to system resources, and other inappropriate behavior. Malware infestation is of increasing concern to government and commercial organizations. For example, according to the Global Threat Report from Cisco Security Intelligence Operations, there were 287,298 "unique malware encounters" in June 2011, double the number of incidents that occurred in March. To help mitigate the threat of malware, researchers at the SEI are investigating the origin of executable software binaries that often take the form of malware. This posting augments a previous posting describing our research on using classification (a form of machine learning) to detect "provenance similarities" in binaries, which means that they have been compiled from similar source code (e.g., differing by only minor revisions) and with similar compilers (e.g., different versions of Microsoft Visual C++ or different levels of optimization).
Evidence shows that a majority of malware families generate from the same origin. For example, a 2006 Microsoft Security Intelligence report revealed that the 25 most common families of malware account for more than 75 percent of the detected malware instances. Compounding this problem is the fact that the current cadre of malware analysis tools consists of either manual techniques (requiring extensive time and effort on the part of malware analysts) or automated techniques that are not as accurate (they produce high false-positive or false-negative rates) or are inefficient. In contrast, our approach involves
creating a training set using a sample of binaries
using the training set to learn (or train) a classifier
using the classifier to predict similarity of other binaries
I, along with my colleagues—Arie Gurfinkel, who works with me in the SEI’s Research, Technology, and System Solutions Program, and Cory Cohen, a malware analyst with CERT—felt that classification was appropriate for evaluating a binary similarity checker because this form of machine learning is particularly appropriate in instances where closed-form solutions are hard to develop, and a solver can be "trained" using a training set composed of positive and negative examples.
While malware classification is a major aim of provenance-similarity research, there are two main hurdles to applying classification directly to malware binary similarity checking:
Classification must be applied to parts of the malware where similarity is expected to manifest most directly. For this research, we decided to apply classification to functions. Intuitively, a function is a fragment of a binary obtained by compiling a source-level procedure or method. Functions are the smallest externally identifiable units of behavior generated by compilers. Similarity at the function level is an indicator of overall similarity between two binaries. For example, malware that originated from the same family will rarely be identical everywhere. Instead they will share important functions.
It is hard to develop training sets from malware due to the lack of information on source code and generative compilers. Our research therefore focuses on evaluating open-source software. We believe that a classifier that effectively detects provenance-similarity in open-source functions will also be effective on malware functions because the variation we are targeting (due to changes in source code and compilers) is largely independent of the software itself. For example, the variation introduced by a different compiler version (e.g., introducing stack canaries to detect buffer overflows at runtime) is the same, regardless of whether the source code being compiled is malware or open-source.
More specifically, we selected approximately a dozen C/C++ open-source projects from SourceForge.net and compiled them to binaries using Microsoft Visual C++ 2005, 2008, and 2010. We then extracted functions from the binaries using Idapro, which is a state-of-the-art dissembler, and constructed a training set and a testing set from the functions using a tool that we developed atop the Rose compiler infrastructure. Next, we learned a classifier from the training set using the Weka framework. When it comes to classification, the following two main decisions must be considered:
What classifier are you going to use?
What kind of attribute are you going to use?
We measured the effectiveness of a classifier in terms of two quantities: (1) its F-measure, which is a real number between 0 and 1 that indicates the overall classifier accuracy, and (2) the time required to train the classifier. There is a tradeoff between the two quantities: an F-measure can be increased by using a larger training set, but the training time also increases. We empirically found that the RandomForest classifier was the most effective Weka classifier for our purposes since it has the best F-measure for the same training time.
We repeated the experiment several times with different randomly constructed training and testing sets. To determine the robustness of our results, we repeated our experiments using a different set of open-source software and different versions of Microsoft Visual C++. The results were consistent in all cases, with the F-measures being around 0.95 for RandomForest. This finding is encouraging since it indicates that a provenance-similarity detector based on RandomForest will produce the correct result in more than 95 percent of the cases. We believe that this accuracy is sufficient for use in practical malware analysis situations.
Next, we experimented with various parameters of RandomForest to observe how these parameters affect the tradeoff between its F-measure and its training time. In particular, we focused on two important parameters: the number of trees and the number of attributes. With each attribute, we experimented with different values and measured how the F-measure vs. training time tradeoff changed.
To further improve and evaluate our approach, we developed a suite of the following types of attributes:
Semantic attributes, which capture the effect of a binary’s execution on specific components of the hardware state, register, and memory locations.
Syntactic attributes, which are derived from n-grams and n-perms and represent groups of instruction opcodes that occur contiguously in the library.
We re-evaluated the effectiveness of the classifier using these two types of attributes and concluded that semantic attributes yield better F-measures, but are more expensive to compute than syntactic attributes. Attribute extraction is inherently parallelizable, however, since it is done independently for each function. A rough estimate is that a modern CPU can extract semantic attributes from about 10,000 functions in the CERT catalog every day. Based on this estimate, extracting attributes from malware samples as they are discovered each day is feasible with a modestly sized CPU farm.
We had several false steps along the way. For example, we originally used text files for all of our input and output, which was slow and unwieldy. We therefore decided to store inputs and outputs in a database, which simplified our tools and accelerated our experiments. Another lesson learned was to handle statistical issues and randomness carefully. Since the set of all possible training and testing samples is large, we had to pick random subsets for our experiments. In some cases, we also had to label the samples in a random—yet deterministic—manner so that each sample had a randomly assigned label that stayed the same across all experiments. Constructing a labeling scheme that was both random and deterministic required extra care.
While determining the similarities between binary functions remains a challenge, the preliminary results from our research were presented in a well-received paper at the 2011 Knowledge, Discovery, and Data Mining Conference. Our malware research has also studied fuzzy hashing and sparse representation. Our future research will explore other ways of detecting similarities between functions, including the use of static analysis.
Additional Resources:
For additional details, or to download benchmarks and tools that we have developed and are using as part of our project, please visit www.contrib.andrew.cmu.edu/~schaki/binsim/.
To listen to the CERT podcast, Building a Malware Analysis Capability, please visit www.cert.org/podcast/show/20110712gennari.html
To read other SEI blog posts relating to our malware research, please visit http://blog.sei.cmu.edu/archives.cfm/category/malware
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:58pm</span>
|
By Donald Firesmith, Senior Member of the Technical Staff Acquisition Support Program
This blog post is the third and final installment in a series exploring the engineering of safety- and security-related requirements.
Background: In our research and acquisition work on commercial and Department of Defense (DoD) programs, we see many systems with critical safety and security ramifications. With such systems, safety and security engineering are used to managing the risks of accidents and attacks. Safety and security requirements should therefore be engineered to ensure that residual safety and security risks will be acceptable to system stakeholders. The first post in this series explored problems with quality requirements in general and safety and security requirements in particular. The second post, took a deeper dive into key obstacles that acquisition and development organizations encounter concerning safety- and security-related requirements. This post introduces a collaborative method for engineering these requirements that overcomes the obstacles identified in earlier posts.
Anyone involved in building safety- and security-critical systems needs to consider the following:
Are you building a safety-critical system or one that must be secure from attack?
Do your safety and security engineers begin their work only after the architecture is engineered, rather than building it in from the start via safety- and security-related requirements?
Do your safety and security engineers develop their work products (documents and models) independently of each other and requirements engineers?
Do your requirements specifications largely ignore safety, security, or both?
Are many of your safety and security requirements so general that they are meaningless, such as "The system shall be safe and secure from attack?"
Are most of your safety- and security-related requirements merely architecture and design constraints that prevent safety- and security-engineers from collaborating with architects to create innovative solutions?
Is use-case modeling or structured analysis your primary or only requirements-analysis method, even when engineering safety- and security-related requirements?
If you answer yes to any of these questions, then your safety, security, and requirements engineers can benefit from a better way of engineering their requirements. To achieve this goal, an appropriate safety- and security-requirements analysis method is needed.
We propose using the Engineering Safety- and Security-related Requirements (ESSR) method, which consists of the following analysis-based tasks.
Stakeholder analysis determines the stakeholders who have a vested interest in the safety and security of the system and the appropriate sources for eliciting safety and security goals and requirements. Safety- and security-engineers collaborate to identify the safety- and security-related stakeholders in the system and the assets that the system must defend from accidental and malicious harm. These stakeholders are modeled by producing stakeholder profiles and creating an initial partial list of the stakeholder’s safety- and security-goals.
Asset analysis determines the assets that must be protected from unauthorized harm and the harm that these assets must be protected from. Safety- and security-engineers collaborate to identify the assets that the system must protect from harm. They model each defended asset by categorizing it, determining its value, identifying the types and severities of harm that it may suffer, and determining its stakeholders.
Abuse analysis examines the ways that the system and the assets for which the system is responsible can be abused. Specifically, this task identifies the different types of abuses including safety mishaps (accidents and safety incidents) and security misuses (attacks and security incidents) that can occur. Abuse analysis also identifies which assets that the abuses can harm, in what manner, and to what degree. Safety- and security-engineers model these abuses using appropriate techniques (e.g., abuse case modeling, attack trees) and create abuse profiles.
Vulnerability analysis determines the existence of the system-internal weaknesses or defects that can enable abuses (mishaps and misuses) to occur. Safety- and security engineers identify the credible potential system-internal vulnerabilities (e.g., defects and weaknesses) that could enable the abuses that may harm the defended assets. They also model these vulnerabilities using appropriate techniques such as STAMP-Based Process Analysis (STPA), Event Tree Analysis (ETA), Fault Tree Analysis (FTA), or Failure Modes and Effects Analysis (FMEA).
Abuser analysis determines the system-external people and things that can accidentally or maliciously abuse the system and the assets that it must defend from unauthorized harm. Safety- and security engineers identify the credible potential abusers that could exploit the vulnerabilities and thereby cause the abuses that may harm the defended assets. They model these abusers using appropriate techniques (e.g., STPA, abuse case modeling, task analysis, or user profiling).
Danger analysis determines the dangers (i.e., safety hazards and security threats), which are cohesive sets of conditions involving the existence of abusers, vulnerabilities, and assets that could increase the probability of abuses occurring. When restricted to safety and security, danger analysis is often called hazard analysis or threat analysis, even though they typically include all of these types of analysis. Safety- and security engineers model these safety hazards and security threats using appropriate techniques (e.g., operator task analysis, ETA, FTA, and FMEA).
Risk analysis determines the maximum acceptable residual safety and security risks as well as the specific types of assets, harm, vulnerabilities, abusers, and dangers that are associated with these risks. Safety- and security engineers model these risks using appropriate techniques (e.g., calculating risk level as the product of probability times harm severity, using degrees of software control instead of probabilities, and risk matrices).
Safety- and security-significance analysis identifies the goals and requirements that have safety and security ramifications so the corresponding parts of the system can be implemented using a process having the appropriate level of rigor and completeness, e.g., to justify the use of a more powerful (and therefore more expensive) development process. Safety- and security engineers categorize requirements into safety/security assurance levels (SALs), such as safety-critical and security-critical, based on the degree to which the requirements have safety and security ramifications. They collaborate with requirements engineers to update the requirements repository by annotating requirements with their SALs. Based on how these categorized requirements are allocated to architectural components, they assign the components safety/security evidence assurance levels (SEALs) that determine the degree of completeness and rigor to be used when architecting, designing, implementing, integrating, and testing these components. In other words, components with high SEALS should be as small as practical to minimize the increased effort, cost, and schedule needed to develop them. Finally, they update the certification repository with the results of safety- and security-significance analysis.
Defense determination determines the appropriate defenses (i.e., controls including safeguards and security countermeasures) that are needed to defend the system and its associated defended assets from unauthorized harm. Safety- and security engineers perform a gap analysis to identify potential new defenses. They then evaluate these potential defenses using appropriate techniques (e.g., engineering analyses, product and vendor trade studies).
Where appropriate (except for the safety- and security-significance analysis task), safety- and security engineers create safety and security goals for each type of analysis and then collaborate with the requirements engineers to transform these goals into requirements to prevent, detect, and react to it. They then update the certification repository with the results of the analysis. Also, where appropriate, they collaborate with requirements engineers to transform these informal restraints into official requirements. Finally, where appropriate, this information is stored in the certification repository to eventually support the system’s safety and security accreditation and certification.
The above tasks result in the engineering of multiple types of associated safety and security requirements (e.g., prevention, detection, and reaction requirements as well as safety and security constraints). All such possible requirements, however, are rarely appropriate for most systems. The harm severity and likelihood of the associated mishaps and misuses may not justify the cost of producing and using the resulting safety- and security-defenses. Some requirements make others unnecessary, e.g., a requirement preventing the existence of a vulnerability may eliminate the need for a requirement to prevent an abuse enabled by that vulnerability. On the other hand, high-level requirements associated with the early analysis steps (e.g., prevent harm to a defended asset) may be used to derive lower-level requirements associated with later analyses steps (prevent vulnerability that enables abuse to harm the defended asset).
The tasks of ESSR described above are best performed in an evolutionary (i.e., incremental, iterative, and concurrent) manner. Due to the evolutionary nature of ESSR, the temporal ordering of the preceding sequence of analyses is merely a logical simplification to improve understandability; a waterfall approach to safety and security is neither intended nor recommended. Safety, security engineers, and requirements engineers should also perform these tailorable tasks in a collaborative manner. At the end of this process, comprehensive safety and security analyses will have been performed and documented, safety and security goals will have been turned into their corresponding requirements, and the certification repository will contain the analysis- and requirements-related safety and security evidence needed for accreditation and certification.
The preceding ESSR method for collaboratively engineering safety- and security-related requirements is described in considerably more detail in tutorials, a class, and a book to be published early in 2012.
Additional Resources:
For more information, please visit www.sei.cmu.edu/library/abstracts/presentations/icse-2010-tutorial-firesmith.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:57pm</span>
|
By Felix Bachmann, Senior Member of the Technical Staff, Research, Technology, and System Solutions
Bursatec, the technology arm of Groupo Bolsa Mexicana de Valores (BMV, the Mexican Stock Exchange), recently embarked on a project to replace three existing trading engines with one system developed in house. Given the competitiveness of global financial markets and recent interest in Latin American economies, Bursatec needed a reliable and fast new system that could work ceaselessly throughout the day and handle sharp fluctuations in trading volume. To meet these demands, the SEI suggested combining elements of its Architecture Centric Engineering (ACE) method, which requires effective use of software architecture to guide system development, with its Team Software Process (TSP), which teaches software developers the skills they need to make and track plans and produce high-quality products. This posting—the first in a two-part series—describes the challenges Bursatec faced and outlines how working with the SEI and combining ACE with TSP helped them address those challenges.
ChallengesThe team of Bursatec software architects faced a significant challenge in designing their new trading system: only one team member had significant experience in designing a financial software system. We felt the ACE methods would help the team better understand what software architecture means, particularly when thinking about abstractions and solving quality attribute problems. Another complicating factor was that Bursatec wanted to combine stock market trading with derivative market trading on the same platform to reduce operating costs and provide a single, high-throughput, low-latency, high-confidence interface to external financial markets.
Getting Started One of our first steps was to conduct a Quality Attribute Workshop in which the Bursatec stakeholders defined the five most important quality attribute requirements (also known as quality attribute scenarios) their new trading system had to fulfill. To guide the system design, the Bursatec architecture team used the Architecture Driven Design (ADD) method. ADD is a decomposition method based on transforming quality attribute scenarios into an appropriate design.
Not surprisingly, given the importance of speed for the new system, the stakeholders identified runtime performance as one of the most important quality attribute scenarios. The performance quality attribute scenario coupled with high availability requirements, led the team to realize that conventional approaches, such as a three-tier architecture, were not the best solution for their new system. Consequently, the architecture team spent the next two weeks exploring various solutions, as well as the potential negative outcomes of each proposed solution.
At the end of the two weeks, using rigorous Architecture Tradeoff Analysis Method (ATAM) techniques, the team of Bursatec architects had to present its findings—as well as evidence (including measures) that its chosen approach was correct—to an SEI software architecture coaching team that challenged each scenario. Every subsequent two weeks, the Bursatec software architects had to present solutions with appropriate evidence for the scenarios they had created. For example, with respect to the performance requirement, the team demonstrated how a stock order would traverse the system, estimating and measuring the timing required for every step. With each review, SEI coaches identified risks associated with a particular approach. For example, the team identified one risk with respect to performance: synchronizing with backup systems would throw off the timing. In all, there were three iterations of the architecture, each lasting six weeks.
At the start of the second iteration, the SEI software architecture coaches brought in the team of Bursatec developers to begin working on prototypes, specifically focusing on risks (such as the timing of querying complex data structures) that could not be addressed solely via software architecture. This important step allowed developers to deeper their understanding of the architecture and familiarize themselves with the problem, which was a lengthy process. The developers had six weeks to implement the prototypes; at the beginning of the third iteration the developers returned and presented their results to the architecture team. This process enabled the architects to finalize their architecture design using the results from the prototypes.
An interesting benefit to this style of architectural coaching was that the Bursatec architects used Enterprise Architect (a Unified Modeling Language-based tool ) from the onset to document, evaluate, and justify their solutions. Although architecture documentation is often an afterthought, it became second nature to the Bursatec architects. The architects focused only on the documentation that was either needed to provide sufficient evidence that the system would support the quality attribute requirements or required by the developers to effectively implement prototypes and the subsequent system.
Improving Delivery with TSPThe Team Software Process (TSP) is a team-centric approach to developing software that enables organizations to better plan and measure their work, and improve software development productivity to gain greater confidence in quality and cost estimates. Our coaches emphasized the incorporation of TSP principles throughout the architecture design process with Bursatec. The use of TSP enabled the Bursatec architects to prepare, estimate, and track their work. In this case, the Bursatec architects were also able to time-box their iterations, an approach that the SEI finds effective. These activities initially proved challenging because TSP is oriented more toward programming, so the measures employed by developers typically apply to lines of code, classes, requirements pages, or other tangible, implementation-oriented measures. To create a measureable work unit, the Bursatec architects used the quality attribute scenarios as a size measure.
The SEI architecture team recognized that each quality attribute scenario would be refined into about five more detailed quality attribute scenarios that address special cases. The SEI team also recognized that the Bursatec architects would have to create at least three to five diagrams and descriptions to fulfill each scenario. The Bursatec team then estimated how long it would take to create each diagram with a description. We found this measure proved a good tool for determining how long it would take to complete the architecture, which has been hard to estimate in our prior work with organizations. This approach to measuring and estimating work allowed the Bursatec architects to provide accurate estimates of deadlines to their management team.
Integrating the ACE architecture within the TSP management process gave the Bursatec architects an effective framework in which to work. While it did restrict some of their freedom, it also proved helpful. For example, the architects’ work was structured into iterations, each with different goals. The first iteration focused solely on discovering problematic areas of the system based on achievement of the necessary quality attribute scenarios. In subsequent iterations the architects gradually added details to the system design to include support of all quality attribute scenarios. This iterative method enabled them to create a software architecture organically, for the whole system, that was well understood, justified, and accepted by the team.
Building and Evaluating the SystemOnce the architecture was complete, the SEI architecture coaching team conducted an active design review in which the Bursatec architects communicated the entire architecture to the developers in a structured way. Next, conformance reviews were conducted during which the developers needed to provide evidence to the architects that the systems they were building conformed to the architecture. These reviews reinforced that the whole system would meet the needs of the stakeholders.
To date, the development of the new trading system for Bursatec has progressed on schedule and within budget. Moreover, early tests confirmed that the trading system performance far exceeded expectations. The combination of TSP and ACE proved an ideal approach for the development of the trading system. TSP brought discipline and measurement, while ACE provided a set of robust architectural techniques that focus on business goals and quality requirements. Both approaches together support the whole development lifecycle, emphasizing business and quality goals, engineering excellence, defined processes, process discipline and teamwork.
This post is the first in a two-part series describing our recent engagement with BMV. The next post focuses on the TSP framework that provided planning, scheduling, estimation, and tracking in the project.
Additional Resources:
For more information about the SEI’s work in Architecture Centric Engineering (ACE), please visitwww.sei.cmu.edu/about/organization/rtss/ace.cfm
For more information about the SEI’s work in the Team Software Process (TSP), please visitwww.sei.cmu.edu/tsp/
To read the SEI technical report, Combining Architecture-Centric Engineering with the Team Software Process, please visitwww.sei.cmu.edu/library/abstracts/reports/10tr031.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:57pm</span>
|
By James McHale, Senior Member of the Technical Staff, Software Engineering Process Management
This post is the second installment in a two-part series describing our recent engagement with Bursatec to create a reliable and fast new trading system for Groupo Bolsa Mexicana de Valores (BMV, the Mexican Stock Exchange). This project combined elements of the SEI’s Architecture Centric Engineering (ACE) method, which requires effective use of software architecture to guide system development, with its Team Software Process (TSP), which is a team-centric approach to developing software that enables organizations to better plan and measure their work and improve software development productivity to gain greater confidence in quality and cost estimates. The first post examined how ACE was applied within the context of TSP. This posting focuses on the development of the system architecture for Bursatec within the TSP framework.
Challenges From a TSP perspective, the project faced several challenges. First, the few developers who had worked on the existing system had either moved into management or possessed technical skills that were out-of-date with modern development technologies. Second, the remaining developers, while competent, did not have experience in building the type of system that Bursatec needed. Another challenge was that several executives within the organization were in favor of outsourcing the work.
Our Approach In the Bursatec project, we initially followed the standard TSP implementation approach, which emphasizes the importance of initially securing senior management commitment. This commitment is typically established via a TSP Executive Strategy Seminar, which covers the key practices and principles of TSP from a senior management perspective. Although Bursatec is a large organization, with several layers of checks and balances befitting a national stock market, the organization itself was very open, which allowed for streamlined communication between senior managers and the engineering team.
In this open environment, the director at Bursatec, as well as his boss—who was president of the Mexican Stock Exchange—participated in the executive training. The executive training included an overview of rational management, the idea that management decisions should be made based on objective facts and data, and why this type of management is required to maintain successful TSP teams. We then trained the team leader of the project, as well as several other peers and senior developers at Bursatec in the basics of day-to-day management of TSP teams.
We next trained the entire Bursatec development team—including the architects and team leader—in the fundamentals of the Personal Software Process (PSP), which teaches individual software engineers how to plan and manage high-quality software development work. The team would go on to apply the PSP concepts in a project, team-based environment. In an unusual development, the Bursatec director attended this class and also authored several programs using PSP methods, which he did as well as any of the developers. Having such a senior manager there—not just in the class but using the methods—sent the strong message that this was how "we" would be working going forward.
After completing the PSP training, we conducted a Quality Attribute Workshop. This workshop is an architecture activity where Bursatec stakeholders defined the five most important quality attribute requirements (also known as quality attribute scenarios) that their new trading system had to fulfill. Not surprisingly, given the importance of speed for the new system, the stakeholders identified runtime performance as chief among the most important quality attribute scenarios.
For the Bursatec developers, one benefit of defining quality attributes is that the practice placed significant emphasis on ensuring that the attributes be measurable. For example, the performance attribute was measured in two ways: the time for individual transactions (how fast each one was processed) and the throughput (how many transactions per second on an ongoing basis). In this context there is perfect harmony between what the ACE approach asks architects to do and what the TSP approach demands of developers. TSP teams receive fairly general direction for eliciting and capturing such quality attributes, the understanding of which often drives a project’s structure in addition to the structure of the developed product. With the Bursatec project, the ACE methods provided clear, specific direction on the early lifecycle issues that TSP normally leaves to local practice. Later in the project, TSP drove a disciplined implementation of the architecture that might otherwise have eluded developers.
The TSP Launch Immediately after the conclusion of the Quality Attribute Workshop, we conducted the TSP launch, which is a series of nine meetings held during the course of four days in which the team reaches a common understanding of the work and the approach that it will take and produces a detailed plan to guide its work. The TSP launch includes producing the necessary planning artifacts (such as goals, roles, estimates, task plan, milestones, quality plan, and risk mitigation plan) that brought together a team of 14 members, including the team leader. Our goal was to plan the architecture activities in the context of supporting Bursatec and their existing time and budget constraints.
During the launch, about half of the team focused on the architecture, including several people who were brought in as domain experts. These individuals were experts at interpreting the functional requirements and ensuring that the developers met them. For example, one individual had expertise in the Mexican Stock Exchange while another domain expert had extensive experience in the options and futures markets, specifically how those instruments are traded in Mexican markets.
The other half of the team, seven developers, focused on two important needs for the system: high- speed communication and a testing framework. To successfully develop the system in the timeframe needed for Bursatec, it was critical that the system be tested automatically rather than manually. Testing (including regression testing to ensure that no other aspects of the system are compromised) of new functionality on the current system takes as much as a month and is performed manually. This testing motivates a quality attribute scenario for rapid testability of most new functions within a day, which leads naturally to an architecture that supports automated testing.
The Bursatec developers then implemented the system’s underlying infrastructure based on an early version of the system architecture, while the architects elaborated their work based in part on the early developer work that supported a decision to purchase a particular commercial package for high-speed communication. This version of the architecture was subject to an Architecture Tradeoff Analysis Method (ATAM) review that ensured the quality attribute scenarios captured in the QAW were still the right ones, and that the proposed architecture addressed those scenarios.
After the initial architecture iterations and the ATAM, the architects and other developers worked as a single, integrated team, removing the potential issues that sometimes arise when software architects throw their artifacts "over the wall" to developers. The architects dealt with issues and revised the architecture as necessary while shouldering a normal development workload. The team named role managers—a TSP concept—to focus on issues surrounding performance and garbage collection, two implementation issues critical to the success of the new trading system.
Measurable Results While TSP can be used to manage all aspects of the software development phase, from requirements elicitation to implementation and testing, this is the first time that the approach has been applied to ACE technologies. The combination of these approaches offered Bursatec architects and developers a disciplined method for developing the software for their new trading engine. Through 6 major development cycles including 14 or so iterations over 21 months, the overall team developed over 200,000 lines of code, spending about 12 percent of their effort after the Quality Attribute Workshop on architecture and approximately 14.5 percent of effort in unit testing, performance testing, and integration testing.
In contrast, the SEI would normally expect almost twice as much testing effort at this point in development, with potentially much more in system testing to push the overall total close to or beyond the 50-percent mark—an unfortunately realistic expectation in our industry. As of October 2011, system testing at Bursatec proceeds on schedule with a very low defect count (unusual in our experience), and the system is on target for deployment beginning in early 2012. Due to the early investment in architecture and a detailed, data-driven approach to managing both their schedule and their quality, less testing was required throughout system development.
Another benefit of combining TSP with ACE is that the team of Bursatec developers was prepared for inevitable changes in the architecture requirements, indeed in changes of any sort over the 21 months of development. When the team received new requirements, it could evaluate them quickly for technical impact and implementation cost in terms of time and effort. With the quality attributes formally captured, the architecture in place, and detailed development plans at every step, a project with enormous risk potential in both technical and business terms ran on-time, within budget, and generally without the drama that large development efforts often exhibit.
Additional Resources:
To read the SEI technical report, Team Software Process (TSP) Body of Knowledge (BOK), please visit www.sei.cmu.edu/library/abstracts/reports/10tr020.cfm
For more information about the SEI’s work in Architecture Centric Engineering (ACE), please visitwww.sei.cmu.edu/about/organization/rtss/ace.cfm
For more information about the SEI’s work in the Team Software Process (TSP), please visitwww.sei.cmu.edu/tsp/
To read the SEI technical report, Combining Architecture-Centric Engineering with the Team Software Process, please visitwww.sei.cmu.edu/library/abstracts/reports/10tr031.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:57pm</span>
|
By Julia Allen, Principal ResearcherCERT Program
The SEI has devoted extensive time and effort to defining meaningful metrics and measures for software quality, software security, information security, and continuity of operations. The ability of organizations to measure and track the impact of changes—as well as changes in trends over time—are important tools to effectively manage operational resilience, which is the measure of an organization’s ability to perform its mission in the presence of operational stress and disruption. For any organization—whether Department of Defense (DoD), federal civilian agencies, or industry—the ability to protect and sustain essential assets and services is critical and can help ensure a return to normalcy when the disruption or stress is eliminated. This blog posting describes our research to help organizational leaders manage critical services in the presence of disruption by presenting objectives and strategic measures for operational resilience, as well as tools to help them select and define those measures.
In April 2011, the DoD identified the engineering of resilient systems as a top strategic priority in helping to protect against the malicious compromise of weapons systems and to develop agile manufacturing for trusted and assured defense systems. SEI CERT has been exploring the topic of managing operational resilience at the organizational level for the past seven years through development and use of the CERT Resilience Management Model (CERT-RMM), a capability model designed to establish the convergence of operational risk and resilience management activities and apply a capability level scale that expresses increasing levels of process performance. CERT-RMM measures the ability of an organization to protect and sustain high-value services (which are organizational activities carried out in the performance of a duty or production of a product) and high-value assets (which are items of value to the organization, such as people, information, technology, and facilities that high-value services rely on). Resilient systems, as identified by the DoD, is one category of technology asset.
Our research on resilience measurement and analysis focuses on addressing the following questions, which are often asked by organizational leaders:
How resilient is my organization?
Have our processes made us more resilient?
What should be measured to determine if performance objectives for operational resilience are being achieved?
To establish a basis for measuring operational resilience, we relied on the CERT-RMM as the process-based framework against which to measure. CERT-RMM comprises 26 process areas (such as Incident Management and Control (IMC) and Asset Definition and Management (ADM)) that provide a framework of goals and practices at four increasing levels of capability (Incomplete, Performed, Managed, and Defined.)
Our initial work provided organizational leaders with tools to determine and express their desired level of operational resilience. Specifically, we defined high-level objectives for an operational resilience management program, for example, "in the face of realized risk, the program ensures the continuity of essential operations of high-value services and their associated assets." We then demonstrated how to derive meaningful measures from those objectives using a condensed Goal Question (Indicator) Metric method, for example, determining the probability of delivering service through a disruptive event. We also defined a template for defining resilience measures and presented example measures using the template.
Too often, organizations collect "type count" measurements (such as numbers of incidents, systems with patches installed, or people trained) with little meaningful context on how these measures can help inform decisions and affect behavior. Based on the Goal Question (Indicator) Metric method outlined above, we identified strategic measures that help organizational leaders determine which process-level measures best address their needs. What follows is a description of five organizational objectives for managing operational resilience and 10 strategic measures for an operational resilience management (ORM) program. The ORM program defines an organization’s strategic resilience objectives (such as ensuring continuity of critical services in the presence of a disruptive event) and resilience activities (such as the development and testing of service continuity plans). We use an example of acquiring managed security services from an external provider to show how each measure could be used. Managed security services may include network boundary protection (such as firewalls and intrusion detection systems), security monitoring, incident management (such as forensic analysis and response), vulnerability assessment, penetration testing, and content monitoring and filtering.
Organizational objective 1: The ORM program derives its authority from—and directly traces it to—organizational drivers (which are strategic business objectives and critical success factors), as indicated by the following measures:
Measure 1: Percentage of resilience activities that do not directly (or indirectly) support one or more organizational drivers.
Example use: External security services replace comparable in-house services with a lower cost (less effort) and more effective (less impact from incidents) solution. After external security services are operational, 75 percent of in-house efforts no longer support organizational drivers. This measure can be used to ensure an effective transition of designated in-house services to externally-provided services and to retrain/reassign staff currently performing such services.
Measure 2: For each resilience activity, the number of organizational drivers that require it to be satisfied (the goal is equal to or greater than 1).
Example use: An example of a resilience activity is formalizing a relationship with a security services provider using a contract or service level agreement (SLA) that includes all resilience specifications. There is at least one organizational driver that calls for having security services in place to achieve the driver. This driver likely maps to a personal objective of the chief information officer or chief security officer. If there is no such traceability, one or more drivers may require updating.
Organizational objective 2: The ORM program satisfies resilience requirements that are assigned to high-value services and their associated assets, as indicated by the following measures:
Measure 3: Percentage of high-value services that do not satisfy their assigned resilience requirements.
Example use: Resilience requirements for security services are specified in the SLA. Provider performance is periodically reviewed to ensure that all services are meeting the SLA requirements (for example, high priority alerts from incident detection systems are resolved within xx minutes). Optimally, this percentage should be zero. If it is greater than an SLA-stated threshold (for example, 20 percent for service A), corrective action is taken and confirmed.
Measure 4: Percentage of high-value assets that do not satisfy their assigned resilience requirements.
Example use: This example is similar to the one above. The incident database is a high-value asset that is required to provide incident response services. The SLA specifies resilience requirements for this database, including daily automated backups and quarterly and event-driven (backup server upgrade and high-impact security incident) testing to ensure the provider’s ability to successfully restore from backups. Optimally, this percentage should be zero. If it is greater than an SLA-stated threshold (for example, 20 percent for asset B), corrective action is taken and confirmed.
Organizational objective 3: The ORM program—via the internal control system—ensures that controls for protecting and maintaining high-value services and their associated assets operate as intended, as indicated by the following measures:
Measure 5: Percentage of high-value services with controls that are ineffective or inadequate.
Example use: The SLA identifies the controls (policies, procedures, standards, guidelines, tools, etc.) that are required by a service. These controls can be tailored versions of the controls that the organization uses or can be negotiated based on the provider’s standard suite of controls. Provider implementation of these controls is periodically reviewed (audited, assessed, scans performed, etc.). Optimally, this percentage should be zero. If it is greater than an SLA-stated threshold (for example, 20 percent for service A), corrective action is taken and confirmed.
Measure 6: Percentage of high-value assets with controls that are ineffective or inadequate.
Example use: This measure is as described above, with asset controls stated in the provider SLA.
Organizational objective 4: The ORM program manages operational risks to high-value assets that could adversely affect the operation and delivery of high-value services, as indicated by the following measures:
Measure 7: Confidence factor that risks from all sources that require identification have been identified.
Example use: Major sources of risk are initially identified in the provider SLA and as part of an ongoing review based on changes in the operational environment within which services are provided. The elements that contribute to "confidence factor" (such as risk thresholds by service) are also identified. Confidence factor is represented as a Kiviat diagram showing plan versus actual for all sources. Analysis of provider gaps is reviewed on a periodic basis and corrective action is taken and confirmed to reduce unacceptable gaps.
Measure 8: Percentage of risks with impact above threshold.
Example use: Assessment of provider risk is performed on a periodic basis as specified in the SLA. Optimally, this percentage should be zero. If it is greater than an SLA-stated threshold (for example, 20 percent for risk type A), corrective action is taken and confirmed.
Organizational objective 5: The ORM program ensures the continuity of essential operations of high-value services and their associated assets in the face of realized risk, as indicated by the following measures:
Measure 9: Probability of delivered service through a disruptive event.
Example use: The SLA states service-specific availability and service levels to meet, both steady state and in degraded mode. Provider performance is periodically reviewed, including during and after a disruptive event (power outage, cyber attack, etc.). Probability of delivered service is determined and evaluated as a trend over time. Corrective action is taken and confirmed as required.
Measure 10: For disrupted, high-value services with a service continuity plan, percentage of services that did not deliver service as intended throughout the disruptive event.
Example use: The SLA includes requirements for service-specific continuity (SC) plans. For provider services with SC plans that do not maintain required service availability and service levels, corrective actions are taken and confirmed, including updates to SC plans. In addition, the customer uses this as an opportunity to review and update its own SC plans that depend on provider services, where service was not delivered as intended.
All these strategic measures derive from lower-level measures at the CERT-RMM process area level, including average incident cost by root cause type and number of breaches of confidentially and privacy of customer information assets resulting from violations of provider access control policies.
To help organizational leaders determine what measures work best for their organization, we are collaborating with members of the CERT-RMM Users Group, which includes the United States Postal Inspection Service, Discover Financial Services, Lockheed Martin, and Carnegie Mellon University. Through a series of two-day workshops, members define an improvement objective, assess their current level of operational resilience against that objective, identify areas of improvement, and implement improvement plans using the CERT-RMM processes and candidate measures as the guide. Please contact us if you are interested in joining a CERT-RMM Users Group.
Additional Resources:
To read the SEI technical note, Measuring Operational Resilience Using the CERT Resilience Management Model, please visit www.sei.cmu.edu/reports/10tn030.pdf
To read the SEI technical note, Measures for Managing Operational Resilience, please visit www.sei.cmu.edu/library/abstracts/reports/11tr019.cfm
For more information about the CERT Resilience Management Model (CERT-RMM), please visitwww.cert.org/resilience/rmm.html
To read an article about how the CERT Resilience Management Model helps companies predict performance under stress, please visit page 8 of the SEI 25th Anniversary Year in Review,www.sei.cmu.edu/library/assets/annualreports/2010_Year_in_Review.pdf
To read an article about CERT work in Resilience Measurement, please visit page 4 of the SEI 25th Anniversary Year in Review,www.sei.cmu.edu/library/assets/annualreports/2010_Year_in_Review.pdf
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:57pm</span>
|
By David French, CERT Senior Researcher
Malware, which is short for "malicious software," is a growing problem for government and commercial organizations since it disrupts or denies important operations, gathers private information without consent, gains unauthorized access to system resources, and other inappropriate behaviors. A previous blog post described the use of "fuzzy hashing" to determine whether two files suspected of being malware are similar, which helps analysts potentially save time by identifying opportunities to leverage previous analysis of malware when confronted with a new attack. This posting continues our coverage of fuzzy hashing by discussing types of malware against which similarity measures of any kind (including fuzzy hashing) may be applied.
Fuzzy hashes provide a continuous stream of hash values for a rolling window over the malware binary, thereby allowing analysts to assign a percentage score that indicates the degree of similarity between two malware programs. When considering how fuzzy hashing works against malware, it is useful first to consider why malware programs would ever be similar to each other. For the purposes of this discussion we focus on prevalent Microsoft Portable Executable (PE) formatted files, although this description can be generalized to any executable code stored in any format. We further consider similarity as a measure of file structure—rather than program behavior—since fuzzy hashing generally applies to the bytes comprising a file, rather than an observation of the semantics of a program in some other space.
Malware is software combining three elements: (1) code, whether compiled source code written in a high-level language or hand-crafted assembly, (2) data, which is some set of numerical, textual, or other types of discrete values intended to drive the logic of the code in specific ways, and (3) process, which is loosely a set of operations (for example, compiling and linking) applied to the code and data that ultimately produce an executable sequence of bytes in a particular format, subject to specific operating constraints. Given a distinct set of code, data, and consistent processes applied thereto, it is reasonable to conclude that—barring changes to any of these—we will produce an identical executable file every time we apply the process to the code and data (where identity is measured using a cryptographic hash, such as MD5). We now consider how the permutation of any of these components will affect the resulting executable file.
First, let us consider the effect of modifying the data used to drive a particular executable. With respect to malicious software, such data may include remote access information (such as IP addresses, hostnames, usernames and passwords, commands, etc.), installation and configuration information (such as registry keys, temporary filenames, mutexes, etc.), or any other values which cause the malware to execute in specific ways. Generally speaking, changing the values of these data may cause different behavior in the malware at runtime but should have little impact on the structure of the malware.
Malware authors may modify their source code to use different data values for each new program instance or may construct their program to access these data values outside the context of the compiled program (for example, by embedding the data within or at the end of the PE file). In the case of malicious code, data may also include bytes whose presence does not alter the behavior of the code in any way, and whose purpose is to confuse analysis. Regardless, the expected changes to the resulting executable file are directly proportional to the amount of data changed. Since we only changed the values of data—not the way in they are referenced (in particular, we have not changed the code)—we can expect that the structure of the output file is modified only to support any different storage requirements for the new data.
Similarly, let us consider the effect of modifying the code found in a particular executable. The code defines the essential logic of the malware and describes the behavior of the program under specified conditions. To modify program behavior, the code must generally be modified. The expected changes to the resulting executable file are proportional to the amount of code changed, much as we expect when changing data. However, code—especially compiled code—differs from data in that the representation of the code in its final form is often drastically different from its original form. Compiling and linking source code represents a semantic transformation, with the resulting product intended for consumption by a processor, not a human reader.
To accomplish semantic transformation most effectively, the compiler and linker may perform all manner of permutations, such as rewriting blocks of code to execute more efficiently, reordering code in memory to take up less space, and even removing code that is not referenced within the original source. If we assume that the process to create the executable remains constant (for example, that optimization settings are not changed between compilations), we must still allow that minor changes in the original source code may have unpredictably large changes in the resulting executable. As a consequence, code changes are more likely to produce executables with larger structural differences between revisions than executables where only data changes.
Thus, we have described two general cases in which structurally different files (measured by cryptographic hashing, such as MD5) may be produced from a common source. We refer to malware families whose primary or sole permutation is in their data as generative malware, and use the analogy of a malware factory cranking out different MD5s by modifying data bytes in some way. We refer to malware families whose primary permutation is in their code as evolutionary malware, in that the behavior of the program evolves over time. When considering the effects of similarity measurements such as fuzzy hashing, we may expect that fuzzy hashing will perform differently against these different general types of malware.
As an example of using fuzzy hashing against generative malware, consider the malware family BackDoor-DUG.a (also referenced here) also known as Trojan.Scraze by ClamAV and W32/ScreenBlaze.A2 by F-Prot (ClamAV and F-Prot are antivirus vendors and it’s important to note that the same family is known by several different names). The two files referenced from the McAfee site are Delphi programs, comprising 4,185 functions at distinct addresses as observed by disassembling each program using IDA-Pro v6.1. If we consider each function as a sequence of bytes and consider the cryptographic hashes of each function’s bytes using a technique called function hashing, when we observe that these programs have approximately 3,321 unique functions each, per their position independent code (PIC) function hashes. Of these 3,321 functions distinct to each program, we observe that 3,292 of these functions are shared (meaning their bytes are exactly the same) between the programs, and that each program has 29 functions not shared with the other program.
Inspecting each of the 29 functions in each of two files (for a total of 58 functions) in IDA-Pro, we discover that for all 29 pairs of functions found at the same address across the two files, the functions at the same addresses only differ by large blocks of seemingly non-executed data, which is jumped around by the code bytes. Otherwise, code bytes for each of the 29 function pairs at corresponding addresses are identical. In this way, we can observe that the two programs are materially identical, except for seemingly non-executed bytes, which we generically call data. By performing ssdeep comparison of these two files we produce the following fuzzy hashes and their associated comparison score:
12288:gp/iN/mlVdtvrYeyZJf7kPK+iqBZn+D73iKHeGspOdqcXigCcCmua1xIam:gpQ/6trYlvYPK+lqD73TeGspOQKUmxpm,"70212f8f88865f4f9bb919383aabc029.ex_"
12288:gp/iN/mlVdtvrYeyZJf7kPK+iqBZn+D73iKHeGsptx6KrPSTKQGLG4a4:gpQ/6trYlvYPK+lqD73TeGspqnKx64,"6f83ac65223e2ac7837bfe3068da411c.ex_"70212f8f88865f4f9bb919383aabc029.ex_ matches 6f83ac65223e2ac7837bfe3068da411c.ex_ (85)
Matching these files using ssdeep corroborates our findings using analysis of these files by function data, in that they are highly similar. These two files thus provide a good example of generative malware.
When considering how code changes can affect fuzzy hashing, we consider briefly non-malicious software for which we have full source code. The Nullsoft Scriptable Installation System (NSIS) is an open-source installation system used to create Windows-based installation programs. Although NSIS is not malicious software it can be used to install many different types of programs on Windows computers, including malicious and non-malicious programs alike.
The project page for NSIS provides several revisions; we examined the two most recent Versions 2.45 (MD5 sum af193ccc547ca83a63eedf6a2d9d644d) and 2.46 (MD5 sum 0e5d08a1daa8d7a49c12ca7d14178830), for which Windows binaries are available. The two files comprise 6,038 and 6,040 functions at distinct addresses, respectively, with 2,564 unique functions (as measured by their PIC function hashes). These two programs have 2,544 identical functions, with 20 different functions each. The differing functions have changes that range from identical functions using different constants to entirely new functions with no overlapping behavior. Regardless, the vast majority of the behavior of these two programs is identical. We perform ssdeep comparison of these two files, and produce the following fuzzy hashes, and their associated comparison score:
12288:p24n/P3WRlauwYyPd7K67jBOs/skXMujtiEs6vHG9Uu94yGjbgWsvvs0V:k4n3GRMuwYyV26XDRiE6qu+yJWsXsa,"nsis-2.45/NSIS.exe"
12288:lWe4uCFAtIma4w3PE6EPYL/t+32gNjw6ps6cg1eHgfKkx71DS0V:Ie4ugwIma4O86YnE6pxKgCg71Sa,"nsis-2.46/NSIS.exe"
nsis-2.45/NSIS.exe matches nsis-2.46/NSIS.exe (0)
As seen from the score of zero from ssdeep, fuzzy hashing does not detect any relationship between these two files even though function analysis revealed that the majority of the behavior of these two files is the same. This result is borne out by reading the release notes for V2.46 from the NSIS website, which documents relatively minor changes. When we compare two files whose known changes are relatively few, we can see that, although the evolution of these two programs is relatively minor in terms of absolute number of changes to functionality, their structure is clearly different enough that fuzzy hashing such as ssdeep was completely unable to detect similarity. This highlights the challenging problem of similarity measurements in malicious code, and underscores the need to understand the underlying reasons that similarity would ever present to any particular technique.
Future blog entries will consider alternate fuzzy hashing approaches and tools, and discuss some of the challenges of performing fuzzy hashing at scale.
This post is the second in a series exploring David's research in fuzzy hashing. To read the first post in the series, Fuzzy Hashing Techniques in Applied Malware Analysis, please click here.
Additional Resources:
More information about CERT research in malicious code and development is available in the 2010 CERT Research Report, which may be viewed online at www.cert.org/research/2010research-report.pdf
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:56pm</span>
|
By Dionisio de NizSenior Member of the Technical Staff, Research, Technology, and System Solutions
Cyber-physical systems (CPS) are characterized by close interactions between software components and physical processes. These interactions can have life-threatening consequences when they include safety-critical functions that are not performed according to their time-sensitive requirements. For example, an airbag must fully inflate within 20 milliseconds (its deadline) of an accident to prevent the driver from hitting the steering wheel with potentially fatal consequences. Unfortunately, the competition of safety-critical requirements with other demands to reduce the cost, power consumption, and device size also create problems, such as automotive recalls, new aircraft delivery delays, and plane accidents. Our research leverages the fact that failing to meet deadlines doesn’t always have the same level of criticality for all functions. For instance, if a music player fails to meet its deadlines the sound quality may be compromised, but lives are not threatened. Systems whose functions have different criticalities are known as mixed criticality systems. This blog posting updates our earlier post to describe the latest results of our research on supporting mixed-criticality operations by giving more central processing unit (CPU) time to functions with higher value while ensuring critical timing guarantees.
During our research, we observed that different functions provide different amounts of utility or satisfaction to the user. For instance, a GPS navigation function may provide higher utility than a music player. Moreover, if we give more resources to these functions (for example, more CPU time) the utility obtained from them increases.
In general, however, the amount of utility obtained from additional resources does not grow forever, nor does it grow at a constant rate. The additional increment in utility for each additional unit of resource instead decreases to a point where the next increment in utility is insignificant. In such cases, it is often more important to dedicate additional computational resources to another function that is currently delivering lower utility and will deliver a larger increment in utility for the same amount of CPU time.
For example, assuming that we get a faster route to our destination if more CPU time is dedicated to the GPS functionality, it seems obvious that the first route we get from the GPS will give us the biggest increment in utility. If we lack enough CPU time (due to the execution of other critical functions) to run both the GPS and the music player, we will choose the GPS. We may even prefer to give more CPU time (if we discover that more time is available) to the GPS to help avoid traffic jams before we decide to run the music player. Letting the GPS run even longer to select a less traffic-clogged route, however, may give us less utility than running the music player.
At this point, we may prefer to start running the music player if we have more CPU time available. We thus change our allocation preference because the additional utility obtained by giving the GPS more CPU time is less than the utility obtained by giving the music player this time. This progressive decrease in the utility obtained as we give more resources to a function is known as diminishing returns, which can be used to allocate resources to ensure we obtain the maximum total utility possible considering all functions in the system.
Our research uses both the diminishing returns characteristics of low-criticality functions and criticality levels to implement a double-booking computation time reservation scheme. Traditional real-time scheduling techniques consider the worst-case execution time (WCET) of the functions to ensure they always complete before their deadlines by reserving CPU time used only in the rare occasion that the WCET occurs. We take advantage of this fact and allocate the same CPU time for functions of lower-criticality. When both functions request the CPU time reserved for both at the same time, we favor the higher-criticality function and let the lower-criticality miss its deadline.
Our double-booking scheme is analogous to the strategies airlines use to assign the same seat to more than one person. In this case, the seat is given to the person with preferred status (e.g., "gold members"). Our project uses utility—in addition to criticality—to ensure the CPU time that is double booked is given to functions providing the largest utility in case of a conflict (both functions requesting the double-booked CPU time). Our double-booking scheme provides the following two benefits:
It protects critical functions ensuring that their deadlines are always met and
It uses the unused time from the critical functions to run the non-critical functions that produce the highest utility.
Our research is aimed at providing real-time system developers with an analysis algorithm that accurately predicts system behavior when it is running (runtime). Developers use these algorithms during the design phase (design-time) to test whether critical tasks will meet their deadlines (providing assurance), and how much overbooking is possible.
To evaluate the effectiveness of our scheme, we developed a utility degradation resilience (UDR) metric that quantifies the capacity of a CPS to preserve the utility derived from double-booking. This metric evaluates all possible conflicts that can happen due to double booking and how much total utility is preserved after the conflict is resolved by deciding what function gets the double-booked CPU time and what functions are left without CPU time. The utility derived from the preserved functions is then summed to compute the total utility that a specific conflict resolution scheme can preserve.
In theory, a perfect conflict resolute scheme should preserve the maximum possible utility. In reality, however, decisions must be made ahead of time assuming that some critical functions will run for their worst-case execution time (even though they may not) to ensure that they finish before their deadlines. Unfortunately, if they execute for less time, it may already be too late to execute other functions.
Using the UDR metric we compare our scheme against the Rate-Monotonic Scheduler (RMS) and a scheme called Criticality-As Priority Assignment (CAPA) that uses the criticality as the priority. Our experiments showed we can recover up to 88 percent of the ideal utility that we could get if we could fully reclaim the unused time left by the critical functions if we had perfect knowledge of exactly how much time each function needed to finish executing. In addition, we observed our double-booking scheme can achieve up to three times the UDR that RMS provides.
We implemented a design-time algorithm to evaluate the UDR of a system and generate the scheduling parameters for our runtime scheduler that performs the conflict resolutions of our overbooking scheme (deciding which function gets the overbooked CPU time). This scheduler was implemented in the Linux operating system as a proof-of-concept to evaluate the practicality of our mechanisms. To evaluate our scheme in a real-world setting, we used our scheduler in a surveillance UAV application using the Parrot A.R. Drone quadricopter with safety-critical functions (flight control) and two non-critical functions (a video streaming and a vision-based object detection functions).
Our results confirmed that we can recover more CPU cycles for non-critical tasks with our scheduler than with the fixed-priority scheduler (using rate-monotonic priorities) without causing problems to the critical tasks. For example, we avoided instability in the flight controller that can lead to the quadricopter turning upside down. In addition, the overbooking between the non-critical tasks performed by our algorithm, allowed us to adapt automatically to peaks in the number of objects to detect (and hence execution time of the object detection function) by reducing the frames per second processed by the video streaming function during these peaks.
In future work we are extending our investigation to multi-core scheduling where we plan to apply our scheme to hardware resources (such as caches) shared across cores.
This research is done in collaboration with Jeffrey Hansen of CMU, John Lehoczky of CMU’s Statistics Department; and Ragunathan (Raj) Rajkumar and Anthony Rowe of the Electrical and Computer Engineering Department at CMU.
Additional Resources: www.contrib.andrew.cmu.edu/~dionisio/
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:55pm</span>
|
By Douglas C. Schmidt Chief Technology OfficerSEI
As noted in the National Research Council’s report Critical Code: Software Producibility for Defense, mission-critical Department of Defense (DoD) systems increasingly rely on software for their key capabilities. Ironically, it is increasingly hard to motivate investment in long-term software research for the DoD. This lack of investment stems, in part, from the difficulty that acquisitions programs have making a compelling case for the return on these investments in software research. This post explores how the SEI is using the Systems and Software Producibility Collaboration and Experimentation Environment (SPRUCE) to help address this problem.
Decades of public and private research investments—coupled with the inexorable growth of globalization and connectivity—have commoditized many information technology (IT) products and services. For example, commercial off-the-shelf (COTS) hardware and software is now produced faster, cheaper, and generally at a predictable pace. During the past two decades, users and developers of IT systems have benefitted from the commoditization of hardware and networking elements. More recently, the maturation and widespread adoption of object-oriented programming languages, operating environments, and middleware is helping commoditize many software components and end-system layers.
Due to this IT commoditization trend, acquisition professionals, senior leaders, politicians, and funding agencies often assume new software innovations will continue to appear at a predictable pace, and that the DoD can benefit from these innovations without significant investment in software research. While mainstream IT systems may not need this investment, mission-critical DoD systems—particularly at the tactical edge—cannot. Without sustained investment in software research, therefore, the DoD is in danger of "eating the seed corn" and reaching a complexity cap that will make it harder to succeed in an era of budget cuts and other austerity measures.
Challenges to Effective Software Research Impact
One challenge to motivating investment in software research is presenting a convincing pathway for how sponsored research finds its way into practice. The underlying problem for the DoD is the ad hoc and often serendipitous nature by which members of the software community (including academic researchers, defense contractor software architects and developers, DoD acquisition program and research sponsors, as well as commercial tool vendors) collaborate to identify, develop, test, and transition promising software technologies. This lack of systematic collaboration by various groups in the software community outlined above has created a dysfunctional— yet all-too-common—situation whereby DoD programs cannot find software technologies that meet their needs, regardless of their inherent promise. As a result, across the DoD acquisition programs repeatedly encounter problems developing, validating, and sustaining software. To exacerbate the problem, the "landing path" for software technologies is typically not DoD program engineers, but organizations (such as commercial vendors or standards bodies) responsible for maintaining the technology. These organizations are often not structured or motivated to leverage the results of advanced research projects effectively.
For example, DoD software researchers have historically received funding for research programs of approximately three years in duration. These programs involve creating a project plan, building teams, working on technologies, generating and evaluating prototypes, and writing papers to publicize the work. Throughout this period, there is typically great enthusiasm for the project from the technical community. Once the program ends, however, the community often disbands and the project descends into the "valley of disappointment," a phenomenon in which researchers struggle to transition their prototypes to the DoD acquisition community, while the practitioners are equally frustrated with not being able to apply research results to practical problems.
Getting stuck in the "valley of disappointment" is a common problem in technology research and development projects, as evidenced by Geoffrey Moore’s book Crossing the Chasm. Moore presents this problem from a venture capital perspective: a group of researchers develops a technology and identifies some early adopters, but struggles to transition from the early-adoption to majority-adoption phase. Some reasons for this valley are that researchers are often required to work on abstracted problems because that’s all that they can access and acquisition professionals don’t have the luxury of transitioning "science projects."
Crossing the "Valley of Disappointment" with SPRUCE
To address the challenges describe above, the Assistant Secretary of Defense Research & Engineering Enterprise (ASDR&E), through the Air Force Research Lab (AFRL), funded researchers at Lockheed Martin Advanced Technology Laboratories, in partnership with Booz Allen Hamilton, Vanderbilt University (where I worked on SPRUCE before joining the SEI), Drexel University, Virginia Tech University, Lockheed Martin Aeronautics, and Raytheon to create the Systems and Software Producibility Collaboration and Experimentation Environment (SPRUCE). SPRUCE is a collaborative set of web-based services that matches DoD challenge problems with the methods, algorithms, tools and techniques developed by researchers. One way to think about SPRUCE is as an "eHarmony" portal for researchers that unites domain experts from the DoD acquisition community who face concrete technical challenges with software researchers who can solve them.
For example, acquisitions professionals could be searching for an approach that will allow them to run legacy code on a multi-core platform or an algorithm that minimizes the amount of processors and network bandwidth in an avionics system. SPRUCE refers to these people as the "problem providers," who post challenge problems into the SPRUCE portal. Conversely, researchers are "solution providers" who use SPRUCE to post candidate solutions to available challenge problems.
SPRUCE allows problem providers to explain their needs in a structured way—along with representative data sets and reproducible experiments—so that solution providers from software researchers can decide if they have methods or technologies that would make an impact on the posted problem. If researchers operate in an open environment (which is typical at universities), they can post their solution on the SPRUCE portal. If researchers operate in a closed environment (which is typical at companies), they can contact the problem providers directly and discuss options for collaboration.
SPRUCE addresses many problems facing DoD software researchers:
It allows researchers access to real-world problems and realistic data sets. Even if the problem providers have anonymized their problem (for example, by removing proprietary information), it still represents an actual challenge faced by the DoD. Once researchers demonstrate that their solutions work on abstracted problems that are relevant to a particular domain and derived from real-world scenarios, it is easier to convince the original problem providers that the results are ready to be applied in practice.
SPRUCE facilitates healthy competition among research groups. For example, SEI researchers may believe they have the most effective techniques and tools for detecting similarity in malware, but SPRUCE allows them to compare their results against techniques and tools devised by other researchers for a common data set. In addition to the obvious competitive benefits, this approach also allows a better collaborative evolution of the solution by incorporating the best parts from each approach into a refined approach.
SRPUCE helps researchers locate sources of funding because it provides an immediate way for them to showcase their results in a forum that has an audience (the challenge problem providers) interested in solutions to real problems. Over time, researchers will populate the SPRUCE repository with their solutions, providing a way for them to find audiences for new funding and additional collaborations on real problems.
In its four years of funding, SPRUCE has focused primarily on capturing DoD challenge problems and helping DoD software researchers collaborate more effectively with other members of the DoD software community. SPRUCE has also influenced the National Science Foundation (NSF), which recently created the Cyber Physical Systems Virtual Organization (CPS-VO) community as a web portal for problem providers in cyber-physical systems. NSF-funded researchers use CPS-VO to post challenges and to work collaboratively to solve challenge problems with their colleagues around the world.
SPRUCE represents part of the trend towards more collaborative research and development among scientists and engineers. For example, UAVForge.net is attempting to use crowd sourcing to go from concept to fly-off of air vehicle designs in under six months. Likewise, a recent article from Wall Street Journal titled The New Einsteins Will Be Scientists Who Share proclaims "publicly funded science should be open science." Portals like SPRUCE help move researchers and practitioners from isolated pockets of collaboration to mainstream adoption.
Using SPRUCE to Guide SEI Research
At the SEI, we are using SPRUCE to showcase our solutions and, more importantly, to capture real-world challenge problems from our stakeholders. For example, the SEI hosted a workshop in August 2011 that brought together researchers and problem providers from Lockheed Martin, Boeing, AFRL, Carnegie Mellon University and Virginia Tech to elicit guidance for our work in real-time scheduling and currency analysis for cyber-physical systems. The workshop participants provided the SEI with challenge problems from avionics domain experts to ensure research we are doing addresses real DoD problems. As a result of this workshop, the problem providers populated the SPRUCE database with problems that SEI technologists will use to guide our future work. We are in the process of conducting challenge problem workshops for other software research projects at the SEI to ensure we continue to work on relevant problems that have high impact on DoD operational needs.
SPRUCE also allows us to continually improve our metrics and measures of success. As a federally funded research and development center, the SEI is often requested to substantiate data and success criteria. Problem providers—who by their nature have a close connection to real-world problems—help define the success criteria. This approach allows an external party, like a DoD contractor, to define the success criteria for SEI researchers who then work to achieve those criteria. At the same time, it showcases the solutions that SEI technologists have developed for various technologies, such as multi-core platforms.
In a commoditized IT environment, human resources are an increasingly strategic asset. In the future, therefore, premium value and competitive advantage will accrue to individuals, universities, companies, and agencies that continue to invest in software research and who master the principles, patterns, and protocols necessary to collaboratively integrate commoditized hardware and software to develop complex systems that cannot yet be bought off-the-shelf. Success in this endeavor requires close collaboration between academia, industry, and government. The SPRUCE portal described above helps to facilitate this collaboration by bringing key stakeholders to the table and ensuring that government investments in software research have greater impact on DoD acquisition programs.
Additional Resources:
For more information about the SPRUCE portal, please visit www.sprucecommunity.org/default.aspx.
To read about the need to motivate greater DoD investment in software research, please see the National Research Council’s Critical Code: Software Producibility for Defense report available atwww.nap.edu/openbook.php?record_id=12979&page=R1.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:55pm</span>
|
By Grace Lewis, Senior Member of the Technical StaffResearch Technology & System Solutions
Cloudlets, which are lightweight servers running one or more virtual machines (VMs), allow soldiers in the field to offload resource-consumptive and battery-draining computations from their handheld devices to nearby cloudlets. This architecture decreases latency by using a single-hop network and potentially lowers battery consumption by using WiFi instead of broadband wireless. This posting extends our original post by describing how we are using cloudlets to help soldiers perform various mission capabilities more effectively, including facial, speech, and imaging recognition, as well as decision making and mission planning.
An initial goal of our research was to create a prototype application that located cloudlets within close proximity of handheld devices using them. We initially focused on offloading computations to cloudlets to extend device battery life. In addition to this benefit, we also found cloudlets significantly reduce the amount of time needed to deploy applications to handheld devices because clients are not tied to a specific server that can take a long time to provision in tactical environments.
Our work together with Mahadev "Satya" Satyanarayanan (the creator of the cloudlet concept and a faculty member at Carnegie Mellon's School of Computer Science) originally focused on face recognition applications as an example of a computation-intensive mission capability. Thus far we have created an Android-based facial recognition application that
locates a cloudlet via a discovery protocol,
sends the application overlay to the cloudlet, where dynamic VM synthesis is performed,
captures the images and sends them to the facial recognition server code that now resides in the cloudlet.
In the context of cloudlets, the application overlay corresponds to the computation-intensive code invoked by the client, which in this case is the face recognition server written in C++ and processes images from a handheld device client for training or recognition purposes. On execution, the overlay is sent to the cloudlet and applied to one of the VMs running in the cloudlet, which is called dynamic VM synthesis. The application overlay is pre-generated by calculating the difference between a base VM and the base VM with the computation-intensive code installed.
The first version of the cloudlet we created is a simple HTTP server. When this server receives the application overlay from the client it decrypts and decompresses the overlay and performs VM synthesis to configure the cloudlet dynamically. It subsequently returns coordinates for the faces it recognizes, as along with a measure of confidence to the client device.
Constructing the Cloudlet Prototype
The original cloudlet prototype built by Satya’s team used a simple Virtual Network Computer (VNC) client, to see what was executing inside the VM. Our cloudlet prototype extended Satya’s work to use a thick mobile client that provides a better user experience for users at the edge and allows incorporation of sensor information that would not be possible with the original VNC cloudlet approach. We constructed this prototype in the RTSS Concept lab.
Our design was tricky because the face recognition client needs to know the IP address and the port on which the face recognition server is listening so that it can connect to it. The client uses an HTTP request to start the cloudlet setup expects an HTTP response from the cloudlet server that includes the face recognition server IP address and port. Since the IP address is assigned by the DHCP server because the VM is executing in bridged mode, however, the host server has no visibility into that assignment, so there was no simple way to obtain the IP address and port.
To solve this problem, we included a Windows service in the VM and run on startup. The Windows service invokes a Python script that performs the following three tasks:
start the face recognition server executable in a separate thread inside a Python script,
read the face recognition server configuration file that contains the IP address and port that the face recognition server is listening on, and
write this information to a file that is accessible by the cloudlet
Although the Windows service creates additional complexity on the cloudlet server, it reduces the complexity cloudlet setup in the field. During field operation, servers residing within Tactical Operation Center (TOCs) and Humvees are provisioned with a set of pre-packaged cloudlets to support a range of applications and versions to avoid provisioning servers for each supported application platform and version. The handheld devices of soldiers participating in the mission are then loaded with application overlays that are necessary for a particular mission. A soldier running a computation-expensive application can discover a compatible cloudlet within minutes and offload the expensive computation to the cloudlet running on a server.
What We’ve Learned
Our research has identified the following two types of applications that can be deployed in a cloudlet setting:
Data-source-reliant applications that rely on a particular data source to work. For example, if soldiers need to launch the facial recognition application, they need a database of faces to match images against. Another example would be if the soldier wanted to compare fingerprints and needed a database of fingerprints to match against. In this setting the cloudlet must be configured to connect the cloudlet to a particular data source.
Non-data-source-reliant applications that are computationally intensive but don’t require a large data source to work. For example, imagine soldiers encountering a sign with characters they don’t understand. They can take a picture of the sign and submit it to a cloudlet to determine the language in which the sign is written. In this case the computationally-intensive code residing on the cloudlet relies on complex character recognition algorithms instead of a large database.
As expected, our experiments demonstrated that the size of the overlay increases overlay transmission time (which in turn consumes more battery) as well as VM synthesis time. If the data source is included inside the overlay this would create a large overlay, which indicates that the cloudlet concept is better fit for non-data-source-reliant applications. We overcame this problem by specifying the location of the data source in a configuration file. The location could be the local server or a server accessible over a network or the Internet. Although this approach requires additional configuration, it is only done once (when the cloudlet is packaged by IT experts), rather than doing it each time a server is configured in the field (potentially by non-IT experts).
Future Work
When testing the cloudlet prototype in the RTSS Concept Lab, we discovered that a reduced deployment time makes it easier to deploy an application in a tactical environment. We are working to capture those measurements and are developing the following applications to support our findings
fingerprint recognition — fingerprints are captured using a fingerprint scanner connected to a handheld device and sent to the cloudlet for processing,
character recognition — pictures of a written sign are taken with a camera on the handheld device and sent to the cloudlet for character identification and translation,
speech recognition — voice of a person speaking a foreign language is captured using the voice recorder on the handheld device and sent to the cloudlet for translation; the same application can be used to translate a response back to the identified foreign language, and
model checking —An app is generated on the handheld on-the-fly using end-user programming capabilities and sent to a model checker in a cloudlet to ensure it does not violate any security (or other) policies and constraints.
We will use these new applications to gather measurements related to bandwidth consumption of overlay transfer and VM synthesis to focus on optimization of cloudlet setup time.
Our future research and collaboration will position cloudlets to both reduce battery consumption and simplify application deployment in the field. For example, our goal is to use dynamic VM synthesis to slash the time needed to deploy applications, thereby shielding operators from unnecessary technical details, while also communicating and responding to mission-critical information at an accelerated operational tempo.
Additional Resources:
This is the second post in a series exploring the SEI’s research in Cloud Computing in partnership with Satya. To read the initial post, Cloud Computing for the Battlefield, please visithttp://blog.sei.cmu.edu/post.cfm/cloud-computing-for-the-battlefield.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:54pm</span>
|
By Edwin Morris, Advanced Mobile Systems Initiative LeadResearch, Technology & System Solutions
Whether soldiers are on the battlefield or providing humanitarian relief effort, they need to capture and process a wide range of text, image, and map-based information. To support soldiers in this effort, the Department of Defense (DoD) is beginning to equip soldiers with smartphones to allow them to manage that vast array and amount of information they encounter while in the field. Whether the information gets correctly conveyed up the chain of command depends, in part, on the soldier’s ability to capture accurate data while in the field. This blog posting, a follow-up to our initial post, describes our work on creating a software application for smartphones that allows soldier end-users to program their smartphones to provide an interface tailored to the information they need for a specific mission.
The software we developed is constructed primarily in Java and operates on an Android platform. We used an object database (DB 4.0) as the underlying data store because it provides flexible and powerful application programming interfaces (APIs) that simplified our implementation. For performance reasons, our application is a native Android app - it’s not running on a browser of an Android smart phone.
Our app—called eMONTAGE (Edge Mission Oriented Tactical App Generator)—allows a soldier to build customized interfaces that support the two basic paradigms that are common to smartphones: maps and lists. For example, a soldier could build an interface that allows them to construct a list of friendly community members including names, affiliations with specific groups, information about whether the person speaks English, and the names of the person’s children. If the soldier also specifies a GPS location in the customized interface s/he constructs, the location of the friendly community members could be plotted on a map. Likewise, the same soldier could build other customized interfaces that capture specific aspects of a threatening incident, or the names and capabilities of non-governmental organizations responding to a humanitarian crisis.
Challenges We Encountered
The software we built is intended for soldiers who are well-versed in their craft, but are not programmers. While we are still conducting user testing, after we developed a prototype, we asked several soldiers to provide feedback. Not surprisingly, we found that soldiers who are Android users and relatively young (i.e., digital natives) quickly learned the software programming application and could use it to build a new application on-site. Conversely, non-digital natives had a harder time. Since our goal is to make our software accessible to every soldier, we are simplifying, revising, and improving the user interface.
As with any device used by our military, security is a key concern. Through our work with DARPA’s Transformative Apps program in the Information Innovation office, we can take advantage of the security strategies they conceive and implement. We are also working to address challenges associated with limited bandwidth and battery consumption in this work and other work within the Research, Technology, and Systems Solutions program at the SEI.
Another area of our work involves enabling our software to connect to back-end data sources that the DoD uses. For example, a soldier on patrol may need to connect to TiGR and other information systems to access current information about people, places, and activities in an area. Our software will enable these soldiers to build customized interfaces to such data sources by selecting fields for display on the phone and by extending the information provided by these sources with additional, mission-specific information. This capability will provide mash ups that support soldiers by capturing multiple sources of information for display and manipulation. Once our full capability is available in spring 2012, it will become much easier to build phone interfaces to new data sources and extend these interfaces with additional information.
Looking to the Future
Currently, eMONTAGE can handle the basic information types that are available on an Android phone, including images, audio, and data. Technologies like finger print readers and chemical sensors are being miniaturized and will likely be incorporated into future handheld devices. With each new technology, we’ll need to add that basic type to our capability. Fortunately, this is a relatively straight-forward programming operation, but it does require engineering expertise. As a new type becomes available, professional engineers will add it to eMONTAGE, thereby making the type available to soldiers who may have little or no programming expertise.
Our current focus is on ensuring that the software is reliable and does not fail, but we are also looking to extend it to provide features that we believe are essential, such as better support for collections of objects. For example, soldiers may need to classify a single individual into different groups: a family member, translator, or member of an organization. Each of these groups is a collection. Soldiers will have the ability to list and search through collections (e.g., list all members of an NGO who work for Doctors Without Borders) and plot the members of a collection on a map (e.g., display all members of Doctors Without Borders who are within 10 miles of my current position.)
While we can provide access to military iconology, eMONTAGE is not DoD-specific by design. This application can be used by other government organizations—or even non-government organizations— that want a user-customizable way to capture information about any variety of people, places, and things, and share this information effectively in the enterprise.
Part of our ongoing research involves testing our applications with soldiers through the Naval Post-Graduate School’s Center for Network Innovation and Experimentation (CENETIX). In our initial tests with the soldiers, they told us what capabilities they need and what did not work. These collaborations tie our work firmly into both the research and military communities and keep us focused on providing a useful and cutting-edge capability. In addition to continuing our collaboration with CENETIX, we are working with Dr. Brad Myers of the Carnegie Mellon University Human Computer Interaction Institute. Dr. Myers is helping us define an appropriate interface for soldiers to use the handheld software in the challenging situations they face.
Additional Resources:
This posting is the second in a series exploring our research in developing software for soldiers who use handheld devices in tactical networks. To read our first post in the series, please visithttp://blog.sei.cmu.edu/post.cfm/a-new-approach-for-handheld-devices-in-the-military
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:54pm</span>
|
By Arie GurfinkelSenior Member of the Technical Staff Research, Technology, & System Solutions
The DoD relies heavily on mission- and safety-critical real-time embedded software systems (RTESs), which play a crucial role in controlling systems ranging from airplanes and cars to infusion pumps and microwaves. Since RTESs are often safety-critical, they must undergo an extensive (and often expensive) certification process before deployment. This costly certification process must be repeated after any significant change to the RTES, such as migrating a single-core RTES to a multi-core platform, significant code refactoring, or performance optimizations, to name a few. Our initial approach to reducing re-certification effort—described in a previous blog post—focused on the parts of a system whose behavior was affected by changes using a technique called regression verification, which involves deciding the behavioral equivalence of two, closely related programs. This blog posting describes our latest research in this area, specifically our approach to building regression verification tools and techniques for static analysis of RTESs.
Although there are many types of RTESs, we concentrate on a class of periodic programs, which are concurrent programs that consist of tasks that execute periodically. The tasks are assigned priorities based on their frequency (higher frequency = higher priority). The RTES executes the tasks using a priority-based preemptive scheduler. Each execution of a task is called a job. Thus, from the perspective of the scheduler, a system’s execution is a constant periodic stream of jobs of different priorities. In the rest of this post, we use RTES to mean periodic programs.
In the beginning of the project, we assumed that automated verification techniques (such as static analysis and model checking) for single-core RTESs could be adapted for regression verification since these techniques have been used for sequential single-core programs. After conducting an initial survey, however, we found that existing automated verification techniques that apply directly to program source (rather than to a manual abstract model) are not applicable to periodic programs. Our original approach to extend static analysis to regression verification in the setting of multi-core RTES was therefore changed in two ways. First, in phase 1 of our project we developed a new static analysis technique for reasoning about bounded executions of periodic programs. Second, in phase 2 we extended regression verification to multi-threaded programs, of which periodic programs are a restricted subset. The remainder of this blog posting describes these two phases. >
Phase 1: Time-Bounded Verification of Periodic Programs
In the first part of our work, we developed an approach for time-bounded verification of safety properties (user-specified assertions) of periodic programs written in the C programming language. Time-bounded verification is the problem of deciding whether a given program does not violate any user-specified assertions in a given time interval. Time-bounded verification makes sense for RTESs because of their intimate dependence on real time behavior. The inputs to our approach are (1) a periodic program C; (2) a safety property expressed via an assertion A embedded in C; (3) an initial condition Init of C, and (4) a time bound W. The output is either a counter-example trace showing how C violates an assertion A, or a message saying that the program is safe, in the sense that there is no execution that triggers any user-specified assertions.
Our solution to time-bounded verification is based on sequentialization, which involves reducing verification of a current program P to verification of a (non-deterministic) sequential program P’. A key feature of our approach is that P’ is linear in the size of P, which means the translation step is not computationally intensive and adds little overhead to the verification effort. The scalability of our approach is therefore mostly driven by the scalability of the underlying analysis engine, and our approach automatically benefits from constant improvements in the verification area.
Our work builds upon previous sequentialization work for context-bounded analysis (CBA) and bounded model checking (BMC). Our approach differs from prior work, however, since it bounds the actual execution time of the program, which is more natural to the designer of an RTES than a bound on the number of context switches (as done in CBA) or a bound on the number of instructions executed (as in BMC). We bound the execution time by translating the input time bound W in our model to a bound on the number of jobs. This translation is a natural consequence of the fact that the tasks are periodic and are therefore activated a finite number of times within W.
We implemented our approach in a tool called REK. REK supports C programs with tasks, priorities, priority ceiling locks, and shared variables. It takes a concurrent periodic program that cannot be analyzed with standard tools for sequential verification and converts it to become analyzable with such tools. Although in principle REK is compatible with any analyzer for bounded (loop- and recursion-free) C programs, in practice we rely on the CBMC tool by Daniel Kroening, which is one of the first and most mature bounded model checkers for C. CBMC can automatically analyze substantial C programs by encoding assertion violation to Boolean satisfiability queries. CBMC is a mature and robust tool that has been extensively applied to many industrial problems.
How REK Works
The analysis problem that REK is designed to solve is to check that a given periodic program is safe under all legal scheduling of tasks. REK solves a time-bounded version of this problem, e.g., whether the program is safe in the first 100ms, 200ms, 300ms, etc., starting from some user-specified initial condition. A time-bounded verification makes sense in the context of periodic programs since their execution can be naturally partitioned by time-intervals. Of course, in practice, unbounded verification would be preferred, so we are working on extending REK in this direction.
We briefly summarize the sequentialization step done by REK. First, we divide a time-bounded execution into execution rounds (or, rounds for short). The execution starts in round 0, a new round starts (and the old one stops) whenever a job of some task finishes. An execution with X jobs therefore requires X execution rounds. The sequentialization step simulates execution of each round independently and then combines them (using non-deterministic choice) into a single legal execution. Further details of the construction are available in our FMCAD 2011 paper referenced below.
In addition to the basic sequentialization described above, REK is extended with the following features to achieve scalability to realistic programs:
Partial order reduction is a set of techniques used in model checking to reduce the number of interleavings that must to be explored in a concurrent system. For example, if there are two independent actions a and b, then only one of the two executions ‘a followed by b’ or ‘b followed by a’ must be explored since they both lead to the same destination state. Although there are many approaches for partial order reduction in explicit state model checking (as opposed to symbolic model checking used in this work), extending them to symbolic verification is an area of active research. In REK, we developed a new partial order reduction technique that restricts explored executions only to those in which a read statement is preempted by a write statement to the same variable, or a write is preempted by a read or a write. This reduction eliminates many unnecessary interleavings and cuts the search space significantly. Our experiments show that the reduction is quite effective in practice.
A limitation to our approach is that it does not keep track of the actual execution time of each instruction, each job, and each task. As such, it is an over-approximation since it explores more executions than actually possible and can produce a "false positive" by producing a counter-example trace that is not possible on a given hardware architecture due to timing restrictions. To reduce the number of false positives, we further constrain our sequentialization by the information that can be inferred from schedulability analysis. Thus, if a periodic program is schedulable, it satisfies the rate monotonic analysis (RMA) equations. Those equations can be used to compute an upper bound on the number of times any given low priority job can be preempted by any given high priority job. We call this the preemption bound, which REK uses to further reduce the number of interleavings by keeping track how many times one task preempts another, and ensuring that this value never exceeds the preemption bound for the jobs of that task.
To deal with practical periodic programs, REK provides support for two types of commonly used lock primitives. In particular, it supports preemption locks (preemptions are disabled when the lock is held) and priority ceiling locks (preemption by any task with lower priority than the lock is disabled when the lock is held). We are extending REK to support the third common type of locks, priority-inheritance locks (regular blocking locks, but the priority of a low-priority task that holds a lock l is increased if a high-priority task is waiting for l).
As part of our research, we created a model problem using the NXTway-GS, which is a two-wheeled, self-balancing robot that responds to Bluetooth commands. The robot uses a gyroscope to balance itself upright by applying power to left and right wheels. It also uses a sonar sensor so that when it comes to an obstacle, like a wall or ditch, it can back up. We have used REK to verify and fix several communication consistency properties between the tasks of the robot. More information on the use of REK for the NXTway-GS is available at http://www.andrew.cmu.edu/~arieg/Rek.
Phase 2: Regression Verification for Multi-threaded Programs
In the second phase of our work, we examined regression verification for multi-threaded programs. We believe that that once we have regression verification for multi-threaded programs, we can adapt it to periodic programs as well.
Every instance of regression verification is based on some underlying notion of equivalence. The equivalence notion for single-threaded software is called partial equivalence: two functions are partially equivalent if they produce the same output for the same input. A multi-threaded program, conversely, is not partially equivalent to itself by the above definition since the same input can lead to different outputs due to scheduling choices. Our first challenge therefore involved creating a notion of equivalence for multi-threaded software.
Our second challenge was to come up with the right notion of decomposition to establish equivalence of programs from equivalence of their functions. Equivalence of sequential programs is done using Intput/Output equivalence. Two sequential programs are equivalent if it is possible to show that their corresponding functions have the same Input/Output behavior (produce the same output given the same input). In the case of multi-threaded programs, however, functions from different threads of a single program affect one another, making simple decomposition at the level of functions much harder because it must take interference from other threads into account.
To check whether two multi-threaded programs are partially equivalent (P = P’) we use a proof rule consisting of a set of premises and a conclusion. Each premise establishes the partial equivalence of a pair of functions f and f’ from P and P’, respectively. A premise is established by verifying a single-threaded program.
As part of this work, we developed two separate proof rules:
The first rule attempts to show equivalence of two programs by showing that their corresponding functions are Input/Output equivalent (produce the same output for a given input) under arbitrary interference, where "interference" means that the value of shared variables can change between execution of instructions of a thread. This rule is "strong" (not widely applicable on many equivalent programs) because in practice the functions must be equivalent only in the context of the given program and not under arbitrary interference.
The second rule improves on the first rule by attempting to show that two programs are equivalent by restricting interference to what is consistent with the other functions in the program. For example, if there is no other function in a program that can affect a global variable ‘x’, then no interference that modifies ‘x’ is considered. This rule is "weaker" (more widely applicable) than the first one, but is computationally harder to automate.
In Conclusion
The ability to statically reason about correctness of periodic programs and the ability to perform regression verification adds the following key capabilities to an RTES developer’s toolbox:
ability to check prior to deployment that the program does not violate its assertions,
ability to check that top-level application programming interfaces (APIs) are not affected by low-level refactoring and/or performance optimizations,
ability to check that new APIs are backward compatible with old APIs, and
ability to perform impact analysis to determine which function may possibly be affected by a given source code change and which unit tests must be repeated.
We believe these capabilities can lower the cost of developing RTESs, while increasing their reliability and trustworthiness.
Additional Resources
For more information about our tool REK and our experiments, please visit http://www.andrew.cmu.edu/user/arieg/Rek
For more information about the bounded model checker CBMC, please visit http://www.cprover.org/cbmc
B. Goldin and O. Strichman. "Regression Verification," in Proceedings of DAC 2009, pp. 466-471.
S. Chaki, A. Gurfinkel, and O. Strichman. "Time-Bounded Analysis of Real-Time Systems," in Proceedings of FMCAD 2011, pp. 72-80.
S. Chaki, A. Gurfinkel, and O. Strichman. "Regression Verification for Multi-Threaded Programs," to appear in Proceedings of VMCAI 2012.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:52pm</span>
|
By Dennis R. GoldensonSenior Member of the Technical Staff Software Engineering Measurement and Analysis
As with any new initiative or tool requiring significant investment, the business value of statistically-based predictive models must be demonstrated before they will see widespread adoption. The SEI Software Engineering Measurement and Analysis (SEMA) initiative has been leading research to better understand how existing analytical and statistical methods can be used successfully and how to determine the value of these methods once they have been applied to the engineering of large-scale software-reliant systems. As part of this effort, the SEI hosted a series of workshops that brought together leaders in the application of measurement and analytical methods in many areas of software and systems engineering. The workshops help identify the technical barriers organizations face when they use advanced measurement and analytical techniques, such as computer modeling and simulation. This post focuses on the technical characteristics and quantified results of models used by organizations at the workshops.
Participants were invited and asked to present at the workshops only if they had empirical evidence about the results of their modeling efforts. A key component of this work is assembling leaders within the organizations who know how to conduct measurement and analysis and can demonstrate how it is successfully integrated into the software product development and service delivery processes. Understandably, attendees don’t share proprietary information, but rather talk about the methods that they used, and, most importantly, they learn from each other.
At a recent workshop, the various models discussed were statistical, probabilistic, and simulation-based. For example, organizational participants
demonstrated the use of Bayesian belief networks and process flow simulation models to define end-to-end software system lifecycle processes requiring coordination among disparate stakeholder groups to meet product quality objectives and efficiency of resource usage,
described the use of Rayleigh curve fitting to predict defect discovery (depicted as defect densities by phase) across the software system lifecycle and to predict latent or escaping defects, and
described the use of multivariable linear regression and Monte Carlo simulation to predict software system cost and schedule performance based on requirements volatility and the degree of overlap of the requirements and design phases (e.g. surrogate for risk of proceeding with development prematurely).
Quantifying the Results
The presentations covered many different approaches applied across a large variety of organizations. Some had access to large data repositories, while others used small datasets. Still others addressed issues of coping with missing and imperfect data, as well as the use of expert judgment to calibrate the models. The interim and final performance outcomes predicted by the models also differed considerably, and included defect prevention, customer satisfaction, other quality attributes, aspects of requirements management, return on investment, cost, schedule, efficiency of resource usage, and staff skills as a function of training practices.
One case study, presented by David Raffo, professor of business, engineering, and computer science at Portland State University, described an organization releasing defective products with high schedule variance. The organization’s defect-removal activities were based on unit test, where they faced considerable reliability problems. They knew they needed to reduce schedule variance and improve quality, but they had a dozen ideas to consider for how to actually accomplish that. They wanted to base their decision on a quantitative evaluation of the likelihood of success of each particular effort. A state-based discrete event model of large-scale commercial development processes was built to address that and other problems. The simulation was parameterized using actual project data. Some outcomes predicted by the model included the following:
cost in staff-months of effort or full-time-equivalent staff used for development, inspections, testing, and rework,
numbers of defects by type across the life cycle,
delivered defects to the customer, and
calendar months of project cycle time.
Raffo’s simulation model was used as part of a full business case analysis. The model ultimately determined likely return on investment (ROI) and related financial performance under different proposed process change scenarios.
Another example presented by Neal Mackertich and Michael Campo of Raytheon Integrated Defense Systems demonstrated the use of a Monte Carlo simulation model they developed. The model was created to support Raytheon’s goal of developing increasingly complex systems with smaller performance margins. One of their most daunting challenges was schedule pressure. Schedules are often managed deterministically by the task manager, limiting the ability of the organization to assess the risk and opportunity involved, perform sensitivity analysis, and implement strategies for risk mitigation and opportunity capture. The model developed at Raytheon allowed them to
statistically predict their likelihood of meeting schedule milestones,
identify task drivers based on their contribution to overall cycle time and percentage of time spent on the critical path, and
develop strategies for mitigating the identified risk.
The primary output of the model was the prediction interval estimate of schedule performance (generated from Monte Carlo simulation) using individual task duration probability estimation and an understanding of the individual task sequence relationships. Engineering process funding was invested in the development and deployment of the model and critical chain project management, resulting in a 15 - 40% reduction in cycle time duration against baseline.
Encouraging Adoption
While these types of models are used frequently in other fields, they are not as often applied in software engineering, where the focus has often been on the challenges of the system being developed. As the field matures, more analysis should be done to determine quantitatively how products can be built most efficiently and affordably, and how we can best organize ourselves to accomplish that.
The initial cost of model development can range from a month or two of staff effort to a year depending on the scope of the modeling effort. Tools can range from $5,000 to $50,000 depending on the level of capability provided. As a result of these kinds of investments, models can and have saved organizations millions of dollars through resultant improvements. Our challenge is to help change the practice of software engineering, where the tendency is to "just go out and do it" to include this type of product and process analysis. To do so, we know we have to conclusively demonstrate that the information gained is worth the expense and bring these results to a wider audience.
Additional Resource:
To read the SEI technical report, Approaches to Process Performance Modeling: A Summary from the SEI Series of Workshops on CMMI High Maturity Measurement and Analysis, please visit www.sei.cmu.edu/library/abstracts/reports/09tr021.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:51pm</span>
|
By Douglas C. Schmidt Chief Technology Officer
A key mission of the SEI is to advance the practice of software engineering and cyber security through research and technology transition to ensure the development and operation of software-reliant Department of Defense (DoD) systems with predictable and improved quality, schedule, and cost. To achieve this mission, the SEI conducts research and development (R&D) activities involving the DoD, federal agencies, industry, and academia. One of my initial blog postings summarized the new and upcoming R&D activities we had planned for 2011. Now that the year is nearly over, this blog posting presents some of the many R&D accomplishments we completed in 2011.
Our R&D benefits the DoD and other sponsors by identifying and solving key technical challenges facing developers and managers of current and future software-reliant systems. Our R&D work focuses on the following four major areas of software engineering and cyber security:
Innovating software for competitive advantage. This area focuses on producing innovations that revolutionize development of assured software-reliant systems to maintain the U.S. competitive edge in software technologies vital to national security.
Securing the cyber infrastructure. This area focuses on enabling informed trust and confidence in using information and communication technology to ensure a securely connected world to protect and sustain vital U.S. cyber assets and services in the face of full-spectrum attacks from sophisticated adversaries.
Advancing disciplined methods for engineering software. This area focuses on improving the availability, affordability, and sustainability of software-reliant systems through data-driven models, measurement, and management methods to reduce the cost, acquisition time, and risk of our major defense acquisition programs.
Accelerating assured software delivery and sustainment for the mission. This area focuses on ensuring predictable mission performance in the acquisition, operation, and sustainment of software-reliant systems to expedite delivery of technical capabilities to win the current fight.
Following is a sampling of the SEI’s R&D accomplishments in each of these areas during 2011 with links to additional information about these projects.
Innovating Software for Competitive Advantage
Although the SEI advocates software architecture documentation as a software engineering best practice, the specific value of software architecture documentation has not been established empirically. The blog posting Measuring the Impact of Explicit Architecture Documentation describes a research project we conducted to measure and understand the value of software architecture documentation on complex software-reliant systems, focusing on creating architectural documentation for a major subsystem of Apache Hadoop, the Hadoop Distributed File System (HDFS).
The SEI has developed algorithms and tools for optimize the performance of cyber-physical systems without compromising their safety. The blog posting Ensuring Safety in Cyber-Physical Systems describes a safe double-booking algorithm that reduces the over-allocation of processing resources needed to ensure the timing behavior of safety-critical tasks in cyber-physical systems. A subsequent posting describes an algorithm for supporting mixed-criticality operations by giving more central processing unit (CPU) time to functions with higher value while ensuring critical timing guarantees.
Together with researchers at CMU, the SEI has worked to develop cloudlets, which are localized, lightweight servers running one or more virtual machines on which soldiers can offload expensive computations from their handheld mobile devices, thereby providing greater processing capacity and helping conserve battery power. The blog posting Cloud Computing for the Battlefield describes a cloudlet prototype the SEI developed to recognize faces on an Android smartphone. A subsequent posting describes how the SEI is using cloudlets to help soldiers perform other mission capabilities more effectively, including speech and imaging recognition, as well as decision making and mission planning.
SEI-developed methods and tools allow soldier end-users to program their smartphones to provide an interface tailored to the information they need for a specific mission. The blog posting A New Approach for Handheld Devices in the Military motivates the need for soldiers to access information on a handheld device and described software we are developing to enable soldiers to tailor the information for a given mission or situation. A subsequent blog posting describes the challenges the SEI encountered when equipping soldiers with end-user programming tools.
Other SEI-developed methods and tools help reduce the time and effort needed to re-certify mission- and safety-critical real-time embedded software systems (RTESs) after significant changes have be made, such as migrating a single-core RTES to a multi-core platform, significant code refactoring, or performance optimizations. The blog posting on Regression Verification of Real-time Embedded Software focuses on research in applying regression verification (which involves deciding the behavioral equivalence of two closely related programs) to help the migration of RTESs from single-core to multi-core platforms. A subsequent posting describes regression verification tools and techniques that the SEI is building to conduct static analysis of RTESs.
Securing the Cyber Infrastructure
A large percentage of cybersecurity attacks against DoD and other government organizations are caused by disgruntled, greedy, or subversive insiders, employees, or contractors with access to that organization’s network systems or data. The blog posting Protecting Against Insider Threads with Enterprise Architecture Patterns describes work that researchers at the CERT® Insider Threat Center have been conducting to help protect next-generation DoD enterprise systems against insider threats by capturing, validating, and applying enterprise architectural patterns. These patterns can be used to ensure that the necessary agreements are in place (IP ownership and consent to monitoring), critical IP is identified, key departing insiders are monitored, and the necessary communication among departments takes place to mitigate the impact of insider threats.
The SEI has been conducting research to help organizational leaders manage critical services in the presence of disruption by presenting objectives and strategic measures for operational resilience, as well as tools to help them select and define those measures. The blog posting Measures for Managing Operational Resilience describes how the SEI has been exploring the topic of managing operational resilience at the organizational level for the past seven years through development and use of the CERT Resilience Management Model (CERT-RMM). The CERT-RMM is a capability model designed to establish the convergence of operational risk and resilience management activities and apply a capability level scale that expresses increasing levels of process performance.
New malicious code analysis techniques and tools being developed at the SEI will better counter and exploit adversarial use of information and communication technologies. The blog posting Fuzzy Hashing Techniques in Applied Malware Analysis describes a technique the SEI has developed to help analysts determine whether two pieces of suspected malware are similar. A subsequent posting discusses types of malware against which similarity measures of any kind (including fuzzy hashing) may be applied. Other blog postings on Learning a Portfolio-Based Checker for Provenance-Similarity of Binaries and Using Machine Learning to Detect Malware Similarity describe our research on using classification (a form of machine learning) to detect "provenance similarities" in binaries, which means that they have been compiled from similar source code (e.g., differing by only minor revisions) and with similar compilers (e.g., different versions of Microsoft Visual C++ or different levels of optimization). Yet another blog posting A New Approach to Modeling Malware using Sparse Representation describes our use of suffix trees, zero-suppressed binary decision diagrams, and sparse representation modeling to create a rapid search capability that allows analysts to quickly analyze a new piece of malware.
Advancing Disciplined Methods for Engineering Software
Recent SEI research aims to improve the accuracy of early estimates (whether for a DoD acquisition program or commercial product development) and ease the burden of additional re-estimations during a program’s lifecycle. The blog posting Improving the Accuracy of Early Cost Estimates for Software-Reliant Systems describes challenges we have observed trying to accurately estimate software effort and cost in DoD acquisition programs, as well as other product development organizations. A subsequent post explores a method and tools the SEI is developing to help cost estimation experts get the right information into a familiar and usable form for producing high quality cost estimates early in the lifecycle.
A notable new approach at the SEI combines elements of the SEI’s Architecture Centric Engineering (ACE) method, which requires effective use of software architecture to guide system development, with its Team Software Process (TSP), which is a team-centric approach to developing software that enables organizations to better plan and measure their work and improve software development productivity to gain greater confidence in quality and cost estimates. The blog postings Combining Architecture-Centric Engineering Within TSP and Using TSP to Architect a New Trading System describe how ACE was applied within the context of TSP to develop system architecture to create a reliable and fast new trading system for Groupo Bolsa Mexicana de Valores (BMV, the Mexican Stock Exchange).
Over the last several years, the SEI hosted a series of workshops that brought together leaders in the application of measurement and analytical methods in many areas of software and systems engineering. The workshops helped identify the technical barriers organizations face when they use advanced measurement and analytical techniques, such as computer modeling and simulation. The blog posting on Using Predictive Modeling in Software Development: Results from the Field describes the technical characteristics and quantified results of models used by organizations at the workshops.
Accelerating Assured Software Delivery and Sustainment for the Mission
The SEI has been assisting large-scale DoD acquisition programs in developing systematically reusable software platforms that provide applications and end-users with many net-centric capabilities, such as cloud computing or Web 2.0 applications. The blog posting A Framework for Evaluating Common Operating Environments explains how the SEI developed a Software Evaluation Framework and applied it to help assess the suitability of common operating environments for the U.S. Army.
Methods and processes that enable large-scale software-reliant DoD systems to innovate rapidly and adapt products and systems to emerging needs within compressed time frames were another area of exploration for the SEI. A series of blog postings details our research on improving the overall value delivered to users by strategically managing technical debt, which involves decisions made to defer necessary work during the planning or execution of a software project, as well as describing the level of skill needed to develop software using Agile for DoD acquisition programs and the importance of maintaining strong competency in a core set of software engineering processes.
Teams at the SEI also have been researching common problems faced by acquisition programs related to the development of IT systems, including communications, command, and control; avionics; and electronic warfare systems. A series of blog postings covers acquisition problems, such as
misaligned incentives, which occur when different individuals, groups, or divisions are rewarded for behaviors that conflict with a common organizational goal
the need to sell the program, which describes a situation in which people involved with acquisition programs have strong incentives to "sell" those programs to their management, sponsors, and other stakeholders so that they can obtain funding, get them off the ground, and keep them sold
the evolution of "science projects," which describes how prototype projects that unexpectedly grow in size and scope during development often have difficulty transitioning into a formal acquisition program, and
the tragedy of common infrastructure and joint programs, which arises when multiple organizations attempt to cooperate in the development of a single system, infrastructure, or capability that will be used and shared by all parties.
The SEI also developed a collaborative method for engineering systems with critical safety and security ramifications. A series of blog postings on this topic explores problems with safety and security requirements, examines key obstacles that acquisition and development organizations encounter concerning safety- and security-related requirements, and explains how the Engineering Safety- and Security-related Requirements (ESSR) method overcomes these obstacles.
Concluding Remarks
As you can see from the summary of accomplishments above, 2011 has been a highly productive and exciting year for the SEI R&D staff. Naturally, this blog posting just scratches the surface of SEI R&D activities. Please come back regularly to the SEI blog for coverage of these and many other topics we’ll be doing in 2012. As always, we’re interested in new insights and new opportunities to partner on emerging technologies and interests. We welcome your feedback and look forward to engaging with you on the blog; as always we invite your comments below.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:50pm</span>
|
By Douglas C. SchmidtChief Technology Officer
After 47 weeks and 50 blog postings, the sands of time are quickly running out in 2011. Last week’s blog posting summarized key 2011 SEI R&D accomplishments in our four major areas of software engineering and cyber security: innovating software for competitive advantage, securing the cyber infrastructure, accelerating assured software delivery and sustainment for the mission, and advancing disciplined methods for engineering software. This week’s blog posting presents a preview of some upcoming blog postings you’ll read about in these areas during 2012.
Innovating Software for Competitive Advantage
The Value-Driven Incremental Development team is creating quantitative engineering techniques to support rapid delivery of high-value, high-quality software capabilities to the DoD. Their approach is based on quality attribute analysis models that guide incremental development so that DoD acquisition program offices will be able to get warfighters the features they need most, when they need them, while balancing speed-of-delivery, quality, value, and cost tradeoffs.
The Cyber-Physical Systems team is developing algorithms and verification techniques that enable the DoD to deliver reliable mission-critical capability cost-effectively by automating more of the development and assurance of cyber-physical embedded control systems. Their approach is based on new algorithms for precise and scalable functional analysis of real-time systems by exploiting scheduling constraints, as well as new resource reclamation algorithms for multi-threaded tasks in multi-core processors.
The Socio-Adaptive Systems team is establishing a new class of adaptive socio-technical systems wherein people, networks, and computer applications can locally decide how to respond when the demand for resources (network resources in this case) outstrips supply, while ensuring the best global use of whatever capacity is available. Their research combines the adaptability of human social institutions—in particular those based in market institutions—with automated network-resource optimization so that scarce tactical network capacity will automatically, continuously, and effectively be allocated to warfighters based on their needs.
The Edge-Enabled Tactical System team is improving the quality and relevance of information available to dismounted (edge) warfighters so the information they receive will be more consistent with and useful for their current missions. They are developing model-driven techniques and tools that will enable tactical units (e.g., squads of soldiers) to consume less battery power, computation, and bandwidth resources when performing their missions.
Securing the Cyber Infrastructure
The CERT Secure Coding Initiative is conducting research to reduce the number of software vulnerabilities to a level that can be mitigated in DoD operational environments. This work focuses on static and dynamic analysis tools, secure coding patterns, and scalable conformance testing techniques that help prevent coding errors or discover and eliminate security flaws during implementation and testing.
The CERT Insider Threat team is evaluating techniques for detecting known insider threats prior to attack, to assist the DoD in preventing future high-impact data loss. This work is leveraging the hundreds of cases in the CERT Insider Threat Database, simulation capacity in CERT’s Insider Threat Laboratory, and system dynamics models of insider crime to create the socio-technical architectural foundations to prevent this kind of damage now and into the future.
The CERT Coordination Center is developing methods and tools to reduce the cost to DoD suppliers and acquirers of improving software assurance and reliability during development and testing. Their aim is to enable these groups to identify software defects via dynamic blackbox "fuzz testing" in a manner identical to what an attacker would be able to perform, to remediate these vulnerabilities before the software is deployed operationally to the DoD.
The CERT Malicious Code team is developing tools to analyze obfuscated malware code to enable analysts to more quickly derive the insights required to protect and respond to intrusions of DoD and other government systems. Their approach uses semantic code analysis to de-obfuscate binary malware to a simple intermediate representation and then convert the intermediate representation back to readable binary that can be inspected by existing malware tools.
Accelerating Assured Software Delivery and Sustainment for the Mission
The Alternative Methods group is researching methods for increasing adoption of incremental development methods to accelerate delivery of software-related technical capabilities while reducing the cost, acquisition time and risk of major defense acquisition programs. Their approach focuses on developing a contingency model that identifies conditions and thresholds for when and how to use incremental development approaches in a DoD acquisition context. They are also documenting incremental development patterns and guidelines that chart the course for removing barriers to effective adoption of incremental and iterative approaches in the DoD.
The Acquisition Dynamics team is evaluating methods that mitigate the effects of misaligned acquisition program organizational incentives and adverse software-reliant acquisition structural dynamics by improving program decision-making. Their objective is to help DoD acquisition programs overcome some of the most severe counter-productive behaviors that stem from inherent social dilemmas by using known solutions drawn from fields such as behavioral economics, and thus deploy higher-quality systems to the field in a more timely and cost-effective manner.
Advancing Disciplined Methods for Engineering Software
The Software Engineering Measurement and Analysis group is developing methods and tools for modeling uncertainties for pre-milestone A cost estimates to minimize the occurrence of severe acquisition program cost overruns due to poor estimates. Their approach involves synthesizing Bayesian belief network modeling and Monte Carlo simulation to model uncertainties among program change drivers, allow subjective inputs, visually depict influential relationships and outputs to aid team-based model development, and assist with the explicit description and documentation underlying an estimate.
Concluding Remarks
This concludes our blog postings for 2011. It’s been my great pleasure and privilege to work with the technical staff at the SEI this year to better acquaint you with the SEI body of work. We’ve enjoyed reading your comments and hope that you’ve learned more about the R&D activities that we’re pursuing. We wish all of you a happy holiday season and look forward to hearing from you in 2012.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:49pm</span>
|
By Will Casey Senior Researcher CERT
Through our work in cyber security, we have amassed millions of pieces of malicious software in a large malware database called the CERT Artifact Catalog. Analyzing this code manually for potential similarities and to identify malware provenance is a painstaking process. This blog post follows up our earlier post to explore how to create effective and efficient tools that analysis can use to identify malware.
At the heart of our approach are longest common substring (LCS) measures, which describe the amount of shared code in malware. In this post we explain how to create measures for similarity studies on malware via a suffix tree, which is a data structure that encodes an entire map of shared substrings in a malware corpus, such as the CERT Artifact Catalog. We characterize the performance characteristics of suffix trees and quantify their dependence on memory and input size. We also demonstrate the efficient construction of suffix trees for large malware data sets involving thousands of files. In addition, we compare LCS measures to the laborious and time intensive process of manually creating signatures (which are regular expressions applied to the binary thought to be both specific and indicative for malware.
Building the Suffix Tree
By building a suffix tree data structure for the CERT Artifact Catalog we can form a better representation of the malware corpus for studies of malware involving string query, shared string usage, and string similarity. Having uncharacterized data is like being in unexplored, unmapped territory. A suffix tree allows analysts to explore and map the malware landscape. Shared code becomes the topographical features of the mapped landscape. As travelers use a map and the landscape features to reason about where they are and where they want to go, so do malware analysts study the large shared substrings of the suffix tree to reason about what areas to focus on. For example, multiple malware pieces from the Zeus malware family have code in common and provide a means to explore and analyze the entire family of malware.
A suffix tree can be built in time linear to the size of the input, allowing us to identify any long common substrings in linear time. We augmented the conventional suffix tree data structure and algorithm to include queries based on subsets of files and measures of information (such as Shannon-entropy) on shared strings. To scale our suffix tree data structure to large data sets we also developed external algorithms that operate efficiently beyond the capacity of main memory in a single computer.
Using the Suffix Tree to Create an LCS Measure for Similarity Studies on Malware
After constructing the suffix tree, we used it to analyze different families of malware, including the Poison Ivy malware family that installs a remote access tool onto an exploited machine. Poison Ivy files were collected by CERT from 2005 to 2008. Although this family of malware is no longer thought to be in active development, analysts have examined it extensively. We used Poison Ivy files as a test set to validate findings from our data structures. For example, we applied clustering based on LCS and compared it to a "ground truth" of known subgroups within the Poison Ivy family.
Our suffix tree data structure enabled us to identify several LCSs that were common to many files in the Poison Ivy family. By quickly filtering out strings of low entropy, we were left with meaningful coding sequences from which we can determine sequences that are characteristic of the malicious software family.
Validating the Measure
After analyzing the code using the suffix trees, we compared our results against signatures that were developed over the course of several years of extensive examination by analysts. We used suffix trees to identify several critical substrings that matched identically across multiple files, exceeded a certain length, and had satisfactory information content. These landmark substrings were then used to create a feature-vector for each file; these feature-vectors were used to cluster the files into subgroups. We then created dendagrams that suggested relationships among the files based on co-location of long common substrings, as shown in the following diagram. (Click here for a larger view.)
To validate the clusters, we revisited the Poison Ivy files and used the signatures that had been developed by analysts to identify versions in the software. Our evaluation showed that the LCS clustering produced groupings consistent with signatures that were developed by analysts, in many cases exposing additional sub-groups that we were unaware of. Moreover, the LCS clustering can group corrupted files and identify potential incorrect attributions.
Results of our Research
We used suffix trees to analyze approximately 200 to 1,000 files in about four hours and identify additional details on the structure of the family that analysts could not access via manual inspection alone. Unfortunately, people often view automated methods as a means to replace human analysis. The goal of our research, however, is to use suffix trees to create a more effective use of computing to bear against the problems of identifying malware from clean-ware. For example, malware may have components that resemble more than one family. Our new tool may allow us to identify those components of malware, as well those that set off a command-control interface or an element that may install a remote access tool.
Future Work
In the past year, our research has focused on creating the suffix tree data structures and ensuring that they can provide us with useful information about malware families. Our next steps are to scale the data structures to larger data sets and optimize them to allow for even larger input size. We are currently able to generate approximately 8,000 files into a data structure. Ideally, we would like to optimize the data structures and algorithms (exploiting parallelism) to include between 80,000 to 100,000 files, the size of which can exceed the main memory of a single computer.
Additional Resources
For further reading about CERT Program work in malware or malicious code research, click on the SEI Blog links below:A New Approach to Modeling Malware using Sparse RepresentationUsing Machine Learning to Detect Malware SimilarityFuzzy Hashing Techniques in Applied Malware AnalysisLearning a Portfolio-Based Checker for Provenance-Similarity of Binaries
More information about CERT research and development is available in the 2010 CERT Research Report, which may be viewed online atwww.cert.org/research/2010research-report.pdf
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:49pm</span>
|
By Donald Firesmith Senior Member of the Technical StaffAcquisition Support Program
In our work with acquisition programs, we’ve often observed a major problem: requirements specifications that are incomplete, with many functional requirements missing. Whereas requirements specifications typically specify normal system behavior, they are often woefully incomplete when it comes to off-nominal behavior, which deals with abnormal events and situations the system must detect and how the system must react when it detects that these events have occurred or situations exist. Thus, although requirements typically specify how the system must behave under normal conditions, they often do not adequately specify how the system must behave if it cannot or should not behave as normally expected. This blog post examines requirements engineering for off-nominal behavior.
Examples of off-nominal behavior that are inadequately addressed by requirements specifications include how robust (i.e., error, fault, and failure tolerant) must the system be, how the system must behave when hardware fails or software defects are executed, how the system must react when incorrect data (e.g., out of range or incorrect data type) is input, and what should happen if the system detects that it is in an improper mode or inconsistent state. This lack of requirements specification can lead to the following omissions and questions that must be asked as a result.
All credible conditions and events. How must the system behave under off-nominal sets of preconditions and trigger events that are unlikely and/or infrequent? When these conditions occur—as they invariably will—there is a risk that the system either does not handle them or the developers have been forced to guess (often incorrectly) how the system must behave. The requirements therefore need to specify how the system shall behave under all credible combinations of conditions and trigger events. Moreover, how are combinations of rare conditions and events determined to be not credible? Users and requirements engineers often underestimate the probability of rare occurrences, so they are surprised when they occur and the system reacts improperly. If these off-nominal conditions and the desired behavior of the system to them are not identified and documented early in the lifecycle, the decisions about what error/fault conditions should be handled by the system are left to individuals who may not have the proper expertise to identify such conditions, but who nevertheless feel compelled to make such decisions.
Detecting off-nominal situations. How will the system recognize off-nominal combinations of conditions and events? Does the system need sensors to determine the existence of these states or occurrence of these events? How available, reliable, accurate, and precise must these sensors and inputs be?
Reacting to off-nominal situations. How must the system react when it recognizes an off-nominal combination of conditions (possibly when a specific, associated event occurs)? Must it notify users or operators by providing warnings, cautions, or advisories? Must it do something to ensure that the system remains in a safe or secure state? Must the system be able to shut down in a safe and secure state or must it automatically restart? Must it record abnormal situations and the responses of the users/operators?
Incomplete use case models. Use case modeling is the most common requirements identification and analysis method for functional requirements. Each use case has one or more normal (so-called "sunny day") paths (a.k.a., courses and flows) as well as several exceptional ("rainy day") paths. Unfortunately, requirements engineers often concentrate so heavily on normal paths that there is inadequate time and staffing to properly address the credible exceptional paths. This omission leads to incomplete requirements specifications that do not adequately address necessary robustness (e.g., error, fault, and failure tolerance), reliability, safety, and security.
Coding standards. Programming languages typically include features and reusable code (e.g., base classes that come with the language) that are inherently unreliable, unsafe, and insecure. Because language features may not be well-defined in the language specification, their behavior may be inconsistent. For example, the use of concurrency and automated garbage collection can lead to common defects, such as race conditions, starvation, deadlock, livelock, and priority inversion. Likewise, certain language features may well be used in an incomplete manner. For example, an if/then/else clause may not contain an else clause stating what to do if the if clause precondition is not true. Similarly, a do X followed by do Y may not say what to do if X fails to complete. There are other cases such as divide by zero situations, taking the square root of negative numbers, and a lack of strong typing, as well as no verification of inputs, preconditions, invariants, post conditions, and outputs. These implementation coding defects typically start as requirements defects: incomplete requirements that do not mandate the use of reliable, safe, and secure subsets of the language, safe base classes, and automatically verified coding standards.
Lack of subject matter expertise. Exception handling is often left to the programmers, who must ensure their software is error, fault, and failure tolerant and meets its requirements. Programmers will be blamed if defects prevent the system from being available, reliable, robust, safe, and secure, even if there are no relevant requirements. Unfortunately, programmers often make assumptions as to what the software’s off-nominal behavior should be. Without adequate domain expertise and sufficient contact with subject matter experts, programmers will incorporate defects and safety/security vulnerabilities. Likewise, poor quality requirements specifications show how requirement engineers struggle to address mandatory off-nominal requirements since they lack sufficient domain expertise and training to determine, analyze, and specify adequate availability, reliability, robustness, safety, and security requirements. Ultimately, the engineering of these quality requirements requires subject matter expertise that is rarely combined in any one developer.
There are often many more off-nominal (and rare) combinations of conditions and events than the common nominal ones. There are also often many more ways that a system can fail. Requirements specifications are typically incomplete with regard to the previous problems, often only including 10 to 30 percent of the necessary requirements. This level of incompleteness can result in systems that fail to meet their true availability, reliability, robustness, safety, and security requirements.
It is insufficient for requirements specifications to state that the system shall be highly available, reliable, robust, safe, and secure, or that it has no single points of failure. The requirements must specify all credible off-nominal combinations of conditions and events. Otherwise, software developers will make incorrect guesses, have incorrect assumptions, and ignore important off-nominal situations. Without complete requirements, verification will not catch these defects, and the resulting defective system will be fielded with highly unfortunate, if predictable, results. Because program offices cannot safely assume that the contractor will automatically address these issues, programs cannot safely leave it up to their contractors. Off-nominal situations must be properly addressed in the requirements.
Studies (Knight, Weiss, Leveson) have shown that the vast majority of accidents (safety) and many common software vulnerabilities (security) result at least partially from incomplete requirements. Many availability and reliability defects due to software also result, at least partially, from incomplete requirements.
Recommended SolutionsTo address the problems described above, acquisition program offices should consider the following steps:
Address off-nominal requirements in the contract. The program office should contractually mandate all significant off-nominal behavior. The contract should also mandate that contractors address all credible off-nominal conditions and events affecting mission, safety, and security functionality.
To address all credible conditions and events, the program office should ensure that the contractor’s requirements engineering plan explicitly states that all credible combinations of conditions and events are to be addressed, even very rare ones, if the corresponding function is mission, safety, and/or security critical. The program office should verify that the requirements engineers collaborate with reliability, safety, and security engineers to ensure than no significant combinations of states and events be overlooked. The program office should also ensure that the proper testing of the associated software in terms of test completion criteria and test case generation criteria is explicitly addressed in the software test plans.
To detect off-nominal situations, the program office should ensure that the contractors have properly addressed the detection of off-nominal situations. Specifically, this includes verifying that the system engineering management plan (SEMP) as well as the system requirements, architecture, and design address situational awareness in terms of both sensors and the input of necessary data concerning off-nominal situations.
When reacting to off-nominal situations, the program office should ensure that the requirements address how the system must behave if it cannot behave in the nominal manner. This process includes ensuring that the system either remains in a safe and secure state or shuts down safely and securely (i.e., is fail-safe). It also includes notifications (warnings, cautions, and advisories) as well as logging any associated error, fault, or failure information.
To ensure against incomplete use case models, the program office should ensure that they include all credible normal and exceptional use case paths, and also ensure that adequate project schedule, budget, and staffing are allocated to complete the models. Verification of the requirements and their associated models should explicitly address exceptional as well as normal use case paths.
With respect to contractor coding standards, the program office should ensure they explicitly address eliminating common design and coding defects that make the software less available, reliable, robust, safe, and secure. The program office should also ensure these coding standards are properly followed including, where practical, automatic verification via static and dynamic code checking.
With respect to subject matter expertise, the program office should ensure that associated quality requirements mandating adequate availability, reliability, robustness, safety, and security are engineered by cross-functional teams of closely collaborating requirements engineers, subject matter experts, stakeholders, and engineers specializing in reliability, safety, and security. This team must identify the appropriate credible off-nominal situations and decide which of these situations should be analyzed and turned into associated requirements given programmatic constraints such as cost, schedule, available development staffing, and critical functionality. The program office should also ensure that the contractors and subcontractors use appropriate coding standards and associated foundational software (e.g., a safe and secure subset of C++, including safe and secure base classes).
Most acquisition programs suffer from incomplete requirements, especially with regard to dealing with rare combinations of states and events, detecting and reacting to off-nominal situations, use-case models that are incomplete due to missing exceptional use case paths, and either inadequate coding standards or coding standards not being followed. The engineers who actually develop the software often lack adequate expertise in availability, reliability, robustness, safety, and security requirements, which yields systems that do not meet their associated requirements. While the problems are well known, so are their answers. Program offices, therefore, must ensure that these answers are implemented, enforced, and verified to be effective and efficient.
Additional Resources:
To read Don Firesmith’s series on The Importance of Safety- & Security-Related Requirements, please visit http://blog.sei.cmu.edu/archives.cfm/category/safety-related-requirements
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:49pm</span>
|
By Ipek OzkayaSenior Member of the Technical StaffResearch, Technology, and System Solutions
Managing technical debt, which refers to the rework and degraded quality resulting from overly hasty delivery of software capabilities to users, is an increasingly critical aspect of producing cost-effective, timely, and high-quality software products. A delicate balance is needed between the desire to release new software capabilities rapidly to satisfy users and the desire to practice sound software engineering that reduces rework. A previous post described the practice of strategically managing technical debt related to software architecture, which involves deliberately postponing implementation of some architectural design choices to accelerate delivery of the system today and then rearchitecting at a later time. This blog post extends our prior post by discussing how an architecture-focused analysis approach helps manage technical debt by enabling software engineers to decide the best time to rearchitect—in other words, to pay down the technical debt.
Our architecture-focused approach for managing technical debt is part of the SEI’s ongoing research agenda on Agile architecting, which aims to improve the integration of architecture practices within Agile software development methods. This project is investigating which measures a software development team can apply to effectively monitor changing qualities of software, such as degrading modifiability and extensibility of the system at each iteration in an iterative and incremental lifecycle like Agile, for example. We initially investigated a particular metric—propagation cost—that measures the percentage of system elements affected when a change is made to a randomly chosen element.
A high propagation cost is an indication of tight coupling, such that when a change is made in the system, many parts of the system will be affected. We focused on propagation cost due to the rich set of existing static analysis techniques that evaluate code and design quality by measuring software coupling and cohesion, such as whether there are cycles within parts of the software system, whether there is code duplication, and so on. Most existing static analysis techniques focus on code quality and code-level technical debt. For example, a high percentage of duplicate code and cycles in the code indicates a high level of technical debt. In contrast, we applied propagation cost at the architecture level to calculate the impact of the dependencies looking at architectural elements, rather than calculating every dependency between different classes. The goal of this approach is to reduce complexity and provide insights, even when an implementation is not complete.
Our work explores the relationship between propagation cost and technical debt. In particular, we use propagation cost as one indication of increasing technical debt. We assess the potentially increasing rework—which is effectively the impact of paying back technical debt—based on monitoring increasing propagation cost of the system.
Reasoning about quality by modeling rework as a proxy for technical debt requires objective—and repeatable—representation of architectural properties (such as module dependencies and changing interfaces) for the model to work. We therefore modeled the dependencies of the architectural elements by means of a technique called design structure matrices (DSMs). DSMs can be used to visualize which elements use or depend on others at each iteration and to calculate propagation cost.
Our research on the propagation cost metric examined a real-world case study regarding a building automation control system. The research team had at its disposal the software engineering artifacts of the project, including the software architecture, code, functional and quality attribute requirements, and project management plan. Using these artifacts we generated the design structure matrix of the system at each of the project iterations and completed "what-if" studies. This what-if analysis focused on calculating accumulating rework based on allocating functionality and architectural tasks with different orders to the iterations. We applied propagation cost measurement to calculate the rework and to assess what the team could have done differently in terms of delivering functionality at different times and calculated the overall impact on rework and the lifecycle costs. Our goal was to demonstrate how different allocations of both functionality and critical architectural tasks to iterations can enable developers to respond to changes quicker and use technical debt to their advantage by monitoring the accruing rework.
The results of our studies showed that focusing on architectural dependencies, as well as using propagation cost as a proxy to indicate the level of changing complexity and rework, provided good insight into quantifying technical debt at the architecture level. This insight helps software architects, developers, and managers decide the best time to pay back technical debt or determine if technical debt is accumulating in the first place. To make these measurement and analysis techniques practical, however, they should be integrated seamlessly into engineers’ integrated development environments. For example, tools should have the ability to group classes into module view architectural elements and specify design rules, such as one element can or cannot access another element. New generation tools, such as Lattix, Sonargraph, and Structure101, are starting to explore such issues, though there is still room for improvement.
In some instances, architectural dependencies should be also integrated with architectural design decisions. For example, when using a mediator to decouple interfaces from the data model, the mediator communicates with all the interface and data model elements. When applying the propagation cost metric consistently across the system—including all dependencies to and from the controller element—a high propagation cost emerges. This high cost indicates a greater risk of technical debt and change propagation, potentially requiring rework when new features must be added.
Although higher propagation costs are generally associated with higher risk, in this case introducing a mediator to decouple the data model and the interface may be a good architectural decision since it localizes the changes. From a reliability perspective, however, the controller is a single point of failure. So in this case high propagation cost may not necessarily be negative from a modifiability standpoint, but it is still a reliability risk. Our studies revealed that enhancing propagation cost measurements with architectural information provides more insightful analysis of the actual implications of technical debt and rework.
Studying rework using propagation cost helps improve the integration of architecture practices within Agile software development methods. For example, when teams are developing software in an Agile context, they typically embrace Scrum as a project-management technique. It is often hard for teams to determine how to subdivide large architectural tasks and allocate them to small two- to four- week sprints. Our research demonstrated that by focusing on iteration-to-iteration analysis— rather than trying to time box distribution of functionality to sprints where each time box/sprint has the same duration—it is possible to show customers how the quality of the software changes with each release, such as how increasing propagation cost could impact rework.
Our next steps are to examine the scalability of assessing rework by focusing on architecture metrics. We developed two real-life case studies using dependency analysis to operationalize the measurement of propagation cost. While our approach works fine with 100- to 200 software architecture elements, we are now evaluating how well our approach scales up to a higher number of elements. When our analysis focused on software architecture as opposed to code to quantify technical debt, we observed a magnitude of reduction of dependencies analyzed from about 200 to two dozen, and we were able to pinpoint the potential emerging rework, which is a significant reduction of complexity. Architecture-level analysis of technical debt enables a team to gauge the status of the system quickly and make decisions on whether to rework the system or not. Code-level analysis then enables the team to define specific tasks for developers.
Our research on an architecture-focused measurement framework for managing technical debt is informed by real-world examples gathered from Technical Debt Workshops. These workshops engage practitioners and researchers in an ongoing dialogue to improve the state of techniques for managing technical debt. The 2011 Managing Technical Debt Workshop co-located with the International Conference on Software Engineering (ICSE) revealed an increasing interest in managing technical debt proactively. As a result, we will conduct a third workshop—again collocated with ICSE on June 4, 2012. Our research team will also guest-edit the November/December 2012 issue of IEEE Software on the same theme and is accepting papers until April 1, 2012. We welcome any individuals who have experiences in this area to submit a paper for consideration in IEEE Software or the 3rd International Workshop on Managing Technical Debt.
This research was conducted in collaboration with
Dr. Philippe Kruchten, professor of software engineering at the University of British Columbia, and Raghu Sangwan, associate professor of software engineering at Penn State University, and with support from Lattix, a leading provider of software architecture management solutions.
Additional Resources
N. Brown, P. Kruchten, R. Nord, and I. Ozkaya. Managing technical debt in software development: report on the 2nd international workshop on managing technical debt, held at ICSE 2011. ACM SIGSOFT Software Engineering Notes 36 (5): 33-35 (2011).
N. Brown, Philippe Kruchten, R. Nord, and I. Ozkaya. Quantifying the Value of Architecting Within Agile Software Development via Technical Debt Analysis. 2011.
N. Brown, R. Nord, I. Ozkaya, and M. Pais. Analysis and Management of Architectural Dependencies in Iterative Release Planning. In Proceedings of the 2011 Ninth Working IEEE/IFIP Conference on Software Architecture (WICSA '11). IEEE Computer Society, 103-112.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:49pm</span>
|
By Mary Ann LaphamSenior Member of the Technical StaffAcquisition Support Program
Over the past several years, the SEI has explored the use of Agile methods in DoD environments, focusing on both if and when they are suitable and how to use them most effectively when they are suitable. Our research has approached the topic of Agile methods both from an acquisition and a technical perspective. Stephany Bellomo described some of our experiences in previous blog posts What is Agile? and Building a Foundation for Agile. This post summarizes a project the SEI has undertaken to review and study Agile approaches, with the goal of developing guidance for their effective application in DoD environments.
The SEI’s Agile project began in 2009 in response to our recognition of the growing awareness that Agile methods help alleviate key challenges facing the DoD, such as providing competitive capabilities to warfighters in a timely manner that minimizes collateral damage and loss of lives and property. We also observed an emerging consensus that Agile methods can be applied to create systems who functionality and quality attributes can be adapted more readily over time, which may help reduce total ownership costs over long acquisition program lifecycles. Within this context, the primary activities of our project included reviewing relevant literature, interviewing programs that are using or have used Agile methods, identifying criteria to be used to determine if a program is a candidate for Agile methods and what risks exist for implementing the Agile methods, and creating guidelines to be used in implementing agile methods.
We initially focused on the question of whether Agile could be used in DoD acquisition programs, which historically follow the DoD 5000 series of guidelines that have been associated with so-called waterfall methods. An early finding of our project was that no prohibitions preclude the use of Agile in the DoD 5000 series. This result is important since some skeptics have asserted that Agile is not suited for the DoD due to inherent conflicts between Agile methods and DoD policies and regulations.
Given that there is no one size fits all Agile method, however, we also found that implementations of Agile methods must be tailored to fit the situation and context. In other words, Agile is not a silver bullet. Transitioning DoD systems—and their associated socio-technical ecosystems—to Agile therefore requires considerable work from DoD agencies and the defense industry base, and is not without hurdles.
For example, we found that adapting Agile into the DoD acquisition life cycle presents unique challenges and opportunities. The challenges are in meeting existing DoD milestone and regulatory criteria when there is little if any guidance available on how to do so when using Agile methods. The opportunities, however, provide the capability to accomplish development with frequent results. Other hurdles identified by DoD programs we interviewed include
providing the right team environment allowing access to end users
determining how to train and coach the government staff
instituting suitable oversight methods
adapting rewards and incentives to the Agile environment
adjusting to a different team structure
These hurdles are due to a different approach than business as usual and a general lack of specific training and guidance on Agile concepts and approaches within the DoD. Overcoming these hurdles requires changes to the waterfall-centric organizational culture that is common within DoD acquisition programs. These culture changes also require mindset changes because the underlying paradigm for implementing Agile methods is different from that used for the waterfall method.
After studying the topics described above to gain a preliminary understanding of the use of Agile within the DoD, we then studied other management, acquisition, and technical topics, as described below.
Agile Management and Acquisition Topics
Our project also is studying the following management and acquisition topics relevant to the effective adoption of Agile methods in the DoD:
Being Agile in the DoD. Agile methods provide promising techniques for streamlining the acquisition process for systems within the DoD. To meet the challenges of adopting Agile methods, however, DoD program management offices must take specific actions to assist in Agile adoption and even enable it. For example, to ensure successful Agile adoption, DoD organizations must plan for it, train for it, anticipate changes in their environments and business models to ensure the benefits of Agile become a reality.
Managing and contracting for Agile programs. Managing large-scale, complex software-reliant systems is always hard, but management in Agile programs takes on some added dimensions. For example, program managers not only must be leaders, they must also be coaches, expeditors, and champions. If they do not personally perform these roles, someone in their organizations must responsible for them. These additional roles are needed due to the paradigm associated with using Agile methods and the lack of any significant experience by current DoD personnel in that arena. A particular management concern is the selection and implementation of appropriate contracting vehicles to support the types of practices that successful Agile projects exhibit.
Technical milestone reviews. One sticking point in employing Agile methods is how to accommodate large capstone events, such as the preliminary design review (PDR), critical design review (CDR), and others. While many concerns exist in this area, it’s important to focus on the purpose of holding these reviews in the first place: to evaluate progress on and/or review specific aspects of the proposed software solution. Expectations and criteria must therefore reflect the level and type of documentation that is acceptable for the milestone, which is no different from business as usual. The key, however, is to define the level and type of documentation required for the specific program while working within an Agile environment.
Estimating in DoD Agile acquisition. Estimation done on Agile projects is typically not the same as the traditional methods used on legacy systems within DoD. Traditional methods tend to focus on estimating details up front; these details are then modified as more information is obtained. In contrast, Agile estimates are often "just-in-time," with high-level estimates that are refined to create detailed estimates as knowledge of the requirements matures. Some tools within the traditional estimation community are now adding modules to address Agile estimation.
Moving toward adopting Agile practices. Change is hard—especially for large DoD ecosystems—and understanding the scope of changes is essential. Organizational change methods must be employed to help DoD organizations successfully adapt to applying Agile. There are multiple adoption factors (such as business strategy, reward system, sponsorship, values, skills, structure, history, and work practices) that must all be addressed. Change-management best practices include understanding the adopter population, understanding the cycle of change, understanding the adoption risks, and building transition mechanisms to mitigate adoption risks. Organizations we studied that had successfully adopted Agile methods typically achieved the following goals:
Found and nurtured good sponsors for Agile adoption
Understood the adoption population they were dealing with
Conducted a readiness assessment that addressed organizational and cultural issues
Analyzed what adoption support mechanisms were needed for a particular context and built or acquired them before proceeding too far into an Agile adoption
SEI Agile work continues and the following additional documents are—or will soon be—available:
Case Study of Successful Use of Agile Methods in DoD: Patriot Excalibur 2011-TN-019
Agile Methods: Changing the Viewpoint of Government Technical Evaluation 2011-TN-026
A Closer Look at 804: A Summary of Considerations for DoD Program Managers 2011 SR-015
"DoD Agile Adoption—Necessary Considerations, Concerns, and Changes" in the Jan/Feb 2012 issue of CrossTalk
In addition, the SEI has created an Agile Collaboration Group to advise, review, enhance, and validate SEI acquisition work. The SEI is working with this group to create, calibrate, and validate a contingency model that will help acquisition professionals determine when to use Agile techniques, as well as how to identify potential risks if Agile methods are adopted. We are also creating guidelines that summarize best practices and instruct users of Agile methods on how to apply these methods effectively in DoD environments.
Agile Technical Topics
Agile software development has historically succeeded in small-scale (largely IT-based) commercial environments due largely to its easy-to-apply practices for tracking project status and allocating the development resources to those activities that deliver the most potential customer value. A key technical challenge for DoD projects, however, involves balancing the short-term and long-term needs. In particular, the cliché "You aren’t gonna need it (YAGNI)" is a principle in eXtreme Programming (XP) that implies developers should not add functionality until it is necessary, thereby eliminating a considerable amount of unused code in a system. The YAGNI principle rarely seems to apply, however, in large-scale DoD environments, where systems must operate for decades with continual flux with respect to evolving requirements, technology upgrades, new partners, and different contractors.
The SEI is conducting the following technical work on successfully creating and applying Agile methods for the DoD:
Agile at scale. This work focuses on providing methods and techniques for applying Agile software development practices for large-scale DoD programs, with improved visibility into the release plan and the quality of the system. One of our activities to address the use of Agile development at scale was conducting a field study with organizations that deal with the challenges of Agile and architecture practices at scale. Based on our observations, we developed a readiness, best practices, and risk analysis technique. Striking the proper balance between developing the system and the architecture in an agile manner while providing enhancement agility for maintaining the system is key to success. We observed that tactics that help organizations to succeed within an Agile environment include paying close attention to architecture-centric practices where a balance of feature development and architecture development is achieved.
Technical debt. Our work in technical debt analysis again focuses on architecture and looks at strategically incurring technical debt (such as applying architectural short-cuts) to improve agility in the short-term. This work focuses on developing techniques to monitor and respond to emerging rework, as well as the need to refactor or rearchitect the system to pay back the debt. The need to refactor or rearchitect DoD systems arises in several ways. For instance, system quality degradation, such as unacceptable end-to-end performance, might require refactoring. Such quality degradation-related rework can appear if the development teams focused solely on feature-oriented decomposition of the system to deliver features at early iterations, but didn’t provide the necessary architecture for the infrastructure in a timely manner. Refactoring would require the restructuring of the existing body of code to alter its internal structure (architecture) but not change its external behavior to address the decrease in quality.
Modeling decision impact on agile development. Our work also provides guidance and techniques that enhance the applicability of mainstream Agile and lean software development methods to DoD stakeholders by balancing their acquisition and technical needs. In a recently started project, for example, we are investigating acquisition and architecture activities during the pre-engineering manufacturing and development phase of the acquisition lifecycle. This work closely examines modeling decision dependencies and analyzes their impact on the ability to conduct effective Agile system development. This work targets the perspective of reducing integration risks in large-scale DoD systems.
In summary, our projects have found that Agile methods can indeed provide both tactical and strategic benefits in the DoD. The tactical benefits of lower cost, on-time delivery, and increasing quality are clearly important as the DoD places a growing emphasis on greater efficiency in its acquisition processes. The strategic benefits of responsiveness and more rapid adaptability to the current situation, however, may be of even greater value in today’s world, where the DoD must get results faster and be better aligned with changing needs to prepare for an uncertain future. As our work progresses, we will periodically post our progress, ask questions, and request feedback. If you have any questions or feedback on the current work, please post in the comments below.
Additional Resources
Please see the following SEI technical reports and notes for more on Agile development:Considerations for Using Agile in DoD Acquisition Agile Methods: Selected DoD Management and Acquisition Concerns Documenting Software Architectures in an Agile World CMMI or Agile: Why not Embrace Both! Incorporating Security Quality Requirements Engineering (SQUARE) into Standard Lifecycle Models Secure Software Development Life Cycle Processes—A Technology Scouting Report Integrating Software-Architecture-Centric Methods into Extreme Programming (XP)
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:49pm</span>
|
By Douglas C. SchmidtVisiting Scientist
We
use the SEI Blog to inform you about the latest work at the SEI, so
this week I'm summarizing some video presentations recently posted to
the SEI website from the SEI Technologies Forum.
This virtual event held in late 2011 brought together participants from
more than 50 countries to engage with SEI researchers on a sample of
our latest work, including cloud computing, insider threat, Agile
development, software architecture, security, measurement, process
improvement, and acquisition dynamics. This post includes a description
of all the video presentations from the first event, along with links
where you can view the full presentations on the SEI website.
Paul Nielsen, director of the SEI, gave the opening remarks, which summarized the presentations in the SEI Technologies Forum, focusing on the SEI’s leadership role in software, security, and resiliency technologies and methods that help address the complexities of software-reliant systems. You can watch the opening presentation here.
My presentation described the SEI’s strategic plan to advance the practice of software engineering for the DoD, federal agencies, industry, and academia through research and technology transition. I motivated and summarized the following four major areas of software engineering and cyber security work at the SEI:
Innovating software for competitive advantage. This area focuses on producing innovations that revolutionize development of assured software-reliant systems to maintain the U.S. competitive edge in software technologies vital to national security.
Securing the cyber infrastructure. This area focuses on enabling informed trust and confidence in using information and communication technology to ensure a securely connected world to protect and sustain vital U.S. cyber assets and services in the face of full-spectrum attacks from sophisticated adversaries.
Advancing disciplined methods for engineering software. This area focuses on improving the availability, affordability, and sustainability of software-reliant systems through data-driven models, measurement, and management methods to reduce the cost, acquisition time, and risk of our major defense acquisition programs.
Accelerating assured software delivery and sustainment for the mission. This area focuses on ensuring predictable mission performance in the acquisition, operation, and sustainment of software-reliant systems to expedite delivery of technical capabilities to win the current fight.
You can watch a video of my presentation here. The remainder of this blog posting summarizes the forum presentations, which are grouped under the four major research areas outlined above.
Innovating Software for Competitive Advantage
The presentation on Architectural Implications of Cloud Computing by Grace Lewis defined cloud computing, explored different types of cloud computing environments, and described the drivers and barriers for cloud computing adoption. It also focused on examples of key cloud architecture and design decisions, such as data location and synchronization, user authentication models, and multi-tenancy support. This topic is important since cloud computing is being adopted by commercial, government, and Department of Defense (DoD) organizations, driven by a need to reduce the operational cost of their information technology resources.
From an engineering perspective, cloud computing is a distributed computing paradigm that focuses on providing a wide range of users with distributed access to virtualized hardware and/or software infrastructure over the internet. From a business perspective, it is the availability of computing resources that are scalable and billed on a usage basis (as opposed to acquired resources) that lead to potential cost savings in IT infrastructure. From a software architecture perspective, having resources in the cloud means that some elements of the software system will be outside the organization, and the control over these elements depends on technical aspects such as the provided resource interface, and on business aspects such as the service-level agreement (SLA) with the resource provider. Systems must therefore be designed and architected to account for lack of full control over important quality attributes. You can watch a video of Grace’s presentation here.
Ipek Ozkaya made a presentation on Agile Development and Architecture: Understanding Scale and Risk. This presentation examined tactics that can help identify and mitigate key risks of large-scale, complex software development when there is a need to use Agile development and architecture-centric practices in concert. This topic is important because Agile software development and software architecture practices have received increasing attention from both industry and government over the past decade. The complementary nature of Agile development and software architecture practices is also increasingly better recognized and appreciated. Applying Agile development with a concurrent focus on architecture, however, is still experimental and experiential rather than a proven practice based on sound engineering techniques. This presentation described how SEI researchers are helping organizations using Agile techniques deal with increased system software size and increased complexity in orchestrating larger engineering and development teams, to ensure that the systems they develop will be viable in the market for decades. You can watch a video of Ipek’s presentation here.
Securing the Cyber Infrastructure
A presentation by Randy Trzeciak on The Insider Threat: Lessons Learned from Actual Insider Attacks described the technical and behavioral aspects of insider threats, focusing on the types of insiders who committed the crimes, their motivation, organizational issues surrounding the incidents, methods of carrying out the attacks, impacts, and precursors that could have served as indicators to organization in preventing incidents or detecting them earlier. It also conveys the complex interactions, relative degree of risk, and unintended consequences of policies, practices, technology, insider psychological issues, and organizational culture over time. This presentation stemmed from a decade of work by the Insider Threat Center at CERT, which has been researching insider threats since 2001 and has built an extensive library and comprehensive database containing more than 700 actual cases of insider cybercrimes. This presentation describes findings from our analysis of three primary types of insider cybercrimes: IT sabotage, theft of information, and fraud. You can watch a video of Randy’s presentation here.
The Smart Grid Maturity Model: A Vision for the Future of Smart Grid presentation by David White offered insight into the past year’s use of the Smart Grid Maturity Model (SGMM), which is a management tool for the utility industry to plan a reliable, secure energy supply that is vital to our economy, our security, and our well-being. The smart grid represents a new framework for improved management of electricity generation, transmission, and distribution. With the support of the U.S. Department of Energy, the SEI is the steward of the SGMM. This presentation described the release of the SGMM V1.2 Product Suite and showed how utilities are working with the model. As more utilities around the globe participate and the SGMM experience base grows, the SGMM has become an increasingly valuable resource for helping inform the industry’s smart grid transformation. You can watch a video of David’s presentation here.
Julia Allen’s presentation was on Measuring Operational Resilience. This presentation suggested the strategic measures for an organization’s an operational resilience management (ORM) program, which defines an organization’s strategic resilience objectives (such as ensuring continuity of critical services in the presence of a disruptive event) and resilience activities (such as the development and testing of service continuity plans). Traditional operational security metrics such as number of machines patched, vulnerability scan results, number of incidents, and number of staff trained are easy to collect and can be useful. If an organization’s objectives are to inform decisions, affect behavior, and determine control effectiveness in support of business objectives, however, they must consider a set of more strategic resilience measures. These ten strategic measures derive from lower-level measures at the CERT Resilience Management Model (RMM) process area level, including average incident cost by root cause type and number of breaches of confidentially and privacy of customer information assets resulting from violations of provider access control policies. You can see a video of Julia’s presentation here.
Advancing Disciplined Methods for Engineering Software
CMMI-SVC: The Strategic Landscape for Service, by Eileen Forrester, described the current state of CMMI for Services (CMMI-SVC). It also explored the larger strategic choices available to organizations in markets where superior service can improve work and business results. CMMI-SVC is important because the global economy is increasingly based on services, rather than manufacturing or trading of tangible goods. Even the development of goods and systems increasingly takes on the character of services. Innovative CMMI-SVC approaches that are already working in the United States, Latin America, Asia, and Europe can be tailored to meet the needs of other organizations and markets. You can watch a video of Eileen’s presentation here.
James McHale gave A Brief Survey of the Team Software Process (TSP). This presentation briefly described training and introduction of TSP practices, including the Personal Software Process (PSP), the results and benefit potentials inherent in the methods, and the common use of TSP methods in combination with other popular practices, including Agile (Scrum, TDD, XP), architecture, secure coding, RUP, Six Sigma, and CMMI. TSP has been identified as one of the most effective practices for software developers by Capers Jones in his 2010 book Software Engineering Best Practices. You can see a video of James’s presentation here.
Accelerating Assured Software Delivery and Sustainment for the Mission
The presentation on Software Acquisition Program Dynamics by William ("Bill") Novak described analysis the SEI is doing on data collected from more than 100 independent technical assessments of software-reliant acquisition programs. This analysis has produced insights into the most common ways that acquisition programs encounter difficulties. Programs regularly experience recurring cost, schedule, and quality failures, and progress and outcomes often appear to be unpredictable and unmanageable. Moreover, many acquisition leaders and staffers neither recognize these recurring issues nor realize that known solutions exist for many of these problems. This presentation explains how the SEI is working to mitigate the effects of misaligned acquisition program organizational incentives and adverse software-reliant acquisition structural dynamics by improving program staff decision-making. To do this, SEI researchers are modeling and analyzing both the adverse acquisition dynamics that we have encountered in actual programs, as well as candidate solutions to resolve those dynamics. You can watch a video of Bill’s presentation here.
Next Event February 28
A second virtual event, Architecting Software the SEI Way, is planned for February 28. This event focuses on moving toward using architecture practices more effectively to build better systems more efficiently and productively by understanding the fundamentalsof software architecture, improving practice through architecture evaluation guidelines, and bridging technical and business goals by applying architecture methods to analyze, and evaluate enterprise software architectures. We look forward to "seeing" you there. If you have any questions or thoughts on any of the presentations please feel free to leave your comments below.
Additional Resources
SEI Technologies ForumArchitecting Software the SEI Way
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:48pm</span>
|
By Dave Zubrow, Manager Software Engineering Measurement and Analysis Initiative
The SEI has been actively engaged in defining and studying high maturity software engineering practices for several years. Levels 4 and 5 of the CMMI (Capability Maturity Model Integration) are considered high maturity and are predominantly characterized by quantitative improvement. This blog posting briefly discusses high maturity and highlights several recent works in the area of high maturity measurement and analysis, motivated in part by a recent comment on a Jan. 30 post asking about the latest research in this area. I’ve also included links where the published research can be accessed on the SEI website.
At CMMI level 3, work is proactively managed and standard processes are used. Beyond level 3, process performance needs to be understood quantitatively. High maturity means you have the data to understand how the process is performing, how variation in the implementation and execution of the process affect performance, and what the likely costs and benefits of any change will be. A high-level description of the benefits, by process area, is shown below.
In past years some CMMI users said they felt high maturity was not well-defined, an issue addressed by its clarification in CMMI v1.3. The CMMI community has also debated the benefits of moving up to high maturity, and asked for more examples of high maturity process implementations. Some challenges organizations face when striving for high maturity include developing an insightful set of measures, creating predictive models for process performance, project management, and product quality, and knowing which tools and methods to use for modeling and analysis. The SEI has worked to address these concerns and provide needed resources through courses, case studies, and other publications about the implementation of high maturity practices.
Publications defining and describing high maturity measurement and analysis practices include:
CMMI and TSP/PSP: Using TSP Data to Create Process Performance ModelsBy Shurei TamuraThis report describes the fundamental concepts of process performance models (PPMs) and describes how they can be created using data generated by projects following the Team Software Process (TSP). PPMs provide accurate predictions and identify factors that projects and organizations can control to better ensure successful outcomes, helping organizations move from a reactive mode to a proactive, anticipatory mode. PPMs are fundamental to the implementation of the high maturity process areas of CMMI and are specifically required in the Quantitative Project Management and Organizational Process Performance process areas. The three examples in this report demonstrate how data generated from projects using TSP can be combined with data from other sources to produce effective PPMs.www.sei.cmu.edu/library/abstracts/reports/09tn033.cfm
Approaches to Process Performance Modeling: A Summary from the SEI Series of Workshops on CMMI High Maturity Measurement and AnalysisRobert W. Stoddard & Dennis R. GoldensonOrganizations are increasingly striving for and achieving high maturity status, yet there is still an insufficient shared understanding of how best to implement measurement and analysis practices appropriate for high maturity organizations. A series of twice-yearly workshops organized by the SEI allows organizations to share lessons learned to accelerate the adoption of best measurement and analysis practices in high maturity organizations.
This report summarizes the results from the second and third high maturity measurement and analysis workshops. The participants' presentations described their experiences with process performance models; the goals and outcomes of the modeling; the x factors used; the data collection methods; and the statistical, simulation, or probabilistic modeling techniques used. Overall summaries of the experience and future plans for modeling also were provided by participants.www.sei.cmu.edu/library/abstracts/reports/09tr021.cfm
CMMI High Maturity Measurement and Analysis Workshop Report: March 2008Robert W. Stoddard II, Dennis R. Goldenson, Dave Zubrow, & Erin HarperOrganizations are increasingly looking for guidance on how to implement CMMI high maturity practices effectively and how to sustain their momentum for improvement. As high maturity organizations work to improve their use of measurement and analysis, they often look to examples of successful implementations for guidance. In response to the need for clarification and guidance on implementing measurement and analysis in the context of high maturity processes, members of the SEI’s Software Engineering Measurement and Analysis (SEMA) initiative organized a workshop at the 2008 SEPG North America conference to bring leaders in the field together at a forum on the topic. Other workshops will be held as part of an ongoing series to allow high maturity organizations to share best practices and case studies.www.sei.cmu.edu/library/abstracts/reports/08tn027.cfm
The following reports describe results from surveys conducted related to the implementation and impacts of high maturity practices.
Performance Effects of Measurement and Analysis: Perspectives from CMMI High Maturity Organizations and AppraisersBy James McCurley & Dennis R. GoldensonThis report describes results from two recent surveys conducted by the SEI to collect information about the measurement and analysis activities of software systems development organizations. Representatives of organizations appraised at maturity levels 4 and 5 completed the survey in 2008. Using a variant of the same questionnaire in 2009, certified high maturity lead appraisers described the organizations that they had most recently coached or appraised for the achievement of similar high maturity levels. The replies to both surveys were generally consistent even though the two groups are often thought to be quite different. The results of the surveys suggest that the organizations understood and used CMMI-based process performance modeling and related aspects of measurement and analysis a great deal. Both the organizational respondents in 2008 and the appraisers in 2009 reported that process performance models were useful for the organizations.
The respondents in both surveys also judged that process performance modeling is more valuable in organizations that understood and used measurement and analysis activities more frequently and provided organizational resources and management support. In addition, results from the 2009 survey of lead appraisers indicate that organizations that achieved their appraised high maturity level goals also found measurement and analysis activities more useful than those organizations that did not achieve their targets.www.sei.cmu.edu/library/abstracts/reports/10tr022.cfm
Use and Organizational Effects of Measurement and Analysis in High Maturity Organizations: Results from the 2008 SEI State of Measurement and Analysis Practice SurveysBy Dennis R. Goldenson, James McCurley, and Robert W. Stoddard IIThere has been a great deal of discussion about what organizations need to attain high maturity status and what they can reasonably expect to gain by doing so. Clarification is needed along with good examples of what has worked well and what has not. This clarification is particularly needed with respect to measurement and analysis. This report contains results from a survey of high maturity organizations conducted by the SEI in 2008. The questions center on the use of process performance modeling in those organizations and the value added by that use. The results show considerable understanding and use of process performance models among the organizations surveyed; however there is also wide variation in the respondents’ answers. The same is true for the survey respondents’ judgments about how useful process performance models have been for their organizations. As is true for less mature organizations, there is room for continuous improvement among high maturity organizations. Nevertheless, the respondents’ judgments about the value added by doing process performance modeling also vary predictably as a function of the understanding and use of the models in their respective organizations. More widespread adoption and improved understanding of what constitutes a suit-able process performance model holds promise to improve CMMI-based performance outcomes considerably.www.sei.cmu.edu/library/abstracts/reports/08tr024.cfm
We hope you find this information useful. If there are any other areas of SEI research that you would like us to highlight, please leave a comment below.
Additional Resources:
An article in the January/February 2012 issue of CrossTalk, The Journal of Defense Software Engineering, titled "High Maturity - The Payoff" discusses the value and benefits realized by organizations that adopt high maturity CMMI Level 4 and 5 software processes.
Measuring the Software Process: Statistical Process Control for Software Process ImprovementBy William A. Florac & Anita D. CarletonThis book was one of the first published works that explicitly addressed how to use statistical process control methods to manage and improve software processes within an organization. It remains a key reference for how to implement high maturity measurement and analysis. It explains how quality characteristics of software products and processes can be quantified, plotted, and analyzed, so that the performance of software development activities can be predicted, controlled, and guided to achieve both business and technical goals.www.sei.cmu.edu/library/abstracts/books/0201604442.cfm
To learn more about the SEI’s work in Measurement and Analysis, please visitwww.sei.cmu.edu/measurement/index.cfm
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:48pm</span>
|