Training Magazine Network

Blogs

Improving Data Quality Through Anomaly Detection

By Mark Kasunic,Senior Member of the Technical Staff,Software Engineering Process Management Program Organizations run on data. They use it to manage programs, select products to fund or develop, make decisions, and guide improvement. Data comes in many forms, both structured (tables of numbers and text) and unstructured (emails, images, sound, etc.). Data are generally considered high quality if they are fit for their intended uses in operations, decision making, and planning. This definition implies that data quality is both a subjective perception of individuals involved with the data, as well as the quality associated with the objective measurements based on the data set in question. This post describes the work we’re doing with the Office of Acquisition, Technology and Logistics (AT&L)—a division of the Department of Defense (DoD) that oversees acquisition programs and is charged with, among other things, ensuring that the data reported to Congress is reliable. The problem with poor data quality is that it leads to poor decisions. This problem has been well documented by many researchers, notably by Larry English in his book Information Quality Applied. According to a report released by Gartner in 2009, the average organization loses $8.2 million annually because of poor data quality. The annual cost of poor data to U.S. industry has been estimated to be $600 billion. Research indicates the Pentagon has lost more than $13 billion dollars due to poor data quality. Data quality is a multi-dimensional concept and the international standard data quality model identifies 15 data quality characteristics, including accuracy, completeness, consistency, credibility, currentness, accessibility, compliance, confidentiality, efficiency, precision, traceability, understandability, availability, portability, and recoverability. In our data quality research, we have been focusing on the accuracy attribute of data quality. Within the ISO model, accuracy is defined as the degree to which data has attributes that correctly represent the true value for the intended attribute of a concept. Ensuring data quality is a multi-faceted problem. Errors in data can be introduced in multiple ways. Sometimes it’s as simple as mistyping an entry, but more complex organizational factors also lead to data problems. Common data problems include misalignment of policies and procedures with how the data is collected and entered into databases, misinterpretation of data entry instructions, incomplete data entry, faulty processing of data, and errors introduced while migrating data from one database to another. A number of software applications have been introduced in recent years to address data quality issues. Gartner estimates that the number of software tools available for data quality grew by 26 percent since 2008. The bulk of these applications, however, focus on problems with customer-relationship management (CRM) data, materials data, and financial data (for example reconciling duplicate records and missing and inconsistent data). As part of our research, we are going beyond these basic types of data checks using statistical, quantitative methods to identify data anomalies that are not addressed by current off-the-shelf data quality software tools. While available data quality automated platforms address erroneous data, these applications are intended for customer relationship management, materials processing, and financial accounting. The types of data errors that they are intended to find and correct include missing data, incomplete data, character mismatches, and duplicate records. Examples of the data anomalies that our research is focused on exposing include cost estimates and performance values that are unusual when compared to the time series values that constitute the remainder of the data series. These unusual data values are considered outliers and tagged as anomalies. A data anomaly is not necessarily the same as a data defect. A data anomaly might be a data defect, but it might also be accurate data caused by unusual, but actual, behavior of an attribute in a specific context. Root cause analysis is typically required to resolve the cause(s) of data anomalies. We are working with our DoD collaborators on the resolution process to determine if the anomalies detected are actual data defects. Our research is analyzing performance data submitted by DoD contractors in monthly reports about aspects of high-profile acquisition programs, including cost, schedule, and technical accomplishments on a project or task. Some methods that we are evaluating include Dixon’s Test Rosner’s Test Grubb’s Test Regression analysis Autoregressive integrated moving average (ARIMA) Various statistical control charting applications (individuals, moving range, moving average, exponentially weighted moving average, etc.) Various non-parametric approaches (kernel function based, histogram based) Slippage Detection These approaches to anomaly detection are being compared and contrasted to determine what specific methods work best for each EVM variable we are studying. Our data quality research complements our recent work on the Measurement and Analysis Infrastructure Method (MAID), which is an evaluation tool that helps organizations understand the weaknesses and strengths of their measurement systems. MAID is broader in scope than what is being addressed with our current research, recognizing that data is part of a life cycle that begins with sound definition, specification, collection, storage, analysis, packaging (for information purposes), and reporting (for decision-making). The integrity of data can be compromised at any of these stages unless policies, procedures and other safeguards are in place. Our research thus far has found a number of methods that have been effective for identifying anomalies in the EVM data. Our work will culminate in a report that we plan to publish by the end of 2011. With support from AT&L, we’re hoping these methods will identify problems in the data they receive and report, ultimately leading to better decisions made by government officials and lawmakers. Additional Resources For more information about the SEI’s work in measurement and analysis, please visit www.sei.cmu.edu/measurement/ To read the SEI technical report, Issues and Opportunities for Improving the Quality and Use of Data in the Department of Defense, please visit www.sei.cmu.edu/library/abstracts/reports/11sr004.cfm To read the SEI technical report, Can You Trust Your Data? Establishing the Need for a Measurement and Analysis Infrastructure Diagnostic, please visit www.sei.cmu.edu/library/abstracts/reports/08tn028.cfm To read the SEI technical report, Measurement and Analysis Infrastructure Diagnostic, Version 1.0: Method Definition Document, please visit www.sei.cmu.edu/library/abstracts/reports/10tr035.cfm To read the SEI technical report, Measurement and Analysis Infrastructure Diagnostic (MAID) Evaluation Criteria, Version 1.0, please visit www.sei.cmu.edu/library/abstracts/reports/09tr022.cfm

SEI . Blog .  Jul 27, 2015 02:59pm

Protecting Against Insider Threats with Enterprise Architecture Patterns

Andrew P. Moore, Insider Threat Researcher CERT The 2011 CyberSecurity Watch survey revealed that 27 percent of cybersecurity attacks against organizations were caused by disgruntled, greedy, or subversive insiders, employees, or contractors with access to that organization’s network systems or data. Of the 607 survey respondents, 43 percent view insider threat attacks as more costly and cited not only a financial loss but also damage to reputation, critical system disruption, and loss of confidential or proprietary information. For the Department of Defense (DoD) and industry, combating insider threat attacks is hard due to the authorized physical and logical access of insiders to organization systems and intimate knowledge of organizations themselves. Unfortunately, current countermeasures to insider threat are largely reactive, resulting in information systems storing sensitive information with inadequate protection against the range of procedural and technical vulnerabilities commonly exploited by insiders. This posting describes the work of researchers at the CERT® Insider Threat Center to help protect next-generation DoD enterprise systems against insider threats by capturing, validating, and applying enterprise architectural patterns. Enterprise architectural patterns are organizational patterns that involve the full scope of enterprise architecture concerns, including people, processes, technology, and facilities. This broad scope is necessary due to the fact that insiders have authorized access to systems—not only online access but physical access too. Our understanding of insider threat stems from a decade of experience cataloging more than 700 cases of malicious insider crime against information systems and assets, including over 120 cases of espionage involving classified national security information. Our experience reveals that malicious insiders exploit vulnerabilities in business processes of victim organizations as often as they do detailed technical vulnerabilities. Likewise, our data analysis has identified well over 100 categories of weaknesses in enterprise architectures that allowed the insider attacks to occur. We have used this analysis to develop an insider threat vulnerability assessment method, based on qualitative models for insider IT sabotage and insider theft of intellectual property (IP) that characterize patterns of problematic behaviors seen in insider threat cases. We have also applied these models to identify insider threat best practices and technical insider threat controls. For example, an organization must deal with the risk that departing insiders might take valuable IP with them. One set of practices and controls that helps reduce the risk of insider theft of IP is based on case data showing that most insiders who stole IP did so within 30 days prior to their forced or voluntary termination. The pattern describing this set of practices and controls helps balance the costs of monitoring employee behavior for suspicious actions with the risk of losing the organization’s intellectual property. Organizations aware of this pattern can ensure that the necessary agreements are in place (IP ownership and consent to monitoring), critical IP is identified, key departing insiders are monitored, and the necessary communication among departments takes place. At the point at which an insider resigns or is fired, technical monitoring and scrutiny of that employee’s activities within a 30-day window of their termination date are increased. Actions taken upon and before employee termination are vital to ensuring IP is not compromised and the organization preserves its legal options. Capturing our understanding of insider threat mitigations as architectural patterns allows us to translate effective solutions in forms useful to engineers who design DoD systems. As part of our research, we are analyzing the subset of insider IT sabotage cases from the CERT insider threat database. We are updating and refining our existing qualitative insider IT sabotage model to include a quantitative simulation capability intended to exhibit the predominate patterns of insider IT sabotage behavior. We are using a system dynamics approach to model and analyze the holistic behavior of complex problems as they evolve over time. System dynamics modeling and simulation makes it easier for us to understand and communicate the nature of problematic insider threat behavior as an enterprise architectural concern. After validating that simulating the problem model accurately represents the historical behavior of the problem—and does so for the right reasons—the next step is to examine the enterprise-level architectural insider threat controls proposed to help mitigate it. Our research will focus on two aspects: Are those controls effective against insider threats? For example, do the controls mitigate the problematic behavior exhibited in the simulation model? Do those controls introduce negative unintended consequences? For example, even if the controls are effective against the threat, do they unintentionally undermine organizational trust and reduce team performance? A key challenge in our research is the difficulty associated with testing these controls in an operational environment. One manifestation of this problem is in the form of unknown false positive rates associated with insider threat controls. From the perspective of technical observations and resource usage, most malicious insiders behave as their non-malicious counterparts do. We therefore expect that poorly-designed controls will overwhelm operators with false positives. Controls are also hard to test operationally because insider attacks occur relatively infrequently, but nevertheless result in huge damages for victim organizations. To meet these challenges, we are using system dynamics modeling and simulation to identify and test enterprise architectural patterns to protect against insider threat to current DoD systems. We are interviewing members of the DoD who have expressed interest in information security controls to mitigate the insider threat. These steps are enabling us to characterize the baseline enterprise architecture, which represents their operational architecture as a starting point for our analysis. Identified architectural patterns will be applied to modify the baseline architecture to better protect against insider threat. The basis for establishing the efficacy of the architectural patterns is system dynamics simulation-based testing. The experiments conducted in the simulation environment provide a body of evidence that supports strong hypotheses going into pilot testing within organizations. Enterprise architectural patterns developed through our research will enable coherent reasoning about how to design—and to a lesser extent implement—DoD enterprise systems to protect against insider threat. Instead of being faced with vague security requirements and inadequate security technologies, DoD system designers will have a coherent set of architectural patterns they can apply to develop effective strategies against insider threat in a more timely and confident manner. Confidence in these patterns will be enhanced through our use of established theories in related areas and the scientific approach of using system dynamics simulation models to test key hypotheses prior to pilot testing. We expect our research results will improve DoD enterprise, system, and software architectures to reduce the number and impact of insider attacks on DoD information assets. We will be periodically blogging about the progress of this work. Please feel free to leave your comments below and we will reply. Additional Resources: For more information about the work of the CERT Insider Threat Center, please visitwww.cert.org/insider_threat/ To read a report about preliminary technical controls derived from insider threat data, Deriving Candidate Technical Controls and Indicators of Insider Attack from Socio-Technical Models and Data, please visit www.cert.org/archive/pdf/11tn003.pdf To read a report about our insider threat modeling, A Preliminary Model of Insider Theft of Intellectual Property, please visit www.cert.org/archive/pdf/11tn013.pdf To read the CERT Insider Threat blog, please visit www.cert.org/blogs/insider_threat/

SEI . Blog .  Jul 27, 2015 02:59pm

The Growing Importance of Sustaining Software for the DoD

Part 2: SEI R&D Activities Related to Sustaining Software for the DoD By Douglas C. Schmidt, Deputy Director, Research, and Chief Technology Officer Software sustainment is growing in importance as the inventory of DoD systems continues to age and greater emphasis is placed on efficiency and productivity in defense spending. In part 1 of this series, I summarized key software sustainment challenges facing the DoD. In this blog posting, I describe some of the R&D activities conducted by the SEI to address these challenges. Primary Sustainment ActivitiesThe term software sustainment is often used synonymously with software maintenance. Sustaining software for the DoD, however, requires attention to certain issues (such as operations and training) that are less essential in commercial software maintenance. There are four primary categories of software sustainment activities: Corrective sustainment diagnoses and corrects software errors after release. Perfective sustainment upgrades existing software to support new capabilities and functionality. Adaptive sustainment modifies software to interface with changing environments. Preventive sustainment modifies software to improve future maintainability or reliability. SEI Sustainment R&DThe software engineering research community has devised various approaches to improve software sustainment. For example, tools for detecting software modularity violations help identify eroding design structure (referred to whimsically as "bad code smells") so the code can be refactored to enhance its sustainability. Likewise, intelligent automated regression testing frameworks help ensure that changes to legacy software work as required and that unchanged parts have not become less dependable. SEI sustainment strategies. Over the past several decades, the SEI has created methods and guidelines for sustaining, migrating, and evolving legacy systems. For example, the SEI has devised strategies for modernizing legacy systems and reusing legacy components in service-oriented architecture (SOA)-based systems. These strategies employ risk-managed, incremental approaches that encompass changes in software technologies, engineering processes, and business practices. In addition, the SEI has created techniques for measuring the effectiveness of software sustainment practices. These techniques can be used to help decision-makers choose a course of continued sustainment, replacement, or selecting which redundant legacy systems to keep and which to retire. Software product lines. Legacy DoD systems comprise a wide range of software variations, such as network, hardware, and software configurations; different algorithms; and different security profiles. This variation is a key driver of total ownership costs because it impacts the time and effort required to assure, optimize, and manage system deployments and configurations throughout the lifecycle. To manage this variation effectively, the SEI helped pioneer software product lines (SPLs), which have been applied in DoD systems to manage software variation while reusing large amounts of code that implement common features needed within a particular domain. Software sustainment costs (particularly SPL testing) for an SPL-based family of systems can be reduced because reusable components in the SPL are maintained and validated in one place, instead of separately within each application. Team Software Process. The Team Software Process (TSP) is another approach pioneered by the SEI that managers and engineers can use to sustain legacy software projects. TSP is a team-centric, time-boxed approach to developing software. By using TSP, organizations can better plan, measure, and improve software development productivity so they have more confidence in sustainment quality and cost estimates. The U.S. Air Force and other DoD and industry organizations have applied TSP successfully to manage software sustainment in large-scale weapons systems for the U.S. Air Force, as well as other DoD and industry organizations. Software architecture. The SEI has also focused extensively on software architecture, which comprises the structure of the software elements in a system, the externally visible properties of those elements, and the relationships among them. SEI research has shown that a solid understanding of software architecture—and the associated methods, infrastructure, and tools—is essential to modify and improve software-reliant systems correctly, dependably, rapidly, and cost effectively throughout the lifecycle. Likewise, successful sustainment of software-reliant DoD systems requires techniques and tools for evaluating and improving software engineer and manager competence with respect to software architecture, including the following: Understanding, analyzing, and engineering tradeoffs among system properties (such as performance, dependability, and security) that are critical to achieving desired levels of quality in software-reliant systems as they evolve. These properties are quality attributes that determine system viability throughout the sustainment phase. Using architecture-centric practices to elicit quality attribute requirements and to design and analyze changes that are needed throughout sustainment of systems at all scales. Architecture-centric practices can be used to plan system releases and address sustainment challenges pertaining to integration and operational problems due to inconsistencies between system and software architectures. Applying architecture principles for systems-of-systems and ultra-large-scale systems to develop architecture design and analysis principles that help document and account for socio-technical interactions, decentralized control, and continuous evolution and sustainment environments where failures/changes are the norm. For example, some soldiers or support staff on the battlefield are capable of creating or modifying existing systems in response to needs that were not anticipated by the designers of the original systems. SEI assessments, workshops, and red teaming. The SEI regularly works with DoD programs to conduct independent technology assessments, reviews, and "red teams" that apply many of the methods and approaches described above to review the planning for—and conducting of—sustainment of DoD systems. For example, architecture practices such as the Architecture Tradeoff Analysis Method (ATAM) can help DoD programs elicit stakeholder input to identify likely long-term sources of change throughout the sustainment phase. The SEI’s experience helping DoD programs transition from the production phase of acquisition to the sustainment phase of acquisition indicates that the DoD often focuses on how its contracts and contractors will change rather than on how its program offices will need to change. The SEI helps acquisition programs plan for these transitions to sustainment and has collected lessons learned from these activities into software acquisition planning guidelines (including Guideline #4: Software Sustainment). An interesting trend is that DoD programs are increasingly interdependent and interoperable, leading to sustainment interdependencies that require new coordination. To address this need, the SEI developed interoperable acquisition workshops to bring program offices together and draft plans that address sustainment. Information assurance and software security. Increasing requirements for interdependence and interoperability also yield new challenges for information assurance and software security in legacy systems. In particular, many legacy systems were developed as isolated enclaves. With the advent of net-centric systems-of-systems, however, these legacy enclaves are being interconnected in ways that subject them to vulnerabilities not anticipated by their original designers. For example, legacy systems programmed in languages like C may be susceptible to buffer overflows that will not occur until they are connected to a network. Moreover, maintainers may not resolve these types of vulnerabilities correctly. They might, for instance, simply add input validation to eliminate a particular path to a buffer overview vulnerability rather than remove the out-of-bounds write. The CERT Secure Coding Team works with developers and maintainers to eliminate these and other types of vulnerabilities by establishing secure coding standards and processes for conformance testing against these standards. Likewise, the CERT Vulnerability Analysis Team can use an analysis of vulnerabilities based on secure coding rule violations to help handle the response. Legacy software systems can also undergo conformance testing against a secure coding standard in the CERT Source Code Analysis Laboratory (SCALe) to detect and eliminate vulnerabilities before the software is deployed. SCALe has also been used by DoD program offices to access the quality of legacy code to inform modernization versus replacement decisions. Related SEI Blog Posts SEI researchers have written several blog postings that are relevant to the sustainment of software-reliant DoD systems. For example, Rick Kazman’s posting on Measuring the Impact of Explicit Architecture Documentation focused on understanding the value of documenting software architectures for complex, software-reliant systems. Thorough software architecture documentation helps engineers who sustain DoD software understand how they can refactor, maintain, and update the software without introducing new defects or degrading existing capabilities. Ipek Ozkaya’s posting on Enabling Agility by Strategically Managing Architectural Technical Debt examined how metrics extracted from the code and module structures of software can help repay technical debt, which is a conceptual framework for understanding how and when to defer design choices during the planning or execution of a software project. Repaying technical debt via refactoring and re-architecting is an effective strategy to alleviate architectural dependencies that impact system-wide architectural rework and minimize software decay during sustainment. Steve Rosemergy’s posting on A Framework for Evaluating Common Operating Environments described a framework for exploring the interdependencies among common language, business goals, and software architecture when evaluating the sustainability of proposed software solutions. We Want to Hear Your ThoughtsThis post has just scratched the surface of the solutions that meet the challenges of sustaining software-reliant DoD systems. While the SEI has expertise in methods and tools related to software sustainment, the DoD faces deeper and broader challenges than any one organization (or blog post) can address. We welcome your feedback in the comments section below on ways to improve the technologies and ecosystems needed to sustain DoD software effectively. Additional Resources: More information about sustaining software-reliant DoD systems is available below. To read about software sustainment practices for the DoD, please visit www.stsc.hill.af.mil/resources/tech_docs/gsam4.html, especially chapter 16. To read about the SEI’s work in software architecture, please visitwww.sei.cmu.edu/architecture To read about the SEI’s work with the Team Software Process (TSP), please visit www.sei.cmu.edu/tsp To read about the SEI’s work in Software Product Lines, please visitwww.sei.cmu.edu/productlines To read about the SEI’s work in system of systems and SOA, please visitwww.sei.cmu.edu/sos To read about the SEI’s work on Ultra-Large-Scale Systems, please visit www.sei.cmu.edu/uls To read about the SEI CERT’s work in secure coding, please visitwww.cert.org/secure-coding/

SEI . Blog .  Jul 27, 2015 02:59pm

Common Infrastructure and Joint Programs, Fourth in a Series

By Bill Novak, Senior Member of the Technical Staff, SEI Acquisition Support Program, Air Force Team This is the fourth in an ongoing series examining themes across acquisition programs. Background: Over the past decade, the U.S. Air Force has asked the SEI’s Acquisition Support Program (ASP) to conduct a number of Independent Technical Assessments (ITAs) on acquisition programs related to the development of IT systems; communications, command and control; avionics; and electronic warfare systems. This blog posting is the latest installment in a series that explores common themes across acquisition programs that we identified as a result of our ITA work. Previous themes explored in this series include Misaligned Incentives, The Need to Sell the Program, and The Evolution of "Science Projects." This post explores the fourth theme: common infrastructure and joint programs, which describes a key issue that arises when multiple organizations attempt to cooperate in the development of a single system, infrastructure, or capability that will be used and shared by all parties. The Fourth Theme: Common Infrastructure and Joint Programs This theme focuses on joint programs, which are popular for the potential they offer to reduce costs and improve interoperability. Joint programs are also recognized, however, as being hard to manage successfully due to many different reasons, including the number of stakeholders, the organizational size and complexity, differing organizational goals, interoperability challenges, geographical separation, coordination overhead, communication issues, and other factors. There are other types of programs that may not technically be joint programs, but which have similar characteristics For example, a common infrastructure system, such as an enterprise-wide IT system, is similar to a joint program. Both are often trying to replace a set of isolated, yet related, existing capabilities with a single new system that will offer an integrated capability that is the union of the existing capabilities—and in the process is both modernizing the capability, as well as making it more efficient to develop and maintain. To explore the issues of common infrastructure and joint programs more closely, consider a scenario that aggregates together the experiences of some joint programs the SEI has worked with: A joint program office has several stakeholder programs that are planning to use the joint infrastructure software being developed, but each program demands that at least one major feature be added to the software just for them. The joint program manager agrees to the additional requirements, for fear of losing stakeholders (who could always build their own custom software). The additional design and coding changes that are needed significantly increase the total program cost, schedule, complexity, and risk. As the schedule now begins to slip, one program decides to leave the joint program and develop its own custom software instead. With one stakeholder gone, the amortized costs for the other programs increase further—and so another program leaves. As cost escalates, participation in the joint program begins to unravel and may ultimately collapse. Many problems we’ve seen in acquisition programs belong to a category known as "social dilemmas" where planned cooperation can turn into opposition. Garrett Hardin’s article "The Tragedy of the Commons" (1968) is one of the most famous types of social dilemmas (the scenario above is such an example). The "Tragedy of the Commons" can be summed up simply: an individual desires an immediate benefit that will cost everyone else—and if all succumb to the same temptation, everyone is worse off. In the case of the joint program, the stakeholders each want custom features—but if they all demand them, it drives up cost, schedule, and risk, and everyone is worse off. Social dilemmas are inherently hard to fix, which is why they persist not only in acquisition, but also in aspects of public policy, economics, sociology, and many other areas. Nonetheless, researchers have identified a range of solutions and mitigations that can be applied. For example, one approach for resolving many instances of the "Tragedy of the Commons" dilemma is privatization, which removes the social aspect of a social dilemma by converting shared ownership (with diffused responsibility) into private ownership (with sole responsibility), so that each owner now has a strong incentive to properly care for what they own. Privatization, however, may defeat the intent of achieving the original objectives (in this case cost savings and interoperability) through cooperation. In the joint program scenario, for example, it would mean that each of the stakeholder programs would build their own custom system, which can be prohibitively costly and time consuming. An alternative solution might be "altruistic punishment," where cooperating participants can penalize uncooperative participants in some way, to encourage them to cooperate—even if the penalty costs the cooperators, and may produce no immediate direct gain for them. The cost of imposing the penalty prevents its overuse, making it self-correcting. Research by Fehr and Gachter has found that cooperation flourishes when altruistic punishment is present, and can break down if it is not. Altruistic punishment might incentivize stakeholder programs to stay with the joint program, despite the difficulties. If it were unsuitable in a given situation, such as a joint program, other solutions to the "Tragedy of the Commons" dilemma still exist, including assurance contracts, rewards and penalties, building trust, and exclusion mechanisms. Elinor Ostrom’s Nobel prize in Economics in 2009 acknowledged her extensive work on how people create successful institutions to manage common resources. The choice of the best solution will depend on the specific circumstances of the program. The SEI is exploring ways to model acquisition program behavior, such as the joint program scenario discussed above, to help analyze, predict, and ultimately manage the effects of various specific solution approaches on program outcome. As this work progresses, a key aspect will be how to best leverage this work in a form that's most helpful to the acquisition community. We know that acquisition leaders may be inexperienced with certain types of decision-making and may also be unfamiliar with some unique complexities of software-reliant acquisition programs—especially joint programs. Moreover, we know that conventional training may not be fully effective in preparing decision-makers for dealing with dynamically complex domains. What acquisition leaders need is experience in complex decision-making, such as they might develop over decades of experience with actual acquisition programs. To accelerate this learning process, we plan to create interactive experiential learning tools, which are essentially "flight simulators" for acquisition professionals that address these types of situations. These learning tools are key since actively learning through experience produces better understanding and superior retention of the knowledge. With such an approach, we believe it will be possible to improve the decision-making abilities of acquisition program staff, thereby achieving more successful program outcomes. Additional Resources: For more information, about the SEI's Acquisition Support Program, please visit www.sei.cmu.edu/acquisition.

SEI . Blog .  Jul 27, 2015 02:58pm

Using Machine Learning to Detect Malware Similarity

By Sagar Chaki, Senior Member of the Technical Staff Research, Technology, and System Solutions Malware, which is short for "malicious software," consists of programming aimed at disrupting or denying operation, gathering private information without consent, gaining unauthorized access to system resources, and other inappropriate behavior. Malware infestation is of increasing concern to government and commercial organizations. For example, according to the Global Threat Report from Cisco Security Intelligence Operations, there were 287,298 "unique malware encounters" in June 2011, double the number of incidents that occurred in March. To help mitigate the threat of malware, researchers at the SEI are investigating the origin of executable software binaries that often take the form of malware. This posting augments a previous posting describing our research on using classification (a form of machine learning) to detect "provenance similarities" in binaries, which means that they have been compiled from similar source code (e.g., differing by only minor revisions) and with similar compilers (e.g., different versions of Microsoft Visual C++ or different levels of optimization). Evidence shows that a majority of malware families generate from the same origin. For example, a 2006 Microsoft Security Intelligence report revealed that the 25 most common families of malware account for more than 75 percent of the detected malware instances. Compounding this problem is the fact that the current cadre of malware analysis tools consists of either manual techniques (requiring extensive time and effort on the part of malware analysts) or automated techniques that are not as accurate (they produce high false-positive or false-negative rates) or are inefficient. In contrast, our approach involves creating a training set using a sample of binaries using the training set to learn (or train) a classifier using the classifier to predict similarity of other binaries I, along with my colleagues—Arie Gurfinkel, who works with me in the SEI’s Research, Technology, and System Solutions Program, and Cory Cohen, a malware analyst with CERT—felt that classification was appropriate for evaluating a binary similarity checker because this form of machine learning is particularly appropriate in instances where closed-form solutions are hard to develop, and a solver can be "trained" using a training set composed of positive and negative examples. While malware classification is a major aim of provenance-similarity research, there are two main hurdles to applying classification directly to malware binary similarity checking: Classification must be applied to parts of the malware where similarity is expected to manifest most directly. For this research, we decided to apply classification to functions. Intuitively, a function is a fragment of a binary obtained by compiling a source-level procedure or method. Functions are the smallest externally identifiable units of behavior generated by compilers. Similarity at the function level is an indicator of overall similarity between two binaries. For example, malware that originated from the same family will rarely be identical everywhere. Instead they will share important functions. It is hard to develop training sets from malware due to the lack of information on source code and generative compilers. Our research therefore focuses on evaluating open-source software. We believe that a classifier that effectively detects provenance-similarity in open-source functions will also be effective on malware functions because the variation we are targeting (due to changes in source code and compilers) is largely independent of the software itself. For example, the variation introduced by a different compiler version (e.g., introducing stack canaries to detect buffer overflows at runtime) is the same, regardless of whether the source code being compiled is malware or open-source. More specifically, we selected approximately a dozen C/C++ open-source projects from SourceForge.net and compiled them to binaries using Microsoft Visual C++ 2005, 2008, and 2010. We then extracted functions from the binaries using Idapro, which is a state-of-the-art dissembler, and constructed a training set and a testing set from the functions using a tool that we developed atop the Rose compiler infrastructure. Next, we learned a classifier from the training set using the Weka framework. When it comes to classification, the following two main decisions must be considered: What classifier are you going to use? What kind of attribute are you going to use? We measured the effectiveness of a classifier in terms of two quantities: (1) its F-measure, which is a real number between 0 and 1 that indicates the overall classifier accuracy, and (2) the time required to train the classifier. There is a tradeoff between the two quantities: an F-measure can be increased by using a larger training set, but the training time also increases. We empirically found that the RandomForest classifier was the most effective Weka classifier for our purposes since it has the best F-measure for the same training time. We repeated the experiment several times with different randomly constructed training and testing sets. To determine the robustness of our results, we repeated our experiments using a different set of open-source software and different versions of Microsoft Visual C++. The results were consistent in all cases, with the F-measures being around 0.95 for RandomForest. This finding is encouraging since it indicates that a provenance-similarity detector based on RandomForest will produce the correct result in more than 95 percent of the cases. We believe that this accuracy is sufficient for use in practical malware analysis situations. Next, we experimented with various parameters of RandomForest to observe how these parameters affect the tradeoff between its F-measure and its training time. In particular, we focused on two important parameters: the number of trees and the number of attributes. With each attribute, we experimented with different values and measured how the F-measure vs. training time tradeoff changed. To further improve and evaluate our approach, we developed a suite of the following types of attributes: Semantic attributes, which capture the effect of a binary’s execution on specific components of the hardware state, register, and memory locations. Syntactic attributes, which are derived from n-grams and n-perms and represent groups of instruction opcodes that occur contiguously in the library. We re-evaluated the effectiveness of the classifier using these two types of attributes and concluded that semantic attributes yield better F-measures, but are more expensive to compute than syntactic attributes. Attribute extraction is inherently parallelizable, however, since it is done independently for each function. A rough estimate is that a modern CPU can extract semantic attributes from about 10,000 functions in the CERT catalog every day. Based on this estimate, extracting attributes from malware samples as they are discovered each day is feasible with a modestly sized CPU farm. We had several false steps along the way. For example, we originally used text files for all of our input and output, which was slow and unwieldy. We therefore decided to store inputs and outputs in a database, which simplified our tools and accelerated our experiments. Another lesson learned was to handle statistical issues and randomness carefully. Since the set of all possible training and testing samples is large, we had to pick random subsets for our experiments. In some cases, we also had to label the samples in a random—yet deterministic—manner so that each sample had a randomly assigned label that stayed the same across all experiments. Constructing a labeling scheme that was both random and deterministic required extra care. While determining the similarities between binary functions remains a challenge, the preliminary results from our research were presented in a well-received paper at the 2011 Knowledge, Discovery, and Data Mining Conference. Our malware research has also studied fuzzy hashing and sparse representation. Our future research will explore other ways of detecting similarities between functions, including the use of static analysis. Additional Resources: For additional details, or to download benchmarks and tools that we have developed and are using as part of our project, please visit www.contrib.andrew.cmu.edu/~schaki/binsim/. To listen to the CERT podcast, Building a Malware Analysis Capability, please visit www.cert.org/podcast/show/20110712gennari.html To read other SEI blog posts relating to our malware research, please visit http://blog.sei.cmu.edu/archives.cfm/category/malware

SEI . Blog .  Jul 27, 2015 02:58pm

A Collaborative Method for Engineering Safety- and Security-Related Requirements

By Donald Firesmith, Senior Member of the Technical Staff Acquisition Support Program This blog post is the third and final installment in a series exploring the engineering of safety- and security-related requirements. Background: In our research and acquisition work on commercial and Department of Defense (DoD) programs, we see many systems with critical safety and security ramifications. With such systems, safety and security engineering are used to managing the risks of accidents and attacks. Safety and security requirements should therefore be engineered to ensure that residual safety and security risks will be acceptable to system stakeholders. The first post in this series explored problems with quality requirements in general and safety and security requirements in particular. The second post, took a deeper dive into key obstacles that acquisition and development organizations encounter concerning safety- and security-related requirements. This post introduces a collaborative method for engineering these requirements that overcomes the obstacles identified in earlier posts. Anyone involved in building safety- and security-critical systems needs to consider the following: Are you building a safety-critical system or one that must be secure from attack? Do your safety and security engineers begin their work only after the architecture is engineered, rather than building it in from the start via safety- and security-related requirements? Do your safety and security engineers develop their work products (documents and models) independently of each other and requirements engineers? Do your requirements specifications largely ignore safety, security, or both? Are many of your safety and security requirements so general that they are meaningless, such as "The system shall be safe and secure from attack?" Are most of your safety- and security-related requirements merely architecture and design constraints that prevent safety- and security-engineers from collaborating with architects to create innovative solutions? Is use-case modeling or structured analysis your primary or only requirements-analysis method, even when engineering safety- and security-related requirements? If you answer yes to any of these questions, then your safety, security, and requirements engineers can benefit from a better way of engineering their requirements. To achieve this goal, an appropriate safety- and security-requirements analysis method is needed. We propose using the Engineering Safety- and Security-related Requirements (ESSR) method, which consists of the following analysis-based tasks. Stakeholder analysis determines the stakeholders who have a vested interest in the safety and security of the system and the appropriate sources for eliciting safety and security goals and requirements. Safety- and security-engineers collaborate to identify the safety- and security-related stakeholders in the system and the assets that the system must defend from accidental and malicious harm. These stakeholders are modeled by producing stakeholder profiles and creating an initial partial list of the stakeholder’s safety- and security-goals. Asset analysis determines the assets that must be protected from unauthorized harm and the harm that these assets must be protected from. Safety- and security-engineers collaborate to identify the assets that the system must protect from harm. They model each defended asset by categorizing it, determining its value, identifying the types and severities of harm that it may suffer, and determining its stakeholders. Abuse analysis examines the ways that the system and the assets for which the system is responsible can be abused. Specifically, this task identifies the different types of abuses including safety mishaps (accidents and safety incidents) and security misuses (attacks and security incidents) that can occur. Abuse analysis also identifies which assets that the abuses can harm, in what manner, and to what degree. Safety- and security-engineers model these abuses using appropriate techniques (e.g., abuse case modeling, attack trees) and create abuse profiles. Vulnerability analysis determines the existence of the system-internal weaknesses or defects that can enable abuses (mishaps and misuses) to occur. Safety- and security engineers identify the credible potential system-internal vulnerabilities (e.g., defects and weaknesses) that could enable the abuses that may harm the defended assets. They also model these vulnerabilities using appropriate techniques such as STAMP-Based Process Analysis (STPA), Event Tree Analysis (ETA), Fault Tree Analysis (FTA), or Failure Modes and Effects Analysis (FMEA). Abuser analysis determines the system-external people and things that can accidentally or maliciously abuse the system and the assets that it must defend from unauthorized harm. Safety- and security engineers identify the credible potential abusers that could exploit the vulnerabilities and thereby cause the abuses that may harm the defended assets. They model these abusers using appropriate techniques (e.g., STPA, abuse case modeling, task analysis, or user profiling). Danger analysis determines the dangers (i.e., safety hazards and security threats), which are cohesive sets of conditions involving the existence of abusers, vulnerabilities, and assets that could increase the probability of abuses occurring. When restricted to safety and security, danger analysis is often called hazard analysis or threat analysis, even though they typically include all of these types of analysis. Safety- and security engineers model these safety hazards and security threats using appropriate techniques (e.g., operator task analysis, ETA, FTA, and FMEA). Risk analysis determines the maximum acceptable residual safety and security risks as well as the specific types of assets, harm, vulnerabilities, abusers, and dangers that are associated with these risks. Safety- and security engineers model these risks using appropriate techniques (e.g., calculating risk level as the product of probability times harm severity, using degrees of software control instead of probabilities, and risk matrices). Safety- and security-significance analysis identifies the goals and requirements that have safety and security ramifications so the corresponding parts of the system can be implemented using a process having the appropriate level of rigor and completeness, e.g., to justify the use of a more powerful (and therefore more expensive) development process. Safety- and security engineers categorize requirements into safety/security assurance levels (SALs), such as safety-critical and security-critical, based on the degree to which the requirements have safety and security ramifications. They collaborate with requirements engineers to update the requirements repository by annotating requirements with their SALs. Based on how these categorized requirements are allocated to architectural components, they assign the components safety/security evidence assurance levels (SEALs) that determine the degree of completeness and rigor to be used when architecting, designing, implementing, integrating, and testing these components. In other words, components with high SEALS should be as small as practical to minimize the increased effort, cost, and schedule needed to develop them. Finally, they update the certification repository with the results of safety- and security-significance analysis. Defense determination determines the appropriate defenses (i.e., controls including safeguards and security countermeasures) that are needed to defend the system and its associated defended assets from unauthorized harm. Safety- and security engineers perform a gap analysis to identify potential new defenses. They then evaluate these potential defenses using appropriate techniques (e.g., engineering analyses, product and vendor trade studies). Where appropriate (except for the safety- and security-significance analysis task), safety- and security engineers create safety and security goals for each type of analysis and then collaborate with the requirements engineers to transform these goals into requirements to prevent, detect, and react to it. They then update the certification repository with the results of the analysis. Also, where appropriate, they collaborate with requirements engineers to transform these informal restraints into official requirements. Finally, where appropriate, this information is stored in the certification repository to eventually support the system’s safety and security accreditation and certification. The above tasks result in the engineering of multiple types of associated safety and security requirements (e.g., prevention, detection, and reaction requirements as well as safety and security constraints). All such possible requirements, however, are rarely appropriate for most systems. The harm severity and likelihood of the associated mishaps and misuses may not justify the cost of producing and using the resulting safety- and security-defenses. Some requirements make others unnecessary, e.g., a requirement preventing the existence of a vulnerability may eliminate the need for a requirement to prevent an abuse enabled by that vulnerability. On the other hand, high-level requirements associated with the early analysis steps (e.g., prevent harm to a defended asset) may be used to derive lower-level requirements associated with later analyses steps (prevent vulnerability that enables abuse to harm the defended asset). The tasks of ESSR described above are best performed in an evolutionary (i.e., incremental, iterative, and concurrent) manner. Due to the evolutionary nature of ESSR, the temporal ordering of the preceding sequence of analyses is merely a logical simplification to improve understandability; a waterfall approach to safety and security is neither intended nor recommended. Safety, security engineers, and requirements engineers should also perform these tailorable tasks in a collaborative manner. At the end of this process, comprehensive safety and security analyses will have been performed and documented, safety and security goals will have been turned into their corresponding requirements, and the certification repository will contain the analysis- and requirements-related safety and security evidence needed for accreditation and certification. The preceding ESSR method for collaboratively engineering safety- and security-related requirements is described in considerably more detail in tutorials, a class, and a book to be published early in 2012. Additional Resources: For more information, please visit www.sei.cmu.edu/library/abstracts/presentations/icse-2010-tutorial-firesmith.cfm

SEI . Blog .  Jul 27, 2015 02:57pm

Developing Architecture-Centric Engineering Within TSP

By Felix Bachmann, Senior Member of the Technical Staff, Research, Technology, and System Solutions Bursatec, the technology arm of Groupo Bolsa Mexicana de Valores (BMV, the Mexican Stock Exchange), recently embarked on a project to replace three existing trading engines with one system developed in house. Given the competitiveness of global financial markets and recent interest in Latin American economies, Bursatec needed a reliable and fast new system that could work ceaselessly throughout the day and handle sharp fluctuations in trading volume. To meet these demands, the SEI suggested combining elements of its Architecture Centric Engineering (ACE) method, which requires effective use of software architecture to guide system development, with its Team Software Process (TSP), which teaches software developers the skills they need to make and track plans and produce high-quality products. This posting—the first in a two-part series—describes the challenges Bursatec faced and outlines how working with the SEI and combining ACE with TSP helped them address those challenges. ChallengesThe team of Bursatec software architects faced a significant challenge in designing their new trading system: only one team member had significant experience in designing a financial software system. We felt the ACE methods would help the team better understand what software architecture means, particularly when thinking about abstractions and solving quality attribute problems. Another complicating factor was that Bursatec wanted to combine stock market trading with derivative market trading on the same platform to reduce operating costs and provide a single, high-throughput, low-latency, high-confidence interface to external financial markets. Getting Started One of our first steps was to conduct a Quality Attribute Workshop in which the Bursatec stakeholders defined the five most important quality attribute requirements (also known as quality attribute scenarios) their new trading system had to fulfill. To guide the system design, the Bursatec architecture team used the Architecture Driven Design (ADD) method. ADD is a decomposition method based on transforming quality attribute scenarios into an appropriate design. Not surprisingly, given the importance of speed for the new system, the stakeholders identified runtime performance as one of the most important quality attribute scenarios. The performance quality attribute scenario coupled with high availability requirements, led the team to realize that conventional approaches, such as a three-tier architecture, were not the best solution for their new system. Consequently, the architecture team spent the next two weeks exploring various solutions, as well as the potential negative outcomes of each proposed solution. At the end of the two weeks, using rigorous Architecture Tradeoff Analysis Method (ATAM) techniques, the team of Bursatec architects had to present its findings—as well as evidence (including measures) that its chosen approach was correct—to an SEI software architecture coaching team that challenged each scenario. Every subsequent two weeks, the Bursatec software architects had to present solutions with appropriate evidence for the scenarios they had created. For example, with respect to the performance requirement, the team demonstrated how a stock order would traverse the system, estimating and measuring the timing required for every step. With each review, SEI coaches identified risks associated with a particular approach. For example, the team identified one risk with respect to performance: synchronizing with backup systems would throw off the timing. In all, there were three iterations of the architecture, each lasting six weeks. At the start of the second iteration, the SEI software architecture coaches brought in the team of Bursatec developers to begin working on prototypes, specifically focusing on risks (such as the timing of querying complex data structures) that could not be addressed solely via software architecture. This important step allowed developers to deeper their understanding of the architecture and familiarize themselves with the problem, which was a lengthy process. The developers had six weeks to implement the prototypes; at the beginning of the third iteration the developers returned and presented their results to the architecture team. This process enabled the architects to finalize their architecture design using the results from the prototypes. An interesting benefit to this style of architectural coaching was that the Bursatec architects used Enterprise Architect (a Unified Modeling Language-based tool ) from the onset to document, evaluate, and justify their solutions. Although architecture documentation is often an afterthought, it became second nature to the Bursatec architects. The architects focused only on the documentation that was either needed to provide sufficient evidence that the system would support the quality attribute requirements or required by the developers to effectively implement prototypes and the subsequent system. Improving Delivery with TSPThe Team Software Process (TSP) is a team-centric approach to developing software that enables organizations to better plan and measure their work, and improve software development productivity to gain greater confidence in quality and cost estimates. Our coaches emphasized the incorporation of TSP principles throughout the architecture design process with Bursatec. The use of TSP enabled the Bursatec architects to prepare, estimate, and track their work. In this case, the Bursatec architects were also able to time-box their iterations, an approach that the SEI finds effective. These activities initially proved challenging because TSP is oriented more toward programming, so the measures employed by developers typically apply to lines of code, classes, requirements pages, or other tangible, implementation-oriented measures. To create a measureable work unit, the Bursatec architects used the quality attribute scenarios as a size measure. The SEI architecture team recognized that each quality attribute scenario would be refined into about five more detailed quality attribute scenarios that address special cases. The SEI team also recognized that the Bursatec architects would have to create at least three to five diagrams and descriptions to fulfill each scenario. The Bursatec team then estimated how long it would take to create each diagram with a description. We found this measure proved a good tool for determining how long it would take to complete the architecture, which has been hard to estimate in our prior work with organizations. This approach to measuring and estimating work allowed the Bursatec architects to provide accurate estimates of deadlines to their management team. Integrating the ACE architecture within the TSP management process gave the Bursatec architects an effective framework in which to work. While it did restrict some of their freedom, it also proved helpful. For example, the architects’ work was structured into iterations, each with different goals. The first iteration focused solely on discovering problematic areas of the system based on achievement of the necessary quality attribute scenarios. In subsequent iterations the architects gradually added details to the system design to include support of all quality attribute scenarios. This iterative method enabled them to create a software architecture organically, for the whole system, that was well understood, justified, and accepted by the team. Building and Evaluating the SystemOnce the architecture was complete, the SEI architecture coaching team conducted an active design review in which the Bursatec architects communicated the entire architecture to the developers in a structured way. Next, conformance reviews were conducted during which the developers needed to provide evidence to the architects that the systems they were building conformed to the architecture. These reviews reinforced that the whole system would meet the needs of the stakeholders. To date, the development of the new trading system for Bursatec has progressed on schedule and within budget. Moreover, early tests confirmed that the trading system performance far exceeded expectations. The combination of TSP and ACE proved an ideal approach for the development of the trading system. TSP brought discipline and measurement, while ACE provided a set of robust architectural techniques that focus on business goals and quality requirements. Both approaches together support the whole development lifecycle, emphasizing business and quality goals, engineering excellence, defined processes, process discipline and teamwork. This post is the first in a two-part series describing our recent engagement with BMV. The next post focuses on the TSP framework that provided planning, scheduling, estimation, and tracking in the project. Additional Resources: For more information about the SEI’s work in Architecture Centric Engineering (ACE), please visitwww.sei.cmu.edu/about/organization/rtss/ace.cfm For more information about the SEI’s work in the Team Software Process (TSP), please visitwww.sei.cmu.edu/tsp/ To read the SEI technical report, Combining Architecture-Centric Engineering with the Team Software Process, please visitwww.sei.cmu.edu/library/abstracts/reports/10tr031.cfm

SEI . Blog .  Jul 27, 2015 02:57pm

Using TSP to Architect a New Trading System

By James McHale, Senior Member of the Technical Staff, Software Engineering Process Management This post is the second installment in a two-part series describing our recent engagement with Bursatec to create a reliable and fast new trading system for Groupo Bolsa Mexicana de Valores (BMV, the Mexican Stock Exchange). This project combined elements of the SEI’s Architecture Centric Engineering (ACE) method, which requires effective use of software architecture to guide system development, with its Team Software Process (TSP), which is a team-centric approach to developing software that enables organizations to better plan and measure their work and improve software development productivity to gain greater confidence in quality and cost estimates. The first post examined how ACE was applied within the context of TSP. This posting focuses on the development of the system architecture for Bursatec within the TSP framework. Challenges From a TSP perspective, the project faced several challenges. First, the few developers who had worked on the existing system had either moved into management or possessed technical skills that were out-of-date with modern development technologies. Second, the remaining developers, while competent, did not have experience in building the type of system that Bursatec needed. Another challenge was that several executives within the organization were in favor of outsourcing the work. Our Approach In the Bursatec project, we initially followed the standard TSP implementation approach, which emphasizes the importance of initially securing senior management commitment. This commitment is typically established via a TSP Executive Strategy Seminar, which covers the key practices and principles of TSP from a senior management perspective. Although Bursatec is a large organization, with several layers of checks and balances befitting a national stock market, the organization itself was very open, which allowed for streamlined communication between senior managers and the engineering team. In this open environment, the director at Bursatec, as well as his boss—who was president of the Mexican Stock Exchange—participated in the executive training. The executive training included an overview of rational management, the idea that management decisions should be made based on objective facts and data, and why this type of management is required to maintain successful TSP teams. We then trained the team leader of the project, as well as several other peers and senior developers at Bursatec in the basics of day-to-day management of TSP teams. We next trained the entire Bursatec development team—including the architects and team leader—in the fundamentals of the Personal Software Process (PSP), which teaches individual software engineers how to plan and manage high-quality software development work. The team would go on to apply the PSP concepts in a project, team-based environment. In an unusual development, the Bursatec director attended this class and also authored several programs using PSP methods, which he did as well as any of the developers. Having such a senior manager there—not just in the class but using the methods—sent the strong message that this was how "we" would be working going forward. After completing the PSP training, we conducted a Quality Attribute Workshop. This workshop is an architecture activity where Bursatec stakeholders defined the five most important quality attribute requirements (also known as quality attribute scenarios) that their new trading system had to fulfill. Not surprisingly, given the importance of speed for the new system, the stakeholders identified runtime performance as chief among the most important quality attribute scenarios. For the Bursatec developers, one benefit of defining quality attributes is that the practice placed significant emphasis on ensuring that the attributes be measurable. For example, the performance attribute was measured in two ways: the time for individual transactions (how fast each one was processed) and the throughput (how many transactions per second on an ongoing basis). In this context there is perfect harmony between what the ACE approach asks architects to do and what the TSP approach demands of developers. TSP teams receive fairly general direction for eliciting and capturing such quality attributes, the understanding of which often drives a project’s structure in addition to the structure of the developed product. With the Bursatec project, the ACE methods provided clear, specific direction on the early lifecycle issues that TSP normally leaves to local practice. Later in the project, TSP drove a disciplined implementation of the architecture that might otherwise have eluded developers. The TSP Launch Immediately after the conclusion of the Quality Attribute Workshop, we conducted the TSP launch, which is a series of nine meetings held during the course of four days in which the team reaches a common understanding of the work and the approach that it will take and produces a detailed plan to guide its work. The TSP launch includes producing the necessary planning artifacts (such as goals, roles, estimates, task plan, milestones, quality plan, and risk mitigation plan) that brought together a team of 14 members, including the team leader. Our goal was to plan the architecture activities in the context of supporting Bursatec and their existing time and budget constraints. During the launch, about half of the team focused on the architecture, including several people who were brought in as domain experts. These individuals were experts at interpreting the functional requirements and ensuring that the developers met them. For example, one individual had expertise in the Mexican Stock Exchange while another domain expert had extensive experience in the options and futures markets, specifically how those instruments are traded in Mexican markets. The other half of the team, seven developers, focused on two important needs for the system: high- speed communication and a testing framework. To successfully develop the system in the timeframe needed for Bursatec, it was critical that the system be tested automatically rather than manually. Testing (including regression testing to ensure that no other aspects of the system are compromised) of new functionality on the current system takes as much as a month and is performed manually. This testing motivates a quality attribute scenario for rapid testability of most new functions within a day, which leads naturally to an architecture that supports automated testing. The Bursatec developers then implemented the system’s underlying infrastructure based on an early version of the system architecture, while the architects elaborated their work based in part on the early developer work that supported a decision to purchase a particular commercial package for high-speed communication. This version of the architecture was subject to an Architecture Tradeoff Analysis Method (ATAM) review that ensured the quality attribute scenarios captured in the QAW were still the right ones, and that the proposed architecture addressed those scenarios. After the initial architecture iterations and the ATAM, the architects and other developers worked as a single, integrated team, removing the potential issues that sometimes arise when software architects throw their artifacts "over the wall" to developers. The architects dealt with issues and revised the architecture as necessary while shouldering a normal development workload. The team named role managers—a TSP concept—to focus on issues surrounding performance and garbage collection, two implementation issues critical to the success of the new trading system. Measurable Results While TSP can be used to manage all aspects of the software development phase, from requirements elicitation to implementation and testing, this is the first time that the approach has been applied to ACE technologies. The combination of these approaches offered Bursatec architects and developers a disciplined method for developing the software for their new trading engine. Through 6 major development cycles including 14 or so iterations over 21 months, the overall team developed over 200,000 lines of code, spending about 12 percent of their effort after the Quality Attribute Workshop on architecture and approximately 14.5 percent of effort in unit testing, performance testing, and integration testing. In contrast, the SEI would normally expect almost twice as much testing effort at this point in development, with potentially much more in system testing to push the overall total close to or beyond the 50-percent mark—an unfortunately realistic expectation in our industry. As of October 2011, system testing at Bursatec proceeds on schedule with a very low defect count (unusual in our experience), and the system is on target for deployment beginning in early 2012. Due to the early investment in architecture and a detailed, data-driven approach to managing both their schedule and their quality, less testing was required throughout system development. Another benefit of combining TSP with ACE is that the team of Bursatec developers was prepared for inevitable changes in the architecture requirements, indeed in changes of any sort over the 21 months of development. When the team received new requirements, it could evaluate them quickly for technical impact and implementation cost in terms of time and effort. With the quality attributes formally captured, the architecture in place, and detailed development plans at every step, a project with enormous risk potential in both technical and business terms ran on-time, within budget, and generally without the drama that large development efforts often exhibit. Additional Resources: To read the SEI technical report, Team Software Process (TSP) Body of Knowledge (BOK), please visit www.sei.cmu.edu/library/abstracts/reports/10tr020.cfm For more information about the SEI’s work in Architecture Centric Engineering (ACE), please visitwww.sei.cmu.edu/about/organization/rtss/ace.cfm For more information about the SEI’s work in the Team Software Process (TSP), please visitwww.sei.cmu.edu/tsp/ To read the SEI technical report, Combining Architecture-Centric Engineering with the Team Software Process, please visitwww.sei.cmu.edu/library/abstracts/reports/10tr031.cfm

SEI . Blog .  Jul 27, 2015 02:57pm

Measures for Managing Operational Resilience

By Julia Allen, Principal ResearcherCERT Program The SEI has devoted extensive time and effort to defining meaningful metrics and measures for software quality, software security, information security, and continuity of operations. The ability of organizations to measure and track the impact of changes—as well as changes in trends over time—are important tools to effectively manage operational resilience, which is the measure of an organization’s ability to perform its mission in the presence of operational stress and disruption. For any organization—whether Department of Defense (DoD), federal civilian agencies, or industry—the ability to protect and sustain essential assets and services is critical and can help ensure a return to normalcy when the disruption or stress is eliminated. This blog posting describes our research to help organizational leaders manage critical services in the presence of disruption by presenting objectives and strategic measures for operational resilience, as well as tools to help them select and define those measures. In April 2011, the DoD identified the engineering of resilient systems as a top strategic priority in helping to protect against the malicious compromise of weapons systems and to develop agile manufacturing for trusted and assured defense systems. SEI CERT has been exploring the topic of managing operational resilience at the organizational level for the past seven years through development and use of the CERT Resilience Management Model (CERT-RMM), a capability model designed to establish the convergence of operational risk and resilience management activities and apply a capability level scale that expresses increasing levels of process performance. CERT-RMM measures the ability of an organization to protect and sustain high-value services (which are organizational activities carried out in the performance of a duty or production of a product) and high-value assets (which are items of value to the organization, such as people, information, technology, and facilities that high-value services rely on). Resilient systems, as identified by the DoD, is one category of technology asset. Our research on resilience measurement and analysis focuses on addressing the following questions, which are often asked by organizational leaders: How resilient is my organization? Have our processes made us more resilient? What should be measured to determine if performance objectives for operational resilience are being achieved? To establish a basis for measuring operational resilience, we relied on the CERT-RMM as the process-based framework against which to measure. CERT-RMM comprises 26 process areas (such as Incident Management and Control (IMC) and Asset Definition and Management (ADM)) that provide a framework of goals and practices at four increasing levels of capability (Incomplete, Performed, Managed, and Defined.) Our initial work provided organizational leaders with tools to determine and express their desired level of operational resilience. Specifically, we defined high-level objectives for an operational resilience management program, for example, "in the face of realized risk, the program ensures the continuity of essential operations of high-value services and their associated assets." We then demonstrated how to derive meaningful measures from those objectives using a condensed Goal Question (Indicator) Metric method, for example, determining the probability of delivering service through a disruptive event. We also defined a template for defining resilience measures and presented example measures using the template. Too often, organizations collect "type count" measurements (such as numbers of incidents, systems with patches installed, or people trained) with little meaningful context on how these measures can help inform decisions and affect behavior. Based on the Goal Question (Indicator) Metric method outlined above, we identified strategic measures that help organizational leaders determine which process-level measures best address their needs. What follows is a description of five organizational objectives for managing operational resilience and 10 strategic measures for an operational resilience management (ORM) program. The ORM program defines an organization’s strategic resilience objectives (such as ensuring continuity of critical services in the presence of a disruptive event) and resilience activities (such as the development and testing of service continuity plans). We use an example of acquiring managed security services from an external provider to show how each measure could be used. Managed security services may include network boundary protection (such as firewalls and intrusion detection systems), security monitoring, incident management (such as forensic analysis and response), vulnerability assessment, penetration testing, and content monitoring and filtering. Organizational objective 1: The ORM program derives its authority from—and directly traces it to—organizational drivers (which are strategic business objectives and critical success factors), as indicated by the following measures: Measure 1: Percentage of resilience activities that do not directly (or indirectly) support one or more organizational drivers. Example use: External security services replace comparable in-house services with a lower cost (less effort) and more effective (less impact from incidents) solution. After external security services are operational, 75 percent of in-house efforts no longer support organizational drivers. This measure can be used to ensure an effective transition of designated in-house services to externally-provided services and to retrain/reassign staff currently performing such services. Measure 2: For each resilience activity, the number of organizational drivers that require it to be satisfied (the goal is equal to or greater than 1). Example use: An example of a resilience activity is formalizing a relationship with a security services provider using a contract or service level agreement (SLA) that includes all resilience specifications. There is at least one organizational driver that calls for having security services in place to achieve the driver. This driver likely maps to a personal objective of the chief information officer or chief security officer. If there is no such traceability, one or more drivers may require updating. Organizational objective 2: The ORM program satisfies resilience requirements that are assigned to high-value services and their associated assets, as indicated by the following measures: Measure 3: Percentage of high-value services that do not satisfy their assigned resilience requirements. Example use: Resilience requirements for security services are specified in the SLA. Provider performance is periodically reviewed to ensure that all services are meeting the SLA requirements (for example, high priority alerts from incident detection systems are resolved within xx minutes). Optimally, this percentage should be zero. If it is greater than an SLA-stated threshold (for example, 20 percent for service A), corrective action is taken and confirmed. Measure 4: Percentage of high-value assets that do not satisfy their assigned resilience requirements. Example use: This example is similar to the one above. The incident database is a high-value asset that is required to provide incident response services. The SLA specifies resilience requirements for this database, including daily automated backups and quarterly and event-driven (backup server upgrade and high-impact security incident) testing to ensure the provider’s ability to successfully restore from backups. Optimally, this percentage should be zero. If it is greater than an SLA-stated threshold (for example, 20 percent for asset B), corrective action is taken and confirmed. Organizational objective 3: The ORM program—via the internal control system—ensures that controls for protecting and maintaining high-value services and their associated assets operate as intended, as indicated by the following measures: Measure 5: Percentage of high-value services with controls that are ineffective or inadequate. Example use: The SLA identifies the controls (policies, procedures, standards, guidelines, tools, etc.) that are required by a service. These controls can be tailored versions of the controls that the organization uses or can be negotiated based on the provider’s standard suite of controls. Provider implementation of these controls is periodically reviewed (audited, assessed, scans performed, etc.). Optimally, this percentage should be zero. If it is greater than an SLA-stated threshold (for example, 20 percent for service A), corrective action is taken and confirmed. Measure 6: Percentage of high-value assets with controls that are ineffective or inadequate. Example use: This measure is as described above, with asset controls stated in the provider SLA. Organizational objective 4: The ORM program manages operational risks to high-value assets that could adversely affect the operation and delivery of high-value services, as indicated by the following measures: Measure 7: Confidence factor that risks from all sources that require identification have been identified. Example use: Major sources of risk are initially identified in the provider SLA and as part of an ongoing review based on changes in the operational environment within which services are provided. The elements that contribute to "confidence factor" (such as risk thresholds by service) are also identified. Confidence factor is represented as a Kiviat diagram showing plan versus actual for all sources. Analysis of provider gaps is reviewed on a periodic basis and corrective action is taken and confirmed to reduce unacceptable gaps. Measure 8: Percentage of risks with impact above threshold. Example use: Assessment of provider risk is performed on a periodic basis as specified in the SLA. Optimally, this percentage should be zero. If it is greater than an SLA-stated threshold (for example, 20 percent for risk type A), corrective action is taken and confirmed. Organizational objective 5: The ORM program ensures the continuity of essential operations of high-value services and their associated assets in the face of realized risk, as indicated by the following measures: Measure 9: Probability of delivered service through a disruptive event. Example use: The SLA states service-specific availability and service levels to meet, both steady state and in degraded mode. Provider performance is periodically reviewed, including during and after a disruptive event (power outage, cyber attack, etc.). Probability of delivered service is determined and evaluated as a trend over time. Corrective action is taken and confirmed as required. Measure 10: For disrupted, high-value services with a service continuity plan, percentage of services that did not deliver service as intended throughout the disruptive event. Example use: The SLA includes requirements for service-specific continuity (SC) plans. For provider services with SC plans that do not maintain required service availability and service levels, corrective actions are taken and confirmed, including updates to SC plans. In addition, the customer uses this as an opportunity to review and update its own SC plans that depend on provider services, where service was not delivered as intended. All these strategic measures derive from lower-level measures at the CERT-RMM process area level, including average incident cost by root cause type and number of breaches of confidentially and privacy of customer information assets resulting from violations of provider access control policies. To help organizational leaders determine what measures work best for their organization, we are collaborating with members of the CERT-RMM Users Group, which includes the United States Postal Inspection Service, Discover Financial Services, Lockheed Martin, and Carnegie Mellon University. Through a series of two-day workshops, members define an improvement objective, assess their current level of operational resilience against that objective, identify areas of improvement, and implement improvement plans using the CERT-RMM processes and candidate measures as the guide. Please contact us if you are interested in joining a CERT-RMM Users Group. Additional Resources: To read the SEI technical note, Measuring Operational Resilience Using the CERT Resilience Management Model, please visit www.sei.cmu.edu/reports/10tn030.pdf To read the SEI technical note, Measures for Managing Operational Resilience, please visit www.sei.cmu.edu/library/abstracts/reports/11tr019.cfm For more information about the CERT Resilience Management Model (CERT-RMM), please visitwww.cert.org/resilience/rmm.html To read an article about how the CERT Resilience Management Model helps companies predict performance under stress, please visit page 8 of the SEI 25th Anniversary Year in Review,www.sei.cmu.edu/library/assets/annualreports/2010_Year_in_Review.pdf To read an article about CERT work in Resilience Measurement, please visit page 4 of the SEI 25th Anniversary Year in Review,www.sei.cmu.edu/library/assets/annualreports/2010_Year_in_Review.pdf

SEI . Blog .  Jul 27, 2015 02:57pm

Fuzzy Hashing Against Different Types of Malware

By David French, CERT Senior Researcher Malware, which is short for "malicious software," is a growing problem for government and commercial organizations since it disrupts or denies important operations, gathers private information without consent, gains unauthorized access to system resources, and other inappropriate behaviors. A previous blog post described the use of "fuzzy hashing" to determine whether two files suspected of being malware are similar, which helps analysts potentially save time by identifying opportunities to leverage previous analysis of malware when confronted with a new attack. This posting continues our coverage of fuzzy hashing by discussing types of malware against which similarity measures of any kind (including fuzzy hashing) may be applied. Fuzzy hashes provide a continuous stream of hash values for a rolling window over the malware binary, thereby allowing analysts to assign a percentage score that indicates the degree of similarity between two malware programs. When considering how fuzzy hashing works against malware, it is useful first to consider why malware programs would ever be similar to each other. For the purposes of this discussion we focus on prevalent Microsoft Portable Executable (PE) formatted files, although this description can be generalized to any executable code stored in any format. We further consider similarity as a measure of file structure—rather than program behavior—since fuzzy hashing generally applies to the bytes comprising a file, rather than an observation of the semantics of a program in some other space. Malware is software combining three elements: (1) code, whether compiled source code written in a high-level language or hand-crafted assembly, (2) data, which is some set of numerical, textual, or other types of discrete values intended to drive the logic of the code in specific ways, and (3) process, which is loosely a set of operations (for example, compiling and linking) applied to the code and data that ultimately produce an executable sequence of bytes in a particular format, subject to specific operating constraints. Given a distinct set of code, data, and consistent processes applied thereto, it is reasonable to conclude that—barring changes to any of these—we will produce an identical executable file every time we apply the process to the code and data (where identity is measured using a cryptographic hash, such as MD5). We now consider how the permutation of any of these components will affect the resulting executable file. First, let us consider the effect of modifying the data used to drive a particular executable. With respect to malicious software, such data may include remote access information (such as IP addresses, hostnames, usernames and passwords, commands, etc.), installation and configuration information (such as registry keys, temporary filenames, mutexes, etc.), or any other values which cause the malware to execute in specific ways. Generally speaking, changing the values of these data may cause different behavior in the malware at runtime but should have little impact on the structure of the malware. Malware authors may modify their source code to use different data values for each new program instance or may construct their program to access these data values outside the context of the compiled program (for example, by embedding the data within or at the end of the PE file). In the case of malicious code, data may also include bytes whose presence does not alter the behavior of the code in any way, and whose purpose is to confuse analysis. Regardless, the expected changes to the resulting executable file are directly proportional to the amount of data changed. Since we only changed the values of data—not the way in they are referenced (in particular, we have not changed the code)—we can expect that the structure of the output file is modified only to support any different storage requirements for the new data. Similarly, let us consider the effect of modifying the code found in a particular executable. The code defines the essential logic of the malware and describes the behavior of the program under specified conditions. To modify program behavior, the code must generally be modified. The expected changes to the resulting executable file are proportional to the amount of code changed, much as we expect when changing data. However, code—especially compiled code—differs from data in that the representation of the code in its final form is often drastically different from its original form. Compiling and linking source code represents a semantic transformation, with the resulting product intended for consumption by a processor, not a human reader. To accomplish semantic transformation most effectively, the compiler and linker may perform all manner of permutations, such as rewriting blocks of code to execute more efficiently, reordering code in memory to take up less space, and even removing code that is not referenced within the original source. If we assume that the process to create the executable remains constant (for example, that optimization settings are not changed between compilations), we must still allow that minor changes in the original source code may have unpredictably large changes in the resulting executable. As a consequence, code changes are more likely to produce executables with larger structural differences between revisions than executables where only data changes. Thus, we have described two general cases in which structurally different files (measured by cryptographic hashing, such as MD5) may be produced from a common source. We refer to malware families whose primary or sole permutation is in their data as generative malware, and use the analogy of a malware factory cranking out different MD5s by modifying data bytes in some way. We refer to malware families whose primary permutation is in their code as evolutionary malware, in that the behavior of the program evolves over time. When considering the effects of similarity measurements such as fuzzy hashing, we may expect that fuzzy hashing will perform differently against these different general types of malware. As an example of using fuzzy hashing against generative malware, consider the malware family BackDoor-DUG.a (also referenced here) also known as Trojan.Scraze by ClamAV and W32/ScreenBlaze.A2 by F-Prot (ClamAV and F-Prot are antivirus vendors and it’s important to note that the same family is known by several different names). The two files referenced from the McAfee site are Delphi programs, comprising 4,185 functions at distinct addresses as observed by disassembling each program using IDA-Pro v6.1. If we consider each function as a sequence of bytes and consider the cryptographic hashes of each function’s bytes using a technique called function hashing, when we observe that these programs have approximately 3,321 unique functions each, per their position independent code (PIC) function hashes. Of these 3,321 functions distinct to each program, we observe that 3,292 of these functions are shared (meaning their bytes are exactly the same) between the programs, and that each program has 29 functions not shared with the other program. Inspecting each of the 29 functions in each of two files (for a total of 58 functions) in IDA-Pro, we discover that for all 29 pairs of functions found at the same address across the two files, the functions at the same addresses only differ by large blocks of seemingly non-executed data, which is jumped around by the code bytes. Otherwise, code bytes for each of the 29 function pairs at corresponding addresses are identical. In this way, we can observe that the two programs are materially identical, except for seemingly non-executed bytes, which we generically call data. By performing ssdeep comparison of these two files we produce the following fuzzy hashes and their associated comparison score: 12288:gp/iN/mlVdtvrYeyZJf7kPK+iqBZn+D73iKHeGspOdqcXigCcCmua1xIam:gpQ/6trYlvYPK+lqD73TeGspOQKUmxpm,"70212f8f88865f4f9bb919383aabc029.ex_" 12288:gp/iN/mlVdtvrYeyZJf7kPK+iqBZn+D73iKHeGsptx6KrPSTKQGLG4a4:gpQ/6trYlvYPK+lqD73TeGspqnKx64,"6f83ac65223e2ac7837bfe3068da411c.ex_"70212f8f88865f4f9bb919383aabc029.ex_ matches 6f83ac65223e2ac7837bfe3068da411c.ex_ (85) Matching these files using ssdeep corroborates our findings using analysis of these files by function data, in that they are highly similar. These two files thus provide a good example of generative malware. When considering how code changes can affect fuzzy hashing, we consider briefly non-malicious software for which we have full source code. The Nullsoft Scriptable Installation System (NSIS) is an open-source installation system used to create Windows-based installation programs. Although NSIS is not malicious software it can be used to install many different types of programs on Windows computers, including malicious and non-malicious programs alike. The project page for NSIS provides several revisions; we examined the two most recent Versions 2.45 (MD5 sum af193ccc547ca83a63eedf6a2d9d644d) and 2.46 (MD5 sum 0e5d08a1daa8d7a49c12ca7d14178830), for which Windows binaries are available. The two files comprise 6,038 and 6,040 functions at distinct addresses, respectively, with 2,564 unique functions (as measured by their PIC function hashes). These two programs have 2,544 identical functions, with 20 different functions each. The differing functions have changes that range from identical functions using different constants to entirely new functions with no overlapping behavior. Regardless, the vast majority of the behavior of these two programs is identical. We perform ssdeep comparison of these two files, and produce the following fuzzy hashes, and their associated comparison score: 12288:p24n/P3WRlauwYyPd7K67jBOs/skXMujtiEs6vHG9Uu94yGjbgWsvvs0V:k4n3GRMuwYyV26XDRiE6qu+yJWsXsa,"nsis-2.45/NSIS.exe" 12288:lWe4uCFAtIma4w3PE6EPYL/t+32gNjw6ps6cg1eHgfKkx71DS0V:Ie4ugwIma4O86YnE6pxKgCg71Sa,"nsis-2.46/NSIS.exe" nsis-2.45/NSIS.exe matches nsis-2.46/NSIS.exe (0) As seen from the score of zero from ssdeep, fuzzy hashing does not detect any relationship between these two files even though function analysis revealed that the majority of the behavior of these two files is the same. This result is borne out by reading the release notes for V2.46 from the NSIS website, which documents relatively minor changes. When we compare two files whose known changes are relatively few, we can see that, although the evolution of these two programs is relatively minor in terms of absolute number of changes to functionality, their structure is clearly different enough that fuzzy hashing such as ssdeep was completely unable to detect similarity. This highlights the challenging problem of similarity measurements in malicious code, and underscores the need to understand the underlying reasons that similarity would ever present to any particular technique. Future blog entries will consider alternate fuzzy hashing approaches and tools, and discuss some of the challenges of performing fuzzy hashing at scale. This post is the second in a series exploring David's research in fuzzy hashing. To read the first post in the series, Fuzzy Hashing Techniques in Applied Malware Analysis, please click here. Additional Resources: More information about CERT research in malicious code and development is available in the 2010 CERT Research Report, which may be viewed online at www.cert.org/research/2010research-report.pdf

SEI . Blog .  Jul 27, 2015 02:56pm

Displaying 29161 - 29170 of 43689 total records

Blogs

Alert Others