Blogs
|
By Anne ConnellDesign Team Lead CERT Cyber Security Solutions Directorate
This blog post was co-authored by Barbora Batokova and Todd Waits.
The source of a recent Target security breach that allowed intruders to gain access to more than 40 million credit and debit cards of customers between Nov. 27 and Dec. 14, 2013, has been traced to a heating, ventilation, and air conditioning (HVAC) service sub-contractor in Sharpsburg, Pa., just outside of Pittsburgh, according to a Feb. 5 post on a Wall Street Journal blog. The post stated that the intruders were able to gain access to Target’s system after stealing login credentials from one of Target’s HVAC subcontractors, who had been given remote access. This breach demonstrates how any vulnerability in a critical information system can be exploited to disrupt or harm the normal operation of any commercial or industrial sector. In this blog post, we will present a tool we have developed that increases a security incident responder’s ability to assess risk and identify the appropriate incident response plan for critical information systems.
We define a critical information system as a computer-controlled information system that manages the operation and essential assets of any commercial or industrial sector, including
energy delivery
backup generators
water
sewer systems
airports
railway
public transportations
oil and natural gas
emergency medical services
information technology (IT) systems
business management systems
A compromise of any of these critical systems and its assets in terms of confidentiality, integrity or availability (CIA Triad of Information Assurance) typically leads to the loss of:
financial stability
revenue
stakeholder trust
customer confidence
competitive advantage
key technologies
property
life
As the Target breach illustrates, organizations face large-scale distributed attacks on a regular basis. Another example of a similar breach is the compromise of networks of the three top medical device makers, according to an article published Feb. 10 in the San Francisco Chronicle.
A report prepared by the Minority Staff of the Homeland Security and Governmental Affairs Committee presented in February 2014 stated "in the past few years, we have seen significant breaches in cybersecurity which could affect critical U.S. infrastructure… Nuclear plants’ confidential cybersecurity plans have been left unprotected. Blueprints for the technology undergirding the New York Stock Exchange were exposed to hackers."
As a result, organizations and law enforcement officials are increasingly interested in investing in tools that will allow them to assess the security of their information systems, and identify and address any vulnerabilities. Often times, the approach to investigating and assessing critical information systems involves interviews with individuals who own and maintain such systems and recording the details about such systems in handwritten notes that are later transcribed into a word processing program.
Due to the lack of a systematic process and standardized terminology, however, this approach is not effective, nor sustainable. It prevents people from accessing historical data from past assessments and leveraging the experience of previous assessors. Moreover, it does not afford real-time assessment and prevents people’s ability to compare and identify patterns of data. Finally, effective collaboration on assessments is nearly impossible, as it is cumbersome to share and expand upon the collected information.
As we outlined in a paper presented at IEEE’s Technologies for Homeland Security Conference in November 2013, The CERT Assessment Tool: Increasing a Security Incident Responder’s Ability to Assess Risk, organizations and law enforcement officials tasked with assessing and/or responding to large-scale attacks on critical information systems typically face three challenges:
It is hard for organizations and security incident responders to establish adequate levels of trust. Organizations are reluctant to share information and communicate when effective response to such attacks requires them to share information including networks, diagrams, and logs.
Even after establishing adequate levels of trust, it is extremely hard to manage the sheer number of tasks and processes in the assessment and response plan. This plan includes identifying the vulnerability or a threat, determining if adequate controls are in place, mitigating the vulnerability if necessary, sharing the response plan, participating in collaborative decision making, and sharing information and analysis across the team of incident responders.
Site administrators who own the critical information systems are at the core of the assessment and response plan and thus must be involved during the entire duration of the assessment and/or the event. This proves challenging because site administrators may need to make modifications to a system that they depend on, so the solution involves them in the process.
Foundations of Our Approach
In 2010, at the request of a federal government agency, we began work on development of a customer-specific version of a tool and accompanying training component that would enable that agency to conduct critical information systems assessments.
After we deployed the tool to the agency, we received positive feedback that it was an operational solution with a built-in training path for critical information systems assessment. We received additional feature requests for the tool, including scaling the methodology to other domains within the organization. This feedback led to the development of the CERT Assessment Tool prototype, which provides a framework for assessing and addressing risk in critical information systems in various domains. The prototype, which is described in our paper, proves the concept of facilitating multiple people to actively collaborate, investigate, plan, and respond to ensure the confidentiality, integrity and availability of their critical information systems and the assets they manage.
Our approach, which can be tailored to any commercial or industrial domain, is based on established methodologies and principles for risk management and information assurance, including NIST’s Recommended Security Controls for Federal Information Systems and Organizations (800-53) and Risk Management Guide for Information Technology Systems (800-30). The CERT Assessment Tool’s hierarchical data model addresses the identified need to standardize the terminology and taxonomy used during assessments. Having this data hierarchy and structure enables the application to create a systemic approach to assessment.
Data Model
Below we have included an image of a data model, which shows the hierarchy and relational qualities that each assessment has to present and future objects in the system. Each assessment can have multiple sites. For each site there are multiple systems, and for each system there are at least six system attributes collected to describe the platform, configuration, vulnerabilities, controls, mitigations, and responsible contact.
The CERT Assessment Tool incorporates machine learning and decision support systems to provide actionable information and guidance that is repeatable. The tool and its systems reduce the reliance on experts, making knowledge more accessible and available when it is needed. The application utilizes Drools, an open-source business rule management system, to enable users to create domain-specific rules.
Our approach defines a concrete process of information collection and assessment, guiding the user through various stages of assessment, through the implementation of the security plan. The approach also incorporates a role-based model that defines the entities involved in the assessment and their responsibilities.
How the CERT Assessment Tools Works
The prototype of the CERT Assessment Tool currently runs on any Windows computer; however, the prototype can be architected for mobile and tablet deployment as well, which will provide flexibility to users and afford a nearly real-time assessment on site.
The application enables users to record the interview with a site/system administrator and tag the information about the system to create object meta data, and establish a standardized assessment taxonomy.
The image above illustrates areas where the tool segments the functionality for the assessment. The left-hand side of the image lists the systems and all of the tagged information. The center section is the free-form text area to support unrestricted information collection and tagging. The right panel provides real-time guidance questions based on the user’s inputs and expert system push notifications should a system be tagged with a known vulnerability. For a proper critical information system (e.g., HVAC, energy delivery, elevators) assessment, the user is required to enter and tag the following information:
System Name
Characteristics - e.g. type of connection, updating/patching process, model number, etc.
Contacts - e.g. any person associated with the system
Vulnerabilities - any flaws or weaknesses in the system security
Controls - any safeguards or measures implemented to minimize or eliminate the identified vulnerabilities
Mitigations - any risk-reducing controls recommended from the risk assessment
Once the information is tagged, the system creates push notifications that help the user ask additional interview questions, identify vulnerabilities or recommend possible mitigations. One of the sources the application taps for these recommendations is the National Vulnerability Database (NVD). For example, if one of the tagged system characteristics is Windows 7, the application will search the NVS and display the common vulnerabilities for Windows 7 to provide the user with additional context for the assessment.
Once enough information is tagged and the system reaches a certain threshold of available information from past assessments, it is possible to create automatic rules using the Drools engine to provide specific expert advice about system vulnerabilities, existing controls, and mitigations.
If we apply this approach and methodology to the HVAC system that was compromised during the Target breach, the initial (very simplified) assessment taxonomy would look like this:
System Name: HVAC
Characteristics:
Vendor-Updated
Windows 7
Internet-Connected
Contacts: Fazio Mechanical Services - Vendor
Vulnerabilities:
Remote Access
Active Directory Credentials
Controls: Password Policy
Based on this information, the CERT Assessment Tool with its guidance questions and decision support system would be able to recommend or alert the user that a remote access capability with active directory credentials is a vulnerability that needs to be properly mitigated to prevent a security breach.
The CERT Assessment Tool can be used for assessments of a wide variety of information systems, which may or may not be deemed critical by the organization:
Industrial Control Systems (ICS) that manage any kind of industrial production, including the following:
SCADA (Supervisory Control and Data Acquisition) computer information systems monitor and control industrial, infrastructure or facility-based processes in commercial buildings, airports, hospitals such as HVAC and energy distribution.
DCS (Distributed Control System) information systems control a manufacturing system, process, or any kind of dynamic system in which the controller elements are not centralized in one location but are distributed throughout the system with each component sub-system controlled by one or more controllers. Examples of such systems are electrical power grids, water management systems, and traffic signals.
Distributed File Systems (DFS) that allow access to files from multiple hosts sharing via a computer network, such as the NFS from Sun Microsystems and DFS from Microsoft
Network Information Systems (NIS) that manage networks such as gas supply or telecommunications
Looking Ahead
Given the increasing complexity and frequency of attacks on critical information systems, organizations and security incident responders need a tool that will allow them to effectively collaborate on assessments and security planning.
The CERT Assessment Tool allows users to take a more pro-active role in risk management and gain an enhanced situational awareness of all systems on site. The application is the first line of defense in making users aware of the security of their information systems.
Additional Resources
To read the paper, The CERT Assessment Tool: Increasing a Security Incident Responder’s Ability to Assess Risk, by Anne Connell and Todd Waits, please visithttp://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=06699006.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:13pm</span>
|
|
By David MundieSenior Member of the Technical StaffCSIRT Development Team
Social engineering involves the manipulation of individuals to get them to unwittingly perform actions that cause harm or increase the probability of causing future harm, which we call "unintentional insider threat." This blog post highlights recent research that aims to add to the body of knowledge about the factors that lead to unintentional insider threat (UIT) and about how organizations in industry and government can protect themselves. This research is part of an ongoing body of work on social engineering and UIT conducted by the CERT Insider Threat Center at the Carnegie Mellon University Software Engineering Institute.
UIT is becoming increasingly common. For example, about a year ago, spear phishers from China infiltrated the New York Times website in hopes of gaining access to names and sources that Times reporters had used in a story. A year earlier, Google pulled more than 22 malicious Android apps from the market after they were found to be infected with malware. This year, security blogger Brian Krebs reported that "The breach at Target Corp. that exposed credit card and personal data on more than 110 million consumers appears to have begun with a malware-laced email phishing attack sent to employees at an HVAC firm that did business with the nationwide retailer, according to sources close to the investigation." The Target breach spear phishing attack is an example of social engineering and illustrates how UIT can cause harm to an organization.
Foundations of Our Work
Insider threat remains a major concern among computer and organizational security professionals, more than 40 per cent of whom report that their greatest concern is employees accidentally jeopardizing security through data leaks and or similar errors. This finding led to our initial research into the field of UIT and the publication of the report, Unintentional Insider Threats: A Foundational Study. In that report, which seeks to understand causes and contributing factors in UITs, we developed the following operational definition:
An unintentional insider threat is (1) a current or former employee, contractor, or business partner (2) who has or had authorized access to an organization’s network system, or data and who, (3) through action or inaction without malicious intent, (4) unwittingly causes harm or substantially increases the probability of future serious harm to the confidentiality, integrity, or availability
As the examples above illustrate, the impact of UIT can be devastating, even though it is typically the result of actions taken by a non-malicious insider. Our initial work in this field led us to conduct a second phase of research that took a deeper dive into social engineering, specifically the psychological aspects of social engineering exploits.
While technical solutions may be useful on the edges, at its core UIT is a human problem that requires human solutions. Unfortunately, organizations are often loathe to report insider incidents out of fear that the news could damage their reputation or value. A very limited amount of information is publically available through lawsuit records. We also examined news articles, journal publications, and other sources, including blogs, to compile information and identify contributing factors to UIT and social engineering.
Through our analysis, we have compiled information on 28 cases that is now housed in our UIT social engineering database.
Contributing Factors in Social Engineering Vulnerability
In the course of our research, we identified several factors that made individuals more susceptible to attack. Although our sample did not allow us to draw any conclusions on demographic factors, such as gender or age, we were able to identify several organizational and human factors. The organizational factors that we identified in our report are as follows:
Security systems, policies, and practices. Many of the cases that we examined provided insight into organizational policies and procedures. Some cases indicated that the victims violated those policies, but most incident summaries do not provide sufficient information to determine whether those factors are involved.
Management and management systems. Many of the cases reveal that simple authentication credentials provide attackers with access to internal emails, company data, and entire computer networks. In one case that we examined, an attacker gained direct network access from a username-password combination and did not need to place malware or execute any other indirect attack to cause damage. Organizations must regularly perform extensive security audits to determine how best to improve internal controls; they cannot rely on security established during initial installation of a system.
Job pressure. Certain industries, such as news services, place a premium on obtaining and distributing information as quickly as possible. Employees in these types of organizations may be more susceptible to outside influence from social engineering due to this pressure.
The human factors that we identified are as follows:
Attention. In at least one of the cases we examined, we identified fatigue as a contributing factor. In that case, a phishing message was received late at night, and the individual responded before taking the time to analyze the message. The attacker may have information about work hours that could be used as part of an organized attack.
Knowledge and memory. Several cases that we examined indicated that even when employees have been trained, a large percentage will still respond to phishing attacks. It is therefore important that organizations offer constant refreshers or other means to maintain employee knowledge and keep it fresh in their minds.
Reasoning and judgment. In some cases, an employee’s safeguards were lowered, perhaps in response to the realistic nature of a phishing message and/or the pretext created through reverse social engineering (e.g., offers to organizations or employees to assist in preventing or addressing outside attacks, in solving bank account problems, or in supporting system operations).
Stress and anxiety. In one case, the victim knew that the organization and its customers were receiving phishing emails. This knowledge may have increased his desire to accept an offer of mitigation that appeared to be legitimate, but in actuality was just another phishing attack.
I would like to stress that we are not breaking new ground with this publication. Our intent was to add meaningful input to the ongoing discussion on how social engineering relates to the body of research on insider threat and what organizations, specifically federal agencies, can do to mitigate contributing factors. Social engineering is a key component of UIT in that many non-malicious insiders are susceptible to social engineering, and thus become a threat to their organizations.
An example of the impact of social engineering is the "Robin Sage" case where a cyber security analyst and "white hat hacker" contacted security specialists, military personnel, staff at intelligence agencies and defense contractors through bogus accounts that had been established on social networking sites such as Facebook, Twitter, and LinkedIn. The recipients of these communications ended up exposing far more information than their organization or its business partners would have wanted released in the public domain. Other examples similar to this have been made public since the "Robin Sage" study.
Best Practices for Organizations
As we stated in our report, organizations face many challenges in countering UIT social engineering threats, including balancing operational goals with security goals to remain competitive. To stay ahead, or at least keep up with phishers and spear phishers, we suggest the following best practices based on our analysis:
Training. Organizations must continue to develop and deploy effective training and awareness programs so that staff members are aware of social engineering scams and can identify deceptive practices and phishing cues. Training plans should also teach effective coping and incident management behaviors to respond to social engineering.
Minimize stress. When employees are stressed and working fast, they tend to be more susceptible to social engineering attempts. Organizational leaders need to examine whether they are creating a stressful environment or one that fosters a natural workflow. For example, one aspect of a plan to minimize stress could involve allocating time for employees to fulfill information security compliance requirements.
Encourage employees to monitor and limit information posted on networking sites. For example, LinkedIn members often post details about their career history, including past cities where they have lived and worked. Phishers and spear phishers often contact individuals based on the information posted on such sites. They advertise false jobs and ask recipients to send a writing sample, building a sense of trust.
A person seeking a job or a networking opportunity should be trained to avoid posting unnecessary details on social network sites. Moreover, job seekers should not operate in a vacuum. In particular, they should seek the input of a co-worker or friend to review an email inquiry to assess whether it appears legitimate.
One technique for detecting unintended disclosure of information on social networking sites is to put a piece of false information on each social media site the individual uses. For example, a user could list an alternate city or alternate dates of employment on separate sites, so that a social engineering attempt based on information from that site can be detected easily. If someone contacts the individual referencing the false information, the individual would know that this is a social engineering attempt, rather than a legitimate contact.
A lot of the best practices listed above are similar to those that our team recommends for intentional insider threat. These include training to heighten awareness and reduce human error, management practices to reduce likelihood of human error, e-mail safeguards that include anti-phishing, and anti-malware, antivirus protection, data encryption on storage devices, password protection, wireless and Bluetooth safeguards, remote memory wipe for lost equipment, and attention to what is posted on social media sites. While not all best practices listed above have been validated in our report, they are strategies that we have found to be successful.
Looking Ahead
Our research on UIT to date has been sponsored by the Department of Homeland Security. In the next phase of our work, we plan to examine UIT in the context of the 14 sectors of the economy identified by the DHS. For example, we will examine if phishing attacks differ based on the sector of the economy where they are executed.
One challenge that we continue to face is the lack of verifiable information regarding social engineering and UIT. It would be ideal if we could set up an information sharing system where organizations could share information about unintentional insider threats without feeling as if their security or reputation were being compromised.
As we stated earlier, socially engineered attacks that result in UIT are very much a human problem. While technical solutions may be useful, further research is needed to identify and mitigate the organizational and human factors of UIT social engineering. We welcome your feedback on our work. Please leave feedback in the comments section below.
If you have experienced an UIT, please let CERT know (also by leaving feedback in the comments section). We are looking to increase the number of cases in our database, and greatly appreciate any help we receive. All your information will be kept strictly confidential.
Additional Resources
To read the SEI technical report, Unintentional Insider Threats: Social Engineering, please visithttp://resources.sei.cmu.edu/library/asset-view.cfm?assetID=77455
To read the SEI technical report, Unintentional Insider Threats: A Foundational Study, please visithttp://resources.sei.cmu.edu/library/asset-view.cfm?assetID=58744
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:13pm</span>
|
|
By Julien DelangeMember of the Technical StaffSoftware Solutions Division
The Architecture Analysis and Design Language (AADL) is a modeling language that, at its core, allows designers to specify the structure of a system (components and connections) and analyze its architecture. From a security point of view, for example, we can use AADL to verify that a high-security component does not communicate with a low-security component and, thus, ensure that one type of security leak is prevented by the architecture. The ability to capture the behavior of a component allows for even better use of the model. This blog post describes a tool developed to support the AADL Behavior Annex and allow architects to import behavior from Simulink (or potentially any other notation) into an architecture model.
The amount of software in safety-critical systems, such as automotive, aerospace, and medical domains—where failure could result in loss of life—continues to grow in size and complexity. As systems become more complex, the challenges of architecting them continue to grow. In Safety Critical Systems: Challenges and Directions, John C. Knight took note of this trend: "Breakdown in the interplay between software engineering and systems engineering remain a significant cause of failure. It is essential that comprehensive approaches to total system modeling be developed so that properties of entire systems can be analyzed. Such approaches must accommodate software property and provide high fidelity models of critical software characteristics."
Many software projects capture the behavior of components using state machines or use-case diagrams. These descriptions support development efforts, either as a specifications document or as inputs for code generators. They focus only on one component, however, and do not take into account the execution environment such as the hardware platform, communication buses and potential concurrent use of resources or the integration with other architectural elements such as connections with other components, etc. Below we describe how a tool developed in accordance with the AADL Behavior Annex can support these capabilities.
Foundations of Our Approach
Integration issues account for more than 70 percent of software defects. As behavior is a key aspect of the system, it is crucial to analyze the impact between behavior specifications and the overall architecture. Most modeling techniques focus on behavior (which involves what the system is supposed to do) for design and implementation purposes. If such notations accurately detail behavioral aspects, then they do not incorporate the execution environment (which involves how the service is provided). This omission makes it hard to analyze the interactions and dependencies between system functionality and its execution environment, which is of primary importance because one aspect can impact the other. (For example, a task may request a resource being already locked, leading to an expected delay or a timeout.
In contrast, AADL allows architects to specify the structure of the system as well as the system’s execution environment. The remainder of this blog post, therefore, describes research aimed at integrating a behavior description (what the system performs) into the execution environment defined in AADL (how the system performs the functions: deployment of system functions on processors, buses, etc.).
AADL provides mechanisms to extend component specifications, including user-defined properties and annex languages. While user-defined properties are an extension mechanism to describe non-functional properties and is semantically limited, the annex language allows architects to associate third-party languages to a component to specify various aspects. Several annex languages have been proposed such as the error-model annex, which allows for the specification of errors and faults that occur within a component or that may propagate across the architecture. The behavior annex brings into the overall architecture the ability to describe the behavior of AADL components as well as their interactions.
SAE International, formerly the Society of Automotive Engineers, published the AADL Behavior Annex language extension to describe component behavior in terms of a state machine that interacts with component members (i.e., modifying data when the component is in a specific state) and interfaces (i.e., sending data through a port when the component is in a state or triggering a state transition upon a specific event received on an AADL event port).
The behavior annex is currently being revised and improved by SEI researchers. Revisions include adding the ability to connect behavior specifications with other annexes such as the error-model annex. By connecting component descriptions, analysis tools can then show the impact between different aspects of the model (e.g. behavior and error specification). This modification would provide architects greater insight into the impact that component behavior can have on other system attributes. For example, this approach will allow architects to visualize how a task activation or communication delay can generate an error that could be propagated through the rest of the architecture.
The behavior annex is now included in the Open-Source AADL Toolset Environment (OSATE).Télécom Paris Tech, an engineering school that has been involved in AADL development for several years, played a significant role in incorporating the behavior annex into the OSATE framework. The behavior annex language is being used in OSATE extensions for research projects such as the System Architecture Virtual Integration (SAVI) program, which was undertaken by the Aerospace Vehicle Systems Institute. To read more about the use of AADL in the SAVI program, which was a recent post in our ongoing series on AADL, please click here.
Adding Behavior to the Execution Platform
The AADL Behavior Annex augments system architecture description for a better design and more accurate analyses. From a security perspective, for instance, component behavior specifies the communication policy so that analysis tools can check that all channels exchange data at the same classification level. For example, if a component has two communication channels—one for highly secured data and another for low-secured data—the behavior description will specify which one is used to send data, thereby avoiding an important type of security leak.
Such validation is infeasible with existing tools because they do not provide the appropriate semantics to capture these aspects. The loose coupling of notation and analysis tools makes it hard to analyze a system’s behavior within its execution environment. Integrating these notations in a single model helps address these issues and provides better end-to-end system analysis.
In the previous post in our series on AADL, we explored the use of AADL in several application domains such as aerospace and aviation as well as projects to specify cyber-physical systems. Current projects focus mostly on defining the system architecture. For example, these projects define functions such as temperature readings and how they are realized (using a dedicated device or sensor) and inter-connected (a shared bus between the sensor and the computer receiving and processing temperature values).
Until recently, however, AADL specifications did not describe how they perform their designated functions. Our latest work, therefore, augments the existing language so that users can associate behavior specifications with each hardware and software component. This capability will integrate another part of the component description and extend potential analysis so that tools can find more issues or refine actual diagnosis of system defects.
Integration with Other Behavior Specification
Several methods and tools already exist to specify component behavior. For example, Simulink or SCADE provide architects the capability to define state machines that characterize how a component processes data and produces its outputs. Thus, when a user establishes the AADL behavior of a component that has already been defined with another language, the following question is raised:
Is the AADL behavior consistent with the other specification?
Indeed, one has to ensure that both behaviors (the AADL model and a second language) are consistent to assure model correctness and consistency. This assurance increases confidence that verification done using the AADL model are applicable to other notations, as well. For that purpose, as shown in the picture below, current research efforts aim to
automatically generate AADL behavior specification from existing behavior model
validate an AADL behavior specification according to a behavior model defined with another notation
These two different methods to connect behavior specifications (from AADL and other languages) will help engineers ensure correct specifications. In addition, ensuring correctness and consistency in behavior specifications will help architects avoid discrepancies between different behavior specifications, ensuring that the behavior description is actually what is specified in other models and not what the engineer wants to expose.
The approach shown in the picture above would avoid any mismatch of behavior specifications, especially if the models are developed by different engineers (i.e., one who makes the AADL model and another who focuses on behavior specification using another language). Likewise, generating AADL behavior specification from existing models would encourage engineers to integrate the behavior description in their architecture model by reducing the learning curve.
These new methods are being implemented as part of OSATE and are already released under an open-source license in the testing branch. Interested users can already use them by downloading our testing release.
Wrapping Up and Looking Ahead
Current research efforts from the SEI and its collaborators aim to refine and improve the AADL Behavior Annex to provide a better and more accurate description of components behavior (e.g., by providing the ability to associate behavior for system or process components, a feature not currently allowed by the current standard). As part of this effort, SEI researchers are also working on methods to connect the AADL behavior specification with existing behavior models used in industrial projects.
Our research and development activities are yielding an improved method and a set of open-source software components that integrate behavior in the architecture, thereby extending the description of the system and providing more information that can be analyzed and processed by analysis tools. The results of our work should help architects detect more issues when integrating components and support design and analysis. In future work we plan to investigate impacts of behavior aspects over system quality attributes (such as performance, safety or security). AADL can capture all these aspects in a single notation. This capability provides necessary information to help designers reasoning about several architecture variances and choose the most appropriate according to their business goals.
We welcome your feedback on this research effort. Please leave us feedback in the comments section below.
Additional Resources
For more information about AADL, please visitwww.aadl.info and our wiki https://wiki.sei.cmu.edu/aadl/
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:13pm</span>
|
|
By Nader Mehravari Senior Member of the Technical Staff CERT Cyber Risk Management Team
This blog post was co-authored by Julia Allen and Pamela Curtis.
In October 2010, two packages from Yemen containing explosives were discovered on U.S.-bound cargo planes of two of the largest worldwide shipping companies, UPS and FedEx, according to reports by CNN and the Wall Street Journal. The discovery highlighted a long-standing problem—securing international cargo—and ushered in a new area of concern for such entities as the United States Postal Inspection Service (USPIS) and the Universal Postal Union (UPU), a specialized agency of the United Nations that regulates the postal services of 192 member countries. In early 2012, the UPU and several stakeholder organizations developed two standards to improve security in the transport of international mail and to improve the security of critical postal facilities. As with any new set of standards, however, a mechanism was needed to enable implementation of the standards and measure compliance to them. This blog post describes the method developed by researchers in the CERT Division at Carnegie Mellon University’s Software Engineering Institute, in conjunction with the USPIS, to identify gaps in the security of international mail processing centers and similar shipping and transportation processing facilities.
Foundations of Our Approach
This engagement was not our first time working with the USPIS. Our first project, in 2011, involved helping USPIS ensure that packages sent to other countries from the United States complied with U.S. export laws. As a result of this initial engagement, USPIS had a well-defined process for addressing export law compliance. In addition to export screening, we have worked with USPIS on projects that involve incident response, authentication services, physical security, aviation screening for international mail, Priority Mail Express revenue assurance, and the development of mail-specific resilience management process areas for mail induction, transportation, and revenue assurance.
The projects that we’ve worked on with the USPIS draw upon the CERT Resilience Management Model (CERT-RMM). CERT-RMM is a capability-focused maturity model that combines aspects of IT operations management with operational risk and resilience management, such as information security and business continuity.
In response to the discovery of explosives in the two packages from Yemen and the subsequent development of new security standards by the UPU, the USPIS asked our team, which included David White, Pamela Curtis, and Julia Allen, to use the CERT-RMM assessment method and process to develop an assessment methodology along with a companion field instrument for the following two UPU standards:
S58, Postal Security Standards—General Security Measures defines the minimum physical and process security requirements available to critical facilities within the postal network.
S59, Postal Security Standards—Office of Exchange and International Airmail Security defines minimum requirements for security operations relating to the air transport of international mail.
Based on the requirements and design criteria that USPIS had specified, our team was faced with the development of a methodology and an associated field instrument that had to be repeatable (i.e., generate consistent results when used by different teams in the same situation); cost effective and scalable (i.e, economical and functional for all locations); accurate (i.e., evidence-based); meaningful (i.e., results could easily be acted on by owners and operators of the assessed facilities); and transparent (i.e., publicly available and could be used for self-assessment).
One challenge in developing an assessment methodology based on UPU standards was contending with the ever increasing interweaving of physical and cyber domains. Securing even a mundane physical asset, such as a parcel, involves controlling both tangible and intangible objects, such as computer systems, networks, processes, and sensors.
Another challenging aspect of this work was that the UPU standards were not developed with much consideration as to how member countries would use them to assess whether mail processing facilities were in compliance. This challenge made it hard to develop the streamlined approach that USPIS was seeking to evaluate mail processing facilities that handle international mail. A Proven MethodAs depicted in the figure below, our team developed a methodology that defined three phases for conducting the assessment:
Preparation: Analyze requirements, develop an assessment plan, select and prepare the assessment team, send and receive the pre-assessment questionnaire, obtain and inventory objective evidence, and prepare for the conduct of the assessment (initial site visit and logistics).
Onsite: Prepare participants, conduct interviews, examine objective evidence, document objective evidence, verify objective evidence, perform characterizations and ratings, formulate and validate preliminary findings, generate the final results of the assessment, and identify improvements to the method and the standards.
Reporting: Deliver assessment results to sponsors and key stakeholders, and preserve and archive assessment results.
The questionnaire-based assessment instrument developed by our team contains a series of questions. Below are some examples of the types of questions included, as well as the areas (in bold type) that they cover:
Risk Management: Do you conduct an annual risk assessment of each critical facility?
Physical Security: Do you have a written facility security plan?
Physical Security: Are facilities constructed to prevent illegal entry?
Access Control: What is the process for the control, issuance, and removal of identification badges?
Human Resources: Do you have a documented personnel selection and hiring policy and process?
Mail Security: How are high risk mail items identified in the mail stream?
Transportation: Are routes, schedules, and planned stops assessed for security?
Our methodology also defines the requirements for evidence collection for each section of the standards. The team conducting the assessment must examine documented artifacts or receive oral and written statements and affirmations confirming or supporting the implementation (or lack of implementation) of a practice. If the assessment team observes specific weaknesses in implementation, team members record them on their assessment worksheets. For example, a weakness in implementation of S58 Section 5.1.1, Risk Assessment and Facility Security Plans, might be that the facility’s security plan covers general lighting requirements, but not interior emergency lighting.
Our field instrument is used by the assessment team to keep track of results of interviews and other objective evidence collected. The assessment team uses this information to rate the degree by which each requirement in the standards is implemented. The assessment team also uses our field instrument to generate a summary of the results, in the form of a colored heat map, which is presented to the facility owners and operators. A small segment of a sample heat map is shown in the figure below. The heat map was designed to be easily understood by all relevant stakeholders, ranging from security operations staff to senior decision makers.
(LEGEND: FI = Fully Implemented; LI = Largely Implemented; PI = Partially Implemented; NI = Not Implemented; S = Satisfied; NS = Not Satisfied.)
Through our research, we aim to transition our solutions to our stakeholders by enabling them to conduct assessments independently. In early 2012, the assessment method was piloted by USPIS staff at several international postal administrations. As a result of one of the USPIS pilot assessments, one of the national postal administrations closed an international mail dispatch facility and moved operations to a new facility with improved security controls and conformance with UPU standards. Other assessments have shown that postal administrations largely conform to UPU standards and that having the specific feedback of assessment results encourages them to make the minor improvements needed to ensure full compliance.
One consistent finding of our pilot highlighted the effectiveness of the method for producing accurate assessments. The S58 standard requires a single, written security plan for critical facilities. While each pilot location that the USPIS assessed maintained a security plan, the plans at those sites did not contain all of the elements that the standard requires, including general facility design standards, perimeter barriers, perimeter windows, doors or other openings, lighting, and locking mechanisms and key controls. The methodology that we developed for the USPIS included evidence-discovery procedures such as direct artifacts, indirect artifacts, and affirmations. These evidence-discovery procedures helped the assessment team locate those elements in other documents such as maintenance plans.
Other benefits that participating facilities cited in field reports and assessment results included:
insight into the strengths and weaknesses of current security practices
recognition of a strong security posture by the International Civil Aviation Organization, World Customs Organization, and supply chain partners that rely on postal services for moving goods
guidance to prioritize security-related improvement plans
feedback on the maturity of the organization’s security program
enhanced identification and prioritization of security risks
The method, the results, and the overall explanation of the project are detailed in a technical note titled A Proven Method for Identifying Security Gaps in International Postal and Transportation Critical Infrastructure, which I co-authored with Julia H. Allen and Pamela D. Curtis of the SEI and Gregory Crabb of the USPIS.
Looking Ahead
Our work with the USPIS has demonstrated that, when properly interpreted, CERT-RMM can be applied to a wide range of business objectives. For example, CERT-RMM allows for the addition of new asset types (such as mail items) and the development of new process areas (PAs) (such as mail induction and mail revenue assurance), which can be used in concert with existing CERT-RMM process areas. These new PAs reference established CERT-RMM PAs for specific types of processing including the following:
identification of discrepancies in mail-specific PAs invoke the Incident Management and Control PA, the purpose of which is to identify and analyze events, detect incidents, and determine an appropriate response.
identification of risks in mail-specific PAs invoke the Risk PA, the purpose of which is to identify, analyze, and mitigate risks to organizational assets that could adversely affect the operation and delivery of services.
In January 2014, we presented our work jointly with the USPIS at the 93rd Annual Meeting of the Transportation Research Board Conference. One message we delivered during that presentation was that moving mail—from the time it is accepted to the time it is delivered—is fundamentally a transportation activity, where the entities being transported are mail items. The security and resilience management techniques we developed for postal administrations are thus also applicable to other transportation modes, as well as to other safety and security standards. These transportation modes include those modes that move people, (such as metropolitan area transit systems) and the transport of goods by air, ground, and sea.
Pilot organizations have shown that using our method, which is ultimately a structured and scripted assessment instrument, is an effective way to assess compliance with the UPU postal security standards. During the 25th Universal Postal Congress in Doha, Qatar, in September 2012, our method was recognized as the approach for assessing compliance with the new UPU security standards.The USPIS and other postal sector organizations continue to use the assessment method to achieve initial results and assess progress made after implementing improvements. In 2014, the method will be provided to civil aviation authorities, who will use it primarily to assess the performance of postal administrations in meeting the screening and other international airmail security standards of S59.
Additional Resources
To read the SEI technical report, A Proven Method for Identifying Security Gaps in International Postal and Transportation Critical Infrastructure, please visithttp://resources.sei.cmu.edu/library/asset-view.cfm?assetid=77265.
For more information about the CERT Resilience Management Model (CERT-RMM), please visithttp://www.cert.org/resilience/products-services/cert-rmm/cert-rmm-model.cfm.
To hear Gregory Crabb, Inspector in Charge of Revenue, Product, and Global Security at the USPIS, discuss his organization’s use of CERT-RMM, listen to the CERT podcast at http://www.cert.org/podcasts/podcast_episode.cfm?episodeid=6E2258D2-DE92-C7DF-D7D2B43BEBCEF8A0&pageid=34576.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:12pm</span>
|
|
This post is the second in a series on prioritizing malware analysis.
By Jose Andre MoralesResearcherCyber Security Solutions Division
Every day, analysts at major anti-virus companies and research organizations are inundated with new malware samples. From Flame to lesser-known strains, figures indicate that the number of malware samples released each day continues to rise. In 2011, malware authors unleashed approximately 70,000 new strains per day, according to figures reported by Eugene Kaspersky. The following year, McAfee reported that 100,000 new strains of malware were unleashed each day. An article published in the October 2013 issue of IEEE Spectrum, updated that figure to approximately 150,000 new malware strains. Not enough manpower exists to manually address the sheer volume of new malware samples that arrive daily in analysts’ queues. In our work here at CERT, we felt that analysts needed an approach that would allow them to identify and focus first on the most destructive binary files. This blog post is a follow up of my earlier post entitled Prioritizing Malware Analysis. In this post, we describe the results of the research I conducted with fellow researchers at the Carnegie Mellon University (CMU) Software Engineering Institute (SEI) and CMU’s Robotics Institute highlighting our analysis that demonstrated the validity (with 98 percent accuracy) of our approach, which helps analysts distinguish between the malicious and benign nature of a binary file.
The purpose of this project was to facilitate malware analysis by placing incoming malware samples in a prioritized queue. Those samples at the top of the queue would be considered the most dangerous and given highest priority for full analysis. This prioritization provides an order to a large set of malware samples, thereby guiding analysts in deciding which malware samples to analyze first.
Background
When the highly dangerous Flame virus was discovered in 2012, it came to light that a sample of this malware was in the repository of a major anti-malware company for at least two years. This incident motivated this research by observing that a malware analyst may have no way of choosing which malware to analyze first from a large repository. The repository may have a mix of both malicious and benign files.
Several analysis systems currently exist including
Anubis http://anubis.iseclab.org/
BitBlaze http://bitblaze.cs.berkeley.edu/
Cuckoo Sandboxwww.cuckoosandbox.org
Joe Securityhttp://www.joesecurity.org/
Malheurhttp://www.mlsec.org/malheur/
Norman Sharkhttp://normanshark.com/products-solutions/products/malware-analysis-mag2/
Darkpoint https://darkpoint.us/
These systems do a good job at informing a user what a binary under analysis does when executed, but do not perform any prioritization of samples for analysis. Analysts therefore need an automated approach to analyzing these files, getting a sense if they are benign, seen before malware, not seen before malware, and malware that is highly dangerous, and having the samples prioritized based on a set of features representing highly malicious malware that require high priority for analysis. With such an automated method in place, the analyst’s job of choosing which sample to start with is greatly simplified, producing a streamlined process of deep malware analysis that focuses on the most dangerous samples first.
Approach
To create a prioritized queue of malware samples in an efficient manner for a large set of incoming files, we decided to run each sample through a runtime analysis system for three minutes. The use of runtime analysis helped us quickly understand how a sample interacts with the underlying operating system. These interactions create execution events from which features can be extracted and used in our prioritization.
We used the CERT Malicious Code Automated Run-Time Analysis (MCARTA) Windows runtime analysis system for this research. The system was given five sets of sample inputs:
A set of 2,277 known benign samples, all consisting of WIN32 Portable Executable files
A set of 65 known malicious samples, the samples were the top ranked most infectious and wide spreading during the 2008-2013 period, as ranked by Kaspersky’s yearly reports
A set of 291 malware samples named in the Mandiant APT1 report
A set of 1884 malware samples from the Citadel and Zeus families
A set of 11K known malicious samples downloaded from virusshare.com
The sets described in items 1- 4 above were used for training classification and clustering algorithms, the set in 5 was used for testing. As each malware was fed to MCARTA, the runtime analysis lasted three minutes and the results were stored in reports. We built customized scripts to extract the desired features from the reports and place them in a comma-separated value file. These files were saved for later use in the research. We added the malware names for each sample from the anti-malware engines F-Secure, Microsoft, and Symantec that were being used by the anti-malware scanner service VirusTotal. The queries to VirusTotal were automated via a script that used VirusTotal’s web-based application programming interface (API) set.
The features used in this research were based on our prior experience with runtime analysis and detection of malware. Our methodology was based on our suspicion assessment approach by asking the question "Who are you?" + "What do you do?" The first question provides provenance, which, for this research, we have defined as the person or group that developed the source code of the file under analysis, and attribution of the static file image for a given process and was answered by checking for a verified digital signature. The second question is based on the execution events of a process on a system and was answered using the extracted features from our MCARTA runtime analysis framework.
The features can be grouped in two broad areas: observed and inferred. Observed features are identified directly from the captured data, whereas inferred features required analyzing the captured data to conclude if a specific event occurred or not. Our focus for feature selection was on behaviors related to infection, injury, and survivability used by malware when entering a system:
Infection is normally an essential component of malware facilitating the spreading of the malware in other files and processes. This assists the malware in achieving several goals such as delegation of nefarious deeds to other processes and detection avoidance.
Injury is used by malware to damage the target system in some way including data stealing, access denial to essential system components, and deletion of files and processes.
Survivability is the malware’s need of running undetected for as long as needed to complete its overall goal. To survive, a malware can carry out several acts, such as stopping or deleting anti-malware software and running secretly as a background process or service.
One can surmise that for any given malware at least one of these three behaviors are in use and can be identified via the various techniques used to implement them during runtime. Our features represent the techniques that can be used to implement these three behaviors of malware within the universe of file system, process, and network activity of an end host computer. The features used in this research were based on the Windows 7 operating system and were as follows:
File System
Observed:Open FilesFind FilesCreate FilesGet file attributesMove FilesCopy Files
Inferred:Attempts to delete selfCopy/Move self to other file system locationOpen standard windows cryptography related dll
Registry
Observed:Created registry keysOpen KeysSet ValueDelete ValueEnumerate Value
Inferred:Values deleted from machine/currentversion/run/ registry keyDisable antimalware from starting on reboot by deleting its value from currentversion\run registry keySet value currentversion/run to start self or copy of self at rebootSet value in a registry key created by sampleDelete values with Registry key \CurrentVersion\Internet SettingsDeleted registry keys referring to an anti-malware product
Processes
Observed:Created ProcessesKill ProcessesOpen Processes
Inferred:Process started from file created by sampleDynamic code injection: open process with desired access, create thread & write
Windows Service
Observed:Open ServicesCreate Service
System Environment
Observed:SleepsEnumerate System ModulesCheck for Kernel Debugger
Threads
Observed:Create Threads
Network Activity
Observed:Create ObjectsDNS Requests
Inferred:Connection attempts excluding localhost (127.0.0.1)Maximum connection attempts to same IP addressDifferent non-localhost IP addresses used in connection attempt
Digital Signature
Observed:Verified Digital Signature
Window GUI
Observed:Create Window of size 0
Our previous research indicated the inferred features above are highly suggestive of malicious behaviors by malware; their occurrence indicates high suspicion of potential malware presence. These features were collected for the process under analysis. During the analysis, we created a malware infection tree and extracted features for each process in the tree.
Evaluation Criteria
Our research is evaluated based on the usability of the chosen features to differentiate between malicious and benign samples with minimal false positives and false negatives. Further evaluation is based on producing a priority queue with an ordering of samples allowing an analyst to decide which samples to analyze first. The samples that are initially analyzed should be either the most malicious in the set either based on their features or their absence of features potentially indicating an ability to carry out nefarious deeds in a benign like manner.
Results
After collecting all needed features, we analyzed the data with different classification and clustering techniques to determine the correctness and usefulness of the feature set for prioritizing malware samples. We labeled each sample with its name from F-secure, Microsoft, and Symantec by submitting the SHA256 hash value to virustotal.com via their API set. From the 11K samples only 8,969 samples were used to report the research results. The remaining samples had non-identifiable answers or error messages returned from VirusTotal.com and were therefore omitted. The tests performed were as follows:
Test 1: Attempt to separate benign from malicious in the training sets.
Test 2: Prioritize malware samples for analysis based on their execution behavior.
Test 1 was meant to validate if our runtime features were capable of differentiating between known benign and known malicious samples. We selected a group of popular classification algorithms—i.e., K-Nearest Neighbor (K = 10), Random Forest, AdaBoost, and Support Vector Machine (RBF Kernel)—and evaluated them using 10-fold cross validation. K-fold cross validation splits the data into K equally sized chunks; the algorithms are trained on K-1 chunks and evaluated on the remaining chunk. This process is carried out with each of the K chunks serving as the evaluation set, and results are averaged across all K evaluations.
We used two popular metrics for our evaluation of classification accuracy: area under the precision/recall (PR) curve, which measures how well each method can distinguish between malicious and benign samples, and area under the receiver operating characteristic (ROC) curve, which measures the false positive rate required to experience a particular true positive rate. For the PR curves, each algorithm assigns a score to each sample in the data set, where a higher score indicates that the sample is believed to be more malicious, and we measure how well the method ranks true malware above benign samples.
The list of sample scores returned by an algorithm are sorted and iterated through: at each step, we use the score of the current sample as a threshold for classifying anomalies, and calculate the algorithms precision (number of correctly identified malware divided by the total number of predicted malware) and recall (number of correctly identified malware divided by the total number of true malware). For the ROC curves, the same process above is followed, however at each step, we use the score of the current sample as a threshold for classifying anomalies, and calculate the algorithms true positive rate (number of correctly identified malware divided by the total number of true malware) and the false positive rate (number of correctly identified benign samples divided by the total number of true benign samples).
According to the graphs below, we observe that these popular classification algorithms are consistently able to distinguish between benign and malicious samples, i.e., experiencing >=98 percent average area under the PR curve and >= 97 percent average area under the ROC curve. Additionally, for the particularly important case of advanced persistent threat malware, Random Forest and AdaBoost, well regarded ensemble methods, experienced more than 96 percent average area under both curves, further indicating that the runtime features we selected are extremely useful in identifying the malicious or benign nature of a sample.
Test 2 was carried out using the 11K samples as the test set. K-means clustering was used to group together malware with similar execution behaviors. The results are shown in the pie chart below.
As shown in the pie chart above, a total of 25 clusters were formed with three large clusters, a handful in the hundreds, and the remaining samples in clusters of 10 samples or less. Each of the clusters was a mixed batch of samples from known malware families along with several samples labeled as a variant or generic. Only cluster 2 had a majority of samples from the same family. Consisting of seven total samples, six of these were identified by F-Secure as Trojan.Crypt.HO, the seventh sample was Win32.Sality.3. Since Adaboost and Random Forest gave the best results from Test 1, we report their ranking of clusters and individual samples for prioritizing malware analysis.
The table below shows the confidence percentage of a specific cluster being malicious by Adaboost and Random Forest. The first column is the cluster number, the second is the Adaboost confidence percentage, the third column is the Random Forest confidence percentage, the fourth column is the difference in the two confidence percentages and the fifth column is the average of the two confidence percentages for each cluster. The difference in confidence percentage was minimal in most cases implying a strong agreement by both Adaboost and Random Forest on the certainty of a cluster being malicious or not. Clusters 3, 19, and 21 had 100 percent confidence by Adaboost and none had similar confidence by Random Forest with cluster 19 the closest at 93.7 percent. Clusters 18,19, and 20 ranked highest in Random Forest and had much more similar scores in Adaboost. The average of the two confidence scores is very revealing with clusters 18,19,15,16 ranking the highest. Cluster 18 was the highest average confidence percentage overall, that cluster consisted of 4 samples, 3 were Trojan.generic and one was variant.symmi.
Although Adaboost and Random Forest are both ensemble methods, the methods optimize different loss functions, and therefore can provide different vantage points on the maliciousness of a sample. Averaging confidence scores of several machine learning algorithms can provide a majority consensus of agreement of maliciousness providing a strong basis for deciding which clusters are truly the most malicious in a given data set. In practice, these results are typically based on a single algorithm’s results, but our study suggests majority consensus could be a better approach.
The table below shows the number of individual samples with their confidence percentage for Adaboost and Random Forest. From the results, Random Forest did not have 0 percent confidence of a sample being malicious, implying no false negatives were produced. Adaboost had 0 percent confidence for 149 samples which can be equated as 1.6 percent false negative rate. Random Forest placed most samples in the 60-99 percent confidence range where Adaboost had the majority in 90 to 100 percent range. The results show a tradeoff between no false negatives but lower confidence in random forest or much higher confidence with some false negatives in Adaboost. In practice, low or no false negatives is preferred since one missed malware can be enough to comprise an entire system, its users and data.
In our research, individual malware samples were prioritized by the confidence score given by Random Forest due to no false negatives being produced. The names of the top, middle, and bottom 10 ranked individual samples from Random Forest are listed in the table below. The top 10 consist of samples from known malware families, the middle 10 contain 1 generic and 1 non-descript sample. The bottom 10 also contain 1 generic and 1 non-descript sample. Though the top 10 are considered most malicious since they were determined by Random Forest to be the most similar to our training data, the bottom 10 included samples of well-known malware families such as Expiro, PoisonIvy, Anserin and Ramnit. The presence of these samples in the bottom 10 could suggest they were either analysis aware, unable to execute correctly in our analysis environment, or able to execute quietly and did not produce the needed features for our analysis to rank them higher.
The 402 samples receiving a 100 percent confidence in Random Forest were considered the most malicious of the test set and thus received highest priority. Recall the algorithms were trained on the observed and inferred features previously discussed and thus these 402 samples are seemingly the most similar to the trained data. Conversely the samples ranked lower at the 10 to 19 percent and in the 0 to 9 percent category of Adaboost may equally deserve high priority since all analyzed samples were known malware samples. Therefore lower ranking confidence and especially 0 percent confidence can be malware samples that are either very stealthy in their execution, are aware they are under analysis and thus purposely act benign, or could have not executed as expected due to missing required resources in our analysis environment. In general, analyzing the highest and lowest ranking samples in a priority queue may be a highly effective approach to prioritizing malware samples for analysis.
Collaborations
I led this research with support from the CERT Cyber Engineering Solutions Group led by technical manager Hasan Yasar and Dr. Jeff Schneider of CMU’s Robotics Institute. Edward McFowland III, a Ph.D. candidate at the Heinz College and a master’s student in the Machine Learning Department) provided machine-learning expertise.
Concluding Remarks
In this research, we set out to make malware analysis easier by using machine learning to prioritize samples based on their runtime behaviors. The prioritized samples are ordered in a queue with the most malicious at the top. The analyst can simply analyze samples as they appear in the queue. Our feature set of runtime behaviors produced a precision recall of greater than 98 percent and greater than 97 percent for ROC curves in differentiating between known benign and known malicious samples. In the specific case of advanced persistent threats, our analysis produced greater than 96 percent for both curves.
These results verify that our runtime features are highly useful in determining the malicious nature of a sample. Based on these results, we prioritized our samples with highest priority going to the samples at the top and bottom of our results lists. These were considered the most malicious due to their similarity with our training set (top of list) and the potential ability to modify behavior and appear benign (bottom of list). Our results also suggest that averaging confidence scores of several machine-learning algorithms could provide a majority consensus of agreement of maliciousness of a sample. This produces a strong basis for deciding which samples are truly the most malicious in a given data set. In practice, machine-learning-based malware detection is typically based on a single algorithm’s result, but our study suggests that a majority consensus could be a better approach.
Our future work will focus on expanding our feature set as malware evolves by detecting their new techniques. We will also create real-time malware detection systems for various operating systems using our feature sets. We will continue to improve current machine-learning algorithms to more accurately differentiate between the malicious and benign nature of a given sample.
We welcome your feedback on our approach and analysis of our results. Please leave feedback in the comments section below.
Additional Resources
To listen to the CERT Podcast, Characterizing and Prioritizing Malicious Code with Jose Morales and Julia Allen, please click here.
To read about other malware research initiatives at the SEI, please visithttp://blog.sei.cmu.edu/archives.cfm/category/malware
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:11pm</span>
|
|
By Will Klieber Member of the Technical Staff CERT Division
This blog post was co-authored by Lori Flynn.
Although the Android Operating System continues to dominate the mobile device market (82 percent of worldwide market share in the third quarter of 2013), applications developed for Android have faced some challenging security issues. For example, applications developed for the Android platform continue to struggle with vulnerabilities, such as activity hijacking, which occurs when a malicious app receives a message (in particular, an intent) that was intended for another app but not explicitly designated for it. The attack can result in leakage of sensitive data or loss of secure control of the affected apps. Another vulnerability is exploited when sensitive information is leaked from a sensitive source to a restricted sink. This blog post is the second in a series that details our work to develop techniques and tools for analyzing code for mobile computing platforms. (A previous blog post, Secure Coding for the Android Platform, describes our team’s development of Android rules and guidelines.)
Our work was led by a team of researchers who, in addition to myself, included Dr. Lori Flynn, also of the CERT Secure Coding Team; Dr. Lujo Bauer and Dr. Limin Jia of Carnegie Mellon University's Department of Electrical and Computer Engineering; and Amar Bhosale.
Improving Dataflow Analysis
Our first tool, for which we recently completed building a prototype, addresses a problem often seen in information flow analysis: the leakage of sensitive information from a sensitive source to a restricted sink. "Sink" and "source" are terms common to flow analysis. We define a source as an external resource (external to the app, not necessarily external to the phone) from which data is read and a sink as an external resource to which data is written.
Sometimes the flow of information can be from a highly sensitive source to a place that’s not authorized to receive the data. So, the source can be high privilege and the sink can be low privilege. Integrity concerns can also be analyzed using the concept of information flow. Sometimes untrusted data is sent to a place that’s supposed to store only high-trusted data that’s been sent by an authorized source. If data travels from a low-trust source to a high-trust sink, that’s also a problem.
For example, a smartphone user might install an Android game that leaks the user’s entire contact list to a marketing company. Of course, government agencies also have their sources of sensitive information and they don’t want them to be leaked to unauthorized parties.
We designed and implemented a novel taint flow analyzer (which we call "DidFail") that combines and augments the existing Android dataflow analyses of FlowDroid (which identifies intra-component taint flows) and Epicc (which identifies properties of intents such as its action string) to track both inter-component and intra-component dataflow in a set of Android applications.
Our analysis of a given set of apps takes place in two phases:
In the first phase, we determine the dataflows enabled individually by each app and the conditions under which these are possible.
In the second phase, we build on the results of the first phase to enumerate the potentially dangerous data flows enabled by the whole set of applications.
Our tool differs from pure FlowDroid, which analyzes flows of tainted information. FlowDroid focuses on information that flows in a single component of an app; our tool analyzes potentially tainted flows between apps and, within a single app, between multiple components.
Our taint flow analyzer prototype for static analysis of sets of Android apps, DidFail (Droid Intent Data flow Analysis for Information Leakage), was completed in March 2014. Our team is continuing to do research and development with this analyzer, focusing on methods to efficiently increase precision.
Challenges in Our Work
One challenge that we encountered in our work is that while we had access to the full source code for FlowDroid, we could access only the binary code for Epicc. Unfortunately, Epicc produced a list of intents, but did not specify where in the app they originated. To integrate the results of FlowDroid and Epicc, we needed to make Epicc produce this missing information.
Since it wasn't feasible to modify Epicc, we instead modified the Android Application Package File (APK) by tagging each intent with a unique ID so that Epicc would print the intent ID. To do that, we used a function of Soot that enables the instrumentation or transformation of APKs. Soot transformed the binary into Jimple, which is an intermediate representation of a Java program that is Soot-specific. In the Jimple, we looked for intent-sending methods (e.g., methods in the startActivity family) and then inserted new Jimple code that added an extra field (with a unique intent ID number) into the intent. Then, once we processed the APK file, we compiled the Jimple back to Dalvik bytecode and wrote a new, transformed APK file.
Another way of explaining this process is that we wrote a piece of code that takes the original APK and adds a unique identification to each place in the code where the APK sends an intent. The use of unique IDs enabled us to match the output of Epicc with the output of FlowDroid. (We modified FlowDroid to also look for these same intent IDs and print them in its output.) Epicc prints out a list of properties added using the putExtra method. That is how we add the unique ID. It is just a dummy extra field.
Analysis Tool Publication
As of April 2014, we believe that our tool is the most precise (publicly available and/or documented) taint-flow static analysis tool for Android apps. Our tool enables users and organizations to be very secure about the set of apps they allow to be installed together while also enabling them to install the greatest number of apps that abide by their security policy.
This tool is freely available to the public for download along with a small test suite of apps that demonstrates the new analytical functionality it provides. Our recently accepted paper, Android Taint Flow Analysis for App Sets, provides details about the DidFail tool and test results of using it with an Android app set. We encourage you to first read the paper, and then download the DidFail tool and the provided test apps.
The DidFail tool has limitations that we hope to address in future research. These limitations include false positives that are caused by a coarse-grained approach to detecting information flows between apps. A finer-grained analysis could reduce the incidence of false positives. Also, the DidFail tool focuses exclusively on Android intents as the method of data communication across applications. Android apps have other means of communicating including
directly querying Content Providers
reading from and writing to an SD card using native code and communication channels (e.g., sockets or the Binder) implemented by the underlying Android Linux operating system
A Tool to Address Activity Hijacking
The second Android app analysis tool, described below, was developed to be part of the CERT Division’s Source Code Analysis Laboratory (SCALe) suite of tools for testing code for compliance with CERT secure coding rules. This tool was designed specifically to grow our Mobile SCALe tool set that checks against our new Android-focused secure coding rules and guidelines. This new tool is now part of the CERT Division’s compliance checker tool set used for our SCALe code conformance analyses. This tool has been developed for a limited audience and is currently not available for public distribution.
Activity hijacking attacks occur when a malicious app receives a message (an intent) that was intended for another app, but not explicitly designated for it. In the Android middleware, intents are the primary means of inter-app communication and may include a designation of the recipient, an action string, and other data.
If no recipient is designated in an activity intent, then Android tries to find a suitable recipient (e.g., an app that declares in an intent filter in its manifest file that it can handle the specified action string). If there are multiple suitable apps, then Android prompts the user to select which one to use. The user can also designate the chosen recipient as the default to handle all similar intents (e.g., intents with the same action string) in the future and thwart hijacking attempts.
However, a malicious app can trick the user by using a confusing name. In addition, an inattentive user might not give much thought to the choice. Moreover, the device’s touch screen might register a click for the malicious app that the user did not intend. Android does not require confirmation of the user’s selection (which would be helpful in mitigating accidental clicks), even though such an accidental click can irreversibly leak sensitive information. An implicit intent is an intent that does not specifically designate a recipient component by its fully qualified class name, as opposed to an explicit intent, which does. Only implicit intents are vulnerable to activity hijacking.
In addition to inter-app communication, intents are also used for intra-app communication between different components of a single app. The use of implicit intents for intra-app communication has proved to be a common mistake in the development of Android apps, as well as a violation of our secure coding rules. A component might intend to communicate with another component in the same application, but if that component uses an implicit intent (instead of an explicit intent), it might be vulnerable to another app intercepting its message.
Unfortunately, it is easy for a developer to mistakenly make app interfaces public when they should be private, allowing malicious apps to hijack or eavesdrop on apps that have access to sensitive information or resources. Moreover, closely related apps may have been developed to send intents to each other without explicitly designating the recipient, leaving open an avenue for activity hijacking.
In the technical report that the SEI published describing our work on this tool, Mobile SCALe: Rules and Analysis for Secure Java and Android Coding, we detail the design and implementation of our tool, which was constructed using the Soot Java analysis framework. Our tool identifies the method calls that send Android intents. Where possible, our tool identifies the action string associated with the intent and the target of the intent in the case of an explicit intent.
Our activity hijacking vulnerability detection tool analyzes each app individually to
find likely violations of secure coding rules
produce a list of the different types of intents the app registers receive
produce a list of program sites (source code or bytecode locations) that send intents, along with the action string and target class if known
Looking Ahead
Our goal with this series of blog posts is to share the progress on our Mobile SCALe project work, thereby extending the existing CERT SCALe conformance process to create a source code analysis laboratory environment for mobile computing platforms. This blog post is intended to update you on our work on the Android operating system, the first area of focus for Mobile SCALe.
Our team is currently at work on improving our tool that looks for information flows where the data source is sensitive and the sink is restricted. The research challenge we’re focusing on is to develop an analysis to determine taint flow endpoints with the following (sometimes conflicting) goals in mind: precision, soundness, and the ability to operate within resource limitations (i.e., time and memory).
We welcome feedback on our work. Please leave comments below.
Additional Resources
To download the DidFail tool, please visithttps://www.cert.org/secure-coding/tools/didfail.cfm.
Read the SEI technical report, Mobile SCALe: Rules and Analysis for Secure Java and Android Coding, for more information about the work described in this blog post.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:10pm</span>
|
|
By Robert C. Seacord
Secure Coding Technical Manager
CERT Division
Software developers produce more than 100 billion lines of code for commercial systems each year. Even with automated testing tools, errors still occur at a rate of one error for every 10,000 lines of code. While many coding standards address code style issues (i.e., style guides), CERT secure coding standards focus on identifying unsafe, unreliable, and insecure coding practices, such as those that resulted in the Heartbleed vulnerability. For more than 10 years, the CERT Secure Coding Initiative at the Carnegie Mellon University Software Engineering Institutehas been working to develop guidance—most recently, The CERT C Secure Coding Standard: Second Edition—for developers and programmers through the development of coding standards by security researchers, language experts, and software developers using a wiki-based community process. This blog post explores the importance of a well-documented and enforceable coding standard in helping programmers circumvent pitfalls and avoid vulnerabilities.
Community-Based Development of the CERT C Coding Standard
The idea for the CERT C Coding Standard as a community-based development project arose at the Spring 2006 meeting of the C Standards Committee in Berlin, Germany. Experts from the community, including members of the C Standards Committee, were invited to contribute and were provided with editing privileges on the wiki.
The wiki-based community development process has many advantages; most importantly, this form of collaborative development engages a broad group of experts to form a consensus opinion on the content of the rules. Members of the community can sign up for a free account on the wiki and comment on the coding standards and the individual rules. Reviewers who provide high-quality comments frequently receive extended editing privileges so that they can directly contribute to the development and evolution of the coding standard. Today, the CERT Coding Standards wiki has more than 1,500 registered contributors, and coding standards have been completed for C and Java, with additional coding standards for C++, Perl, and other languages under development. These guidelines and standards, if implemented, could have prevented vulnerabilities, such as Heartbleed.
Heartbleed
Heartbleed emerged as a serious vulnerability in the popular OpenSSL cryptographic software library. This vulnerability allows an attacker to steal information that under normal conditions would be protected by Secure Socket Layer/Transport Layer Security (SSL/TLS) encryption.
Despite the seriousness of the vulnerability, Heartbleed is the result of a common programming error and an apparent lack of awareness of secure coding principles. Following is the vulnerable code:
int dtls1_process_heartbeat(SSL *s) {
unsigned char *p = &s->s3->rrec.data[0], *pl;
unsigned short hbtype;
unsigned int payload;
unsigned int padding = 16; /* Use minimum padding */
/* Read type and payload length first */
hbtype = *p++;
n2s(p, payload);
pl = p;
/* ... More code ... */
if (hbtype == TLS1_HB_REQUEST) {
unsigned char *buffer, *bp;
int r;
/*
* Allocate memory for the response; size is 1 byte
* message type, plus 2 bytes payload length, plus
* payload, plus padding.
*/
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
/* Enter response type, length, and copy payload */
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy(bp, pl, payload);
/* ... More code ... */
}
/* ... More code ... */
}
This code processes a "heartbeat" packet from a client or server. As specified in the
Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) Heartbeat Extension
RFC 6520, when the program receives a heartbeat packet, it must echo the packet’s data back to the client. In addition to the data, the packet contains a length field that conventionally indicates the number of bytes in the packet data, but there is nothing to prevent a malicious packet from lying about its data length.
The
p
pointer, along with
payload
and
p1, contain data from a packet. The code allocates a buffer sufficient to contain
payload
bytes, with some overhead, then copies
payload
bytes starting at
p1
into this buffer and sends it to the client. Notably absent from this code are any checks that the payload integer variable extracted from the heartbeat packet corresponds to the size of the packet data. Because the client can specify an arbitrary value of
payload, an attacker can cause the server to read and return the contents of memory beyond the end of the packet data, which violates our recommendation,
INT04-C, Enforce limits on integer values originating from tainted sources. The resulting call to
memcpy()
can then copy the contents of memory past the end of the packet data and the packet itself, potentially exposing sensitive data to the attacker. This call to
memcpy()
violates the secure coding rule
ARR38-C, Guarantee that library functions do not form invalid pointers. A version of ARR38-C also appears in
ISO/IEC TS 17961:2013, "Forming invalid pointers by library functions [libptr]." This rule would require a conforming analyzer to diagnose the Heartbleed vulnerability.
The Latest CERT C Coding Standard
Within two years of launching the wiki, the community had developed 89 rules and 132 recommendations for secure coding in C. At that point, a snapshot of the CERT C Coding Standard was created, and published in October 2008 as The CERT C Secure Coding Standard. CERT’s Coding Standards continue to be widely adopted by industry. Cisco Systems, Inc., announced its adoption of the CERT C Secure Coding Standard as a baseline programming standard in its product development in October 2011 at Cisco’s annual SecCon conference. Recently, Oracle has integrated all of CERT’s secure coding standards into its existing Secure Coding Standards. This adoption is the most recent step of a long collaboration: CERT and Oracle previously worked together in authoring The CERT Oracle Secure Coding Standard for Java.
The CERT C Coding Standard continues to evolve. Existing guidelines are updated as new standards, such as Programming Languages—C, 3rd ed. (ISO/IEC 9899:2011), are introduced. Through large and small modifications—from changing a word in a rule title to writing new code examples—the guidelines continue to be improved by the ongoing activities of contributors. Obsolete guidelines are regularly culled from the wiki, and new rules and recommendations are added as technology and research warrant.
In 2013, a second snapshot of the CERT C Coding Standard was prepared for publication. The wiki had grown in the intervening five years: it now had 98 rules and 178 recommendations. We elected to publish only the rules, not the recommendations, in the second edition of The CERT C Coding Standard, published in April 2014. The rules laid forth in the new edition will help ensure that programmers’ code fully complies with the new C11 standard; they also address earlier versions of the C Standard, including C99.
The CERT C Coding Standard itemizes coding errors that are the root causes of current software vulnerabilities in C, prioritizing them by severity, likelihood of exploitation, and remediation costs. Each rule includes examples of insecure code, as well as secure, C11-conforming, alternative implementations. If uniformly applied, these guidelines eliminate critical coding errors that lead to buffer overflows, format-string vulnerabilities, integer overflow, and other common vulnerabilities when programming in C.
The CERT C Coding Standard, second edition, covers all aspects of the new C Standard, including best solutions, compliant solutions, and pertinent language and library extensions. It also offers advice on issues ranging from tools and testing to risk assessment. Of the 98 rules, 42 are new—since the first edition, 30 rules have been deprecated, 30 more have been added, and a new section, Concurrency (containing 12 rules), has also been added.
A Tool for Developers
The Source Code Analysis Laboratory (SCALe) provides a means for developers to evaluate the conformance of their code to CERT’s coding standards. CERT coding standards provide a normative set of rules against which software systems can be evaluated. Conforming software systems should demonstrate improvements in their safety, reliability, and security over nonconforming systems.
SCALe analyzes a developer’s source code and provides a detailed report of findings to guide the code’s repair. After the developer has addressed these findings and the SCALe team determines that the improved source code conforms to the standard, CERT issues the developer a certificate and lists the system in a registry of conforming systems.
Conformance to CERT coding standards requires that the source code not contain any rule violations. Occasionally, a developer may claim that code that appears to violate a rule actually is secure because of an exceptional condition. For example, library code with insufficient thread protection could still be secure when run only in single-threaded programs. If an exceptional condition is claimed, the exception must correspond to a predefined exceptional condition, and the application of this exception must be documented in the source code. Conformance with the recommendations is not necessary but, in many cases, will make it easier to conform to the rules and eliminate many potential sources of defects.
SCALe has also been used by the Department of Defense (DoD), which increasingly depends on networked software systems. One result of this dependency is an increase in attacks on both military and non-military systems, as attackers look to exploit these software vulnerabilities. Our technical report on this work, Supporting the Use of CERT Secure Coding Standards in DoD Acquisitions, provides guidance to help DoD acquisition programs address software security in acquisitions. It provides background on the development of secure coding standards, sample request for proposal (RFP) language, and a mapping of the Application Security and Development STIG to the CERT C Secure Coding Standard.
Since its inception, more than 20 SCALe analyses have been performed on a variety of systems from both government and industry for a variety of languages, including C, C++, Java, and Perl.
Addressing Challenges and Future Work
Coding standards are an integral part of the software development lifecycle and increasingly a requirement. The National Defense Authorization Act for FY13, Section 933, states "Improvements in Assurance of Computer Software Procured by the Department of Defense," requires evidence that government software development and maintenance organizations, including contractors, conform to DoD-approved secure coding standards during software development, upgrade, and maintenance activities, including through the use of inspection and appraisals. The Application Security and Development Security Technical Implementation Guide (STIG), Section 2.1.5, "Coding Standards," requires that program managers "ensure the development team follows a set of coding standards."
A number of challenges make compliance with these requirements difficult:
Secure coding standards must be developed for ubiquitous languages with no existing standards and, where possible, published by an international standards body to allow easy adoption by the DoD;
The number of actual rule violations discovered in conformance testing is excessive and must be reduced to levels that can be reasonably addressed by the development team;
It must be demonstrated that the adoption of secure coding standards will not degrade system performance and result in slow, bloated code.
To address these challenges in the coming year our work will focus on the following areas:
C++ Coding Standard. To address the lack of secure coding standards, we plan to complete the CERT C++ Secure Coding Standard. C++ is used extensively throughout the DoD including for major weapons systems such as the Joint Strike Fighter. Existing C++ coding standards fail to address security, subset the language, or are outdated and unprofessional.
Reduce Rule Violations by Enforcing Secure Coding Rules in Interactive Development Environments. To address the problem of excessive rule violations, we plan to collaborate with Clang developers and Japan Computer Emergency Response Team Coordination Center (JPCERT) to develop additional analyses for Clang’s static analyzer to check for violations of a prioritized list of secure coding rules. Clang is an open-source compiler that has been integrated into Apple’s XCode integrated development environment (IDE), which is the primary tool for developing software for iOS and OS X. Catching rule violations early will prevent these errors from propagating throughout the codebase and will allow developers to learn secure coding techniques while programming. New checkers will be submitted into the main trunk of Clang and integrated into XCode (as well as any other IDEs that support Clang integration), improving software security for all developers who use Clang.
We will also collaborate with Dr. Bill Pugh, professor emeritus in the University of Maryland’s Computer Science Department and FindBugs creator, to develop analysis against unchecked guidelines in The CERT Oracle Secure Coding Standard for Java and to integrate this analysis into Eclipse so that analysis results are immediately available to Android platform developers.
Demonstrate the Costs of Producing Secure Code. We are planning a research project with Igalia to evaluate the costs of producing a CERT-conforming implementation of the Chromium browser project. Igalia is a contributor to Chromium and a member of the World Wide Web Consortium (W3C). Chromium has several properties that make it a compelling demonstration: Chromium is an open-source project, released under a BSD-style license, and is the foundation for the Google Chrome browser. The performance of Chromium, which has a bug bounty program, is intensely scrutinized; developers are unlikely to accept patches that fix theoretical vulnerabilities but adversely affect performance. Chromium is used by hundreds of millions of users, and a successful case study will be widely publicized and replicated. This research will evaluate the effort required to discover and mitigate secure coding violations in the Chromium codebase. We will evaluate the performance, size, and resource consumption of the code before and after remediation and note common anti-patterns and mitigations.
In the long term, we hope to continue to develop and refine coding rules for existing secure coding standards, additional coverage for other languages and platforms, and additional analysis capabilities.
We welcome your feedback on our latest and future secure coding work. Please leave feedback in the comments section below.
Additional Resources
For more information about, The CERT C Coding Standard, Second Edition: 98 Rules for Developing Safe, Reliable, and Secure Systems, please visit
http://www.informit.com/store/cert-c-coding-standard-second-edition-98-rules-for-9780133805383.
To sign up for a free account on the CERT Secure Coding wiki, please visit
http://www.securecoding.cert.org.
To subscribe to our Secure Coding eNewsletter, please click here.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:09pm</span>
|
|
By Will DormannVulnerability AnalystCERT Division
The Heartbleed bug, a serious vulnerability in the Open SSL crytographic software library, enables attackers to steal information that, under normal conditions, is protected by the Secure Socket Layer/Transport Layer Security (SSL/TLS) encryption used to secure the internet. Heartbleed and its aftermath left many questions in its wake:
Would the vulnerability have been detected by static analysis tools?
If the vulnerability has been in the wild for two years, why did it take so long to bring this to public knowledge now?
Who is ultimately responsible for open-source code reviews and testing?
Is there anything we can do to work around Heartbleed to provide security for banking and email web browser applications?
In late April 2014, researchers from the Carnegie Mellon University Software Engineering Institute and Codenomicon, one of the cybersecurity organizations that discovered the Heartbleed vulnerability, participated in a panel to discuss Heartbleed and strategies for preventing future vulnerabilities. During the panel discussion, we did not have enough time to address all of the questions from our audience, so we transcribed the questions and panel members wrote responses. This blog posting presents questions asked by audience members during the Heartbleed webinar and the answers developed by our researchers. (If you would like to view the entire webinar, click here.)
I have been a software vulnerability analyst with the CERT Coordination Center (CERT/CC) since 2004 with a focus on web browser technologies, ActiveX, and fuzzing. In addition to myself, answers to audience questions from the Heartbleed panel discussion are provided by:
Brent Kennedy is a member of the CERT Cyber Security Assurance team focusing on penetration testing operations and research. Kennedy leads an effort that partners with the Department of Homeland Security’s National Cybersecurity Assessments and Technical Services (NCATS) team to develop and execute a program that offers risk and vulnerability assessments to federal, state, and local entities.
Jason McCormick has been with SEI Information Technology Services since 2004 and is currently the manager of network and infrastructure engineering. He oversees datacenter, network, storage, and virtualization services and plays a key role in information security policy, practices, and technologies for the SEI.
William Nichols joined the SEI in 2006 as a senior member of the technical staff and serves as a Personal Software Process (PSP) instructor and Team Software Process (TSP) mentor coach in the Software Solutions Division at the SEI.
Robert Seacord is a senior vulnerability analyst in the CERT Division at the SEI, where he leads the Secure Coding Initiative. Seacord is the author of The CERT C Secure Coding Standard (Addison-Wesley, 2014) and Secure Coding in C and C++ (Addison-Wesley, 2002) as well as co-author of two other books. He is also an adjunct professor at Carnegie Mellon University.
Attendee: Did anyone in the information security industry have suspicions about the security of OpenSSL before the Heartbleed story broke in the media?
Will Dormann: Both Google and Codenomicon had investigated OpenSSL and discovered the Heartbleed vulnerability before its public release. Whether OpenSSL was specifically targeted is unclear. It is likely that a number of SSL/TLS libraries were tested, and it just happened that OpenSSL behaved unexpectedly, which is due to the vulnerability.
Attendee: It seems as though the attacks did not start, as far as we know, until after the vulnerability was publicly announced. Has there been any effort to create avenues to distribute patches to vulnerabilities like this prior to publicly announcing the vulnerability?
Will Dormann: We do not know when the attacks started. Before Heartbleed was publicly disclosed, we did not even know what to look for in an attack. Therefore, it is possible that the vulnerability was being attacked as early as two years ago, when the vulnerability was first introduced. The CERT Coordination Center (CERT/CC) offers support in coordinating vulnerabilities among affected vendors before public release. This minimizes the rushed efforts required by software vendors to produce updates after public disclosure. In this case, the CERT/CC was not involved in the pre-disclosure coordination of the OpenSSL vulnerability.
Attendee: I’m curious; they said this vulnerability has been in the wild for two years. Why did it take so long to bring this to the public knowledge now? It looks fishy.
Will Dormann: With any vulnerability that is discovered, there is some delay between its introduction and its discovery. It is quite common for vulnerabilities to go unnoticed for years.
Attendee: Is the code part of the browser or part of the server-side platform? If the browser, is there anything we can do to work around Heartbleed to provide security for banking and email web browser applications?
Brent Kennedy: Heartbleed mostly affects server-side applications, but there are some client applications (and most likely more to come) that reportedly are affected. For vulnerable servers, the vulnerability exists on the server itself, not the client accessing it. No major browsers implement OpenSSL, so determining if you are safe when accessing email or online banking is dependent on your provider. While most banks reported no issues, it is worthwhile to check their status.
Attendee: What is involved in "fully" patching this vulnerability at each impacted company? This seems to be a gray area at present.
Jason McCormick: There is a black-and-white answer for the concept of being fully patched from the perspective of eliminating the acute vulnerability. That is simply to upgrade to a version of OpenSSL that is not vulnerable to Heartbleed, or to upgrade your software system that is using OpenSSL as a component to a version that addresses the Heartbleed vulnerability. That Open SSL upgrade "fully patches" the issue.
The big gray area is what to do next. There is no one-size-fits-all solution here; unfortunately, organizations will have to make decisions based on their own risk tolerances and costs.
Every organization should immediately be re-issuing certificates on Internet-facing systems that were vulnerable to Heartbleed as quickly as possible. The potential compromise of private key material that could be used for decryption of captured data or the impersonation of sites makes this an important consideration.
Organizations should consider their risk/cost trade-offs for their insider risk as well. For example, a university-style situation, where you have a large, heterogeneous user base with a culture of limited controls for open access, will have a much different risk/cost analysis for changing internal certificates than a corporation with a well-controlled, well-known user population with strict internal controls.
Finally, you have to conduct a risk assessment of the other contents of the server or service that was affected by Heartbleed:
Are passwords in play for a web application?
What is the risk to compromised account information?
What other information was the server/service working on such that fragments of it may have been in memory?
Only by a thoughtful analysis can each organization determine what, if any, their next steps should be. That calculation will be different for every organization.
Attendee: For our personal home computers, how do we get the updates to eliminate the Heartbleed vulnerability?
Jason McCormick: Unless you are a home hobbyist and are affected by Heartbleed by running a server, there is nothing that home users need to do with their computers other than always keeping their software updated.
The most important step that individual users can take is changing their passwords on the services they consume such as webmail, social media, banks, etc. Most large companies have announced publicly (to some degree) whether they were affected and issued recommendations for their services.
It is always a good idea to perform a regular change of your passwords regardless, so now might be a good time to do it. Additionally, even if a service you use was not affected by Heartbleed, if you used the same password for multiple sites and services, one of them may have been compromised, which means they are all compromised.
Yes, there is value in knowing whether or not an organization has re-issued their certificates, but that it too complicated for the average home user to understand. Beyond that, there is no reasonable way to know whether an organization has re-issued their certificate. By this time, we hope most organizations have done the right thing for their users.
Finally, many larger online services are offering two-factor authentication (TFA) that requires a PIN-like code in addition to a password. This is accomplished using an authenticator client on a computer or smartphone or an SMS text-message based system. As part of your login process, you would enter your username and password and on the following screen the numeric code printed on the authenticator app or from a text message. Anywhere and everywhere this service is offered, users should be taking advantage of it. The use of a TFA for logins greatly mitigates the risk of compromised and weak passwords.
Attendee: Shouldn’t organizations also check their applications that act as Secure Socket Layer/Transport Layer Security clients — whether those are desktop or Web applications, developed in-house or externally — if they use outdated versions of OpenSSL? This vulnerability can also be exploited against clients, not just servers. Couldn’t the memory of those client applications also contain sensitive information that could be stolen if they connect to a malicious or compromised server?
Brent Kennedy: The short answer is "yes" although the server side vulnerability carries a greater risk. Exploiting Heartbleed via a client-side application would be a multi-step attack. The attacker would have to stand up a malicious SSL/TLS server and trick the user into visiting that server using a vulnerable client. In the event that it does happen, memory would be dumped from the user’s host machine. This could contain anything that is actively being processed on the computer, not just data related to the specific client application.
Jason McCormick: It is always good practice to keep all systems updated, including clients. While it is theoretically possible to attack a pure client using Heartbleed, most attacks are not practical en masse, both because browsers do not use OpenSSL (Internet Explorer uses the Microsoft Crypto library; Firefox and Chrome use network security services (NSS)) and because attacking the clients using Heartbleed would require an initial compromise such as phishing a person to connect to a malicious site.
Attendee: You mentioned that website owners might want to get new SSL certificates and revoke the old ones. But how should they mitigate against the fact that browsers and most TLS clients have broken certificate revocation checking where they soft fail when they don’t get an online certificate status protocol (OCSP) response—they accept the connection. This means a man-in-the-middle attacker who obtains a certificate through Heartbleed can impersonate the site to users possibly indefinitely even if the old certificate is revoked by also blocking OCSP responses to those users.
Will Dormann: It is true that a certificate revocation may not be honored by a client application. However, that is not reason to skip the revocation in the first place. For more details, see http://news.netcraft.com/archives/2014/04/24/certificate-revocation-why-browsers-remain-affected-by-heartbleed.html.
Jason McCormick: The short answer here is you can’t, at least not practically, without both increasing the industry-wide robustness of OCSP services and implementing different default behaviors in the browsers. OCSP soft-fail is a deliberate behavior choice by browser makers because hard fail would cause a serious disruption to many user experiences (which is a debate for another time).
OCSP stapling (or TLS Certificate Status Request Extension) is a great step forward here, but unfortunately it is not widely implemented yet. Thought leaders in the IT world need to be pushing concepts and technologies like OCSP stapling forward at every opportunity.
Attendee: Obviously, Heartbleed exposes a number of flaws in our security infrastructure (e.g., OpenSSL is being maintained by a very small number of people). I’d like to hear some about how the panelists view the resiliency of certificate authentication when stressed by something like Heartbleed.
Jason McCormick: I can’t say that I agree with the opening statement of this question, that Heartbleed exposes any fundamental flaw in the architecture of our varying security infrastructures. While the Heartbleed vulnerability is serious and pervasive, it is fundamentally a coding mistake. Heartbleed is not revealing a protocol weakness such as BEAST, CRIME, or the renegotiation attacks.
As for OpenSSL being maintained by a small number of people, that is very true, but it sounds as if plans are in the works to change that through The Linux Foundation. This is great news and hopefully will lead to faster and better evolution of the OpenSSL system. The interesting issue that persists though is how you find quality cryptographers who are also good programmers and who can do work on OpenSSL. This is a hard combination to come by and I hope the financing that is expected to flow to OpenSSL can overcome some of these challenges. This is a very unfortunate bug, and I am sure more bugs will be found in this software just as all other software has bugs. I do not think, however, that OpenSSL is fundamentally broken such that it should be abandoned or considered fundamentally flawed.
Additionally, Heartbleed also has nothing to do with certificate authentication. It is an important point that the Heartbleed vulnerability itself, while damaging, is limited to a particular function of the TLS protocol used for "keepalive" checks and for path MTU discovery for DTLS connections. X.509 certificate authentication and authorization is an entirely different function within the TLS protocol specification. It is used during the handshake phase of the TLS session establishment to check identity and establish the encrypted transport session. Certificate authorities and the related constellation of technologies and protocols, some of which do have some interesting challenges, are entirely separate from TLS and OpenSSL.
Attendee: A mature process through the entire software development cycle is essential to reducing this type of vulnerability. Can a mature process that will catch defects like Heartbleed dovetail with an Agile software development approach?
Bill Nichols: The short answer is "yes." There is no wide agreement on the specific practices in Agile, but it is generally agreed that Agile involves delivering value to the user. Vulnerabilities deliver negative value and therefore are anti-Agile by definition.
Attendee: In my 25+ years of consulting in programming circles, MANY large and well-known corporations are merely "maintaining" code (adding Band-Aids, enhancement) and not building from scratch. That aside, developers should not be the main source of evaluating the resulting code — for adherence of standards, injection of vulnerabilities, etc.
Bill Nichols: If developers are maintaining or enhancing code, they should take responsibility for any changes they introduce. On many old code bases, this is hard work indeed. Excessive change is dangerous because changes often introduce new problems. The developers should not be the last line of defense, but they should be among the first and take personal responsibility not to allow vulnerabilities to escape. The tools checking code after development are absolutely essential, but very imperfect. The only way to get clean code out is to put clean code in. Moreover, someone must review each and every find from those tools.
Attendee: Given this issue has existed in practice for many years, what would the panelists suggest are lessons for various positions in in the ecosystem who all missed this?: (1) open-source code reviewers, (2) component integrators, (3) testers etc., (4) auditors?
Bill Nichols: Reviewing code is hard. The following practices have been known to work:
Review only a 200 to 300 lines in a single sitting.
Use a checklist of items.
Review the entire section of code for a single item from your checklist before moving on.
Write your own checklist for review so that you recognize the problem in the code immediately when you see it.
You must inspect, not read the code. If you take less than an hour for 200 lines of code, you have probably gone too fast.
Many separate studies have found that review rates of more than 200 lines per hour are not very effective. For many codes, it is likely that the inspector will, on average, find less than one issue per hour. This low rate contributes to making inspection difficult to perform because it feels slow and unrewarding. Nonetheless, that find rate is many times faster than integration or system test.
For integrators and auditors, I recommend running compilers with all checks turned on. Follow up with a low-cost static check tool such as SonarQube, then run two or more proprietary static analysis tools, for example, Coverity and CAST among others. Our experience has been that using analysis tools is better, than relying on a single product because analysis tools tend to have limited overlap in their results. Resolve each and every issue. There will be a large portion of false positives. If you cannot afford to resolve all the issues, you should recognize that 10 to 15 percent are likely to be real defects. Oddly enough, some people have found that the lower ratio of total finds (positive and negative), the lower the true positive to false positive ratio.
You should definitely consider dynamic checking tools such as fuzzers. You should re-inspect any module in which you discover a defect in test.
Robert Seacord: There are many lessons here. One is that developers, reviewers, and auditors should make sure they have an up-to-date and comprehensive knowledge of secure coding. The SEI provides secure coding training and numerous books have been written on the subject.
Attendee: Robert, is it realistic to burden software developers with an ever-increasing set of things they need to worry about rather than building the solutions to these problems into their languages and tools?
Robert Seacord: I am not sure if it is realistic, but because developers are the last line of defense between the languages and tools they are using and deploying vulnerable products, they need to shoulder the burden. We are heavily involved in language standards committee work to try to help improve the inherent security of these languages. You can look at David Keaton’s blog post to see some of the specific improvements we have made in C11.
Bill Nichols: Mistake-proofing the development environment is desirable, but Robert described some barriers. Moreover, a seemingly safer environment can lead to compensating behavior (i.e., the Peltzman effect) that can undermine the improvements. Regardless of improvements to the environment, developers must hold themselves to a high standard. Our experience suggests that a large portion of the vulnerabilities result from mistakes programmers make routinely, but can be found and removed with some discipline. Developers need to learn and apply the meta-tools required to do good work. They need to understand and use sound design principles, sound coding practices, and effective review, inspection, and test. If these techniques are applied diligently, the exposure can be reduced by a factor of between 5 and 20 without adding overall cost to development.
Attendee: Shouldn’t SSL_malloc()’s return be checked to be a valid pointer prior to using it as destination for memcpy? What if SSL_malloc() returned a NULL on memory exhaustion?
Robert Seacord: Yes. This is almost certainly a violation of ERR33-C. Detect and handle standard library errors.
Attendee: Is C++ a suitable language for writing security-critical code or are there more secure languages available that would avoid problems such as this?
Robert Seacord: There are languages that are less susceptible than C and C++ to reading or writing memory outside of the bounds of an object, as occurred in Heartbleed. Whether or not these languages are appropriate for a particular system depends on a number of factors including the type of application, existing code, and the knowledge and skills of the developers. Networking applications are frequently written in C because of the need to optimize performance and because of the bit level manipulations frequently required.
Attendee: Had the OpenSSL coders used something like SafeC implementation, do you think HeartBleed bug would not have occurred?
Robert Seacord: Probably not, but the reason may have been because OpenSSL might not have been widely adopted if it had not been written in a language for which compilers and tools were widely available. Language is part of the issue, but there is no such thing as a secure language.
Attendee: Has anyone determined if the vulnerability would have been detected by static analysis tools?
Robert Seacord: The vulnerability was not detected by static analysis, either because the analysis was not performed or because many analysis tools such as Coverity would not have detected the problem because of the number of levels between inputting the tainted value and using it. Many static analysis tools (including Coverity) can now detect this vulnerability, particularly if the code is annotated so that the tool is aware that the certain macros/functions return tainted inputs.
Attendee: A majority of U.S., federal, civilian, DoD, and Intel websites have unknown anomalies that potentially are very similar to the "Open SSL" issue. The June deadline for FedRamp is quickly approaching that mandates certification prior to Authority to Operate. What actions should the U.S. Federal Government (including Intelligence) be taking to prevent similar unknown vulnerabilities and anomalies? What differentiated this particular proprietary fuzzing tool (rather than open-source that did not find the anomaly) that caused this tool to find Heartbleed? Should the U.S. CERT be advising the U.S. Federal Government to be redirecting resources to determine anomalies prior to "Authority to Operate" (ATO) using a proprietary tool to prevent hacking of their networks to ensure zero day vulnerabilities prior to ATO?
Robert Seacord: In general, there is no one tool that is likely to catch all possible problems. The best solution is to use a collection of dynamic and static analysis tools (and to develop the code securely to begin with). SEI/CERT’s approach is to use the SCALe to provide conformance testing of source code against secure coding standards using a variety of static analysis tools. This is consistent with the requirements of Section 933 of the FY13 National Defense Authorization Act.
Attendee: Could you recommend some static program analysis tools?
Robert Seacord: As an FFRDC, we cannot endorse any tools. In general, the static analysis tools tend to have non-overlapping capabilities, so you may need to use more than one. For the Source Code Analysis Laboratory (SCALe), we use Coverity, FindBugs, Fortify, and Eclipse to analyze Java code and Coverity, Fortify, LDRA, Microsoft Visual C++, PCLint, GCC, and Compass Rose for C language systems.
Attendee: What are the implications of the general use of fuzzing tools for the U.S. federal government? Should a minimum standard be established for what a fuzzing tool should accomplish to ensure no zero day vulnerabilities?
Will Dormann: CERT works with vendors to encourage them to use fuzzing tools. If they do not fuzz on their own, somebody else will, and they may discover vulnerabilities using that technique. It would be nice if there were some sort of standard for fuzzing robustness. However, enforcing a requirement can be difficult. How can one objectively measure the amount of fuzzing that an application or library has endured? There are so many variables at play that it is not as simple as saying that "Application X withstood Y number of fuzzing iterations without crashing."
The vision of ensuring no zero day vulnerabilities is a worthy goal, but perhaps not ever achievable. Also consider the fact that some vulnerabilities are discovered without the use of fuzzing at all. The idea of fuzzing metrics has been batted around in the past (http://dankaminsky.com/2011/03/11/fuzzmark/); however, it appears that not much progress has been made in this area.
Attendee: There’s been a lot of chatter about open source being the problem. Do the panelists think a closed-source solution would’ve fared any better? There seem to be plenty of vulnerabilities regardless of open/closed status, so shouldn’t the lesson learned be that regardless of open/closed status, we need to do a better job of coding our software securely and continuing to test it even after it’s released?
Will Dormann: Open-source software is not inherently less secure or more secure than closed-source software. Vulnerabilities can be discovered without the availability of source code. The quality of the code being written is what affects the quality of the applications and libraries that we use.
Looking Ahead
As the answers above demonstrate, Heartbleed is fundamentally a coding mistake and one that could have been prevented. Through open exchanges like this, we hope to prevent future vulnerabilities. We welcome your feedback in the comments section.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:09pm</span>
|
|
By Sarah A. SheardSenior EngineerSoftware Solutions DivisionThis post is the first in a series on this topic.
The Government Accountability Office (GAO) recently reported that acquisition program costs typically run 26 percent over budget, with development costs exceeding initial estimates by 40 percent. Moreover, many programs fail to deliver capabilities when promised, experiencing a 21-month delay on average. The report attributes the "optimistic assumptions about system requirements, technology, and design maturity [that] play a large part in these failures" to a lack of disciplined systems engineering analysis early in the program. What acquisition managers do not always realize is the importance of focusing on software engineering during the early systems engineering effort. Improving on this collaboration is difficult partly because both disciplines appear in a variety of roles and practices. This post, the first in a series, addresses the interaction between systems and software engineering by identifying the similarities and differences between the two disciplines and describing the benefits both could realize through a more collaborative approach.
Origins of Systems Engineering and Software Engineering
Systems engineering is an interdisciplinary field of engineering that focuses on how to design and manage complex engineering projects over their lifecycles. System engineering textbooks first appeared around 1960, long before the advent of software engineering. However, systems engineering only emerged as a discipline (complete with journals, documented practices, and academic departments) in the 1990s, well after software engineering had been an established field in computer science. As part of the maturation of systems engineering, two capability models were created that documented its practices; by the time they had been merged and then folded into the Capability Maturity Model Integration (CMMI) model, they had become mainstream. Standards were written in the 1990s that document systems engineering, and in the 2000s they generally have been harmonized with software standards.
The discipline of software engineering traces its roots to a 1968 NATO conference when Mary Shaw named "software engineering" as an aspiration, a field that was needed but not yet developed. Over the next several decades, software engineering both became better defined and grew in new directions. In 1996 Shaw categorized software engineering eras as "Programming-any-which-way," (1955-1965), "Programming-in-the-small," (1965-1975), "Programming-in-the-large" (1975-1985), and as "Programming-in-the-world" (1985-1995). The field continues to evolve, with recent focus on topics such as mobile programming and big data.
Present Day
Software engineers today have adopted many of the principles and practices of systems engineering. For example, developers do not merely react to requirements; they elicit, prioritize, and negotiate changes to them. Developers do not just build code; they architect it, plan it, and establish rules to ensure data is appropriately shared and processed. Software project managers do not just distribute development tasks among individual coders and hope they integrate a seamless final product; they ensure that developers collaborate using plans, processes, and daily standup meetings. Software engineers do not just write programs; they design modular pieces, define integration principles, build for interoperability and modifiability, and perform integration testing.
Meanwhile the systems whose development programs employ systems engineers are increasingly dependent on and controlled by software. For example, the effort to write and maintain software on fighter aircraft now exceeds all other costs. Systems engineers are responsible for ensuring that every part of the system works with every other, and because software represents the intelligence of human-made systems, it is especially important to the integration of the system. Yet many chief systems engineers and program managers have had more experience in mechanical or electrical engineering than in software or software engineering.
Overlap and Uniqueness
Today, systems engineering and software engineering activities overlap to a great extent. The figure below shows activities common to both disciplines in the middle (purple), activities related more to systems engineering than software on the left (red), and activities related more to software than systems on the right (blue). The term "hardware" appears only on the right, because software engineers use it to mean anything that isn’t software, but systems engineers generally do not use it. Instead, they call physical subsystems by name (e.g., power, propulsion, structures) or describe disciplines or domains (e.g., mechanical engineering, survivability, software, orbit analysis, launch sequence).
It’s important to note in the figure above that systems engineers’ responsibilities tend to be broad but not deep, certainly not deep in software. Conversely, while software engineers’ responsibilities, may be broad across the software, they are deep in areas necessary to create working code. While most of the comparisons in the figure are fairly intuitive, three points require some elaboration: customer interface, non-functional requirements, and what each group does not do.
Although customer interface is a systems engineering role, the software group must participate when discussing software requirements and design. For example, software engineers in Agile sprints often interface one-on-one with a customer representative to select the next development chunk, based on customer priorities.
Systems engineers are responsible for ensuring that non-functional requirements are met across the system (qualities they call "ilities" that include reliability, usability, and producibility, for example). Software engineers focus on software "quality attributes" (such as reliability, usability, and maintainability, for example.) The two concepts are clearly similar, but there are some differences; for example, software's counterpart to hardware producibility, namely creating the second and later copies, is trivial, and security and maintainability are more important.
The contrast between systems and software engineers is most evident in what they do not do. The role of systems engineers is to balance concerns and perform tradeoff analyses at a high level and not get immersed in details in any area; they consistently delegate details to specialists. System engineers cannot say a task is not their job: if no one else is doing it and it needs to be done, the task defaults to systems engineers. In contrast, software engineers (and mechanical, electrical, etc. engineers) focus their attention inside a domain, mastering details and keeping up with evolving practices and technologies.
Theory and Practice
It is important to note that what systems and software engineers should do is not always the same as what they actually do. In fact, in our experience at the SEI, the majority of complaints each group has about the other results from a failure to meet best practices, rather than from significant differences in best practices. If the waterfall model or a Vee lifecycle model is applied as a rigid sequential process, software engineers can think systems engineering does not apply to software. If the program organizes software engineers only into integrated product teams other than the team responsible for systems engineering, integration and test, then software engineers can get frustrated with their lack of leverage. But good systems engineers also find these situations to be frustrating. Similarly, both systems and software engineers get frustrated if "agile" is used as an excuse to address only nominal-case functional requirements and not system level requirements like security and scalability.
Needed Knowledge
Most of today’s systems engineers were not software engineers or computer scientists first, nor do many of them have significant background or experience with software engineering techniques, tools, and methods. What software concepts they need to know to perform systems engineering on software-reliant systems effectively is an interesting question, which I will address in a future blog post. Few universities offer a joint degree in systems and software engineering (exceptions include Cornell University and Carnegie Mellon, which offer a joint certificate in systems and software engineering). Moreover, systems engineering programs ordinarily do not require a foundational understanding of software.
The Graduate Software Engineering (GSWE) 2009 guidelines provide recommendations for how to integrate systems engineering into software engineering education. The Systems Engineering Research Center/University Affiliated Research Center (SERC-UARC) published a Systems Engineering Body of Knowledge and Curriculum to Advance Systems Engineering (BKCASE) in 2012, which includes a graduate reference curriculum for a systems engineering master’s program, but its relationship to software is minimally discussed (in two pages).
The Road Ahead
The earlier part of this blog identified the similarities and differences between the disciplines of software engineering and systems engineering. The remainder of this blog outlines the benefits both could realize through a more collaborative approach. At this time, there are many disconnects between software and systems engineers that could be minimized through better collaboration. Software engineers could get involved in key systems architecting decisions so that the best software architecture could be easier to build. Systems engineers could reduce work that the software engineers find to be non-value added, and would receive help from software engineers in analyzing tradeoffs. In sum, a breadth+depth combination would identify innovative ideas to solve problems and exploit opportunities.
To ensure that future software-reliant systems (and systems of systems) can be built effectively, a new collaborative relationship between systems engineering and software engineering must become the norm. Moreover, there is a clear and compelling need to understand which roles must be filled by which types of professionals and which actions must be performed to ensure the two disciplines work together effectively. Improving the relationship will require overcoming the negative emotions that sometimes appear. (Both groups at a talk on systems and software that I presented at INCOSE Three Rivers Chapter in July 2013 agreed that the others "are arrogant and not as smart as they think they are.") By learning to work together effectively, systems engineers and software engineers can help increase the probability of success of new and planned ultra-large-scale software-reliant systems and systems-of-systems. In particular, they must understand each other’s strengths and objectives and work together to support them.
In future posts in this blog series, I will address the questions systems engineers must ask software engineers at the earliest program phases, and vice versa. I will examine the knowledge that systems engineers should have about software and propose how to provide them with that knowledge. I will also show that typical software engineering tasks map to systems engineering tasks and fulfill much of what customers are looking for when acquiring software-reliant systems.
What do you think systems engineering should know about software engineering? What should software engineers know about systems engineering? Please leave us feedback in the comments section below.
Additional Resources
To read the SEI technical report, The Business Case for Systems Engineering Study: Results of the Systems Engineering Effectiveness Survey, please visithttp://resources.sei.cmu.edu/library/asset-view.cfm?assetid=34061.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:09pm</span>
|
|
By C. Aaron CoisSoftware Engineering Team Lead CERT Cyber Security Solutions Directorate
This blog post is the second in a series on DevOps
To maintain a competitive edge, software organizations should be early adopters of innovation. To achieve this edge, organizations from Flickr and IBM to small tech startups are increasingly adopting an environment of deep collaboration between development and operations (DevOps) teams and technologies, which historically have been two disjointed groups responsible for information technology development. "The value of DevOps can be illustrated as an innovation and delivery lifecycle, with a continuous feedback loop to learn and respond to customer needs," Ashok Reddy writes in the technical white paper, DevOps: The IBM approach. Beyond innovation and delivery, DevOps provides a means for automating repetitive tasks within the software development lifecycle (SDLC), such as software builds, testing, and deployments, allowing them to occur more naturally and frequently throughout the SDLC. This blog post, the second in our series, presents a generalized model for automated DevOps and describes the significant potential advantages for a modern software development team.
I oversee a software engineering team that works within CERT and focuses on research and development of solutions to cybersecurity challenges. Our engineers design and implement software solutions that solve challenging problems for federal agencies, law enforcement, defense intelligence organizations, and industry by leveraging cutting-edge academic research and emerging technologies.
The environment and manner in which software engineering teams (not just my own) write code are constantly evolving. Until recently, most software engineers operated in siloed environments, with developers in one silo and an operations staff of IT professionals who maintained systems and software deployments in another. Members of the operations staff were often uninvolved in development processes, sometimes even unfamiliar with technologies used in development, yet expected to handle deployment, release testing, maintenance, and support of software applications produced by development teams.
Likewise, development teams were unable to design or develop software to be effectively released, tested, and maintained by operations teams. The problem, of course, is that they had no understanding of the needs or processes inherent to their operations teams. This divergence led to wasted time and effort, as well as heavy amounts of risk for software products and deployments.
The concept of DevOps initially evolved in 2009 as an effort to remove barriers between the development and operations teams in software development. At the SEI, we are taking that concept and pushing it forward, alongside many others working in the software industry today.
As a federally-funded research and development center (FFRDC), the SEI must maintain high standards of efficiency, security, and functionality. Our team in particular develops tools and technologies to help federal agencies assess cybersecurity risks, manage secure systems, and investigate increasingly complex cyber attacks and crimes. Cybersecurity is often misunderstood or even ignored as new systems are designed and developed, falling out of view to more high profile quality requirements, such as availability or correctness of software systems. This prioritization necessarily requires addressing security after the primary design phase, or adding security controls after large portions of development have already occurred. Due to CERT’s responsibility to our sponsors and the community, security is consistently a first-tier concern, addressed as an early and fundamental requirement for any system developed by our team. In addition to being a defining factor of our software development methodology, this posture heavily influences our approach to DevOps, weaving security considerations into every facet of our software development operation.
As mentioned in my introduction, forward-thinking approaches to process—including heavily automated DevOps techniques—allow us to systematically implement, maintain, and monitor status and quality standards for each of our projects. One large component of the current DevOps movement is Release Automation, which automates the build and deployment cycles of a software project. In Agile software-development scenarios, this process has two benefits:
automation of final software deployments
enabling the automation of continuous incremental deployments, triggered numerous times daily as changes are made to the software system
This process ensures that the entire development and management team is aware of the up-to-the-minute state of the software, including its test status and ability to be deployed to the expected runtime environment, continually throughout development. This process also enables highly confident development, and a great deal of certainty that when the time comes, the software can be successfully transitioned to an operations team for deployment and maintenance. This certainty is based on the fact that the entire test and deployment process has been automated and performed countless times already throughout the project lifecycle.
Continuous deployment means no surprises for operations staff, which translates into predictable, low-risk releases of provably high-quality software. Many players in the modern software industry share this vision. "When it comes to ensuring quality, release automation is another major asset for development teams. It automates the complex workflows required for agile software movement between development, test and production environments by removing manual interaction so humans don’t introduce avoidable errors," Ruston Vickers wrote in a March 5 blog post on Wired’s Innovation Insights blog. That same post cites a study in which more than half of respondents (52 percent) "identified the ability to simplify, standardize and execute application releases with [fewer] errors as the main benefit of release automation." In my view, release automation allows teams to perform build, deployment, and testing activities thoroughly and continually, dramatically increasing the confidence in the state of the software and all processes surrounding release and transition.
In a traditional, siloed environment, teams of developers deploy software manually, on a periodic basis (such as once per quarter, anum, or project cycle) or as necessitated by a new product release. In contrast, for a team leveraging automated DevOps, continuous integration (the process of building and testing a software project continuously, each time new changes are made) is in place and, as part of the process, the project is continuously deployed to an integration environment for testing and review. Some highly mature DevOps operations practice Continuous Deployment, an uninterrupted process that actually deploys live software to production environment.
Automating a task requires a substantial level of understanding by all involved of the processes, technologies, and complexities of deployments of the software in question. This requisite increase in understanding makes it likely that teams moving towards automated DevOps and continuous deployment capabilities also see a positive impact in the reliability, testability, security, and other quality attributes of their software. To achieve automation at this level, the team must study and deeply consider the needs of the entire project, from inception to deployment, which will result in a superior product due to increased focus on implementation details and operational realities.
To achieve the level of automation described, a number of autonomous systems must be in place, such as source control, build and deployment systems, an integration environment, and other systems to channel data and communications throughout DevOps processes. As the image below illustrates, DevOps involves a number of systems operating in concert and communicating seamlessly with other systems and with humans, to detect changes, perform the necessary autonomous functions, and notify team members of status and results. These structured interactions ensure that developers, quality-assurance staff, managers, and even external stakeholders receive continuous, real-time information on the status of the project, which is hugely beneficial, especially in Agile environments. Consistent with the Agile Manifesto: more information, more often, will lead to better project outcomes.
Systems of an Automated DevOps Environment
The remainder of this blog post explores our generalized model for automated DevOps, identifying the systems required to support an automated DevOps process (as shown in the illustration above):
Source control. Software developers need to safely store their code and keep track of source-code history and versions. For this reason alone, source control is of critical importance. Moreover, in an automated DevOps system, the chosen version control system (VCS) (e.g., Git and Subversion etc.) becomes the system that defines when changes have been made to the software project, triggering the rest of the automated DevOps build, test, and deploy activities.
Issue tracking system. An issue tracking system allows everyone involved to track current issues, estimates, and deadlines. Communication between this system and systems for code review and source control can allow invaluable traceability between code changes and project goals.
Build system. The build system supports continuous integration by building the software, running unit and integration tests, deploying to the integration environment, and performing any other automated checks defined for new versions of the software. This system must detect changes in source-code files from the source-control system, seamlessly communicate with and push data to the integration environment, and notify team members of status at all times.
Monitoring system. Monitoring systems continuously track all autonomous systems within the DevOps environment, notifying necessary maintenance staff if a system failure occurs. This monitoring requires communication with all autonomous systems used by the DevOps process. As a manager of a software engineering team, I don’t want to spend time debugging issues in my infrastructure. I want my developers to be able to continue coding even if a server fails. If a failure occurs, I want appropriate staff notified immediately so the issue can be fixed as quickly as possible without derailing my project.
Communications system. The constant exchange of information is important. Our team uses email, wikis, and a real-time chat system to enable continuous communication among all members of the project team. Moveover, communications systems are used to allow our automated systems to communicate with humans involved in the project through channels that are already part of their workflow. On my own team, I don’t have to check a dashboard to see that my latest commit broke the build; the build server will email and send me and instant message to alert me, in real time. The communications system will also alert my team lead, so everyone is in the loop and knows to fix the issue immediately so that no other developer’s work is inhibited.
Integration environment. The integration environment hosts all of the virtual machines that make up our DevOps environment. The integration environment houses machines running all of the DevOps services listed here and provides servers for continual deployment of software projects. Once a project is created, it has deployment servers, so, at any time, stakeholders can visit those servers and see the current state of working software achieved by the project team.
Code review system. To ensure software quality, every line of code must be reviewed by a seasoned developer. The practice of reviewing code also accelerates career growth and learning. Unfortunately, it is hard to find time for code review in a busy schedule. Our automated code review system therefore detects new code committed to source control and automatically assigns it to be reviewed by a senior developer. With automatic management of code reviews, we ensure that no piece of code fails to have multiple developers review it before it ends up in the final version of a software product.
Documentation system. Regrettably, documentation often remains an afterthought in production software projects. To ensure that documentation is written throughout the project, when developers are most conscious of technical details, we have developed an automated system that allows developers to write documentation easily, along with source code. All our documentation is written in Markdown, a simple plain-text format that can be easily edited in an integrated development environment (IDE) and committed to source control repositories along with source code. Once the documentation files are committed, an automated system builds robust documentation artifacts in a variety of formats (such as HTML, PDF, and Microsoft Word) for consumption by project managers and stakeholders. This process operates through the build system already monitoring source code for changes and has proven comfortable for developers by not requiring them to change tools to generate comprehensive project documents.
As I see it, all the processes described above—from source control to documentation—require some system to perform them in every software project. Of course, before automated DevOps, many of these processes were performed manually, especially manual testing and deployments. Even now, generation and formatting of system documentation or management of code reviews are still manual processes in most organizations. My goal with this post was to define all tasks necessary throughout the SDLC, and to promote the automation of as many of them as possible, thus freeing up developers to do what they do best: design and write amazing code.
Looking Ahead
While this post presented a generalized model for DevOps practices, future posts in this series will present the following topics:
advanced DevOps automation
DevOps system integration
continuous integration
continuous deployment
automated software deployment environment configuration
We welcome your feedback on this series, and what DevOps topics would be of interest to you. Please leave feedback in the comments section below.
Additional Resources
To listen to the podcast, DevOps—Transform Development and Operations for Fast, Secure Deployments, featuring Gene Kim and Julia Allen, please visit http://url.sei.cmu.edu/js.
To view the August 2011 edition of the Cutter IT Journal, which was dedicated to DevOps, please visit http://www.cutter.com/promotions/itj1108/itj1108.pdf.
Additional resources include the following sites:
http://devops.com/
http://dev2ops.org/
http://devopscafe.org/
http://www.evolven.com/blog/devops-developments.html
http://www.ibm.com/developerworks/library/d-develop-reliable-software-devops/index.html?ca=dat-
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 02:09pm</span>
|



