Training Magazine Network

Blogs

Toward Safe Optimization of Cyber-Physical Systems

By Dionisio de NizSenior Member of the Technical Staff, Research, Technology, and System Solutions Cyber-physical systems (CPS) are characterized by close interactions between software components and physical processes. These interactions can have life-threatening consequences when they include safety-critical functions that are not performed according to their time-sensitive requirements. For example, an airbag must fully inflate within 20 milliseconds (its deadline) of an accident to prevent the driver from hitting the steering wheel with potentially fatal consequences. Unfortunately, the competition of safety-critical requirements with other demands to reduce the cost, power consumption, and device size also create problems, such as automotive recalls, new aircraft delivery delays, and plane accidents. Our research leverages the fact that failing to meet deadlines doesn’t always have the same level of criticality for all functions. For instance, if a music player fails to meet its deadlines the sound quality may be compromised, but lives are not threatened. Systems whose functions have different criticalities are known as mixed criticality systems. This blog posting updates our earlier post to describe the latest results of our research on supporting mixed-criticality operations by giving more central processing unit (CPU) time to functions with higher value while ensuring critical timing guarantees. During our research, we observed that different functions provide different amounts of utility or satisfaction to the user. For instance, a GPS navigation function may provide higher utility than a music player. Moreover, if we give more resources to these functions (for example, more CPU time) the utility obtained from them increases. In general, however, the amount of utility obtained from additional resources does not grow forever, nor does it grow at a constant rate. The additional increment in utility for each additional unit of resource instead decreases to a point where the next increment in utility is insignificant. In such cases, it is often more important to dedicate additional computational resources to another function that is currently delivering lower utility and will deliver a larger increment in utility for the same amount of CPU time. For example, assuming that we get a faster route to our destination if more CPU time is dedicated to the GPS functionality, it seems obvious that the first route we get from the GPS will give us the biggest increment in utility. If we lack enough CPU time (due to the execution of other critical functions) to run both the GPS and the music player, we will choose the GPS. We may even prefer to give more CPU time (if we discover that more time is available) to the GPS to help avoid traffic jams before we decide to run the music player. Letting the GPS run even longer to select a less traffic-clogged route, however, may give us less utility than running the music player. At this point, we may prefer to start running the music player if we have more CPU time available. We thus change our allocation preference because the additional utility obtained by giving the GPS more CPU time is less than the utility obtained by giving the music player this time. This progressive decrease in the utility obtained as we give more resources to a function is known as diminishing returns, which can be used to allocate resources to ensure we obtain the maximum total utility possible considering all functions in the system. Our research uses both the diminishing returns characteristics of low-criticality functions and criticality levels to implement a double-booking computation time reservation scheme. Traditional real-time scheduling techniques consider the worst-case execution time (WCET) of the functions to ensure they always complete before their deadlines by reserving CPU time used only in the rare occasion that the WCET occurs. We take advantage of this fact and allocate the same CPU time for functions of lower-criticality. When both functions request the CPU time reserved for both at the same time, we favor the higher-criticality function and let the lower-criticality miss its deadline. Our double-booking scheme is analogous to the strategies airlines use to assign the same seat to more than one person. In this case, the seat is given to the person with preferred status (e.g., "gold members"). Our project uses utility—in addition to criticality—to ensure the CPU time that is double booked is given to functions providing the largest utility in case of a conflict (both functions requesting the double-booked CPU time). Our double-booking scheme provides the following two benefits: It protects critical functions ensuring that their deadlines are always met and It uses the unused time from the critical functions to run the non-critical functions that produce the highest utility. Our research is aimed at providing real-time system developers with an analysis algorithm that accurately predicts system behavior when it is running (runtime). Developers use these algorithms during the design phase (design-time) to test whether critical tasks will meet their deadlines (providing assurance), and how much overbooking is possible. To evaluate the effectiveness of our scheme, we developed a utility degradation resilience (UDR) metric that quantifies the capacity of a CPS to preserve the utility derived from double-booking. This metric evaluates all possible conflicts that can happen due to double booking and how much total utility is preserved after the conflict is resolved by deciding what function gets the double-booked CPU time and what functions are left without CPU time. The utility derived from the preserved functions is then summed to compute the total utility that a specific conflict resolution scheme can preserve. In theory, a perfect conflict resolute scheme should preserve the maximum possible utility. In reality, however, decisions must be made ahead of time assuming that some critical functions will run for their worst-case execution time (even though they may not) to ensure that they finish before their deadlines. Unfortunately, if they execute for less time, it may already be too late to execute other functions. Using the UDR metric we compare our scheme against the Rate-Monotonic Scheduler (RMS) and a scheme called Criticality-As Priority Assignment (CAPA) that uses the criticality as the priority. Our experiments showed we can recover up to 88 percent of the ideal utility that we could get if we could fully reclaim the unused time left by the critical functions if we had perfect knowledge of exactly how much time each function needed to finish executing. In addition, we observed our double-booking scheme can achieve up to three times the UDR that RMS provides. We implemented a design-time algorithm to evaluate the UDR of a system and generate the scheduling parameters for our runtime scheduler that performs the conflict resolutions of our overbooking scheme (deciding which function gets the overbooked CPU time). This scheduler was implemented in the Linux operating system as a proof-of-concept to evaluate the practicality of our mechanisms. To evaluate our scheme in a real-world setting, we used our scheduler in a surveillance UAV application using the Parrot A.R. Drone quadricopter with safety-critical functions (flight control) and two non-critical functions (a video streaming and a vision-based object detection functions). Our results confirmed that we can recover more CPU cycles for non-critical tasks with our scheduler than with the fixed-priority scheduler (using rate-monotonic priorities) without causing problems to the critical tasks. For example, we avoided instability in the flight controller that can lead to the quadricopter turning upside down. In addition, the overbooking between the non-critical tasks performed by our algorithm, allowed us to adapt automatically to peaks in the number of objects to detect (and hence execution time of the object detection function) by reducing the frames per second processed by the video streaming function during these peaks. In future work we are extending our investigation to multi-core scheduling where we plan to apply our scheme to hardware resources (such as caches) shared across cores. This research is done in collaboration with Jeffrey Hansen of CMU, John Lehoczky of CMU’s Statistics Department; and Ragunathan (Raj) Rajkumar and Anthony Rowe of the Electrical and Computer Engineering Department at CMU. Additional Resources: www.contrib.andrew.cmu.edu/~dionisio/

SEI . Blog .  Jul 27, 2015 02:55pm

Bridging the “Valley of Disappointment” for DoD Software Research with SPRUCE

By Douglas C. Schmidt Chief Technology OfficerSEI As noted in the National Research Council’s report Critical Code: Software Producibility for Defense, mission-critical Department of Defense (DoD) systems increasingly rely on software for their key capabilities. Ironically, it is increasingly hard to motivate investment in long-term software research for the DoD. This lack of investment stems, in part, from the difficulty that acquisitions programs have making a compelling case for the return on these investments in software research. This post explores how the SEI is using the Systems and Software Producibility Collaboration and Experimentation Environment (SPRUCE) to help address this problem. Decades of public and private research investments—coupled with the inexorable growth of globalization and connectivity—have commoditized many information technology (IT) products and services. For example, commercial off-the-shelf (COTS) hardware and software is now produced faster, cheaper, and generally at a predictable pace. During the past two decades, users and developers of IT systems have benefitted from the commoditization of hardware and networking elements. More recently, the maturation and widespread adoption of object-oriented programming languages, operating environments, and middleware is helping commoditize many software components and end-system layers. Due to this IT commoditization trend, acquisition professionals, senior leaders, politicians, and funding agencies often assume new software innovations will continue to appear at a predictable pace, and that the DoD can benefit from these innovations without significant investment in software research. While mainstream IT systems may not need this investment, mission-critical DoD systems—particularly at the tactical edge—cannot. Without sustained investment in software research, therefore, the DoD is in danger of "eating the seed corn" and reaching a complexity cap that will make it harder to succeed in an era of budget cuts and other austerity measures. Challenges to Effective Software Research Impact One challenge to motivating investment in software research is presenting a convincing pathway for how sponsored research finds its way into practice. The underlying problem for the DoD is the ad hoc and often serendipitous nature by which members of the software community (including academic researchers, defense contractor software architects and developers, DoD acquisition program and research sponsors, as well as commercial tool vendors) collaborate to identify, develop, test, and transition promising software technologies. This lack of systematic collaboration by various groups in the software community outlined above has created a dysfunctional— yet all-too-common—situation whereby DoD programs cannot find software technologies that meet their needs, regardless of their inherent promise. As a result, across the DoD acquisition programs repeatedly encounter problems developing, validating, and sustaining software. To exacerbate the problem, the "landing path" for software technologies is typically not DoD program engineers, but organizations (such as commercial vendors or standards bodies) responsible for maintaining the technology. These organizations are often not structured or motivated to leverage the results of advanced research projects effectively. For example, DoD software researchers have historically received funding for research programs of approximately three years in duration. These programs involve creating a project plan, building teams, working on technologies, generating and evaluating prototypes, and writing papers to publicize the work. Throughout this period, there is typically great enthusiasm for the project from the technical community. Once the program ends, however, the community often disbands and the project descends into the "valley of disappointment," a phenomenon in which researchers struggle to transition their prototypes to the DoD acquisition community, while the practitioners are equally frustrated with not being able to apply research results to practical problems. Getting stuck in the "valley of disappointment" is a common problem in technology research and development projects, as evidenced by Geoffrey Moore’s book Crossing the Chasm. Moore presents this problem from a venture capital perspective: a group of researchers develops a technology and identifies some early adopters, but struggles to transition from the early-adoption to majority-adoption phase. Some reasons for this valley are that researchers are often required to work on abstracted problems because that’s all that they can access and acquisition professionals don’t have the luxury of transitioning "science projects." Crossing the "Valley of Disappointment" with SPRUCE To address the challenges describe above, the Assistant Secretary of Defense Research & Engineering Enterprise (ASDR&E), through the Air Force Research Lab (AFRL), funded researchers at Lockheed Martin Advanced Technology Laboratories, in partnership with Booz Allen Hamilton, Vanderbilt University (where I worked on SPRUCE before joining the SEI), Drexel University, Virginia Tech University, Lockheed Martin Aeronautics, and Raytheon to create the Systems and Software Producibility Collaboration and Experimentation Environment (SPRUCE). SPRUCE is a collaborative set of web-based services that matches DoD challenge problems with the methods, algorithms, tools and techniques developed by researchers. One way to think about SPRUCE is as an "eHarmony" portal for researchers that unites domain experts from the DoD acquisition community who face concrete technical challenges with software researchers who can solve them. For example, acquisitions professionals could be searching for an approach that will allow them to run legacy code on a multi-core platform or an algorithm that minimizes the amount of processors and network bandwidth in an avionics system. SPRUCE refers to these people as the "problem providers," who post challenge problems into the SPRUCE portal. Conversely, researchers are "solution providers" who use SPRUCE to post candidate solutions to available challenge problems. SPRUCE allows problem providers to explain their needs in a structured way—along with representative data sets and reproducible experiments—so that solution providers from software researchers can decide if they have methods or technologies that would make an impact on the posted problem. If researchers operate in an open environment (which is typical at universities), they can post their solution on the SPRUCE portal. If researchers operate in a closed environment (which is typical at companies), they can contact the problem providers directly and discuss options for collaboration. SPRUCE addresses many problems facing DoD software researchers: It allows researchers access to real-world problems and realistic data sets. Even if the problem providers have anonymized their problem (for example, by removing proprietary information), it still represents an actual challenge faced by the DoD. Once researchers demonstrate that their solutions work on abstracted problems that are relevant to a particular domain and derived from real-world scenarios, it is easier to convince the original problem providers that the results are ready to be applied in practice. SPRUCE facilitates healthy competition among research groups. For example, SEI researchers may believe they have the most effective techniques and tools for detecting similarity in malware, but SPRUCE allows them to compare their results against techniques and tools devised by other researchers for a common data set. In addition to the obvious competitive benefits, this approach also allows a better collaborative evolution of the solution by incorporating the best parts from each approach into a refined approach. SRPUCE helps researchers locate sources of funding because it provides an immediate way for them to showcase their results in a forum that has an audience (the challenge problem providers) interested in solutions to real problems. Over time, researchers will populate the SPRUCE repository with their solutions, providing a way for them to find audiences for new funding and additional collaborations on real problems. In its four years of funding, SPRUCE has focused primarily on capturing DoD challenge problems and helping DoD software researchers collaborate more effectively with other members of the DoD software community. SPRUCE has also influenced the National Science Foundation (NSF), which recently created the Cyber Physical Systems Virtual Organization (CPS-VO) community as a web portal for problem providers in cyber-physical systems. NSF-funded researchers use CPS-VO to post challenges and to work collaboratively to solve challenge problems with their colleagues around the world. SPRUCE represents part of the trend towards more collaborative research and development among scientists and engineers. For example, UAVForge.net is attempting to use crowd sourcing to go from concept to fly-off of air vehicle designs in under six months. Likewise, a recent article from Wall Street Journal titled The New Einsteins Will Be Scientists Who Share proclaims "publicly funded science should be open science." Portals like SPRUCE help move researchers and practitioners from isolated pockets of collaboration to mainstream adoption. Using SPRUCE to Guide SEI Research At the SEI, we are using SPRUCE to showcase our solutions and, more importantly, to capture real-world challenge problems from our stakeholders. For example, the SEI hosted a workshop in August 2011 that brought together researchers and problem providers from Lockheed Martin, Boeing, AFRL, Carnegie Mellon University and Virginia Tech to elicit guidance for our work in real-time scheduling and currency analysis for cyber-physical systems. The workshop participants provided the SEI with challenge problems from avionics domain experts to ensure research we are doing addresses real DoD problems. As a result of this workshop, the problem providers populated the SPRUCE database with problems that SEI technologists will use to guide our future work. We are in the process of conducting challenge problem workshops for other software research projects at the SEI to ensure we continue to work on relevant problems that have high impact on DoD operational needs. SPRUCE also allows us to continually improve our metrics and measures of success. As a federally funded research and development center, the SEI is often requested to substantiate data and success criteria. Problem providers—who by their nature have a close connection to real-world problems—help define the success criteria. This approach allows an external party, like a DoD contractor, to define the success criteria for SEI researchers who then work to achieve those criteria. At the same time, it showcases the solutions that SEI technologists have developed for various technologies, such as multi-core platforms. In a commoditized IT environment, human resources are an increasingly strategic asset. In the future, therefore, premium value and competitive advantage will accrue to individuals, universities, companies, and agencies that continue to invest in software research and who master the principles, patterns, and protocols necessary to collaboratively integrate commoditized hardware and software to develop complex systems that cannot yet be bought off-the-shelf. Success in this endeavor requires close collaboration between academia, industry, and government. The SPRUCE portal described above helps to facilitate this collaboration by bringing key stakeholders to the table and ensuring that government investments in software research have greater impact on DoD acquisition programs. Additional Resources: For more information about the SPRUCE portal, please visit www.sprucecommunity.org/default.aspx. To read about the need to motivate greater DoD investment in software research, please see the National Research Council’s Critical Code: Software Producibility for Defense report available atwww.nap.edu/openbook.php?record_id=12979&page=R1.

SEI . Blog .  Jul 27, 2015 02:55pm

Cloud Computing at the Tactical Edge

By Grace Lewis, Senior Member of the Technical StaffResearch Technology & System Solutions Cloudlets, which are lightweight servers running one or more virtual machines (VMs), allow soldiers in the field to offload resource-consumptive and battery-draining computations from their handheld devices to nearby cloudlets. This architecture decreases latency by using a single-hop network and potentially lowers battery consumption by using WiFi instead of broadband wireless. This posting extends our original post by describing how we are using cloudlets to help soldiers perform various mission capabilities more effectively, including facial, speech, and imaging recognition, as well as decision making and mission planning. An initial goal of our research was to create a prototype application that located cloudlets within close proximity of handheld devices using them. We initially focused on offloading computations to cloudlets to extend device battery life. In addition to this benefit, we also found cloudlets significantly reduce the amount of time needed to deploy applications to handheld devices because clients are not tied to a specific server that can take a long time to provision in tactical environments. Our work together with Mahadev "Satya" Satyanarayanan (the creator of the cloudlet concept and a faculty member at Carnegie Mellon's School of Computer Science) originally focused on face recognition applications as an example of a computation-intensive mission capability. Thus far we have created an Android-based facial recognition application that locates a cloudlet via a discovery protocol, sends the application overlay to the cloudlet, where dynamic VM synthesis is performed, captures the images and sends them to the facial recognition server code that now resides in the cloudlet. In the context of cloudlets, the application overlay corresponds to the computation-intensive code invoked by the client, which in this case is the face recognition server written in C++ and processes images from a handheld device client for training or recognition purposes. On execution, the overlay is sent to the cloudlet and applied to one of the VMs running in the cloudlet, which is called dynamic VM synthesis. The application overlay is pre-generated by calculating the difference between a base VM and the base VM with the computation-intensive code installed. The first version of the cloudlet we created is a simple HTTP server. When this server receives the application overlay from the client it decrypts and decompresses the overlay and performs VM synthesis to configure the cloudlet dynamically. It subsequently returns coordinates for the faces it recognizes, as along with a measure of confidence to the client device. Constructing the Cloudlet Prototype The original cloudlet prototype built by Satya’s team used a simple Virtual Network Computer (VNC) client, to see what was executing inside the VM. Our cloudlet prototype extended Satya’s work to use a thick mobile client that provides a better user experience for users at the edge and allows incorporation of sensor information that would not be possible with the original VNC cloudlet approach. We constructed this prototype in the RTSS Concept lab. Our design was tricky because the face recognition client needs to know the IP address and the port on which the face recognition server is listening so that it can connect to it. The client uses an HTTP request to start the cloudlet setup expects an HTTP response from the cloudlet server that includes the face recognition server IP address and port. Since the IP address is assigned by the DHCP server because the VM is executing in bridged mode, however, the host server has no visibility into that assignment, so there was no simple way to obtain the IP address and port. To solve this problem, we included a Windows service in the VM and run on startup. The Windows service invokes a Python script that performs the following three tasks: start the face recognition server executable in a separate thread inside a Python script, read the face recognition server configuration file that contains the IP address and port that the face recognition server is listening on, and write this information to a file that is accessible by the cloudlet Although the Windows service creates additional complexity on the cloudlet server, it reduces the complexity cloudlet setup in the field. During field operation, servers residing within Tactical Operation Center (TOCs) and Humvees are provisioned with a set of pre-packaged cloudlets to support a range of applications and versions to avoid provisioning servers for each supported application platform and version. The handheld devices of soldiers participating in the mission are then loaded with application overlays that are necessary for a particular mission. A soldier running a computation-expensive application can discover a compatible cloudlet within minutes and offload the expensive computation to the cloudlet running on a server. What We’ve Learned Our research has identified the following two types of applications that can be deployed in a cloudlet setting: Data-source-reliant applications that rely on a particular data source to work. For example, if soldiers need to launch the facial recognition application, they need a database of faces to match images against. Another example would be if the soldier wanted to compare fingerprints and needed a database of fingerprints to match against. In this setting the cloudlet must be configured to connect the cloudlet to a particular data source. Non-data-source-reliant applications that are computationally intensive but don’t require a large data source to work. For example, imagine soldiers encountering a sign with characters they don’t understand. They can take a picture of the sign and submit it to a cloudlet to determine the language in which the sign is written. In this case the computationally-intensive code residing on the cloudlet relies on complex character recognition algorithms instead of a large database. As expected, our experiments demonstrated that the size of the overlay increases overlay transmission time (which in turn consumes more battery) as well as VM synthesis time. If the data source is included inside the overlay this would create a large overlay, which indicates that the cloudlet concept is better fit for non-data-source-reliant applications. We overcame this problem by specifying the location of the data source in a configuration file. The location could be the local server or a server accessible over a network or the Internet. Although this approach requires additional configuration, it is only done once (when the cloudlet is packaged by IT experts), rather than doing it each time a server is configured in the field (potentially by non-IT experts). Future Work When testing the cloudlet prototype in the RTSS Concept Lab, we discovered that a reduced deployment time makes it easier to deploy an application in a tactical environment. We are working to capture those measurements and are developing the following applications to support our findings fingerprint recognition — fingerprints are captured using a fingerprint scanner connected to a handheld device and sent to the cloudlet for processing, character recognition — pictures of a written sign are taken with a camera on the handheld device and sent to the cloudlet for character identification and translation, speech recognition — voice of a person speaking a foreign language is captured using the voice recorder on the handheld device and sent to the cloudlet for translation; the same application can be used to translate a response back to the identified foreign language, and model checking —An app is generated on the handheld on-the-fly using end-user programming capabilities and sent to a model checker in a cloudlet to ensure it does not violate any security (or other) policies and constraints. We will use these new applications to gather measurements related to bandwidth consumption of overlay transfer and VM synthesis to focus on optimization of cloudlet setup time. Our future research and collaboration will position cloudlets to both reduce battery consumption and simplify application deployment in the field. For example, our goal is to use dynamic VM synthesis to slash the time needed to deploy applications, thereby shielding operators from unnecessary technical details, while also communicating and responding to mission-critical information at an accelerated operational tempo. Additional Resources: This is the second post in a series exploring the SEI’s research in Cloud Computing in partnership with Satya. To read the initial post, Cloud Computing for the Battlefield, please visithttp://blog.sei.cmu.edu/post.cfm/cloud-computing-for-the-battlefield.

SEI . Blog .  Jul 27, 2015 02:54pm

Equipping the Soldier with End-User Programming

By Edwin Morris, Advanced Mobile Systems Initiative LeadResearch, Technology & System Solutions Whether soldiers are on the battlefield or providing humanitarian relief effort, they need to capture and process a wide range of text, image, and map-based information. To support soldiers in this effort, the Department of Defense (DoD) is beginning to equip soldiers with smartphones to allow them to manage that vast array and amount of information they encounter while in the field. Whether the information gets correctly conveyed up the chain of command depends, in part, on the soldier’s ability to capture accurate data while in the field. This blog posting, a follow-up to our initial post, describes our work on creating a software application for smartphones that allows soldier end-users to program their smartphones to provide an interface tailored to the information they need for a specific mission. The software we developed is constructed primarily in Java and operates on an Android platform. We used an object database (DB 4.0) as the underlying data store because it provides flexible and powerful application programming interfaces (APIs) that simplified our implementation. For performance reasons, our application is a native Android app - it’s not running on a browser of an Android smart phone. Our app—called eMONTAGE (Edge Mission Oriented Tactical App Generator)—allows a soldier to build customized interfaces that support the two basic paradigms that are common to smartphones: maps and lists. For example, a soldier could build an interface that allows them to construct a list of friendly community members including names, affiliations with specific groups, information about whether the person speaks English, and the names of the person’s children. If the soldier also specifies a GPS location in the customized interface s/he constructs, the location of the friendly community members could be plotted on a map. Likewise, the same soldier could build other customized interfaces that capture specific aspects of a threatening incident, or the names and capabilities of non-governmental organizations responding to a humanitarian crisis. Challenges We Encountered The software we built is intended for soldiers who are well-versed in their craft, but are not programmers. While we are still conducting user testing, after we developed a prototype, we asked several soldiers to provide feedback. Not surprisingly, we found that soldiers who are Android users and relatively young (i.e., digital natives) quickly learned the software programming application and could use it to build a new application on-site. Conversely, non-digital natives had a harder time. Since our goal is to make our software accessible to every soldier, we are simplifying, revising, and improving the user interface. As with any device used by our military, security is a key concern. Through our work with DARPA’s Transformative Apps program in the Information Innovation office, we can take advantage of the security strategies they conceive and implement. We are also working to address challenges associated with limited bandwidth and battery consumption in this work and other work within the Research, Technology, and Systems Solutions program at the SEI. Another area of our work involves enabling our software to connect to back-end data sources that the DoD uses. For example, a soldier on patrol may need to connect to TiGR and other information systems to access current information about people, places, and activities in an area. Our software will enable these soldiers to build customized interfaces to such data sources by selecting fields for display on the phone and by extending the information provided by these sources with additional, mission-specific information. This capability will provide mash ups that support soldiers by capturing multiple sources of information for display and manipulation. Once our full capability is available in spring 2012, it will become much easier to build phone interfaces to new data sources and extend these interfaces with additional information. Looking to the Future Currently, eMONTAGE can handle the basic information types that are available on an Android phone, including images, audio, and data. Technologies like finger print readers and chemical sensors are being miniaturized and will likely be incorporated into future handheld devices. With each new technology, we’ll need to add that basic type to our capability. Fortunately, this is a relatively straight-forward programming operation, but it does require engineering expertise. As a new type becomes available, professional engineers will add it to eMONTAGE, thereby making the type available to soldiers who may have little or no programming expertise. Our current focus is on ensuring that the software is reliable and does not fail, but we are also looking to extend it to provide features that we believe are essential, such as better support for collections of objects. For example, soldiers may need to classify a single individual into different groups: a family member, translator, or member of an organization. Each of these groups is a collection. Soldiers will have the ability to list and search through collections (e.g., list all members of an NGO who work for Doctors Without Borders) and plot the members of a collection on a map (e.g., display all members of Doctors Without Borders who are within 10 miles of my current position.) While we can provide access to military iconology, eMONTAGE is not DoD-specific by design. This application can be used by other government organizations—or even non-government organizations— that want a user-customizable way to capture information about any variety of people, places, and things, and share this information effectively in the enterprise. Part of our ongoing research involves testing our applications with soldiers through the Naval Post-Graduate School’s Center for Network Innovation and Experimentation (CENETIX). In our initial tests with the soldiers, they told us what capabilities they need and what did not work. These collaborations tie our work firmly into both the research and military communities and keep us focused on providing a useful and cutting-edge capability. In addition to continuing our collaboration with CENETIX, we are working with Dr. Brad Myers of the Carnegie Mellon University Human Computer Interaction Institute. Dr. Myers is helping us define an appropriate interface for soldiers to use the handheld software in the challenging situations they face. Additional Resources: This posting is the second in a series exploring our research in developing software for soldiers who use handheld devices in tactical networks. To read our first post in the series, please visithttp://blog.sei.cmu.edu/post.cfm/a-new-approach-for-handheld-devices-in-the-military

SEI . Blog .  Jul 27, 2015 02:54pm

Regression Verification for Real-time Embedded Software Systems

By Arie GurfinkelSenior Member of the Technical Staff Research, Technology, & System Solutions The DoD relies heavily on mission- and safety-critical real-time embedded software systems (RTESs), which play a crucial role in controlling systems ranging from airplanes and cars to infusion pumps and microwaves. Since RTESs are often safety-critical, they must undergo an extensive (and often expensive) certification process before deployment. This costly certification process must be repeated after any significant change to the RTES, such as migrating a single-core RTES to a multi-core platform, significant code refactoring, or performance optimizations, to name a few. Our initial approach to reducing re-certification effort—described in a previous blog post—focused on the parts of a system whose behavior was affected by changes using a technique called regression verification, which involves deciding the behavioral equivalence of two, closely related programs. This blog posting describes our latest research in this area, specifically our approach to building regression verification tools and techniques for static analysis of RTESs. Although there are many types of RTESs, we concentrate on a class of periodic programs, which are concurrent programs that consist of tasks that execute periodically. The tasks are assigned priorities based on their frequency (higher frequency = higher priority). The RTES executes the tasks using a priority-based preemptive scheduler. Each execution of a task is called a job. Thus, from the perspective of the scheduler, a system’s execution is a constant periodic stream of jobs of different priorities. In the rest of this post, we use RTES to mean periodic programs. In the beginning of the project, we assumed that automated verification techniques (such as static analysis and model checking) for single-core RTESs could be adapted for regression verification since these techniques have been used for sequential single-core programs. After conducting an initial survey, however, we found that existing automated verification techniques that apply directly to program source (rather than to a manual abstract model) are not applicable to periodic programs. Our original approach to extend static analysis to regression verification in the setting of multi-core RTES was therefore changed in two ways. First, in phase 1 of our project we developed a new static analysis technique for reasoning about bounded executions of periodic programs. Second, in phase 2 we extended regression verification to multi-threaded programs, of which periodic programs are a restricted subset. The remainder of this blog posting describes these two phases. > Phase 1: Time-Bounded Verification of Periodic Programs In the first part of our work, we developed an approach for time-bounded verification of safety properties (user-specified assertions) of periodic programs written in the C programming language. Time-bounded verification is the problem of deciding whether a given program does not violate any user-specified assertions in a given time interval. Time-bounded verification makes sense for RTESs because of their intimate dependence on real time behavior. The inputs to our approach are (1) a periodic program C; (2) a safety property expressed via an assertion A embedded in C; (3) an initial condition Init of C, and (4) a time bound W. The output is either a counter-example trace showing how C violates an assertion A, or a message saying that the program is safe, in the sense that there is no execution that triggers any user-specified assertions. Our solution to time-bounded verification is based on sequentialization, which involves reducing verification of a current program P to verification of a (non-deterministic) sequential program P’. A key feature of our approach is that P’ is linear in the size of P, which means the translation step is not computationally intensive and adds little overhead to the verification effort. The scalability of our approach is therefore mostly driven by the scalability of the underlying analysis engine, and our approach automatically benefits from constant improvements in the verification area. Our work builds upon previous sequentialization work for context-bounded analysis (CBA) and bounded model checking (BMC). Our approach differs from prior work, however, since it bounds the actual execution time of the program, which is more natural to the designer of an RTES than a bound on the number of context switches (as done in CBA) or a bound on the number of instructions executed (as in BMC). We bound the execution time by translating the input time bound W in our model to a bound on the number of jobs. This translation is a natural consequence of the fact that the tasks are periodic and are therefore activated a finite number of times within W. We implemented our approach in a tool called REK. REK supports C programs with tasks, priorities, priority ceiling locks, and shared variables. It takes a concurrent periodic program that cannot be analyzed with standard tools for sequential verification and converts it to become analyzable with such tools. Although in principle REK is compatible with any analyzer for bounded (loop- and recursion-free) C programs, in practice we rely on the CBMC tool by Daniel Kroening, which is one of the first and most mature bounded model checkers for C. CBMC can automatically analyze substantial C programs by encoding assertion violation to Boolean satisfiability queries. CBMC is a mature and robust tool that has been extensively applied to many industrial problems. How REK Works The analysis problem that REK is designed to solve is to check that a given periodic program is safe under all legal scheduling of tasks. REK solves a time-bounded version of this problem, e.g., whether the program is safe in the first 100ms, 200ms, 300ms, etc., starting from some user-specified initial condition. A time-bounded verification makes sense in the context of periodic programs since their execution can be naturally partitioned by time-intervals. Of course, in practice, unbounded verification would be preferred, so we are working on extending REK in this direction. We briefly summarize the sequentialization step done by REK. First, we divide a time-bounded execution into execution rounds (or, rounds for short). The execution starts in round 0, a new round starts (and the old one stops) whenever a job of some task finishes. An execution with X jobs therefore requires X execution rounds. The sequentialization step simulates execution of each round independently and then combines them (using non-deterministic choice) into a single legal execution. Further details of the construction are available in our FMCAD 2011 paper referenced below. In addition to the basic sequentialization described above, REK is extended with the following features to achieve scalability to realistic programs: Partial order reduction is a set of techniques used in model checking to reduce the number of interleavings that must to be explored in a concurrent system. For example, if there are two independent actions a and b, then only one of the two executions ‘a followed by b’ or ‘b followed by a’ must be explored since they both lead to the same destination state. Although there are many approaches for partial order reduction in explicit state model checking (as opposed to symbolic model checking used in this work), extending them to symbolic verification is an area of active research. In REK, we developed a new partial order reduction technique that restricts explored executions only to those in which a read statement is preempted by a write statement to the same variable, or a write is preempted by a read or a write. This reduction eliminates many unnecessary interleavings and cuts the search space significantly. Our experiments show that the reduction is quite effective in practice. A limitation to our approach is that it does not keep track of the actual execution time of each instruction, each job, and each task. As such, it is an over-approximation since it explores more executions than actually possible and can produce a "false positive" by producing a counter-example trace that is not possible on a given hardware architecture due to timing restrictions. To reduce the number of false positives, we further constrain our sequentialization by the information that can be inferred from schedulability analysis. Thus, if a periodic program is schedulable, it satisfies the rate monotonic analysis (RMA) equations. Those equations can be used to compute an upper bound on the number of times any given low priority job can be preempted by any given high priority job. We call this the preemption bound, which REK uses to further reduce the number of interleavings by keeping track how many times one task preempts another, and ensuring that this value never exceeds the preemption bound for the jobs of that task. To deal with practical periodic programs, REK provides support for two types of commonly used lock primitives. In particular, it supports preemption locks (preemptions are disabled when the lock is held) and priority ceiling locks (preemption by any task with lower priority than the lock is disabled when the lock is held). We are extending REK to support the third common type of locks, priority-inheritance locks (regular blocking locks, but the priority of a low-priority task that holds a lock l is increased if a high-priority task is waiting for l). As part of our research, we created a model problem using the NXTway-GS, which is a two-wheeled, self-balancing robot that responds to Bluetooth commands. The robot uses a gyroscope to balance itself upright by applying power to left and right wheels. It also uses a sonar sensor so that when it comes to an obstacle, like a wall or ditch, it can back up. We have used REK to verify and fix several communication consistency properties between the tasks of the robot. More information on the use of REK for the NXTway-GS is available at http://www.andrew.cmu.edu/~arieg/Rek. Phase 2: Regression Verification for Multi-threaded Programs In the second phase of our work, we examined regression verification for multi-threaded programs. We believe that that once we have regression verification for multi-threaded programs, we can adapt it to periodic programs as well. Every instance of regression verification is based on some underlying notion of equivalence. The equivalence notion for single-threaded software is called partial equivalence: two functions are partially equivalent if they produce the same output for the same input. A multi-threaded program, conversely, is not partially equivalent to itself by the above definition since the same input can lead to different outputs due to scheduling choices. Our first challenge therefore involved creating a notion of equivalence for multi-threaded software. Our second challenge was to come up with the right notion of decomposition to establish equivalence of programs from equivalence of their functions. Equivalence of sequential programs is done using Intput/Output equivalence. Two sequential programs are equivalent if it is possible to show that their corresponding functions have the same Input/Output behavior (produce the same output given the same input). In the case of multi-threaded programs, however, functions from different threads of a single program affect one another, making simple decomposition at the level of functions much harder because it must take interference from other threads into account. To check whether two multi-threaded programs are partially equivalent (P = P’) we use a proof rule consisting of a set of premises and a conclusion. Each premise establishes the partial equivalence of a pair of functions f and f’ from P and P’, respectively. A premise is established by verifying a single-threaded program. As part of this work, we developed two separate proof rules: The first rule attempts to show equivalence of two programs by showing that their corresponding functions are Input/Output equivalent (produce the same output for a given input) under arbitrary interference, where "interference" means that the value of shared variables can change between execution of instructions of a thread. This rule is "strong" (not widely applicable on many equivalent programs) because in practice the functions must be equivalent only in the context of the given program and not under arbitrary interference. The second rule improves on the first rule by attempting to show that two programs are equivalent by restricting interference to what is consistent with the other functions in the program. For example, if there is no other function in a program that can affect a global variable ‘x’, then no interference that modifies ‘x’ is considered. This rule is "weaker" (more widely applicable) than the first one, but is computationally harder to automate. In Conclusion The ability to statically reason about correctness of periodic programs and the ability to perform regression verification adds the following key capabilities to an RTES developer’s toolbox: ability to check prior to deployment that the program does not violate its assertions, ability to check that top-level application programming interfaces (APIs) are not affected by low-level refactoring and/or performance optimizations, ability to check that new APIs are backward compatible with old APIs, and ability to perform impact analysis to determine which function may possibly be affected by a given source code change and which unit tests must be repeated. We believe these capabilities can lower the cost of developing RTESs, while increasing their reliability and trustworthiness. Additional Resources For more information about our tool REK and our experiments, please visit http://www.andrew.cmu.edu/user/arieg/Rek For more information about the bounded model checker CBMC, please visit http://www.cprover.org/cbmc B. Goldin and O. Strichman. "Regression Verification," in Proceedings of DAC 2009, pp. 466-471. S. Chaki, A. Gurfinkel, and O. Strichman. "Time-Bounded Analysis of Real-Time Systems," in Proceedings of FMCAD 2011, pp. 72-80. S. Chaki, A. Gurfinkel, and O. Strichman. "Regression Verification for Multi-Threaded Programs," to appear in Proceedings of VMCAI 2012.

SEI . Blog .  Jul 27, 2015 02:52pm

Using Predictive Modeling in Software Development: Results from the Field

By Dennis R. GoldensonSenior Member of the Technical Staff Software Engineering Measurement and Analysis As with any new initiative or tool requiring significant investment, the business value of statistically-based predictive models must be demonstrated before they will see widespread adoption. The SEI Software Engineering Measurement and Analysis (SEMA) initiative has been leading research to better understand how existing analytical and statistical methods can be used successfully and how to determine the value of these methods once they have been applied to the engineering of large-scale software-reliant systems. As part of this effort, the SEI hosted a series of workshops that brought together leaders in the application of measurement and analytical methods in many areas of software and systems engineering. The workshops help identify the technical barriers organizations face when they use advanced measurement and analytical techniques, such as computer modeling and simulation. This post focuses on the technical characteristics and quantified results of models used by organizations at the workshops. Participants were invited and asked to present at the workshops only if they had empirical evidence about the results of their modeling efforts. A key component of this work is assembling leaders within the organizations who know how to conduct measurement and analysis and can demonstrate how it is successfully integrated into the software product development and service delivery processes. Understandably, attendees don’t share proprietary information, but rather talk about the methods that they used, and, most importantly, they learn from each other. At a recent workshop, the various models discussed were statistical, probabilistic, and simulation-based. For example, organizational participants demonstrated the use of Bayesian belief networks and process flow simulation models to define end-to-end software system lifecycle processes requiring coordination among disparate stakeholder groups to meet product quality objectives and efficiency of resource usage, described the use of Rayleigh curve fitting to predict defect discovery (depicted as defect densities by phase) across the software system lifecycle and to predict latent or escaping defects, and described the use of multivariable linear regression and Monte Carlo simulation to predict software system cost and schedule performance based on requirements volatility and the degree of overlap of the requirements and design phases (e.g. surrogate for risk of proceeding with development prematurely). Quantifying the Results The presentations covered many different approaches applied across a large variety of organizations. Some had access to large data repositories, while others used small datasets. Still others addressed issues of coping with missing and imperfect data, as well as the use of expert judgment to calibrate the models. The interim and final performance outcomes predicted by the models also differed considerably, and included defect prevention, customer satisfaction, other quality attributes, aspects of requirements management, return on investment, cost, schedule, efficiency of resource usage, and staff skills as a function of training practices. One case study, presented by David Raffo, professor of business, engineering, and computer science at Portland State University, described an organization releasing defective products with high schedule variance. The organization’s defect-removal activities were based on unit test, where they faced considerable reliability problems. They knew they needed to reduce schedule variance and improve quality, but they had a dozen ideas to consider for how to actually accomplish that. They wanted to base their decision on a quantitative evaluation of the likelihood of success of each particular effort. A state-based discrete event model of large-scale commercial development processes was built to address that and other problems. The simulation was parameterized using actual project data. Some outcomes predicted by the model included the following: cost in staff-months of effort or full-time-equivalent staff used for development, inspections, testing, and rework, numbers of defects by type across the life cycle, delivered defects to the customer, and calendar months of project cycle time. Raffo’s simulation model was used as part of a full business case analysis. The model ultimately determined likely return on investment (ROI) and related financial performance under different proposed process change scenarios. Another example presented by Neal Mackertich and Michael Campo of Raytheon Integrated Defense Systems demonstrated the use of a Monte Carlo simulation model they developed. The model was created to support Raytheon’s goal of developing increasingly complex systems with smaller performance margins. One of their most daunting challenges was schedule pressure. Schedules are often managed deterministically by the task manager, limiting the ability of the organization to assess the risk and opportunity involved, perform sensitivity analysis, and implement strategies for risk mitigation and opportunity capture. The model developed at Raytheon allowed them to statistically predict their likelihood of meeting schedule milestones, identify task drivers based on their contribution to overall cycle time and percentage of time spent on the critical path, and develop strategies for mitigating the identified risk. The primary output of the model was the prediction interval estimate of schedule performance (generated from Monte Carlo simulation) using individual task duration probability estimation and an understanding of the individual task sequence relationships. Engineering process funding was invested in the development and deployment of the model and critical chain project management, resulting in a 15 - 40% reduction in cycle time duration against baseline. Encouraging Adoption While these types of models are used frequently in other fields, they are not as often applied in software engineering, where the focus has often been on the challenges of the system being developed. As the field matures, more analysis should be done to determine quantitatively how products can be built most efficiently and affordably, and how we can best organize ourselves to accomplish that. The initial cost of model development can range from a month or two of staff effort to a year depending on the scope of the modeling effort. Tools can range from $5,000 to $50,000 depending on the level of capability provided. As a result of these kinds of investments, models can and have saved organizations millions of dollars through resultant improvements. Our challenge is to help change the practice of software engineering, where the tendency is to "just go out and do it" to include this type of product and process analysis. To do so, we know we have to conclusively demonstrate that the information gained is worth the expense and bring these results to a wider audience. Additional Resource: To read the SEI technical report, Approaches to Process Performance Modeling: A Summary from the SEI Series of Workshops on CMMI High Maturity Measurement and Analysis, please visit www.sei.cmu.edu/library/abstracts/reports/09tr021.cfm

SEI . Blog .  Jul 27, 2015 02:51pm

A Summary of Key SEI R&D Accomplishments in 2011

By Douglas C. Schmidt Chief Technology Officer A key mission of the SEI is to advance the practice of software engineering and cyber security through research and technology transition to ensure the development and operation of software-reliant Department of Defense (DoD) systems with predictable and improved quality, schedule, and cost. To achieve this mission, the SEI conducts research and development (R&D) activities involving the DoD, federal agencies, industry, and academia. One of my initial blog postings summarized the new and upcoming R&D activities we had planned for 2011. Now that the year is nearly over, this blog posting presents some of the many R&D accomplishments we completed in 2011. Our R&D benefits the DoD and other sponsors by identifying and solving key technical challenges facing developers and managers of current and future software-reliant systems. Our R&D work focuses on the following four major areas of software engineering and cyber security: Innovating software for competitive advantage. This area focuses on producing innovations that revolutionize development of assured software-reliant systems to maintain the U.S. competitive edge in software technologies vital to national security. Securing the cyber infrastructure. This area focuses on enabling informed trust and confidence in using information and communication technology to ensure a securely connected world to protect and sustain vital U.S. cyber assets and services in the face of full-spectrum attacks from sophisticated adversaries. Advancing disciplined methods for engineering software. This area focuses on improving the availability, affordability, and sustainability of software-reliant systems through data-driven models, measurement, and management methods to reduce the cost, acquisition time, and risk of our major defense acquisition programs. Accelerating assured software delivery and sustainment for the mission. This area focuses on ensuring predictable mission performance in the acquisition, operation, and sustainment of software-reliant systems to expedite delivery of technical capabilities to win the current fight. Following is a sampling of the SEI’s R&D accomplishments in each of these areas during 2011 with links to additional information about these projects. Innovating Software for Competitive Advantage Although the SEI advocates software architecture documentation as a software engineering best practice, the specific value of software architecture documentation has not been established empirically. The blog posting Measuring the Impact of Explicit Architecture Documentation describes a research project we conducted to measure and understand the value of software architecture documentation on complex software-reliant systems, focusing on creating architectural documentation for a major subsystem of Apache Hadoop, the Hadoop Distributed File System (HDFS). The SEI has developed algorithms and tools for optimize the performance of cyber-physical systems without compromising their safety. The blog posting Ensuring Safety in Cyber-Physical Systems describes a safe double-booking algorithm that reduces the over-allocation of processing resources needed to ensure the timing behavior of safety-critical tasks in cyber-physical systems. A subsequent posting describes an algorithm for supporting mixed-criticality operations by giving more central processing unit (CPU) time to functions with higher value while ensuring critical timing guarantees. Together with researchers at CMU, the SEI has worked to develop cloudlets, which are localized, lightweight servers running one or more virtual machines on which soldiers can offload expensive computations from their handheld mobile devices, thereby providing greater processing capacity and helping conserve battery power. The blog posting Cloud Computing for the Battlefield describes a cloudlet prototype the SEI developed to recognize faces on an Android smartphone. A subsequent posting describes how the SEI is using cloudlets to help soldiers perform other mission capabilities more effectively, including speech and imaging recognition, as well as decision making and mission planning. SEI-developed methods and tools allow soldier end-users to program their smartphones to provide an interface tailored to the information they need for a specific mission. The blog posting A New Approach for Handheld Devices in the Military motivates the need for soldiers to access information on a handheld device and described software we are developing to enable soldiers to tailor the information for a given mission or situation. A subsequent blog posting describes the challenges the SEI encountered when equipping soldiers with end-user programming tools. Other SEI-developed methods and tools help reduce the time and effort needed to re-certify mission- and safety-critical real-time embedded software systems (RTESs) after significant changes have be made, such as migrating a single-core RTES to a multi-core platform, significant code refactoring, or performance optimizations. The blog posting on Regression Verification of Real-time Embedded Software focuses on research in applying regression verification (which involves deciding the behavioral equivalence of two closely related programs) to help the migration of RTESs from single-core to multi-core platforms. A subsequent posting describes regression verification tools and techniques that the SEI is building to conduct static analysis of RTESs. Securing the Cyber Infrastructure A large percentage of cybersecurity attacks against DoD and other government organizations are caused by disgruntled, greedy, or subversive insiders, employees, or contractors with access to that organization’s network systems or data. The blog posting Protecting Against Insider Threads with Enterprise Architecture Patterns describes work that researchers at the CERT® Insider Threat Center have been conducting to help protect next-generation DoD enterprise systems against insider threats by capturing, validating, and applying enterprise architectural patterns. These patterns can be used to ensure that the necessary agreements are in place (IP ownership and consent to monitoring), critical IP is identified, key departing insiders are monitored, and the necessary communication among departments takes place to mitigate the impact of insider threats. The SEI has been conducting research to help organizational leaders manage critical services in the presence of disruption by presenting objectives and strategic measures for operational resilience, as well as tools to help them select and define those measures. The blog posting Measures for Managing Operational Resilience describes how the SEI has been exploring the topic of managing operational resilience at the organizational level for the past seven years through development and use of the CERT Resilience Management Model (CERT-RMM). The CERT-RMM is a capability model designed to establish the convergence of operational risk and resilience management activities and apply a capability level scale that expresses increasing levels of process performance. New malicious code analysis techniques and tools being developed at the SEI will better counter and exploit adversarial use of information and communication technologies. The blog posting Fuzzy Hashing Techniques in Applied Malware Analysis describes a technique the SEI has developed to help analysts determine whether two pieces of suspected malware are similar. A subsequent posting discusses types of malware against which similarity measures of any kind (including fuzzy hashing) may be applied. Other blog postings on Learning a Portfolio-Based Checker for Provenance-Similarity of Binaries and Using Machine Learning to Detect Malware Similarity describe our research on using classification (a form of machine learning) to detect "provenance similarities" in binaries, which means that they have been compiled from similar source code (e.g., differing by only minor revisions) and with similar compilers (e.g., different versions of Microsoft Visual C++ or different levels of optimization). Yet another blog posting A New Approach to Modeling Malware using Sparse Representation describes our use of suffix trees, zero-suppressed binary decision diagrams, and sparse representation modeling to create a rapid search capability that allows analysts to quickly analyze a new piece of malware. Advancing Disciplined Methods for Engineering Software Recent SEI research aims to improve the accuracy of early estimates (whether for a DoD acquisition program or commercial product development) and ease the burden of additional re-estimations during a program’s lifecycle. The blog posting Improving the Accuracy of Early Cost Estimates for Software-Reliant Systems describes challenges we have observed trying to accurately estimate software effort and cost in DoD acquisition programs, as well as other product development organizations. A subsequent post explores a method and tools the SEI is developing to help cost estimation experts get the right information into a familiar and usable form for producing high quality cost estimates early in the lifecycle. A notable new approach at the SEI combines elements of the SEI’s Architecture Centric Engineering (ACE) method, which requires effective use of software architecture to guide system development, with its Team Software Process (TSP), which is a team-centric approach to developing software that enables organizations to better plan and measure their work and improve software development productivity to gain greater confidence in quality and cost estimates. The blog postings Combining Architecture-Centric Engineering Within TSP and Using TSP to Architect a New Trading System describe how ACE was applied within the context of TSP to develop system architecture to create a reliable and fast new trading system for Groupo Bolsa Mexicana de Valores (BMV, the Mexican Stock Exchange). Over the last several years, the SEI hosted a series of workshops that brought together leaders in the application of measurement and analytical methods in many areas of software and systems engineering. The workshops helped identify the technical barriers organizations face when they use advanced measurement and analytical techniques, such as computer modeling and simulation. The blog posting on Using Predictive Modeling in Software Development: Results from the Field describes the technical characteristics and quantified results of models used by organizations at the workshops. Accelerating Assured Software Delivery and Sustainment for the Mission The SEI has been assisting large-scale DoD acquisition programs in developing systematically reusable software platforms that provide applications and end-users with many net-centric capabilities, such as cloud computing or Web 2.0 applications. The blog posting A Framework for Evaluating Common Operating Environments explains how the SEI developed a Software Evaluation Framework and applied it to help assess the suitability of common operating environments for the U.S. Army. Methods and processes that enable large-scale software-reliant DoD systems to innovate rapidly and adapt products and systems to emerging needs within compressed time frames were another area of exploration for the SEI. A series of blog postings details our research on improving the overall value delivered to users by strategically managing technical debt, which involves decisions made to defer necessary work during the planning or execution of a software project, as well as describing the level of skill needed to develop software using Agile for DoD acquisition programs and the importance of maintaining strong competency in a core set of software engineering processes. Teams at the SEI also have been researching common problems faced by acquisition programs related to the development of IT systems, including communications, command, and control; avionics; and electronic warfare systems. A series of blog postings covers acquisition problems, such as misaligned incentives, which occur when different individuals, groups, or divisions are rewarded for behaviors that conflict with a common organizational goal the need to sell the program, which describes a situation in which people involved with acquisition programs have strong incentives to "sell" those programs to their management, sponsors, and other stakeholders so that they can obtain funding, get them off the ground, and keep them sold the evolution of "science projects," which describes how prototype projects that unexpectedly grow in size and scope during development often have difficulty transitioning into a formal acquisition program, and the tragedy of common infrastructure and joint programs, which arises when multiple organizations attempt to cooperate in the development of a single system, infrastructure, or capability that will be used and shared by all parties. The SEI also developed a collaborative method for engineering systems with critical safety and security ramifications. A series of blog postings on this topic explores problems with safety and security requirements, examines key obstacles that acquisition and development organizations encounter concerning safety- and security-related requirements, and explains how the Engineering Safety- and Security-related Requirements (ESSR) method overcomes these obstacles. Concluding Remarks As you can see from the summary of accomplishments above, 2011 has been a highly productive and exciting year for the SEI R&D staff. Naturally, this blog posting just scratches the surface of SEI R&D activities. Please come back regularly to the SEI blog for coverage of these and many other topics we’ll be doing in 2012. As always, we’re interested in new insights and new opportunities to partner on emerging technologies and interests. We welcome your feedback and look forward to engaging with you on the blog; as always we invite your comments below.

SEI . Blog .  Jul 27, 2015 02:50pm

The Road Ahead for SEI R&D in 2012

By Douglas C. SchmidtChief Technology Officer After 47 weeks and 50 blog postings, the sands of time are quickly running out in 2011. Last week’s blog posting summarized key 2011 SEI R&D accomplishments in our four major areas of software engineering and cyber security: innovating software for competitive advantage, securing the cyber infrastructure, accelerating assured software delivery and sustainment for the mission, and advancing disciplined methods for engineering software. This week’s blog posting presents a preview of some upcoming blog postings you’ll read about in these areas during 2012. Innovating Software for Competitive Advantage The Value-Driven Incremental Development team is creating quantitative engineering techniques to support rapid delivery of high-value, high-quality software capabilities to the DoD. Their approach is based on quality attribute analysis models that guide incremental development so that DoD acquisition program offices will be able to get warfighters the features they need most, when they need them, while balancing speed-of-delivery, quality, value, and cost tradeoffs. The Cyber-Physical Systems team is developing algorithms and verification techniques that enable the DoD to deliver reliable mission-critical capability cost-effectively by automating more of the development and assurance of cyber-physical embedded control systems. Their approach is based on new algorithms for precise and scalable functional analysis of real-time systems by exploiting scheduling constraints, as well as new resource reclamation algorithms for multi-threaded tasks in multi-core processors. The Socio-Adaptive Systems team is establishing a new class of adaptive socio-technical systems wherein people, networks, and computer applications can locally decide how to respond when the demand for resources (network resources in this case) outstrips supply, while ensuring the best global use of whatever capacity is available. Their research combines the adaptability of human social institutions—in particular those based in market institutions—with automated network-resource optimization so that scarce tactical network capacity will automatically, continuously, and effectively be allocated to warfighters based on their needs. The Edge-Enabled Tactical System team is improving the quality and relevance of information available to dismounted (edge) warfighters so the information they receive will be more consistent with and useful for their current missions. They are developing model-driven techniques and tools that will enable tactical units (e.g., squads of soldiers) to consume less battery power, computation, and bandwidth resources when performing their missions. Securing the Cyber Infrastructure The CERT Secure Coding Initiative is conducting research to reduce the number of software vulnerabilities to a level that can be mitigated in DoD operational environments. This work focuses on static and dynamic analysis tools, secure coding patterns, and scalable conformance testing techniques that help prevent coding errors or discover and eliminate security flaws during implementation and testing. The CERT Insider Threat team is evaluating techniques for detecting known insider threats prior to attack, to assist the DoD in preventing future high-impact data loss. This work is leveraging the hundreds of cases in the CERT Insider Threat Database, simulation capacity in CERT’s Insider Threat Laboratory, and system dynamics models of insider crime to create the socio-technical architectural foundations to prevent this kind of damage now and into the future. The CERT Coordination Center is developing methods and tools to reduce the cost to DoD suppliers and acquirers of improving software assurance and reliability during development and testing. Their aim is to enable these groups to identify software defects via dynamic blackbox "fuzz testing" in a manner identical to what an attacker would be able to perform, to remediate these vulnerabilities before the software is deployed operationally to the DoD. The CERT Malicious Code team is developing tools to analyze obfuscated malware code to enable analysts to more quickly derive the insights required to protect and respond to intrusions of DoD and other government systems. Their approach uses semantic code analysis to de-obfuscate binary malware to a simple intermediate representation and then convert the intermediate representation back to readable binary that can be inspected by existing malware tools. Accelerating Assured Software Delivery and Sustainment for the Mission The Alternative Methods group is researching methods for increasing adoption of incremental development methods to accelerate delivery of software-related technical capabilities while reducing the cost, acquisition time and risk of major defense acquisition programs. Their approach focuses on developing a contingency model that identifies conditions and thresholds for when and how to use incremental development approaches in a DoD acquisition context. They are also documenting incremental development patterns and guidelines that chart the course for removing barriers to effective adoption of incremental and iterative approaches in the DoD. The Acquisition Dynamics team is evaluating methods that mitigate the effects of misaligned acquisition program organizational incentives and adverse software-reliant acquisition structural dynamics by improving program decision-making. Their objective is to help DoD acquisition programs overcome some of the most severe counter-productive behaviors that stem from inherent social dilemmas by using known solutions drawn from fields such as behavioral economics, and thus deploy higher-quality systems to the field in a more timely and cost-effective manner. Advancing Disciplined Methods for Engineering Software The Software Engineering Measurement and Analysis group is developing methods and tools for modeling uncertainties for pre-milestone A cost estimates to minimize the occurrence of severe acquisition program cost overruns due to poor estimates. Their approach involves synthesizing Bayesian belief network modeling and Monte Carlo simulation to model uncertainties among program change drivers, allow subjective inputs, visually depict influential relationships and outputs to aid team-based model development, and assist with the explicit description and documentation underlying an estimate. Concluding Remarks This concludes our blog postings for 2011. It’s been my great pleasure and privilege to work with the technical staff at the SEI this year to better acquaint you with the SEI body of work. We’ve enjoyed reading your comments and hope that you’ve learned more about the R&D activities that we’re pursuing. We wish all of you a happy holiday season and look forward to hearing from you in 2012.

SEI . Blog .  Jul 27, 2015 02:49pm

Modeling Malware with Suffix Trees

By Will Casey Senior Researcher CERT Through our work in cyber security, we have amassed millions of pieces of malicious software in a large malware database called the CERT Artifact Catalog. Analyzing this code manually for potential similarities and to identify malware provenance is a painstaking process. This blog post follows up our earlier post to explore how to create effective and efficient tools that analysis can use to identify malware. At the heart of our approach are longest common substring (LCS) measures, which describe the amount of shared code in malware. In this post we explain how to create measures for similarity studies on malware via a suffix tree, which is a data structure that encodes an entire map of shared substrings in a malware corpus, such as the CERT Artifact Catalog. We characterize the performance characteristics of suffix trees and quantify their dependence on memory and input size. We also demonstrate the efficient construction of suffix trees for large malware data sets involving thousands of files. In addition, we compare LCS measures to the laborious and time intensive process of manually creating signatures (which are regular expressions applied to the binary thought to be both specific and indicative for malware. Building the Suffix Tree By building a suffix tree data structure for the CERT Artifact Catalog we can form a better representation of the malware corpus for studies of malware involving string query, shared string usage, and string similarity. Having uncharacterized data is like being in unexplored, unmapped territory. A suffix tree allows analysts to explore and map the malware landscape. Shared code becomes the topographical features of the mapped landscape. As travelers use a map and the landscape features to reason about where they are and where they want to go, so do malware analysts study the large shared substrings of the suffix tree to reason about what areas to focus on. For example, multiple malware pieces from the Zeus malware family have code in common and provide a means to explore and analyze the entire family of malware. A suffix tree can be built in time linear to the size of the input, allowing us to identify any long common substrings in linear time. We augmented the conventional suffix tree data structure and algorithm to include queries based on subsets of files and measures of information (such as Shannon-entropy) on shared strings. To scale our suffix tree data structure to large data sets we also developed external algorithms that operate efficiently beyond the capacity of main memory in a single computer. Using the Suffix Tree to Create an LCS Measure for Similarity Studies on Malware After constructing the suffix tree, we used it to analyze different families of malware, including the Poison Ivy malware family that installs a remote access tool onto an exploited machine. Poison Ivy files were collected by CERT from 2005 to 2008. Although this family of malware is no longer thought to be in active development, analysts have examined it extensively. We used Poison Ivy files as a test set to validate findings from our data structures. For example, we applied clustering based on LCS and compared it to a "ground truth" of known subgroups within the Poison Ivy family. Our suffix tree data structure enabled us to identify several LCSs that were common to many files in the Poison Ivy family. By quickly filtering out strings of low entropy, we were left with meaningful coding sequences from which we can determine sequences that are characteristic of the malicious software family. Validating the Measure After analyzing the code using the suffix trees, we compared our results against signatures that were developed over the course of several years of extensive examination by analysts. We used suffix trees to identify several critical substrings that matched identically across multiple files, exceeded a certain length, and had satisfactory information content. These landmark substrings were then used to create a feature-vector for each file; these feature-vectors were used to cluster the files into subgroups. We then created dendagrams that suggested relationships among the files based on co-location of long common substrings, as shown in the following diagram. (Click here for a larger view.) To validate the clusters, we revisited the Poison Ivy files and used the signatures that had been developed by analysts to identify versions in the software. Our evaluation showed that the LCS clustering produced groupings consistent with signatures that were developed by analysts, in many cases exposing additional sub-groups that we were unaware of. Moreover, the LCS clustering can group corrupted files and identify potential incorrect attributions. Results of our Research We used suffix trees to analyze approximately 200 to 1,000 files in about four hours and identify additional details on the structure of the family that analysts could not access via manual inspection alone. Unfortunately, people often view automated methods as a means to replace human analysis. The goal of our research, however, is to use suffix trees to create a more effective use of computing to bear against the problems of identifying malware from clean-ware. For example, malware may have components that resemble more than one family. Our new tool may allow us to identify those components of malware, as well those that set off a command-control interface or an element that may install a remote access tool. Future Work In the past year, our research has focused on creating the suffix tree data structures and ensuring that they can provide us with useful information about malware families. Our next steps are to scale the data structures to larger data sets and optimize them to allow for even larger input size. We are currently able to generate approximately 8,000 files into a data structure. Ideally, we would like to optimize the data structures and algorithms (exploiting parallelism) to include between 80,000 to 100,000 files, the size of which can exceed the main memory of a single computer. Additional Resources For further reading about CERT Program work in malware or malicious code research, click on the SEI Blog links below:A New Approach to Modeling Malware using Sparse RepresentationUsing Machine Learning to Detect Malware SimilarityFuzzy Hashing Techniques in Applied Malware AnalysisLearning a Portfolio-Based Checker for Provenance-Similarity of Binaries More information about CERT research and development is available in the 2010 CERT Research Report, which may be viewed online atwww.cert.org/research/2010research-report.pdf

SEI . Blog .  Jul 27, 2015 02:49pm

The Need to Specify Requirements for Off-Nominal Behavior

By Donald Firesmith Senior Member of the Technical StaffAcquisition Support Program In our work with acquisition programs, we’ve often observed a major problem: requirements specifications that are incomplete, with many functional requirements missing. Whereas requirements specifications typically specify normal system behavior, they are often woefully incomplete when it comes to off-nominal behavior, which deals with abnormal events and situations the system must detect and how the system must react when it detects that these events have occurred or situations exist. Thus, although requirements typically specify how the system must behave under normal conditions, they often do not adequately specify how the system must behave if it cannot or should not behave as normally expected. This blog post examines requirements engineering for off-nominal behavior. Examples of off-nominal behavior that are inadequately addressed by requirements specifications include how robust (i.e., error, fault, and failure tolerant) must the system be, how the system must behave when hardware fails or software defects are executed, how the system must react when incorrect data (e.g., out of range or incorrect data type) is input, and what should happen if the system detects that it is in an improper mode or inconsistent state. This lack of requirements specification can lead to the following omissions and questions that must be asked as a result. All credible conditions and events. How must the system behave under off-nominal sets of preconditions and trigger events that are unlikely and/or infrequent? When these conditions occur—as they invariably will—there is a risk that the system either does not handle them or the developers have been forced to guess (often incorrectly) how the system must behave. The requirements therefore need to specify how the system shall behave under all credible combinations of conditions and trigger events. Moreover, how are combinations of rare conditions and events determined to be not credible? Users and requirements engineers often underestimate the probability of rare occurrences, so they are surprised when they occur and the system reacts improperly. If these off-nominal conditions and the desired behavior of the system to them are not identified and documented early in the lifecycle, the decisions about what error/fault conditions should be handled by the system are left to individuals who may not have the proper expertise to identify such conditions, but who nevertheless feel compelled to make such decisions. Detecting off-nominal situations. How will the system recognize off-nominal combinations of conditions and events? Does the system need sensors to determine the existence of these states or occurrence of these events? How available, reliable, accurate, and precise must these sensors and inputs be? Reacting to off-nominal situations. How must the system react when it recognizes an off-nominal combination of conditions (possibly when a specific, associated event occurs)? Must it notify users or operators by providing warnings, cautions, or advisories? Must it do something to ensure that the system remains in a safe or secure state? Must the system be able to shut down in a safe and secure state or must it automatically restart? Must it record abnormal situations and the responses of the users/operators? Incomplete use case models. Use case modeling is the most common requirements identification and analysis method for functional requirements. Each use case has one or more normal (so-called "sunny day") paths (a.k.a., courses and flows) as well as several exceptional ("rainy day") paths. Unfortunately, requirements engineers often concentrate so heavily on normal paths that there is inadequate time and staffing to properly address the credible exceptional paths. This omission leads to incomplete requirements specifications that do not adequately address necessary robustness (e.g., error, fault, and failure tolerance), reliability, safety, and security. Coding standards. Programming languages typically include features and reusable code (e.g., base classes that come with the language) that are inherently unreliable, unsafe, and insecure. Because language features may not be well-defined in the language specification, their behavior may be inconsistent. For example, the use of concurrency and automated garbage collection can lead to common defects, such as race conditions, starvation, deadlock, livelock, and priority inversion. Likewise, certain language features may well be used in an incomplete manner. For example, an if/then/else clause may not contain an else clause stating what to do if the if clause precondition is not true. Similarly, a do X followed by do Y may not say what to do if X fails to complete. There are other cases such as divide by zero situations, taking the square root of negative numbers, and a lack of strong typing, as well as no verification of inputs, preconditions, invariants, post conditions, and outputs. These implementation coding defects typically start as requirements defects: incomplete requirements that do not mandate the use of reliable, safe, and secure subsets of the language, safe base classes, and automatically verified coding standards. Lack of subject matter expertise. Exception handling is often left to the programmers, who must ensure their software is error, fault, and failure tolerant and meets its requirements. Programmers will be blamed if defects prevent the system from being available, reliable, robust, safe, and secure, even if there are no relevant requirements. Unfortunately, programmers often make assumptions as to what the software’s off-nominal behavior should be. Without adequate domain expertise and sufficient contact with subject matter experts, programmers will incorporate defects and safety/security vulnerabilities. Likewise, poor quality requirements specifications show how requirement engineers struggle to address mandatory off-nominal requirements since they lack sufficient domain expertise and training to determine, analyze, and specify adequate availability, reliability, robustness, safety, and security requirements. Ultimately, the engineering of these quality requirements requires subject matter expertise that is rarely combined in any one developer. There are often many more off-nominal (and rare) combinations of conditions and events than the common nominal ones. There are also often many more ways that a system can fail. Requirements specifications are typically incomplete with regard to the previous problems, often only including 10 to 30 percent of the necessary requirements. This level of incompleteness can result in systems that fail to meet their true availability, reliability, robustness, safety, and security requirements. It is insufficient for requirements specifications to state that the system shall be highly available, reliable, robust, safe, and secure, or that it has no single points of failure. The requirements must specify all credible off-nominal combinations of conditions and events. Otherwise, software developers will make incorrect guesses, have incorrect assumptions, and ignore important off-nominal situations. Without complete requirements, verification will not catch these defects, and the resulting defective system will be fielded with highly unfortunate, if predictable, results. Because program offices cannot safely assume that the contractor will automatically address these issues, programs cannot safely leave it up to their contractors. Off-nominal situations must be properly addressed in the requirements. Studies (Knight, Weiss, Leveson) have shown that the vast majority of accidents (safety) and many common software vulnerabilities (security) result at least partially from incomplete requirements. Many availability and reliability defects due to software also result, at least partially, from incomplete requirements. Recommended SolutionsTo address the problems described above, acquisition program offices should consider the following steps: Address off-nominal requirements in the contract. The program office should contractually mandate all significant off-nominal behavior. The contract should also mandate that contractors address all credible off-nominal conditions and events affecting mission, safety, and security functionality. To address all credible conditions and events, the program office should ensure that the contractor’s requirements engineering plan explicitly states that all credible combinations of conditions and events are to be addressed, even very rare ones, if the corresponding function is mission, safety, and/or security critical. The program office should verify that the requirements engineers collaborate with reliability, safety, and security engineers to ensure than no significant combinations of states and events be overlooked. The program office should also ensure that the proper testing of the associated software in terms of test completion criteria and test case generation criteria is explicitly addressed in the software test plans. To detect off-nominal situations, the program office should ensure that the contractors have properly addressed the detection of off-nominal situations. Specifically, this includes verifying that the system engineering management plan (SEMP) as well as the system requirements, architecture, and design address situational awareness in terms of both sensors and the input of necessary data concerning off-nominal situations. When reacting to off-nominal situations, the program office should ensure that the requirements address how the system must behave if it cannot behave in the nominal manner. This process includes ensuring that the system either remains in a safe and secure state or shuts down safely and securely (i.e., is fail-safe). It also includes notifications (warnings, cautions, and advisories) as well as logging any associated error, fault, or failure information. To ensure against incomplete use case models, the program office should ensure that they include all credible normal and exceptional use case paths, and also ensure that adequate project schedule, budget, and staffing are allocated to complete the models. Verification of the requirements and their associated models should explicitly address exceptional as well as normal use case paths. With respect to contractor coding standards, the program office should ensure they explicitly address eliminating common design and coding defects that make the software less available, reliable, robust, safe, and secure. The program office should also ensure these coding standards are properly followed including, where practical, automatic verification via static and dynamic code checking. With respect to subject matter expertise, the program office should ensure that associated quality requirements mandating adequate availability, reliability, robustness, safety, and security are engineered by cross-functional teams of closely collaborating requirements engineers, subject matter experts, stakeholders, and engineers specializing in reliability, safety, and security. This team must identify the appropriate credible off-nominal situations and decide which of these situations should be analyzed and turned into associated requirements given programmatic constraints such as cost, schedule, available development staffing, and critical functionality. The program office should also ensure that the contractors and subcontractors use appropriate coding standards and associated foundational software (e.g., a safe and secure subset of C++, including safe and secure base classes). Most acquisition programs suffer from incomplete requirements, especially with regard to dealing with rare combinations of states and events, detecting and reacting to off-nominal situations, use-case models that are incomplete due to missing exceptional use case paths, and either inadequate coding standards or coding standards not being followed. The engineers who actually develop the software often lack adequate expertise in availability, reliability, robustness, safety, and security requirements, which yields systems that do not meet their associated requirements. While the problems are well known, so are their answers. Program offices, therefore, must ensure that these answers are implemented, enforced, and verified to be effective and efficient. Additional Resources: To read Don Firesmith’s series on The Importance of Safety- & Security-Related Requirements, please visit http://blog.sei.cmu.edu/archives.cfm/category/safety-related-requirements

SEI . Blog .  Jul 27, 2015 02:49pm

Displaying 29171 - 29180 of 43689 total records

Blogs

Alert Others