Blogs
|
Marimba! Marimba! MARIIIIIMBAAAAAAAA! My phone kept ringing wildly as I approached the movie theater this past Thanksgiving. As many of you have guessed by now, I love the movies and I was on my way to see Horrible Bosses 2 (horrendous, I know) when I was taken down a different road by my cousin back in Miami. I answered and he uttered, "You won’t believe the story I have for you. This is absolute MADNESS!" ...
SHRM
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:15pm</span>
|
|
By Suzanne Miller Principal Researcher Software Solutions Division
In 2010, the Office of Management and Budget (OMB) issued a 25-point plan to reform IT that called on federal agencies to employ "shorter delivery time frames, an approach consistent with Agile" when developing or acquiring IT. OMB data suggested Agile practices could help federal agencies and other organizations design and acquire software more effectively, but agencies needed to understand the risks involved in adopting these practices. Two years later, OMB directed agencies to consider Agile development in its 2012 contracting guidance. As organizations work to become more agile, they can employ the 12 principles outlined in the Agile Manifesto to assess progress. I work with a team of researchers at the SEI who explore the barriers and enablers to applying Agile in government settings. We have found that each of these principles plays out differently in the federal landscape. While some principles are a natural fit, others are harder to implement. This blog post introduces a series of discussions recorded as podcasts about the application (and challenges) of the 12 Agile principles across the Department of Defense (DoD).
First Agile Principle: Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
Below is an excerpt from our podcast:
Mary Ann: The problem I see is for those in the field, if this is already a fielded system, or even if it’s being developed, they can’t deal with that frequent of a release. So, they lump releases together, and then they send them out every 8 months, 9 months, 18 months, whatever it is that works for them from a deployment viewpoint.Suzanne: So, even though we’re producing valuable software as a program, we may not actually get to deliver it, in the way that commercial settings might be able to deliver it.Mary Ann: Correct.Suzanne: I’ve seen what we call a "sandbox" as one of the ways that we deal with that. So, in a sandbox setting, I will put each iteration’s software into what we call "a sandbox area." That area could allow user access, but it isn’t a full deployment.
To listen to the complete podcast, please click here.
Second Agile Principle: Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.
Below is an excerpt from our podcast:
Suzanne: One of the ways to get it out to the field faster is to not presuppose that all of the requirements that we think of at the beginning have to be designed and implemented. We need to prioritize them in a way that allows us to get something out there that the people can try, so the learning can occur.Mary Ann: One of the key things, if you’re going to use Agile methods, is have enough definition up front of what you want to do, but not so much detail that you can’t learn, that it can’t change, because your environment changed.
To listen to the complete podcast, please click here.
Third Agile Principle: Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
Below is an excerpt from our podcast:
Suzanne: What we see in a lot of acquisitions is we get lots of deliveries, but the early deliveries are more documentation, more review meetings, more review meeting slide decks. That’s a really different focus for a lot of people in the contract settings where, you know, working software comes after all that stuff is delivered. So this is saying, Don’t wait to actually deliver working software. So, you’ve got to really go from a document-centric lifecycle, and view of the world, to an implementation-centric focus. Mary Ann: That’s true. Suzanne: And that’s a culture change for the acquisition systems also, isn’t it?
To listen to the complete podcast, please click here.
Fourth Agile Principle: Business people and developers must work together daily throughout the project.
Below is an excerpt from our podcast:
Suzanne: So, from their view point, business people are essentially the marketing people that understand what the market is, what market they are trying to penetrate. It’s the end users who are actually going to use the product. So, they are looking at business people in a little different way than we do in the DoD. So, they don’t necessarily have quite that same difference between he who pays for it or she who pays for it and the person that’s using it. Mary Ann: And, the thing is with the DoD environment, you obviously have to have the acquirers because those are the people that are trained to do the acquisition. And, they have the warrants, if you will: the permissions and the legal authorities to do it. However, they need to be working with the end users. And, typically, in a traditional environment, they do. They go out. They gather all the information from all the different end users, and they can be multiple groups.
To listen to the complete podcast, please click here.
Fifth Agile Principle: Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.
Below is an excerpt from our podcast:
Mary Ann: …Trust, but verify. That’s very much the moniker of today. However, some organizations trust more than others. In many cases trust is just not there at all. In that environment it would be very difficult to do an agile-type of development without a lot of change in culture.Suzanne: So, you and I have both seen some settings where a development organization, usually a contractor, is trying to use Agile principles. Yet, at the same time, they are being asked to do, at least the same if not more, documentation than they had in the past. They are asked for the detailed team metrics not just the typical management metrics. They are being asked for a lot of information that would make you think they are not very trusted. In those settings we’ve seen not very good success with Agile methods. I would assert that that’s actually one of the reasons—that there isn’t a feeling of trust.
To listen to the complete podcast, please click here.
Sixth Agile Principle: The most efficient and effective method of conveying information to and within a development teams is face-to-face conversation.
Below is an excerpt from our podcast:
Suzanne: If the primary means of communication is phone only—without some other screen sharing, without some other way of understanding what’s going on or tons of email and not a lot of even voice communication—you really are reducing the bandwidth, and you are going to reduce the ability of the team to deal with problems and to deal with the issues that inevitably come up because that’s when you need people to have your back.This is a principle that in my mind—you can get support for it, I think, better than maybe some of the other principles—but you’ve got to ask for it. You’ve got to know that that’s what you need to pay attention to, and you’ve got to pay attention to it.Mary Ann: Well, and getting the support may require a little bit of investment in infrastructure because not everybody will have those tools available.
To listen to the complete podcast, please click here.
Seventh Agile Principle: Working software is the primary measure of progress.
Below is an excerpt from our podcast:
Suzanne: There is a huge amount of time where you are working on design documents, working on requirements refinement, working on interface descriptions, working on everything, working on essentially everything but the software itself. So, this mindset that working software essentially takes precedence over some of the other artifacts that we are accustomed to. This is a very big shift for our DoD audience.Mary Ann: It is a big shift. The other thing that this makes me think of when you start talking about measuring, most of the very large systems by mandate are required to do earned value management. It is a fairly rigid system where everything’s defined and you don’t change things. But, if you’re going to do it on working software…Suzanne: …and, if you’re allowed to change requirements at the lower level.Mary Ann: …and, you’re allowed to change things, then how does that go? That has all kinds of implications on how you do that.
To listen to the complete podcast, please click here.
Eighth Agile Principle: Agile processes promotes sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.
Below is an excerpt from our podcast:
Mary Ann: There is one program we are aware of—they are into mostly what we would probably consider more sustainment—the software is already out in the field, but they are enhancing it. They are updating it and fixing bugs and so forth. Every cycle they do, they do it by release. They have 1,200 (what they call) "program points," and their users know they have 1,200 points, and so they know how much each of their wish-list things are worth. There is a lot of horse trading going on behind the scenes, so they get their 1,200 points, but they get what they need. But, they know we only can get 1,200 points—no more, no less—and it is constant.Suzanne: Part of this is setting up reasonable expectations with the end users and the customer community as well as the sponsor community. If there is a good understanding between the developers and the sponsors of the project as to what actually can be accomplished, then we have got a chance at sustainable development.
To listen to the complete podcast, please click here.
Ninth Agile Principle: Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
Below is an excerpt from our podcast:
Suzanne: We talk about Agile allowing good teams to really have superior performance. That is one of the basic aspects of Agile; there is an assumption that the people on the team are competent at what they do. So, you have got to have that competency for coding. You have got to have some competency for design, for detailed design. You have got to have some competency in unit testing. You have got to have some competency in integration. So, there are assumptions about what kinds of things your cross-functional team is capable of, and those are things that if you have them, it will enhance what you are able to do. If you don’t have them, you are going to build technical debt. You are going to build in defects….Mary Ann: It gets into the question that a lot of people say, Well, a good Agile team has to be very highly skilled. Bring your A game, if you will, your A players. An average kind of guy really won’t play well in an Agile team. Well, that’s not true. People will rise to what you expect them to do.
To listen to the complete podcast, please click here.
Tenth Agile Principle: Simplicity--the art of maximizing the amount of work not done--is essential.
Below is an excerpt from our podcast:
Mary Ann: It means maximizing on your return on investment, and getting the value you need, and then determining if some of the bells and whistles aren’t needed.Suzanne: We are also talking about the difference between creating complex architectures and complex ways of solving a problem and looking at, What is the simplest way?Often, when we say, What is the simplest way to do something? we actually stop having to do a lot of the work that goes along with implementing the complex way. So, the simplicity is not just about looking at getting value, it is also about reducing complexity. We are very good as engineers in figuring out lots of convoluted ways to make things work. So, this is really saying, Don’t go there if you don’t need to. I go back to the old Einstein quote, Make everything as simple as possible; but no simpler.
To listen to the complete podcast, please click here.
Eleventh Agile Principle: The best architectures, requirements, and designs emerge from self-organizing teams.
Below is an excerpt from our podcast:
Mary Ann: When you hear the principle best architectures, requirements and designs emerge from self-organizing teams, any self-respecting DoD manager would run screaming from the room. What do you mean emerge from a self-organizing team? Oh my gosh! You have to understand what those terms mean. That is the key to understanding what this means. It doesn’t mean chaos and people are just saying, Go do something. Heavens no. Self-organizing means you give the team boundaries. And, say, OK. Here is the problem. You give them an initial skeleton, if you will, of an architecture. You don’t just say, Go make one up. It is stuff that people would call maybe sprint zero, depending on who you talk to. Then, you let them go solve the problem. Suzanne: Let’s talk a little bit about the architecture thing because that, in the larger Agile community, has been a topic of debate for some years. Although I think we are seeing some resolution of that where most Agile teams in commercial settings are starting to acknowledge the role of architecture, not just emergent architecture but actually some design up front. They call it just enough design up front as opposed to big design up front.
To listen to the complete podcast, please click here.
Twelfth Agile Principle: At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.
Below is an excerpt from our podcast:
Suzanne: One of the things you often have when you get into DoD settings is to have multiple teams running. So, one of the things that teams in DoD settings have to be aware of is improvements locally to their own work may affect others. Usually at release time is when you will do larger retrospectives where you look across the teams that are working on a release and get everybody together and say, What do we need to change as a whole group, not just as an individual team?Mary Ann: That is true. Release time is when they usually do that, but you might want to consider doing it at the end of iterations or sprints. Because, for instance, say you have three or four different teams. And, team A and team B worked really well together, and they had some kind of cool technology they were using. But, team A and C didn’t have that, but they were doing similar kinds of interfaces. They might want to identify that and say, Why don’t we use this for our interface too or our tool?That way you are not waiting until the very end of the release to upgrade the whole group as opposed to just your team. It gets a little more complicated but it’s like Scrum of Scrums.Suzanne: This is one of the roles of a Scrum Master, to help identify these places where as the different Scrum Teams come up with things, the Scrum Masters get together and help identify where these opportunities are. That is one of the important things about this: tuning, this idea of tuning our work, is about taking advantage of the learning in the moment. It is really about that plan-do-act-check cycle over and over and over again and getting people accustomed to looking at their work from that viewpoint.
To listen to the complete podcast, please click here.
We welcome your feedback on this series as well as ideas for future topics.
Additional Resources
This series of podcasts exploring the application of the 12 Agile principles across the Department of Defense is available in its entirety at sei.cmu.edu/podcasts/agile-in-the-dod.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:15pm</span>
|
|
Simon Hurst, User Researcher
My name is Simon and I’m the user researcher on the Personal Independence Payment (PIP) Digital project. It’s my job to make sure the team is fully aware and appreciative of who our users are.
As Leisa Reichelt wrote recently:
"You are not your user and you cannot think like a user unless you're meeting users regularly"
Our users aren’t ‘just’ statistics either:
They aren’t ‘just’ one of several 100,000 PIP claimants
They aren’t ‘just’ one of the 11 million people in the UK with a long term health condition or disability
They aren’t ‘just’ in a family where 21% of children in families with at least one disabled member are in poverty.
Meeting user needs during user testing
When we’re designing services that meet the needs of DWP’s customers, we meet our users regularly. For PIP, we take extra steps to give our users the right level of support when we invite them to test our services. Our users may need to bring a carer or family member; they might need to take breaks during a user testing session; our users might need to test the service with assistive software or on their own devices; and they may prefer us to work with charities who are already supporting them, or for us to visit them in their own home - we’re able to meet all these needs.
Understanding our users’ lives
Our users are real people, with real lives, families, friends and goals. They are John. I met John and his daughter recently at a user research session. John gave me permission to tell his story and when he saw this blog he said "I’m amazed at how well you’ve captured my story".
John is 59 and a father and grandfather. John used to be a goalkeeper in a football league club in his youth. In one game, John saved two penalties for the youth team. The opposition striker wasn’t best pleased and kicked him in the back, bursting two of his vertebrae. This ended his chance to be a professional footballer, and also made it difficult to pursue his second career choice, as a joiner, so he set up a successful business. The injury resulted in the degeneration of his nerves, and he knew it would deteriorate throughout his life. The rheumatoid arthritis he developed in his teens didn’t help.
In the last few years this has prevented John from working, he had to sell the business he spent over 20 years building and that he loved being a part of. Selling his business meant he lost the lifestyle he took for granted - John had to get rid of his car, downsize his house and stop taking the holidays that meant so much to him and his wife.
This affected his mental health so much he became irritable and aggressive, he’d have "argued with a brick wall". After his heart attack John realised he needed to get help and was referred to a psychiatrist, who helped a great deal.
John was too proud to apply for Disability Living Allowance (DLA). He was having difficulty coming to terms with the fact that he could no longer do a job he loved and that applying for DLA would be like "admitting and accepting his condition". His wife and his daughter had to apply for him.
He’s now concerned that because he loses the feeling in his hands he can’t be sure he isn’t gripping his grandson’s hand too tightly or not tightly enough. John worries that he’ll hurt his grandson, or not hold him tightly enough when they’re walking near the road.
To quote Leisa again: "needs can be functional things people need to do, for example, to check eligibility. Needs can also be emotional, perhaps people are stressed and anxious and they need reassurance.".
Designing services to meet users’ needs
We can use John’s story to design a better service. We can find out what he needs to do and then try our best to get out of his way. We can make sure our designers and developers are building a service that John can use. We can work to ensure the language and tone support and reassure John and that the questions we are asking him are clear and transparent.
We know from research that when people are completing an application for PIP they can get tired, or they need to take a break to take some medicine, or they lose concentration. People also want to ‘sleep on it’, they have a first go at their answers then rework them over several days. And they really want to keep a copy of what they’ve sent us. Knowing this, we can design a service that considers this and allows people to save their application
We’ll continue to meet our users regularly as we design the PIP digital service. We share our findings with people involved in delivering the service too, to help them to understand John’s life and how we need to design a service that helps John and all users to get to their goal quickly and easily.
DWP Digital
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:15pm</span>
|
|
By Tim Palko Senior Member of the Technical StaffCERT Cyber Security Solutions Directorate
This post is the latest installment in a series aimed at helping organizations adopt DevOps.
Some say that DevOps is a method; others say it is a movement, a philosophy, or even a strategy. There are many ways to define DevOps, but everybody agrees on its basic goal: to bring together development and operations to reduce risk, liability, and time-to-market, while increasing operational awareness. Long before DevOps was a word, though, its growth could be tracked in the automation tooling, culture shifts, and iterative development models (such as Agile) that have been emerging since the early 1970s. While its community-driven evolution has given DevOps strength by infusing it with ideas from many corners of the software development world, it has also hindered the movement by not providing the community with a central set of operational guidelines.
Often, a company attempting to adopt DevOps will be doing so against the current of operational red tape and culture of silos. This transition is not easy for companies that have built their enterprise (and their employees’ expectations) on a foundation of "un-DevOps." Moreover, once the decision has been made and a group has the freedom to attempt implementation (which is often its own challenge), the group is faced with the problem of how to implement it properly.
As we’ll discuss below, DevOps adoption is not a one-step process, and it can certainly be done incorrectly (or not at all). An attempt at correctness can be found in the scientific method, with the ability to measure, test, analyze, and repeat DevOps decisions and outcomes. While many leaders in DevOps talk about what needs to be done, there have not been enough eyes and ears tasked with objectively and measurably observing change as a result of implementing DevOps.
This gap is not to say that DevOps does not prescribe monitoring and measuring. In fact, monitoring and measuring is a primary objective in some DevOps circles. The purpose of this monitoring, however, is to compare the state of a project now to that same project last week (or in another sense, to alert the team that the servers are down). This perspective is great when you need to see how well a project is progressing, but fails miserably when you need to answer the question "How far along are we on the road to DevOps implementation?"
Studies of DevOps adoption rates use the phrases "have adopted" or "will adopt," as though they are line items on an organization’s quarterly goals and objectives. Does that mean they have achieved Flickr’s 10 deployments a day, or do they use the word adopt in a softer connotation, where they have simply accepted their fate, and will now begin listening to DevOps philosophy? Given the many definitions DevOps carries, the word adopt has at least that many variations in meaning and probably more. In any case, DevOps is not a one or a zero, but a continuum of positive and negative attributes, and far from linear.
I’m not going to craft arbitrary milestones. In some teams, achieving any level of DevOps behavior is an accomplishment worthy of a catered lunch. But, to understand that DevOps is at once culture and technology goes a long way toward framing the goal. Another perspective is that your goal of DevOps adoption is what you need it to be. In other words, each organization has its own signature of pain points and struggles, and the vast array of solutions that DevOps offers is sure to provide a good start toward fixing them, even if just one or two are needed.
It seems as though the DevOps movement is doing just fine without some dry, boiled-down set of standards and metrics. However, if we focus on making changes without measuring them we risk being on an endless road to gold plating our process. This outcome would be fine, except customers are also investing real money into these cultural overhauls, whether they know it, want to, or neither. Changes must be planned, with a clear goal and a target date.
Because DevOps doesn’t come with an inclusive guidebook, identifying concrete goals and reasonable timelines can be hard. Seeing a report of, say, 400 percent decrease in release time or 8,000 percent increase in profits, can tempt organizational leaders to chase similar results. In reality, any positive result achieved from focusing on some aspect of DevOps will be proportional to the size or output of a business. While these kinds of measurements are quantifiable and objective, are they targeting the specific problems within an organization? If the current process isn’t noticeably damaging release time or profits, what is? Culture-related issues can be hard to identify, let alone quantify and measure. In many cases, providing channels for a team to report incidents in a genial manner can help to identify distinct properties of those incidents, such as severity (rating on a scale), what groups are involved, or the point during the development cycle at which it occurred? By identifying concrete metrics for a problem in this fashion, changes will become observable over time. Starting with the problem and designing a system to measure its change can be a far more effective strategy than jumping in headfirst to implement DevOps.
A set of standards and metrics might even already exist in some sense, but the casual conference-goer might be led to think they don’t, due to how DevOps is often presented: as a patchwork of stories of individual experiences, do and don’t lists, and vendors hawking automation technology. Developers new to the idea go home refreshed, approaching the task with enthusiasm, but without the clipboard and analytical squint. This approach can be dangerous for businesses that take a real risk in initiating a culture shift and then find themselves without a quantifiable goal. It is important to be aware that we are missing the dense and tabular chart that would define specific and measurable attributes for degrees of DevOps adoption. Simply knowing that we should have reachable goals is not only logical, but also helpful in guiding change as it occurs within a software development and release team.
Every two weeks, the SEI will publish a new blog post offering guidelines and practical advice for organizations seeking to adopt DevOps in practice. We welcome your feedback on this series as well as suggestions for future content. Please leave feedback in the comments section below.
Additional Resources
To view the webinar Culture Shock: Unlocking DevOps with Collaboration and Communication with Aaron Volkmann and Todd Waits, please click here.
To view the webinar What DevOps is Not! with Hasan Yasar and C. Aaron Cois, please click here.
To listen to the podcast DevOps—Transform Development and Operations for Fast, Secure Deployments featuring Gene Kim and Julia Allen, please click here.
To read all of the blog posts in our DevOps series, please click here.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:14pm</span>
|
|
We have heard it before. The proposed regulatory changes to the white collar exemption are "imminent." And, then they were delayed. Well, the regulations were sent by the DOL to the OMB. The conventional wisdom is that they will be published on June 18, 2015 (I suspect so the DOL can say "Spring"). We know the purpose and effect of the proposed regulations will be to increase the number of individuals who are non-exempt. At a minimum, exempt status will carry a heavier price tag. The federal minimum weekly salary is going up—the only question is how high. My prediction: ...
SHRM
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:14pm</span>
|
|
Everyone who drives a car understands the importance of a dashboard. How fast are you going? How much gas do you have left? Are there any warning lights flashing?
An executive dashboard can give you the same kind of information in real time for your organization and its health.
Below are the key characteristics of an executive dashboard:
Uses visual indicators as a primary mode of providing information
Connected with databases that provide near real-time information
An executive dashboard runs on your computer, uses graphs and maps as a primary display device and is connected to databases which are updated regularly so you aren’t looking at old information. Just like car dashboards, executive dashboards can vary in appearance.
There are many industry standard frameworks to implement dashboards. Examples of these include: Balanced Scorecard, Six Sigma, SCOR. In the subsequent posts, these frameworks will be discussed in greater detail.
Netwoven
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:14pm</span>
|
|
By Kevin Fall Deputy Director, Research, and CTOSEI
Software and acquisition professionals often have questions about recommended practices related to modern software development methods, techniques, and tools, such as how to apply agile methods in government acquisition frameworks, systematic verification and validation of safety-critical systems, and operational risk management. In the Department of Defense (DoD), these techniques are just a few of the options available to face the myriad challenges in producing large, secure software-reliant systems on schedule and within budget.
In an effort to offer our assessment of recommended techniques in these areas, SEI built upon an existing collaborative online environment known as SPRUCE (Systems and Software Producibility Collaboration Environment), hosted on the Cyber Security & Information Systems Information Analysis Center (CSIAC) website. From June 2013 to June 2014, the SEI assembled guidance on a variety of topics based on relevance, maturity of the practices described, and the timeliness with respect to current events. For example, shortly after the Target security breach of late 2013, we selected Managing Operational Resilience as a topic.
Ultimately, SEI curated recommended practices on five software topics: Agile at Scale, Safety-Critical Systems, Monitoring Software-Intensive System Acquisition Programs, Managing Intellectual Property in the Acquisition of Software-Intensive Systems, and Managing Operational Resilience. In addition to a recently published paper on SEI efforts and individual posts on the SPRUCE site, these recommended practices will be published in a series of posts on the SEI blog. This following post, Managing Operational Resilience by Julia H. Allen, Pamela Curtis, and Nader Mehravari, presents challenges for managing operational resilience (in this post) and recommended practices for helping organizations manage operational resilience (in the second post in this series).
Managing Operational Resilience - SPRUCE/SEIhttps://www.csiac.org/spruce/resources/ref_documents/recommended-practices-managing-operational-resilience
A search at your favorite news aggregator for keywords such as "malware," "computer virus," or "data breach" will return tens of thousands of results. For most organizations it’s not a question of if a cyber attack will occur, but when. When an attack happens, the tempo of response must be fast, so an organization must already have practices in place covering how to respond. These practices should reflect a strategic approach that balances actions that protect assets—such as customer data and intellectual property—with actions that sustain services and operations.
A recommended approach to address both protection and sustainment is the application of resilience management practices. Operational resilience is the ability of an entity to prevent disruptions to its mission from occurring, continue to meet its mission if a disruption or incident does occur, and return to normalcy when the disruption is eliminated. The concept of operational resilience applies to entities such as organizations, systems, networks, supply chains, critical infrastructure, cyberspace, Armed Forces, and even nations.
Operational resilience management includes all the practices of planning, integrating, executing, and governing activities to ensure that an entity can
identify and mitigate operational risks that could lead to service disruptions before they occur
prepare for and respond to disruptive events (realized risks) in a manner that demonstrates command and control of incident response and service continuity
recover and restore mission-critical services and operations following an incident within acceptable time frames
Operational resilience management draws from several complex and evolving disciplines, including risk management, business continuity, disaster recovery, information security, incident and emergency management, information technology (IT), service delivery, workforce management, and supply-chain management, each with its own terminology, principles, and solutions. The practices described here reflect the convergence of these distinct, often siloed disciplines. As resilience management becomes an increasingly relevant and critical attribute of their missions, organizations should strive for a deeper coordination and integration of its constituent activities.
Our discussion of operational resilience management as presented in this post has three parts. First, we set the context by providing an answer to the question "Why is operational resilience management challenging?" The next post in this series will present a set of recommended practices for operational resilience management follows. Our original SPRUCE post concludes with an extensive list of selected resources to help you learn more about operational resilience management and added links to various sources to help amplify some points.
Every organization is different; judgment is required to implement these practices in a way that benefits your organization. In particular, be mindful of your mission, goals, existing processes, and culture. All practices have limitations. Some of these practices will be more relevant to your situation than others, and their applicability will depend on the context in which you apply them. To gain the most benefit, you need to evaluate each practice for its appropriateness and decide how to adapt it, striving for an implementation in which the practices meet your business objectives. Also, consider additional collections of recommended practices, including those among the various sources at the bottom of the webpage. Monitor your adoption and use of these practices, and adjust as appropriate.
These practices are certainly not complete—they are a work in progress.
Why is Managing Operational Resilience Challenging?
Over the past 10 years, organizations have invested a tremendous amount of resources in cybersecurity. Nevertheless, regardless of how much has been spent on protection, cyber attackers continue to penetrate systems. We have reached a point in the battle for information and cybersecurity where we should change the focus of security investment from a narrow focus on planning how to avoid cyber attacks to a more balanced focus on avoidance and planning how to recover from cyber attacks.
Operational resilience management has two sides—protect and sustain—and both are equally important. An organization must learn about the threat environment, maintain situational awareness of the context in which it operates, and create a risk-management plan that is as thorough and reliable as possible. But when an attack occurs, can the organization sustain its critical services and operations? Can it adequately recover its systems and get them back online as quickly as possible? Can it restore and recover service within a prescribed recovery time and according to its recovery-point objectives? An organization must ask, where can we not afford to have something bad happen, and where can we afford to have something bad happen and bounce back as quickly as we can? The need for organizations to achieve a balance between protect and sustain is why operational resilience management is so important.
Operational resilience management is challenging for several reasons:1. Making a long-term commitment: Operational resilience is an emergent property. An emergent property is not something an organization can buy and put in place or assemble by buying its parts. For a property to emerge within an organization, the organization must execute a certain set of activities in a coordinated manner and do so with consistent discipline. Achieving operational resilience requires an organization to make a long-term commitment to perform certain activities with consistency. The activities involved in operational resilience management must become part of the organization’s daily habits across the enterprise.
2. Understanding the big picture: To be operationally resilient, organizations must address operational risk on many dimensions simultaneously, including people, technology, information, facilities, supply-chain, management, cyber, and physical dimensions. This requires careful planning, coordination, and training across many interdependent domains, as well as understanding how the organization’s capabilities along these dimensions contribute to mission success.
3. Overcoming organizational hurdles: An organization may encounter a number of barriers to operational resilience management, including
the vague and abstract nature of operational risk management
compartmentalization of operational risk-management activities, such as segmenting responsibilities for information security and business continuity/disaster recovery
focusing on technology instead of on all the dimensions listed in Challenge 2
the proliferation of practices for operational resilience management
insufficient funding and staff
insufficient success stories and measurements
(over)reliance on people
regulatory climate
existing policies
the tendency to ignore current information to avoid a painful reality and the need to act
competitive pressures or short-term goals
Looking Ahead
Technology transition is a key part of the SEI’s mission and a guiding principle in our role as a federally funded research and development center. The next post will in this series will explore recommended practices for managing operational resilience in organizations as well as strategies for deriving more benefits from those recommended practices.
We welcome your comments and suggestions on this series.
Additional Resources
For comprehensive information about CERT's research operational resilience management, please see www.cert.org/resilience.
For more information about frameworks and maturity models, please see Buyer Beware: How to be a Better Consumer of Security Maturity Models presented by Julia Allen and Nader Mehravari at the February 2014 RSA Conference.Richard A. Caralli, Julia H. Allen, and David W. White, also published the book CERT Resilience Management Model (CERT-RMM): A Maturity Model for Managing Operational Resilience by Addison-Wesley Professional, 2011.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:13pm</span>
|
|
The design of executive dashboards varies depending upon the needs of the executives for which they are designed. However, well-designed executive dashboards commonly have the following characteristics:
Highly graphical in nature and enables the executives to read and understand the key metrics in very little time.
Tailored to the needs of the executive who uses them. The VP of Sales probably doesn’t need to see the total inventory turns or human resource information.
Starts with a high level view and, by clicking on the relevant graph or map, the user can drill down into more detail. Navigation is easy and intuitive.
Automatically updated with the latest available data so you’re not making decisions based on old information.
One needs to ensure that the dashboard adheres to the following rules of usability:
Relevance - Ensure that only the relevant information is presented at the top level
Clarity - Ensure that the data and information are assimilated well and presented in an easy to use way
Hierarchy - Ensure that the users of the dashboard are easily able to navigate from high level metrics to the details
Netwoven
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:13pm</span>
|
|
By Kevin Fall Deputy Director, Research, and CTOSEI
Software and acquisition professionals often have questions about recommended practices related to modern software development methods, techniques, and tools, such as how to apply agile methods in government acquisition frameworks, systematic verification and validation of safety-critical systems, and operational risk management. In the Department of Defense (DoD), these techniques are just a few of the options available to face the myriad challenges in producing large, secure software-reliant systems on schedule and within budget.
In an effort to offer our assessment of recommended techniques in these areas, SEI built upon an existing collaborative online environment known as SPRUCE (Systems and Software Producibility Collaboration Environment), hosted on the Cyber Security & Information Systems Information Analysis Center (CSIAC) website. From June 2013 to June 2014, the SEI assembled guidance on a variety of topics based on relevance, maturity of the practices described, and the timeliness with respect to current events. For example, shortly after the Target security breach of late 2013, we selected Managing Operational Resilience as a topic.
Ultimately, SEI curated recommended practices on five software topics: Agile at Scale, Safety-Critical Systems, Monitoring Software-Intensive System Acquisition Programs, Managing Intellectual Property in the Acquisition of Software-Intensive Systems, and Managing Operational Resilience. In addition to a recently published paper on SEI efforts and individual posts on the SPRUCE site, these recommended practices will be published in a series of posts on the SEI blog.
The first post in this series by Julia H. Allen, Pamela Curtis, and Nader Mehravari, presented challenges for managing operational resilience. This post presents recommended practices for helping organizations manage operational resilience as well as strategies for making the best use of the recommended practices.
Recommended Practices for Managing Operational Resilience in Organizationshttps://www.csiac.org/spruce/resources/ref_documents/recommended-practices-managing-operational-resilience
1. Governance and program management. Organizations must oversee and manage the execution of resilience activities. Resilient organizations ensure that all such activities derive their purpose and focus from strategic objectives and critical success factors for operational resilience. The governance and program-management practice ensures that the investment in operational resilience, cybersecurity, service continuity, and other domains is consistent with the organization’s business objectives. This practice entails regular planning, definition of roles and responsibilities, adequate funding, appropriate resource allocations, oversight in executing the plan, and corrections as necessary. In addition, governance and program management involves measuring, analyzing, and reporting the effectiveness of resilience-management practices and implementing improvements. These are all standard business practices for successful, mature organizations, but they are often overlooked when managing operational resilience.
2. Staff preparation and deployment. Organizations must be prepared when a disruptive event occurs. That means making sure that staff at all levels of the organization are trained in how to perform their assigned roles when disruptions occur. Everyone must know his or her role, receive training, and rehearse plans and contingencies. Skill gaps and deficiencies should be identified and training provided to address them.
Training can be designed to help meet the goals of resilience management as well as other goals of the organization that depend on interdisciplinary team performance. For example, teams with members drawn from different disciplines and departments can train together in a scenario that encourages interaction, mutual understanding, and building trust among team members. Such training breaks down barriers that otherwise naturally arise when work must be done across disciplines and departments.
This practice also encompasses establishing staff backup and redundancy at all levels of the organization. For key personnel, not only it is important to have backups who can step in; organizations should also have identified qualified successors to staff members in key positions if those positions are vacated.
Training is not a one-time event. The organization should provide periodic refreshment training for all key functions so that responsibilities and skills are not forgotten in the stress of disruptive events.
3. Communication and awareness. Resilient organizations make establishing and maintaining communications with stakeholders a key objective in all operational resilience-management practices—both during normal operations and during periods of stress. Communication is always important, but it is particularly essential during times of disruption. The organization should plan in advance exactly who will contact whom during and following disruptive events. Plan who will communicate with stakeholders, including both customers and suppliers, to share information and make stakeholders aware of the status of the situation. In addition, develop communication methods (newsletters, email notifications, community meetings, etc.), channels (public relations activities, peer and professional organizations, etc.), infrastructure, and systems (such as emergency alerting via mobile devices).This practice includes both internal and external communication. An organization should report ongoing measurement of operational performance and resilience-management activities and disseminate that information across the enterprise to ensure that all organizational units are operating with an up-to-date picture of the organization’s operations. External communication tasks may require providing information to news media about its resilience efforts or efforts to contain an incident or event. As appropriate, establish responsibility for planning or and executing crisis communications among first responders, other emergency and public service staff, and law enforcement.
4. Risk management. Organizations must identify, analyze, and mitigate risks to assets that could adversely affect the operation and delivery of high-value services. Because an organization cannot protect against every possible threat, risk management involves identifying critical services and operations, identifying the assets that enable their delivery, and prioritizing them. Based on the strategic objectives established in Practice 1, an organization identifies, analyzes, and prioritizes the set of risks that it will monitor and mitigate. This means that some risks will not be addressed, whether intentionally or accidentally. The goal of risk management is to limit exposure to the latter, but an organization can simply accept some risks and monitor them as residual risks (e.g., a price increase for a critical purchased component). In this way, the organization knows that it has an exposure but has attempted to intelligently limit that exposure.
Risk management is a continuous process involving identifying new risks, updating the status and disposition of identified risks, determining how to handle the risks (e.g., prevent, mitigate, monitor, or accept), and implementing the selected risk-handling option. For most organizations, this includes cyber risks—and, more specifically, software vulnerabilities and malware. A large body of work by the Software Engineering Institute and the MITRE Corporation describes specific vulnerabilities and software weaknesses. In particular, MITRE has established a large resource in its Common Vulnerability and Enumeration (CVE) repository, where it makes classes of vulnerabilities and solutions available.
5. Incident management. Incident management is one of the disciplines that most naturally comes to mind when one considers operational resilience management. It is the end-to-end handling of a disruptive event from the time that something happens to when it is detected, triaged, and resolved. Disruptive events include deliberate or inadvertent harmful actions of people, failed internal processes, technology failures, and external events such as natural disasters and power outages. Implementing this practice begins before an incident occurs, when an organization plans for and assigns roles and responsibilities, including those for key stakeholders and decision makers (for escalation).
Operational staff should be trained not only in delivering the services and conducting the operations for which they have responsibility but also in the results and effects to expect from performing these services and operations. Operational staff are often the first staff capable of detecting an incident; thus such training should make them more sensitive to unexpected deviations from "normal" results and effects. Once an incident is detected, the first step is to carefully note the circumstances of the incident, declare the incident, and preserve evidence. The organization may have prepared an immediate workaround for just such an incident. If so, that workaround is often implemented by the same staff who detect the incident. Otherwise, the organization analyzes the incident to develop an appropriate response, including recovery actions that minimize the disruption. When analyzing the incident, the incident-handling team looks for patterns or similarities to other incidents that they may have seen in the past. The organization may perform a root-cause analysis and identify and evaluate multiple candidate solutions.
The next steps are to implement the solution—respond and recover. The incident-handling team should also ensure that the organization communicates with key stakeholders, who can provide needed resources and expertise immediately or later in the incident resolution.
Once the incident is closed, the organization should conduct a postmortem analysis to determine if the organization should make any improvements to its overall incident management, risk management, and service delivery (operations) processes. The organization should define measures to help evaluate the effectiveness of its responses to disruptive incidents. It will analyze those measures of effectiveness to determine where to improve its practices.
6. Service continuity. This practice entails ensuring the continuity of essential operations and services during and following a disruptive event. Service continuity may include business continuity, disaster recovery, crisis management, and pandemic planning.
Activities encompassed by this practice include developing service-continuity plans, assigning roles and responsibilities, and then testing plans and running exercises to ensure that the plans are robust. For example, the organization should establish plans about what to do with its workforce if it must evacuate its facility and stand up an alternative facility to continue operations. Tests and exercises can cover a wide range of activities and may include computer simulations.
Organizations should ensure the continuity of the services they provide through careful preparation and planning. The resilient organization tracks the location of key personnel and backup personnel, so that in the event of an incident, they can put recovery plans into action. Through exercises and drills, the organization assures that everyone knows his or her roles. When a Hurricane Sandy happens, the resilient organization does what it has rehearsed.
7. Critical asset protection. Critical assets (e.g., information, technology, facilities) that support high-value services must be identified, protected, and maintained. In particular, an organization must ensure that it applies adequate controls to protect the confidentiality, integrity (i.e., information security), and availability of information essential or entrusted to the business. Such controls can include maintaining an up-to-date inventory of the information that the organization must protect, on what devices that information resides, and over what networks it may be transmitted. In addition, an organization should have practices for configuring, tracking, protecting, and maintaining its IT assets (e.g., workstations, laptops, mobile devices, and network components).
Protecting critical assets requires continually identifying and mitigating threats to the asset (e.g., as part of a comprehensive risk-management practice, discussed in Practice 4); improving, retiring, and adding new controls to the asset to maintain its integrity; and establishing appropriate identity and access management to limit access to the asset. Critical asset protection also includes facility protection, such as for an organization’s IT assets, and includes facilities for backup and recovery.
8. External-dependencies management. An organization must identify and manage dependencies on external entities, such as its supply chain. Key elements of this practice include prioritizing external dependencies, managing risks arising from external dependencies, and formalizing relationships with external entities. Organizations should make sure that formal and contractual agreements are in place with external entities and that everyone understands what is expected from each party, in particular with respect to disruptions in delivery of critical components or services. To ensure preparedness, an organization should proactively monitor and manage the performance of external entities to make sure they meet expectations.
9. Secure software development and integration. Organizations must ensure that software that enables or performs the delivery of critical services and operations satisfies resilience requirements. An organization derives resilience requirements for such software in part from its resilience-management activities, including governance and program management (Practice 1), service continuity (Practice 6), and critical asset protection (Practice 7). For example, mitigating a particular threat to an asset may impose resilience (and security) requirements on the software that controls it or access to it. An organization should also elicit or collect requirements from stakeholders, including customers, end users, suppliers, other partners, and regulatory authorities. Multiple frameworks provide recommended practices for software development that address security and other resilience-related topics (see Learn More for more information). Many of the challenges noted for practices at the top of this webpage apply to the practices described in such frameworks as well.
How to derive more benefit from the recommended practices for managing operational resilience?
1. Coordinate the implementation of these practices. Implementing these practices requires competence in several disciplines (incident management, asset protection, risk management, etc.). Organizations that create a separate solution or team to deal with each practice will find their operational resilience-management activities to be inefficient and difficult to manage due to the overlaps (e.g., where do incident management, disaster recovery, and asset protection and sustainment begin or end?). Just as the implementation of each operational resilience-management practice should be driven by business objectives, so should their collective implementation. Organizations will improve their operational resilience by taking an integrated approach to implementing these activities and ensuring that there is adequate coordination among them.
Begin by gathering representatives from the different discipilines and departments to develop end-to-end scenarios that describe how the organization should respond to particular threats (as described in Practice 2). Identify which disciplines or departments (e.g., incident analysis, disaster recovery, and crisis communication) to involve at each stage of the response, including afterward, when making improvements to processes and training for service delivery, service continuity, and information security. Then determine how the organization should coordinate its activities in such scenarios. Such rehearsals or simulations help identify superior ways to implement the operational resilience-management practices.
The following diagram may help you remember the purpose of each resilience-management practice. The two practices in the "Stop the bleeding" row deal primarily with resolving incidents. The "Improve and manage" row of the diagram depicts the practices that provide infrastructural and foundational support for establishing, facilitating, measuring, and improving asset protection and operations sustainment activities. The position of those practices in the diagram also indicates their role in protecting and sustaining the health of the organization and continually improving operational resilience-management activities. The diagram illustrates the need for all the operational resilience-management practices to work together.
2. Maintain currency with relevant standards. In the past 10 years, standards have exploded across all disciplines in national and international efforts to deal with the growing number of cybersecurity failures. The number of standards dealing with preparedness planning has quadrupled since 2005. An organization should develop an integrated approach to updating its processes to maintain compliance with standards relevant to its business. For example, when ISO/IEC Standard 27034 Information Technology—Security Techniques—Application Security was published, its guidance affected business managers, IT managers, developers, auditors, and end users. An organization should involve designers, programmers, acquisition managers, IT staff, and users to determine what changes are needed to preserve the effectiveness of operational resilience-management activities while addressing this standard.
3. Understand compliance issues. Compliance issues affect all the recommended practices. An organization must not only follow federal and state legislation and regulations but also be aware that state-by-state differences exist. For example, state requirements vary for notifications about data breaches, and this will inform the organization’s communication practices. However, an organization should view compliance as an outcome of an integrated operational resilience-management program, not a goal. Simply following a rule may not be sufficient to plan for and mitigate risk; new risks arise much faster than the rate of legislation.
Looking Ahead
Technology transition is a key part of the SEI’s mission and a guiding principle in our role as a federally funded research and development center. The next post will in this series will present recommended practices for the software development of safety-critical systems.
We welcome your comments and suggestions on this series in the comments section below.
Additional Resources
For comprehensive information about CERT's research operational resilience management, please see www.cert.org/resilience.
For more information about frameworks and maturity models, please see Buyer Beware: How to be a Better Consumer of Security Maturity Models presented by Julia Allen and Nader Mehravari at the February 2014 RSA Conference.
Richard A. Caralli, Julia H. Allen, and David W. White, also published the book CERT Resilience Management Model (CERT-RMM): A Maturity Model for Managing Operational Resilience by Addison-Wesley Professional, 2011.
A detailed list of resources for managing operational resilience, frameworks and maturity models, risk management, external dependencies management, resilience engineering, operational resilience management, and resilience policy development, please visithttps://www.csiac.org/spruce/resources/ref_documents/recommended-practices-managing-operational-resilience.
SEI
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:12pm</span>
|
|
In just a few weeks, our profession will gather at the premier HR event...
SHRM
.
Blog
.
<span class='date ' tip=''><i class='icon-time'></i> Jul 27, 2015 01:12pm</span>
|







