Wednesday, December 31, 2014

Here’s Your CIP-002-5.1 R1 Compliance Methodology!*


This is the third in a series of posts on the serious problems with CIP-002-5.1 R1, and what entities and NERC need to do to deal with them.  The first post is here.  The exciting conclusion - in which I chide NERC for their mishandling of these problems and say what I think needs to be done to address them - is here.

June 11, 2015: I'm afraid I've come to the conclusion that there can be no definitive guidance developed for complying with CIP-002-5.1 R1; there are just too many contradictions and ambiguities in the standard.  I would like to see the standard rewritten, but since that is a multi-year process, it obviously won't help entities preparing for compliance next April. I will continue to discuss different aspects of the R1 (and Attachment 1) compliance process, such as in this post.  But my hope to keep updating this post as a kind of comprehensive guide for R1 compliance is officially ended.  There is probably nothing in this post that is simply wrong, but keep in mind that my ideas on how you should comply with R1 have moved beyond what's here.

*I’m worried the FTC lawyers will be contacting me any minute to hit me up for pulling a bait-and-switch.  The fact is, I have no intention of telling you what your CIP-002-5.1 R1 and Attachment 1 (“R1” for short) compliance methodology should be.  That’s because I’ve become convinced that it is impossible to write down a single procedure for R1 compliance that takes up something less than the length of War and Peace; or if you prefer, I don’t think you could document the process with a diagram that takes up less than the size of my living room.  And this leads to a couple important conclusions.  For the whole sorry story, read on.

A brief history:  I’ve made three main efforts to come up with a single methodology for R1 compliance.  The first was when I rewrote R1 as a comment to FERC in June 2013.  I cringe when I read it now, since a lot of it is simply wrong – although I still think it’s better than the requirement as currently written.  The second was at the very beginning of 2014, when I thought I finally knew exactly how to comply with R1 and I laid it all out in three posts (the first is here).  Yet within three weeks, I knew I’d missed a couple really important nuances, most notably why “Facilities” was the subject of criteria 2.3-2.8.

My most recent attempt was last April, when I wrote Part 1 of what was to be a couple posts that would lay out for once and all exactly how to comply with R1 for substations (and substations, of course, account for 95% of the v5 compliance effort).   I don’t think there was anything in that post that was strictly wrong; however, I gave up on it when it became clear that criterion 2.5 – and probably others as well - had to have a different compliance methodology associated with it from the other criteria; so there was a whole new layer of complexity I hadn’t realized.

I’ll be honest:  Since that time, I’ve just kept on discovering more layers of complexity as I realize new problems with R1 (for a list of the 21 primary problems I see now, see my last post).  One of the biggest new sources of complexity is the various areas where an entity needs to “roll your own” definitions and interpretations.  You could think of each of these areas – e.g. the definition of “programmable” – as being its own “subroutine”[i].  In other words, you need to roll your own definition of programmable and insert it at the appropriate place in the overall R1 methodology.  The same for your interpretation of “affect the reliable operation of the BES”, your definition of “associated with”, etc.

So it’s not like I’m saying there could never be a complete methodology for R1 compliance, although it might take something approaching the remaining lifetime of the universe to put it down on paper.  But the real problem is something that I first learned in my statistics classes in college: Uncertainty is multiplicative.  If you have only a 50% certainty that a particular sub-methodology (such as the definition of “programmable” you’ve just developed) is correct, and you have 9 other sub-methodologies each with only a 50% certainty, the percent of certainty you have about the whole string of processes together is 0.09765625% (i.e. less than a tenth of a percent).  This means you simply have no clue whether the whole thing makes sense or not.  Of course, I think there are a lot more than ten major areas of uncertainty in complying with R1; even this very low level of certainty is probably too high an estimate for any methodology you might develop. 

So I’m not exaggerating at all when I say there is simply no way to write a single methodology for R1 compliance that has some reasonable probability of being correct.  The likelihood of being able to do that is similar to the proverbial likelihood of a bunch of monkeys, pounding on keyboards, being able to recreate the works of Shakespeare.

Now that I’ve established the fact that I lied to you in the title, what is the point of this post, anyway?  It’s quite simple: The fact that no comprehensive R1 compliance methodology can be written down doesn’t mean you’re off the hook for complying.  Regardless of the difficulty, every entity does need to come up with a documented process for R1 compliance, and to follow that process in identifying their Medium and High impact BES Cyber Systems, as well as their Low impact BES assets. 

So in this post I’m going to try to lay out, on a high level, the main tasks that need to be taken to comply with R1, as well as what I think are the primary considerations that need to be addressed under each task.  There is no way I can put together even a reasonably detailed methodology that will work for your organization.  Hopefully, you can use this post to guide your effort to put that methodology together – perhaps using the help of a knowledgeable consulting organization such as the one I work for.  Where I’ve discussed a topic in a previous post, I’ll include a link.

Note that the discussion below is geared toward substations, since as I said above, I believe literally 90-95% of the CIP v5 compliance effort will be for Medium impact substations.  I believe the discussion also applies to Medium impact generating plants that meet criteria other than 2.1.  The methodology for those plants is quite different from what I outline below, mainly because of the huge numbers of devices that may meet the definition of Cyber Asset in large coal plants, as well as because of the “exemption” for BES Cyber Systems that don’t affect 1500MW.  I’ll hope to address those plants in a future post.

Task 1: Preliminary Identification and Classification of Substations and Facilities
The first step of the process is to decide which substations may be Medium impact – or more correctly, which substations are likely to contain Medium impact BES Cyber Systems.  Criteria 2.4 to 2.8 are the ones that can potentially apply to substations.[ii] 

The next step is to make a decision that every owner of Medium impact substations needs to make: whether to classify BCS based on the asset (substation) or on the Facility (line, transformer, etc) they’re associated with.  There are a couple key questions that your entity needs to answer, in order to decide which route they will take.  If you decide to classify based on substations, you will probably end up identifying more Medium impact BCS than you would if you classified based on Facilities.  On the other hand, there may be more work involved, and potentially more cost if networks need to be separated.  I will generally refer to “substations/Facilities” below; you need to choose this to mean one of the two terms, depending on your answer to this question.

If you do choose the Facilities route, you need to identify and list the Facilities at each substation.  To simplify things going forward, you can remove from the list Facilities that don’t have any cyber assets associated with them, since they obviously won’t have any BES Cyber Systems (the same applies to substations that don’t have any cyber assets associated with them, if you’re going the “substations” route).  Finally, you need to identify the Medium impact Facilities using Criteria 2.4 – 2.8 (and you should document a methodology for doing this, both to guide the people who do this work, as well as to show the auditors how you did it). 

A final step in this task is to develop a methodology for dealing with substations that are jointly owned or operated with one or more other NERC entities.  NERC has promised this will be one of the upcoming Lessons Learned, but you probably can’t wait for that to come out (of course, when it does, it will just be a draft for comment.  And even when it’s finalized, it won’t be binding on auditors or entities.  This applies to all the Lessons Learned[iii]).

Task 2: Inventory of Cyber Assets
This task (and the remaining tasks except for the last two) only needs to be taken for substations/Facilities that meet a Medium impact criterion.  You need to identify all of the electronic devices “associated with” the  substation/Facility, that meet the definition of Cyber Asset: “programmable electronic device”.  Of course, there are three important details embedded in this seemingly simple task:

  1. You need to come up with a definition of “programmable".  This word is the heart of the NERC definition of Cyber Asset, but isn’t itself defined.  A Lessons Learned document was recently released in draft form by NERC.  You need to consult this, but keep in mind that it isn't mandatory you follow it - however, you do need to at least document why you didn't.
  2.           Jointly Owned Substations – Your entity needs to decide how it will allocate responsibility for BES Cyber Systems with any joint ownership partners.
  3.          “..affect the reliable operation of the BES” – This undefined phrase is an important part of the definition of BES Cyber Asset.  How your entity defines this phrase will have an impact on the number of BCAs (and BCS) you identify.  See this post.
Once you have developed addressed these three items, you need to inventory every device that could possibly be a Cyber Asset associated with a Medium impact substation.  The inventory needs to include identification of the Facility/substation with which it is associated, as well as the asset (whether that is a substation or another of the six asset types in R1, like a control center) where it is actually located (and if you’re going the “Facility” route described above, you’ll first need to inventory the Transmission Facilities located at each Medium substation).

Task 3: “Top-Down” Identification of BES Cyber Systems / BES Cyber Assets
I have written extensively about the two main approaches to identifying BCS/BCA: “top-down” and “bottom-up”.  Until recently, I thought that entities should combine the two approaches, for all types of assets.  It combines top-down and bottom-up, since not using both approaches can lead to under- or over-identification of BES Cyber Systems.  However, I've recently been persuaded that, for substations, only the bottom-up approach is needed; it remains a good idea for generating plants (except those that meet Criterion 2.1.  As I said above, these need a completely different appraoch) and perhaps for Control Centers.

Since, this post is partly about generating plants, I'll start by describing the top-down approach.  For generating plants (other than 2.1 plants, of course) I think it is better to start with the top-down approach, then use the bottom-up as a check on it (you’ll see this will reduce the work required, since starting with bottom-up will result in your doing some unnecessary classification work).  It would be nice if NERC, or for that matter the regions, provided some guidance on this issue.  But that is pretty unlikely at this point, so you’ll need to decide for yourself whether to use just one or both approaches. 
 
The first step for applying the top-down approach is to develop a methodology for it.  While the Guidelines and Technical Basis in CIP-002-5.1 does provide a good overall description of how the BES Reliability Operating Services (the heart of the top-down approach, of course) can be used to identify BES Cyber Systems, this is far from being a complete methodology for this task.  You need to develop that methodology first, both to guide whoever will be doing this and to show the auditors how you came up with your list of BCS. 

The methodology you develop should show how you will apply the BROS to develop a list of systems that are potentially BCS.  Very briefly, “potential BCS” are systems that support one or more BROS and are associated with a Medium impact substation/Facility.  For each of these potential BCS, you then need to ask, “Does this affect the reliable operation of the BES within 15 minutes?”  If the system doesn’t do that, it isn’t a BCS; if it does, it is one.  This is because a BCS is made up of BCAs, and the 15-minute criterion is part of the definition of BCA; what doesn’t meet this definition isn’t a BCA, and a system with no BCAs isn’t a BCS.

At this point, you should make a list of the component Cyber Assets within each BES Cyber System, but you don't need to actually classify these into BES Cyber Assets or Protected Cyber Assets (as I originally thought - see this correction to this post.

Task 4: “Bottom-Up” Identification of BCS
Whether you're dealing with a substation or a generating plant, you do need to do the bottom-up analysis to identify BCS.  In the bottom-up approach, you start with the definition of BES Cyber Asset, apply it to each Cyber Asset, and then determine which Cyber Assets are BCAs (and remember to use the "interpretation" you did of "affect the reliable operation of the BES" in the BCA definition, as described in Task 2 above).  Finally, you aggregate BCAs into BCS, so that every BCA is included in at least one BCS (and a BCS can consist of a single BCA).

There is a big difference in how you apply this process to generating plants vs. substations, though. For plants, you only need to apply the bottom-up analysis to the Cyber Assets that haven't already been included in a BES Cyber System as a result of the top-down analysis.  Since these are already in scope for v5, you don't need to spend your time deciding again whether they're in scope.  For substations, since you aren't doing the top-down analysis, you need to apply the bottom-up analysis to every Cyber Asset you've identified.

For both substations and plants, you now need to include all BES Cyber Assets in at least one BES Cyber System.  But there is a difference in how you do this for the two asset types.  For substations, you simply create as many BCS as are required to include every BCA.  But for generating plants, since you have already identified a number of BCS in the top-down analysis, you can always look first to those BCS when you're trying to find "homes" for the BCAs you've just identified in bottom-up.  You can then create new BCS to "house" the new BCAs (and of course, you always have the option of making every BCA its own BCS).

Of course, when I talk about combining BCAs into BCS, I'm not saying how that should be done. This is because there are no "directions" for this in CIP-002-5.1, the NERC Glossary definition of BCS, or the CIP-002 Guidance and Technical Basis.  NERC did publish a draft Lessons Learned document that at least gives you some ideas on what you can do - but in the end, this is something that's up to the entity to determine, with the idea that some choices will lead to a much easier CIP v5 compliance job than others will.

Speaking of this Lessons Learned document, I think it provides several good ideas and one spectacularly bad one: the idea that you could group BCAs into BCS differently in order to comply with different requirements - i.e. you would comply with one requirement using one set of BCS and another requirement with a potentially different set.  There is nothing illegal about doing this, but in my opinion it would create all sorts of problems in other parts of the v5 compliance effort.  I'll hopefully have a post out on this soon.

For a substation, the list of BES Cyber Systems you develop in this step is your final list.  However, for generating plants - since you have already developed a BCS list as part of the top-down analysis - you need to combine the bottom-up with the top-down list.  This, then, is the list of BCS that is the final outcome of R1.

Task 5: Draw Preliminary ESP
You may wonder what drawing the ESP has to do with complying with CIP-002-5.1 R1.  After all, that requirement is solely concerned with identifying and classifying BES Cyber Systems, not PCAs and certainly not ESPs.  The reason I have this task here is that, as we get into classifying BCS as Medium or High vs. Low impact, it is important to know which of these are networked with which others.  This is important not from a strictly compliance point of view but more from an operational point of view: if you don’t know where your ESP is drawn, you don’t know whether a Cyber Asset – that isn’t a BES Cyber Asset – is a PCA or not.  As a consequence, you may make decisions about classifying BCAs/BCSs that will result in your over-classifying Cyber Assets as BCAs.

There is a big difference between drawing the ESP in CIP v5 vs. in v3.  In v3, you needed to include all Critical Cyber Assets in an ESP.  In v5, you just need to include those BCS that are connected on an internal routable network (this is of course a different issue than whether an asset has external routable connectivity).  Other than that, you should draw your ESP(s) just as you did in V3, trying to include only those Cyber Assets that need to be in it, and making networking changes to reduce their number as much as possible.

Task 6: Classifying BES Cyber Systems
We now come to probably the most important task, and the only one that is explicitly called out in R1.  Each of your BES Cyber Systems needs to be classified according to the Facility/substation with which it is associated.  If the Facility/substation is a Medium impact, the BCS will be Medium; if it is Low, the BCS will be a Low.

There is an exception to this rule in the case of relays, located in an otherwise Low impact substation, that are associated with a line that meets criterion 2.5 as Medium impact (i.e. “far-end” relays in a transfer-trip scheme).  NERC’s recent Lessons Learned document on this topic, and their previous pronouncements, seem to make clear that such relays will not be Medium impact but Low; this is because of the specific wording of criterion 2.5 (thus, this provision only applies to that criterion). 

I need to point out that, IMHO, if you are taking the interpretation that criteria 2.4 – 2.8 apply to entire substations, not to particular Facilities in those substations (as discussed at the beginning of this post), then it isn’t clear that this “exemption” will apply to your far-end relays.  The Lessons Learned document and the wording of criterion 2.5 make it quite clear that this exemption applies to the Facility – i.e. a line between 200 and 499 kV – not to the substation itself.  So if you interpret 2.5 as classifying an entire substation[v] as Medium impact, then strictly speaking you should classify your far-end relays as Medium BCS.  However, my guess is that no auditor would issue you a PV for calling the far-end relays Low impact, at least if they didn’t want to come out to find their tires had been slashed. 

You finish this task by listing your Medium impact BES Cyber Systems, and the substation/Facility with which they are associated.

Task 7: List of Low Impact Assets
The final list you need is one of Low impact assets (although for not-very-good reasons they are called “assets containing Low impact BES Cyber Systems”) in R1.  This will of course be all of the Transmission substations that aren’t Medium impact.  However, if some of the Medium substations do contain Low BCS (meaning you identified BCS based on the Facility they’re associated with, not the substation – this can then result in a mixture of Medium and Low BCS at the same substation), you need to list these as Lows as well.

Task 8: Distribution Provider Assets
There are many NERC entities that are registered as both TO or TOP (who thus could have Medium impact substations) and Distribution Provider.   Section 4.2.1 of CIP-002-5.1 lists four types of assets, owned by some Distribution Providers, which are in scope for CIP Version 5.  In other words, if an entity is registered as a DP, it needs to treat any of these assets that it owns as in scope for CIP v5, even though they are Distribution assets and wouldn’t otherwise be subject to CIP.  Each of these assets needs to be added to the Low impact list, since none would meet one of the Medium criteria.

And Now, the Moral of Our Story
The purpose of this post has been twofold.  First, it has hopefully at least given you an idea of everything I currently see that needs to be included in a CIP v5 methodology, so you can use it as a completeness check for your own methodology.  But more importantly, I hope it has shown you (with "you" meaning NERC entities, FERC, the NERC regions and NERC itself) how serious the problems are with this requirement.  If you can't develop a defined methodology for complying with a requirement, you simply have a requirement that can't be complied with in any meaningful sense of the word; and that's what R1 is.  More on this in the next post, which brings this four-part series to its exciting conclusion.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.


[i] The fact that the best way I can express this idea is to use a term from the days when programming involved spending a lot of time typing punch cards perhaps tells you the last time I did any serious programming (in FORTRAN).  I’m told you no longer have to use punch cards to write programs, which I’m glad to hear.

[ii] Criteria 2.2, 2.9 and 2.10 could also identify substations that contain Medium BES Cyber Systems, although you need to make some modifications to the remaining Tasks to properly deal with these criteria.  Of course, I'm completely glossing over the fact that technically none of the criteria apply to substations.  They apply to Facilities that may be found at substations.

[iii] In fact, the term Lesson Learned is quite inappropriate here, since the documents NERC is coming out with address topics that nobody has had to address so far, meaning no lessons can yet have been learned.  But “Lessons Learned” is an established category of documents with NERC, which provides a convenient fig leaf covering the fact that these are really quasi-Interpretations (capital I), and thus pushing the bounds of legality in the NERC world.

[v] In my opinion, to say that criteria 2.4 – 2.8 apply to substations, not Facilities, is to ignore the clear meaning of the words in those criteria.  The reason why you might nevertheless want to do this is that it will make your job of classifying BES Cyber Systems easier, although it will most likely increase the number of BCS that you end up classifying as Medium impact.

No comments:

Post a Comment