Managing Resource Tags in Azure

16 October 2022 by Martin Dahlborg

Checking how tags have been applied is one of the two fastest ways to get an indication of what the current state of an environment is. The second is checking whether a naming convention has been followed. In this article I’ll provide recommendations on how to lay a foundation for a tag management process. We also provide access to a detailed “Resource Tagging Guideline” for newsletter subscribers.

Perspective

I work with analyzing the state of Azure, AWS and GCP environments as a cloud security architect. Checking how tags have been applied is one of the two fastest ways to get an indication of what the current state of an environment is. The second is checking whether a naming convention has been followed. In this article I’ll provide recommendations on how to lay a foundation for a tag management process. Although I’ll be focusing on how to define tag management requirements, I’ll also briefly discuss how to assign responsibility and how to measure improvements over time. I’ve put together a “Resource Tagging Guideline” document, which you’ll receive access to by subscribing to our newsletter. The guideline document is a real-life example based on how I’ve defined tagging requirements in the past working as a consultant. It contains a long list of tags which cover the most common use cases. This article hopefully help you improve the maturity of the tag management process within your organization.

While the primary concerns of a security architect are IT security related, many of the prerequisites of achieving a certain IT security baseline are related to governance processes being in place i.e., having control over changes, having quality assured configurations, issuing approvals in formal way etc. Having such governance controls in place will reduce the risk of security incidents from occurring. The main benefit from my perspective of adding tags is that it makes it easier to see if responsibility has been assigned, if governance activities are performed, and to check that the supporting platform services (backups, monitoring, logging) are in place. Reviewing tags is a major part for me in reaching a conclusion on whether the organization has established control of the environment and the resources therein.

Methodology

In my experience, things don’t improve unless there are a few things that come together. For example, there’s almost never an improvement in security if there are policies in place but no one adheres to them, if responsibility is assigned but no additional resources are allocated, if someone excels at performing a task manually but there’s no consistency as everyone else does it differently. To succeed you need to ensure people have the necessary knowledge, that they have access to appropriate tools, a clear understanding of what is required and why, and to automate whenever possible as time is always a limitation.

My recommendation is to base your approach on a light-weight version of best practices that heavily regulated organizations are using. The basic steps would be to:

  1. Define the requirements
  2. Assign responsibility
  3. Implement a technical solution (automated analysis)
  4. Measure improvements over time

Defining Requirements

The requirements should be clearly stated in a policy, guideline, or standard i.e., a document that is part of the organizations Information Security Management System (ISMS). The “Resource Tagging Guideline” mentioned in this article is an example of what that could look like (see screenshot below). It’s primarily aligned to Microsoft’s recommendations and best practices. However, I’ve also looked at the recommendations from Amazon and Google and added those into the mix. Check out the references at the bottom of this article if you want to read more tagging related information published by them.

ResourceTaggingGuideline


Subscribe to our newsletter here and you’ll receive your copy of the Resource Tagging guideline:


Resource Tagging Guideline

The foundational principle is that each tag that you add to a subscription, resource group or resource show provide a clear value to at least one stakeholder. Try to reduce the number of tags that are included in the guideline based on what is needed, and what you see when performing reviews.

The audience of such a guideline are cloud architects, engineers, and developers. It must be sufficiently detailed so that confusion can be avoided, and mistakes can be prevented. One tell-tale sign that the guideline itself needs improving is that you see a lot of mistakes being made once you start measuring compliance. When you reach out to people, they’ll likely provide feedback to you that they couldn’t understand what (or how) they should’ve done instead. This is one reason any guideline should be revised at least annually to address any issues that comes to light during that time.

Here is a screenshot of the “Resource Tagging guideline” so you can get a sense of what it contains:

ResourceTags


Looking at the screenshot above, you will notice some details. The first thing to point out is that tags are be separated into two categories: mandatory and optional. Mandatory tags should be applied to everything. Any tag that can’t be applied to all resources should be optional. There will be likely exceptions to this rule, but the general principle is still applicable.

Any tags that are generated by the cloud service itself can be disregarded as there’s no value in including them in the guideline.

The purpose of using a tag can be said to tie into one of four contexts. The term “context” describes which type of processes in the organization benefit from the tag being assigned.

Tagging contexts:

  • Business: Specifying which department, or project, is accountable and/or responsible for the resources and related costs.
  • Service: Identifying specific workloads, services applications, or grouping resources that stretch across different Azure subscriptions and/or resource groups. These tags are also used to provide insight into solution architecture, design, and documentation.
  • Security: Providing visibility into how security non-functional requirements (NFR’s) have been implemented to secure access, limit exposure and reduce risk.
  • Governance: Describing who is responsible for performing governance activities. These tags also providing insights which governance controls have been implemented to perform monitoring, updates, and to ensure availability.

Since tags are case sensitive, it’s necessary to specify the exact format and casing tag keys should use. As an example, the cost center tag key might be written as “costcenter”, “cost-center”, “CostCenter” (upper camel case), “costCenter” (lower camel case).

A tag could also have pre-defined set of allowed values. Such tags and their values should be evaluated to see if the value correspond to an allowed value or not.

By specifying these detailed requirements for both tag keys and tag values it will make it easier to filter and sort output from scripts or other forms of automated analysis. It will also make it possible to detect mistakes and to see to it that those are corrected.

If you also have a Cost Management Guideline in place, then you might also need to specify which tags are mandatory for resources that must be deallocated according to a schedule (using automation scripts or similar). The purpose of this is to reduce cost when resources are not in use.


Resource Tagging Guideline video on YouTube

I’ve recorded a video that is available on our YouTube channel. In it I go through the Resource Tagging Guideline document itself and provide more details regarding each tag, what the reason could be behind wanting to use it, and a few other tagging related topics.

Responsibility

The “Responsibility assignment matrix” model, also known as the “RACI” model, is commonly used when there’s a need to clarify responsibility. Using this model, it’s quite easy to include this information in a document, on an Intranet, in Confluence or similar. This is what I usually work with when it comes to documenting who, or which teams, have some form of responsibility for governance and security related activities.

RACI model consists of these four categories:

  • Responsible: The person or team responsible for carrying out one or more repeatable tasks to deliver an outcome. Usually consists of architects, engineers, developers, or a team lead.
  • Accountable: A single decision-maker who is accountable for the success or failure of tasks that are carried out. Usually consists of a manager or project manager.
  • Consulted: Person(s) that are required to provide details or input on requirements before work is started. These also carry out reviews once tasks have been completed. For functional requirements this would be a product owner, backlog owner. For non-functional requirements it’s usually Architect Board, or the IT security department.
  • Informed: Those that should be kept up to date of major changes. Usually senior management, portfolio management, or the Information security department.

Measuring maturity

Once requirements have been defined, responsibility has been assigned, a technical solution implemented, then there is a last step that is necessary. That last step is to verify that everyone is adhering to the requirements, are following the guidelines correctly, and that improvements are being made over time. Measuring is also beneficial to if you have to provide proof of compliance as a security architect to an internal audit department or to external auditors who have been contracted to review regulatory compliance for example.

To this end, the (outdated) “Capability Maturity Model” (CMM) can be used as a basic framework for measuring maturity. Although it’s been replaced by the CMMI model, the CMM model will still serve its purpose if you only want a light-weight structured way to measure how well a process is performing.

The term “maturity” relates to the degree of formality and optimization of processes from ad hoc practices, to formally defined steps, to managed result metrics, to active optimization of the processes.

Capability Maturity Model levels:

  • Level 1: Initial. Chaotic, ad hoc, undocumented, unstructured, unrepeatable, manual.
  • Level 2: Repeatable. Documented, repeatable, consistent results.
  • Level 3: Defined. The process has been defined as a standard business process.
  • Level 4: Managed. The process is performing according to agreed-upon Key Performance Indicators (KPI’s) or similar metrics.
  • Level 5: Optimizing. Process management includes deliberate optimization and improvements.

Performing reviews

One final note on what to look for if you are performing reviews by yourself. When I perform a review of an environment these are the things that I look for. They are basically the opposites of recommendations that I have provided in this article and included in the guideline document.

Possible issues:

  • Untagged resource groups
  • Possible inconsistencies between tag values when the same tag has been applied at the resource group level and at the resource level (mutual exclusivity).
  • Missing mandatory tags
  • Misspellings
  • Wrong casing used
  • Tagging applied differently in different environments
  • Certain tags have values which deviate from what’s allowed
  • Tags contain contact information that is no longer relevant i.e., a person has left the organization, changed roles or the organization itself has changed structure

Possible conclusions:

  • Work is done ad hoc (CMM level 1)
  • The Change Management process is deficient
  • Responsibility hasn’t been clearly defined
  • Unnecessary costs are accrued due to a lack of automated deallocation of resources when they aren’t needed
  • Unacceptable residual risks exist due to insufficient security levels regarding confidentiality and availability

Final note

In this article I’ve presented an approach on how to ensure that one can be successful when implementing Resource Tags in an Azure environment. I hope that this information was of value and that it will help you get started if you haven’t done so already, or that it might help you improve certain aspects on what you already have in place.

As a reminder if you haven’t done so already, subscribe to our newsletter here and you’ll receive your copy of the Resource Tagging guideline:



References

Cloud Service Provider recommendations:

Implementing tagging in Azure:

Other references: