Machine identity management is the process of governing and orchestrating the identities – digital certificates and keys – of machines – devices, workloads, applications, containers, IoT, etc. Machine identity management is essential for data security, integrity, and compliance, as it authenticates communicating parties and ensures all traffic is encrypted.
When a user signs up for an application, they choose a username and a password, which get stored in a database. During subsequent logins, the application checks the username and password that they enter against the credentials stored in the database. If they match, the user is authenticated and can access the application. The username and password here serve as the user’s identity.
Just as humans go through an authentication process to securely access applications, machines, too, need to be authenticated to communicate securely with each other. Machines here mean just about anything in the digital world other than humans – servers, applications, websites, software, virtual machines, container workloads, IoT, and so on. And just as humans have usernames and passwords to establish their identity, machines have digital certificates and keys that form their identity. Every internet protocol (HTTPS, SSH, FTP, etc.) validates and authenticates machine identities to secure network communications.
Here’s how machine identity works in the most common machine-to-machine communication – server and client communication:
When a client initiates a session with a web server, the server sends across its digital certificate (TLS/SSL certificate) to the client on receiving the connection request. The client then validates the digital certificate and authenticates the server. Sometimes, the server might also ask the client to prove its identity by sharing its certificate, which is usually the case with sensitive applications. Once authenticated, the server and client exchange keys for hashing and encryption, and the session gets established.
As we discussed earlier, machine identities are established using digital certificates and keys. But, the certificates and key types differ based on the machine, communication protocol, and usage. Here, we’ll see some of the commonly used certificates and keys that make up machine identity.
X.509 certificates: These are the most widely used machine identity certificates and are the foundation of PKI (Public Key Infrastructure). X.509 certificates are used for server-client authentication over the HTTPS protocol (which in turn is based on the TLS/SSL protocol), and in digitally signing offline applications.
SSH keys and certificates: SSH keys are used by system administrators and power users to connect to and manage network devices remotely. Since SSH keys authenticate access to critical IT systems, the SSH protocol offers a higher level of security than TLS/SSL. While using SSH certificates for authentication is not common practice, it is recommended as it eliminates the manual, insecure process of key approval and distribution.
Code-signing certificates: These certificates are used to digitally sign software, applications, scripts, and executable files, as a way of establishing their origin and preventing them from being tampered with.
Symmetric keys: Symmetric keys are used in data encryption-in-transit (after the client-server session has been secured with public-key cryptography), data encryption-at-rest (in databases), and encrypting credit card and other personally identifiable information (PII) in banking. Symmetric-key encryption is a lot faster and more efficient than public-key cryptography but is less secure.
Imagine someone stealing your identity. They have access to all of your personal information – your credit, social security number, club passes, social media accounts, etc. Ideally, they have everything to impersonate you. They can withdraw huge sums from the ATM. They can perform wire transfers from your account to theirs. They can even spoof your friends into doing them favors.
Something similar happens when machine identities get stolen, only on a much larger and more severe scale. This is because the “machine” in question, say a banking application, holds information on tens of thousands of individuals. When such an application’s identity is compromised, the threat actor gains unhindered access to sensitive customer data and can exploit them however they want.
When the identity of a critical network device, like a web server or a load balancer, is compromised, the attacker gains access to the deep network. They sometimes get even administrative privileges, and with that, they can insert malicious code into critical devices and cause them to malfunction, or even shut down systems. This causes all of the organization’s applications to go down or get breached, resulting in irreparable damage that affects both customers and internal users.
Apart from the loss in revenue incurred and damage to reputation, organizations that face a data breach are deemed non-compliant to the government or industry regulations and need to pay hefty sums in reparation to their customers and the respective bodies.
Also, with the growth of virtual, cloud, and containerized workloads, machine identities are becoming more numerous and varied in nature. Container-based services have a lifespan of mere seconds, and their TLS certificates, too, are ephemeral. Without careful management of container PKI, the workloads could get bombarded with insecure traffic.
Another reason why machine identity management is critical is the rise of connected devices. IoT has a strong presence in the healthcare, automotive, manufacturing, and consumer electronics industries. Like IT, secure communication between IoT devices through encryption involves X.509 certificate enrollment and authentication of IoT devices – but since they’re usually present in highly insecure environments, their identities need extra protection.
Below are a few of the common reasons that lead to machine identity theft-
Expired certificates: This is by far the most common reason behind application outages and data breaches. When a user tries accessing a website with an expired certificate, the browser displays a warning that your connection is insecure, making the application inaccessible. Sometimes, a certificate in software that’s responsible for monitoring and filtering out dubious traffic can expire, which causes the software to stop functioning (as happened in the Equifax breach). Applications connected to the vulnerable network are thus rendered defenseless against cyber attacks, leading to crippling data breaches.
Unknown revoked certificates: Digital certificates are revoked before they expire due to private key compromise or when the domain or application to which the certificate is attached is no longer operational. Certificate revocations get broken for the following reasons – the CA may fail to revoke the certificate or delay updating their CRL (certificate revocation list), which leads to the browser recognizing a revoked certificate as valid. Sometimes, browsers may validate a certificate despite its being on the CRL due to shoddy policy implementation. The result is that applications that use revoked certificates lose out on the security that certificates offer and become vulnerable to attacks. If an application or website has been taken down, but its certificate hasn’t been revoked on time, threat actors can use the orphan certificate for phishing attacks.
CA compromise: CAs get compromised when attackers steal the private key with which it signs certificates that it issues to organizations. The attacker can then use the stolen private key to sign certificates of malicious applications and fool browsers into trusting them. Such certificates, called rogue certificates, are widely used by attackers to propagate phishing and man-in-the-middle attacks. CA-related attacks also happen when a trusted CA fails to properly inspect and validate a sub-CA (internal or external) and makes it an intermediate root. The rogue intermediate root can then misuse its authority to sign certificates of fraudulent servers and applications and go undetected for a very long time.
By now, we have proven beyond doubt that machine identity management is critical to protect data, secure communication, and keep threat actors at bay. However, effectively managing machine identities is no mean task and comes with a fair share of challenges. One reason why machine identities are difficult to manage is that there are so many of them; the exponential rise in both IT machines (applications, web servers, workloads) and OT machines (endpoints such as IoT, mobile, laptops, etc.) has led to a corresponding increase in certificates and keys. Traditional methods cannot rise up to the demands of identity management in the digital era, and if not replaced, can cause widespread attacks and outages.
Some challenges are:
Visibility: With the colossal certificate and key numbers, it’s difficult to keep track of many of there are and where they are. Hidden certificates pose the greatest risk to machine identity management as they can expire quietly and cause outages. While it’s hard enough to discover all the certificates within an organization’s network, certificates that reside on devices outside the network perimeter, such as edge and IoT devices, are rarely detected and escape auditing.
Compliance: Properly implementing certificate policies that regulate issuance, validity, trust levels, access privileges, and so on are critical to ensure compliance. Lack of centralized policy management for TLS/SSL certificates and SSH keys creates security loopholes that can be exploited by threat actors.
Storage and distribution: Some PKI teams store certificates and keys in spreadsheets and distribute them via email. As the numbers go up, tracking and managing certificates on spreadsheets become error-prone, unsafe, and unwieldy.
Manual management: Manually managing certificate lifecycles is slow, error-prone, and highly inefficient. Manual certificate enrollment and provisioning stalls applications and devices from going online quickly, while manual renewal, revocation, and auditing can potentially cause downtimes and outages.
Ramp up visibility: The only way to make sure no machine identity is left unmanaged is to have a thorough scanning process that brings every certificate and key to the light. The scanning should go both broad and deep – exposing identities of devices outside the network perimeter, such as edge and IoT, and also certificates that are buried deep within the network. The scanning tool should also provide details about each certificate, such as its location, CA, and expiration date.
Centralize management: Centralized management of machine identities helps streamline policy implementation across devices, workloads, and environments. You can also group certificates based on their type (internal or external, server-side or client-side, etc.), expiry dates, criticality, etc., and implement group policies. Proper policy management helps machine identities do their job of securing communication and preventing illegal access effectively.
Enable self-servicing: Allow application and network teams to self-service certificate provisioning, renewal, and revocation to make machine identity management faster and more robust. PKI teams may be wary of letting other teams handle identities, but implementing role-based access controls and privileges can limit what the teams can see and do and keep the identities well-protected.
Rethink storage: Store digital certificates and SSH keys away from the user network in a centralized, secure location. Ideally, it should be an encrypted device such as a USB thumb drive, token, or an HSM (Hardware Security Module). This way, even if the users’ network or system is compromised, the certificates and keys remain safe. Restrict access to the storage unit to privileged users using strong passwords and RBAC.
Rotate SSH keys: SSH key rotation is essential to stave off vulnerabilities resulting from ex-employees having access to keys, old cryptographic algorithms, and policy changes. Replacing old SSH keys with new ones based on a set cycle after testing is highly recommended to keep remote access secure.
Embrace automation: A vital but often missing ingredient in machine identity management, automation is the solution to 90% of cybersecurity issues. Automating certificate and key lifecycle management – enrollment, provisioning, renewal, and revocation – helps keep machine identities up-to-date and effectively eliminates outages. Processes such as policy management and SSH key rotation can also be automated for better security. Automation also helps enable cryptographic agility – machine identities can stay on top of protocol and algorithm upgrades to offer the best possible protection under all circumstances.
Conduct frequent audits: Auditing machine identities help you find and weed out vulnerabilities such as weak passwords, old, unused keys, and rogue or expiring certificates. Knowing the status of your certificates and keys helps you prevent outages, preempt attacks, and improve your management strategy continuously. You can also automate auditing using a trusted third-party auditing tool.
The machine identity lifecycle may be defined as the duration of the identity’s existence, from when it is issued, to when a client chooses to retire it. There are a handful of well-defined components in this cycle. Each stage has to be carefully vetted and monitored – either by an engineer or by software – to ensure that it entirely plays its role in securing an endpoint on a network.