Log4j2 more zero days

By William Entriken

3 minutes

The log4j2 API is underspecified and still vulnerable to undefined behavior. Affects all versions up to and including the current 2.16.0.

This quick PR fixes the documentation from its current state of DANGEROUS into the better CORRECT, BUT ANNOYING WITH CAPITAL LETTERS.

The problem

I have also discussed this at https://github.com/apache/logging-log4cxx because every implementation is vulnerable and so they should all have defect reports opened on them. And they should all be blocking on this issue (PR) here.

In the API specification, i.e. Logger.java, function documentations include the word “message”.

Each time it is unspecified whether that means:

Of course because of CVE-2021-44228 we all know that this undefined behavior can and has led to dire consequences. And this confusion has still not been fixed.

Proposed solution

Rename the “Message” class to “InterpretedMessage”

The current class named Message still includes dangerous functionality including allowing interpolation. That means inputs still need to be sanitized. (There are other vulnerabilities here, less than shell access, which are out of scope for this PR.)

Therefore the name Message is inappropriate, in the same way that naming a machine gun as “Cuddly Cat” is irresponsible.

This should be renamed to InterpretedMessage. This better indicates that the content may be interpreted. And a reasonable programmer at the Balmer Peak will recognize that using InterpretedMessage might require sanitizing the inputs.

To maintain backwards compatibility in the version 2 release series, the existing Message class can be a deprecated alias for the new InterpretedMessage.

Remove all usage of the lowercase, unqualified word “message” from parameter names and documentation

As per above, the word “message” is dangerously confusing. And for the kinds of people that don’t recognize the difference between a lower case “message” and a title case “Message” as referring to two totally different things, we can help them by using better words.

Instead, how about:

@param text the content to be directly logged, without interpretation

and

@param interpretedText the content to be interpreted via {@code InterpretedMessage} and logged

Do not refer to InterpretedMessage as a “string”

Currently, function documentation in Logger.java refers to the Message object: “the message string to be logged”. This is a dangerous way to refer to the Message object.

A Message object is not a String, nor a string.

Instead, use refer to an InterpretedMessage only using “InterpretedMessage”.

Specify which inputs are interpreted or not

Currently, the Logger.java specification includes these API signatures for the debug(...) functions. This list is exhaustive except for duplicative functions where the first parameter is Marker marker or where the trailing parameters are Object p### or where the trailing parameter is Throwable t:

void debug(CharSequence message);
void debug(Message msg);
void debug(MessageSupplier msgSupplier);
void debug(Object message);
void debug(String message, Object... params);
void debug(String message, Supplier<?>... paramSuppliers);
void debug(String message);
void debug(Supplier<?> msgSupplier);

In all the functions, only one of them, with documentation read by a VERY attentive and skilled programmer, is specified whether the input is interpreted or not. Can you figure out which one? I won’t tell you.

Now just for one example, let’s look specifically at the CharSequence variant documentation:

/**
* Logs a message CharSequence with the {@link Level#ERROR ERROR} level.
*
* @param marker the marker data specific to this log statement.
* @param message the message CharSequence to log.
*/
void error(Marker marker, CharSequence message);

There can be no definitive reading of this specification as to whether the CharSequence will be logged as-is, or whether it may be interpreted. This ambiguity is dangerous.

Severity

William Entriken / https://phor.net / 2021-12-15

It is my opinion that logging is an important part of retrospective analysis of security incidents. And therefore any underspecified behavior in a logging library (including the above) which is vulnerable to threat actor manipulation is automatically a high-impact vulnerability.

Timeline

Comments

There are no comments yet.

Please discuss this topic anywhere and let me know any great comments or media coverage I should link here.