Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For security, prevent Function0 execution during LazyList deserialization #10118

Merged
merged 1 commit into from Aug 31, 2022

Conversation

lrytz
Copy link
Member

@lrytz lrytz commented Aug 23, 2022

This PR ensures that LazyList deserialization will not execute an arbitrary Function0 when being passed a forged serialization stream.

Overview

The most important take-away is: Deserializing untrusted data in a Scala (or Java) application is always a security risk. This recommendation does not change after this PR is merged.

A forged serialization stream can be used to execute unrelated code, which can be exploited. The next section explains the case that is being addressed in this PR in detail. However, any class on an application's classpath with similar code can be exploited. Therefore, untrusted data should never be deserialized.

A presentation about the topic: https://www.youtube.com/watch?v=MTfE2OgUlKc

Technical details

The following details play together:

  • LazyList has a var lazyState: () => LazyList.State[A] field, which is of type Function0 in bytecode
  • Serialization uses a "proxy" object with custom serialization / deserialization methods:
    • Evaluated elements of the lazy list are serialized first
    • Then the unevaluated tail of the lazy list is serialized using standard Java object serialization
  • The deserialization method invokes tail.prependedAll(init) to reconstruct the lazy list (init are the evaluated elements)
  • The prependedAll may invoke the lazyState() function of the tail object; this does not happen under normal circumstances, but a forged serialization stream can force the control flow into this situation
  • The lazyState object is created when deserializing the tail object
    • On a typical application classpath, there are a lot of Function0 subclasses; for example, every argument to a by-name parameter is encoded into a Function0
    • A forged serialization stream can contain any Function0 instance for the lazyState field, for example one that interacts with the file system or the network

In this scenario, the invocation of the lazyState function happens before the deserialized LazyList is actually used (for example assigned to a local variable). So even if the LazyList object would lead to a ClassCastException later on, the custom lazyState function is already executed. In other words, a serialization stream containing a forged LazyList would lead to executing the Function0 at any deserialization of the application, no matter what object type the application expects to read.

Change in this PR

In this PR, we change the LazyList deserialization method to ensure that this process cannot evaluate the state of the lazy tail.

The forged Function0 can still be executed later when evaluating the tail of the lazy list. However, this is only possible if the application actually expects a serialized LazyList and evaluates its tail. This makes an exploit much less likely.

Credits

We would like to thank Marc Bohler for finding this issue and reporting it to us.

This issue is reported as CVE-2022-36944.

@lrytz lrytz added the library:collections PRs involving changes to the standard collection library label Aug 23, 2022
@lrytz lrytz added this to the 2.13.9 milestone Aug 23, 2022
@lrytz lrytz requested a review from NthPortal as a code owner August 23, 2022 07:52
@@ -1370,7 +1377,7 @@ object LazyList extends SeqFactory[LazyList] {
case a => init += a.asInstanceOf[A]
}
val tail = in.readObject().asInstanceOf[LazyList[A]]
coll = init ++: tail
coll = tail.withNullLazyState(tail.prependedAll(init))
Copy link
Contributor

@NthPortal NthPortal Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we use a Builder[LazyList] for init instead of an ArrayBuffer, and just call lazyAppendedAll on the result?

Copy link
Contributor

@NthPortal NthPortal Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact, because tail is also a LazyList, I believe we can use LazyBuilder and just call addAll with tail as an argument, and we don't even rebuild the LazyList ever (not that I think it has a huge performance impact)

Copy link
Contributor

@NthPortal NthPortal Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side note: there was a discussion at some point somewhere suggesting that no Builders should be lazy (don't remember where), and that LazyList.newBuilder should return an eager Builder (probably for List) that calls mapResult to create a LazyList. On the off-chance that ever happens, we should explicitly create a new LazyBuilder rather than calling LazyList.newBuilder if we go with the change I suggested

Copy link
Contributor

@NthPortal NthPortal Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to put it differently: evaluating tail there does not represent a fundamental flaw, shortcoming or gap in the design of LazyList, it represents a careless programming error likely due to copy-pasting

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at LazyBuilder, I actually think we'd run into the same issue. If we change the code to

      val tail = in.readObject().asInstanceOf[LazyList[A]]
      coll = lazyBuilder.addAll(tail).result()

we have

    override def addAll(xs: IterableOnce[A]): this.type = {
      if (xs.knownSize != 0) {
        ...

  override def knownSize: Int = if (knownIsEmpty) 0 else -1
...
  private[this] def knownIsEmpty: Boolean = stateEvaluated && isEmpty
...
  override def isEmpty: Boolean = state eq State.Empty

So calling addAll could evaluate the state lazy val, which calls the lazyState function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a forged serialization stream can have that field set to true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

point. I guess we fall back to lazyAppendedAll then

Copy link
Contributor

@NthPortal NthPortal Aug 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrytz I whipped up my own fix that uses stateFromIteratorConcatSuffix. I was going to push an alternate PR to consider, but I haven't figured out how to actually test that the fix works. Been having issues deserializing the lambda.

If you'd like to use my code, the only changes are to the SerializationProxy (below)

  final class SerializationProxy[A](@transient protected var coll: LazyList[A]) extends Serializable {

    private[this] def writeObject(out: ObjectOutputStream): Unit = {
      out.defaultWriteObject()
      var these = coll
      while(these.knownNonEmpty) {
        out.writeObject(these.head)
        these = these.tail
      }
      out.writeObject(SerializeEnd)
      out.writeObject(these)
    }

    private[this] def readObject(in: ObjectInputStream): Unit = {
      in.defaultReadObject()
      val init = new ListBuffer[A]
      var initRead = false
      while (!initRead) in.readObject match {
        case SerializeEnd => initRead = true
        case a => init += a.asInstanceOf[A]
      }
      val tail = in.readObject().asInstanceOf[LazyList[A]]
      coll = newLL(stateFromIteratorConcatSuffix(init.toList.iterator)(tail.state))
    }

    private[this] def readResolve(): Any = coll
  }

and to import ListBuffer instead of ArrayBuffer.

Edit: I think the only change is to readObject?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this looks good, nicer than my workaround 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added a test.

Copy link
Contributor

@NthPortal NthPortal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels weird to me to add a new method used just for this when we already have lazy methods. I think the main issue was using an ArrayBuffer (which is on me)

@SethTisue SethTisue added the prio:blocker release blocker (used only by core team, only near release time) label Aug 24, 2022
@lrytz lrytz added the release-notes worth highlighting in next release notes label Aug 25, 2022
@lrytz lrytz force-pushed the deser branch 3 times, most recently from a8aacb0 to a40c17a Compare August 29, 2022 18:17
@lrytz lrytz requested a review from SethTisue August 30, 2022 07:17
This PR ensures that LazyList deserialization will not execute an
arbitrary Function0 when being passed a forged serialization stream.

See the PR description for a detailed explanation.
@lrytz lrytz enabled auto-merge August 31, 2022 06:41
@lrytz lrytz merged commit 3a12413 into scala:2.13.x Aug 31, 2022
@SethTisue SethTisue removed the prio:blocker release blocker (used only by core team, only near release time) label Sep 1, 2022
@SethTisue SethTisue changed the title Prevent Function0 execution during LazyList deserialization For security, prevent Function0 execution during LazyList deserialization Sep 1, 2022
@SethTisue SethTisue changed the title For security, prevent Function0 execution during LazyList deserialization For security, prevent Function0 execution during LazyList deserialization Sep 1, 2022
@lrytz
Copy link
Member Author

lrytz commented Sep 16, 2022

👍 thank you for the pointer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
library:collections PRs involving changes to the standard collection library release-notes worth highlighting in next release notes
Projects
None yet
4 participants