Conclusions from CVE-2024-3094 (libxz disaster)

Sat Mar 30 17:07:26 PDT 2024

Reordering since I want to respond to different bits in a different
order...

On Sat, Mar 30, 2024 at 03:30:49PM +0000, Daniel Golle wrote:
>
> Hiding a malicious change in a commit is infinitely harder than hiding
> it in a tarball.

Yet most of the exploit/payload found so far was in commits, disguised as
test cases.

On Sat, Mar 30, 2024 at 03:30:49PM +0000, Daniel Golle wrote:
>
> unchanged. Git has a lot of security built-in, and by using tarballs
> as a base for our package builds we are basically throwing all that
> away, for the sake of saving a negligible amount of resources on
> the build infrastructure.

I sort of agree, sort of disagree with this.  Having a cryptographic hash
at the center of everything provides security comparable to the security
of the hash.  Alas, this means replacing that hash is a bit difficult.

The design is good, but SHA-1 is no longer appropriately secure.
Replacing SHA-1 is a work in progress, but until that completes SHA-1 is
still the core of *everything*.  I've been monitoring the situation and
early work started in 2017, but it still isn't usable yet.  Until it is
ready there is this rather oversize elephant in the room.

https://git-scm.com/docs/hash-function-transition

(SHA-1 collisions aren't known to have been used for anything /yet/,
but it is only a matter of time; this *really* worries me)

On Sat, Mar 30, 2024 at 03:30:49PM +0000, Daniel Golle wrote:
> 
> However, after reading up about the details of this backdoored release
> tarball, I believe that the current tendency to use tarballs rather
> than (reproducible!) git checkouts is also problematic to begin with.
> 
> Stuff like 'make dist' seems like a weird relic nowadays, creates more
> problems than it could potentially solve, bandwidth is ubiquitous, and
> we already got our own tarball mirror of git checkouts done by the
> buildbots (see PKG_MIRROR_HASH). So why not **always** use that
> instead of potentially shady and hard to verify tarballs?

I don't think the issue is so much that tarballs are archiac, but that
*everyone* is using Git now.  One proposed patch from a pull:

https://github.com/openwrt/openwrt/pull/14280/commits/1b29aadbbf07cb77498a0eb92fe7c171c65dab2e

I don't see a single reference to a version control system besides Git
anywhere in OpenWRT at this point.  Tarballs were a reasonable choice
when there were >4 source code handling systems in use, yet now Git is
also a common point.  So if everything is in Git, how does handling
tarballs help builds?

> Always using git checkouts instead of tarballs would also makes it
> much easier for maintainers to at least have a quick look at the
> changes made in an upstream project between versions (a quick scroll
> over  'git diff oldtag..newtag' or even just 'git log --stat
> oldtag..newtag' doesn't take much more time than manually validating a
> release tarball GPG signature in most cases, if there even is any...).

I see several issues with your argument, but I mostly agree with your
conclusion.  Git is *everywhere*, so why use tarballs?

I disagree with your approach though.  Git already has two tools for
handling this situation and I think one of them should be chosen.

The first is `git submodule`.  My understanding it is pretty similar to
OpenWRT's current approach.  Difference is this lets `git` handle
downloading other repositories instead of doing it in a Makefile.  Since
Git is already designed to handle this sort of task, I suspect this will
be rather more reliable than the existing system.

Second is `git subtree`.  This is a tool for including other projects
into a repository.  The end result is the other project's history becomes
merged into local history.  One advantage is you download everything all
at *once*, rather than individually grabbing tools.  Other is their full
history will make upgrades easier since differences will be more obvious.

These will need major changes to the build system.

On Sat, Mar 30, 2024 at 10:54:00PM +0100, Oldřich Jedlička wrote:
>
> so 30. 3. 2024 v 16:31 odesílatel Daniel Golle <daniel at makrotopia.org> napsal:
> > Hiding a malicious change in a commit is infinitely harder than hiding
> > it in a tarball.
>
> Just a note: The malicious code was part of the tarball because it was
> part of the main Git repository in the first place. Using Git would
> not help in any way in this particular case. Just check [1] together
> with findings [2].
>
> [1]: https://git.tukaani.org/?p=xz.git;a=shortlog
> [2]: https://boehs.org/node/everything-i-know-about-the-xz-backdoor

One of the information sources (haha, one can wonder about *any* source
of information):
https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27

Under "Design":

>Normally upstream publishes release tarballs that are different than the
>automatically generated ones in GitHub. In these modified tarballs, a
>malicious version of build-to-host.m4 is included to execute a script
>during the build process.

So the malicious source code was part of all tarballs, but only the
tarballs with the modified `build-to-host.m4` would trigger the malicious
payload.

So obtaining GitHub's tarballs which came directly from the Git
repository *does* avoid the breach.

(as does avoiding SystemD, as does not building rpms or .debs, or using
something besides amd64)

-- 
(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \BS (    |         ehem+sigmsg at m5p.com  PGP 87145445         |    )   /
  \_CS\   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445