Since Linux 6.9, LUKS suspend stopped wiping disk-encryption keys from memory

196 points by IngoBlechschmid 2 hours ago | 98 comments

bbminner
I am far from a security expert, but from the number of "we missed a single line C check across files during refactoring" critical security bugs discovered on a regular basis these days, the whole premise of a "giant secure open source C codebase" seems questionable. It is not specific to C of course, but invariants are arguably even harder to enforce and track consistently (esp under changes to code) in C. Unsure if FP with invariants encoded in types is a practically feasible scalable solution either. Model checking? [LLM] fuzzing? Fewer primitives with clear boundaries? Is that how seLinux was "checked"?
- fsddfsdfssdf
  While I can see the shortcomings of C and generally don't recommend it for new projects I don't see this particular bug as a good example of something Rust's borrow checker or some other language's type system will catch. I don't think even static analyzers can catch this.
  It's basically something like this:
  original: DoTheThing()
  new: DoTheThingSlightlyDifferentButKeepMyCredentialsAlive()
  fix: DoTheThingSlightlyDifferentButDoInFactNOTKeepMyCredentialsAlive()
  In my experience a substantial portion of gnarly bugs come down to a violation of a high-level system invariant and those do not strike me as something that can be automated. Even with something like Lean you can prove your program satisfies certain properties but you need to have thought about those properties in the first place. The proof doesn't discover the invariant for you.
  If you'd had thought about the relevant security property you could have written a regression test for it which is not hard. IMO the really hard part isn't expressing the implementation safely, but it's the realization that this was a property the implementation needed to preserve.
- WhitneyLand
  The premise of a secure open codebase is fine.
  The problem is being more auditable does not automatically make it more audited.
  There have to be enough people with skill taking enough time to work on it.
  pixl97
  If you think open source is bad, wait till you see enterprise code. I'm talking full auth bypass due to the stupidest crap. You can do that in any language if you have fools working on the code base.
- moritzwarhier
  The whole premise of a "giant secure open source C codebase" seems questionable
  Because code review is sometimes not much different from an idealized version of the halting problem, where you would have access to a formalized version of a specification.
  In other words, there is no strict definition of what is a security issue.
- russdill
  The lesson here is that if a feature (at a minimum) does not have a associated test case, it is not actually a feature.
- lazide
  In open source, someone (many, many) someone’s can at least check.
  Closed source…..
  Twirrim
  Not sure why you're getting downvoted, this is the entire point of open source.
  Does such a bug exist in Windows? OSX? Who checks? If someone finds the key in memory, can they tell what conditions might be causing it and where?
  Their only recourse under those situations is to hand it off to the OS Vendor and trust that what they implement does solve the problem, and trust that it wasn't a deliberate back-door that is now being replaced by another back-door.
  charcircuit
  Security researchers find security bugs in closed source operating systems all of the time.
  lazide
  Yup, it’s just harder to know for sure.
  pixl97
  Oh, and large companies quite often fix these horrific issues silently, especially in online services where the customer can't diff bins. We're talking auth bypasses and RCE's that you'll never know about.
kokada
While it is certainly an interesting bug, I kinda feel that the title is click bait? Because this `cryptsetup luksSuspend` from what I understood is not really officially supported but an extension done in Debian, so if anything this regression only affected Debian? I am not sure if you can blame the kernel for something that is not supported or even widely tested.
I still find this impressive, and it is nice that we now have a test (NixOSTests BTW are awesome, I agree with OP) to avoid this regression from coming back. But from the title it seems to be a widespread issue, not something that affects only one Distro.
- IngoBlechschmid
  Sorry, aimed for a technically precise title and didn't want to bait clicks.
  Yes, this does not affect people on stock configurations for the plain reason that they wouldn't expect the volume key to be safe during suspend anyway.
  Debian's solution was ported to several (most?) other distributions and I guess quite a few people maintained private ports.
  The thread-keyring(7) manpage promises: "A thread keyring is destroyed when the thread that refers to it terminates." For their key upload (from userspace to kernelspace) mechanism, the cryptsetup project relied on this property; but kernel 6.9 introduced a regression invalidating this property.
tombert
I don't think this bothers me.
The only reason that I do the disk encryption is so that I don't have to worry about people going through my laptop to steal tax documents and/or credit card stuff when I sell the laptop. I of course also wipe the laptop too, but I figure that if the data is encrypted at the drive level then there's very little risk of anyone being able to use some kind of forensics tool and recover data.
bitbasher
I don't see any other way? When you sleep (suspend to RAM), everything is stored in RAM and is encrypted but the master key is present in kernel memory (if I recall correctly).
However, if you hibernate (suspend to disk) the entire contents of RAM (including the master key) is written/encrypted to disk and the RAM is cleared.
When you wake the machine up you have to re-enter the passphrase to decrypt the master key to re-load disk contents back to memory.
- IngoBlechschmid
  Yes, if you simply suspend your laptop on most stock Linux distributions, then everything including the master key is still kept in memory. But Debian pioneered the (optional) cryptsetup-suspend addon. This issues a luksSuspend command which is supposed to wipe the key from memory, and on resume asks you to resupply your passphrase.
  Up to kernel 6.8, this worked as described; starting with kernel 6.9, it silently didn't.
  herywort
  So you would still be asked for a passphrase, even though it's already available?
  IngoBlechschmid
  Exactly. Cryptsetup wouldn't know about the extra copy of the volume key in kernel memory. Which is why, dramatically, it appeared secure ("surely I wouldn't be asked to resupply the passphrase if the volume key is still in memory, right?").
  pedrocr
  It was still more secure than the default if I understand this correctly. On resume from suspend the laptop would still be locked by the encryption key and without access to the disk even if you can somehow circumvent the lock. The only insecurity was that somewhere in the kernel memory the key still exists so if you can somehow extract that from the live system you can unlock it.
  IngoBlechschmid
  Yes, you are right: LUKS encryption protests your data at rest. An attacker which steals your disk can only gain little, like the information that you have used LUKS (unless you put your LUKS headers elsewhere, separated from the disk) and perhaps disk and disk sector usage statistics.
  naturalmovement
  FYI: VeraCrypt is not the defacto encryption software for Windows.
  IngoBlechschmid
  Oh, which one is it?
  (You don't mean BitLocker, right?)
  naturalmovement
  It absolutely is and they have most the enterprise market.
  IngoBlechschmid
  Okay, yes, sure. It definitely is the most-used encryption software for Windows.
  But I would never trust it a second, being property and known for issues. You likely know that, but for the benefit of others:
  38C3 - Windows BitLocker: Screwed without a Screwdriver https://media.ccc.de/v/38c3-windows-bitlocker-screwed-withou... https://www.youtube.com/watch?v=5eNtT2p12cM
  noinsight
  If you’re at all serious about security and not user convenience, you deploy BitLocker with a PIN instead of TPM only. And then a whole class of vulnerabilities goes away.
  bri3d
  The issues you linked with BitLocker are obvious properties of BitLocker-with-SecureBoot-only architecture. If you configure Linux that way, you get similar issues (for example, it's pretty easy to mis-configure TPM sealed disk encryption on Linux to still allow a recovery shell, which will run with the disk unsealed).
  BitLocker with a password (the equivalent of the LUKS configuration in question) does not share these issues.
  saidnooneever
  veracrypt lost their drivers license so afaik you should avoid it since it cannot update its drivers any longer. didnt see any news about them reacquiring that license
  snailmailman
  Assuming this is what you are referring to, it was resolved within a few days. The incident being resolved just didn't make headlines. https://sourceforge.net/p/veracrypt/discussion/general/threa...
  nacs
  Reminder that by using Bitlocker, you're using a closed source encryption for which Microsoft will happily hand out your recovery key on request.
  https://www.forbes.com/sites/thomasbrewster/2026/01/22/micro...
  andrewpiroli
  Only if you store your key with Microsoft, which is not required or the default if you're using a local account which I assume most privacy sensitive people are.
  gruez
  Not to mention that unless the bitlocker activation flow changed recently, it specifically asks you how to store your backup keys, with a choice given been local options (eg. usb drive, printing it off, etc.) and saving it to your microsoft account.
  briHass
  Bitlocker can use keys that are local only, but the default for home editions of Windows was to use the online account to back it up.
  'Happily' is also a stretch, as they really don't have a choice if served a valid court order.
  If you want encryption that is safe from the US government, keys need to be stored in your head. Anything physical is subject to court orders.
  john_strinlai
  for enterprises, where this doesn't really matter, bitlocker is great.
  dijit
  if by "great" you really mean "fine".
  It's still brittle, awkward and puzzlingly awful UX despite being the literal standard for the platform.
  Compare it to any of the actively maintained alternatives, Filevault for MacOS (which is wonderful and never sends your key to be kept somewhere else) or LUKS on Linux.. heck, even Veracrypt is actually easier to understand and more robust.
  john_strinlai
  >if by "great" you really mean "fine".
  no, i mean great.
  managing a fleet of 100+ laptops with bitlocker is a breeze. its so seemless that the users don't even realize its enabled (i.e. no UX issues, at all).
  on the other hand, i am not managing 100+ laptops that use veracrypt. sounds absolutely awful. i've never managed an apple fleet, so i can't speak to that, and will take your word on it.
  for personal use, i do not recommend bitlocker (or windows, really), but for already-windows enterprises? absolutely
  dijit
  Flicking a button to turn something on is not what I'm talking about, that's normally the easy part of any setup, and I judge people harshly who only take that aspect of something into consideration when discussing systems.
  Brittle is what happens when you haven't logged on to the machine in 60 days, trust with AD is broken, TPM has a glitch and wipes the in device key and forces you into recovery... or god forbid you service the laptop and now you have to enter recovery mode.
  Then you're in a nightmare, trying to give someone a super long passphrase over the phone is a not-too-uncommon occurance.
  That's assuming you have a good policy for storing the recovery keys. Too loose and they're handed out to everyone, sort of defeating the purpose: too strict and you need the IT department (or specific members), and its still predicated on the notion that you have a policy for it... Given that Admins are a dying breed... I don't think this is workable.
  If you compare with Filevault on MacOS: which tracks the credentials of the logged in user; there's no "issue" if the device loses trust because ultimately you always use the real unlock key: not something cached in a "secure storage".
  bri3d
  Having dealt with FileVault in this context, it's also frustrating; it's really common to have it fail to follow the logged-in user's credentials, and if you use any kind of federated login, you will frequently get users with FileVault passwords that are either ahead of or behind their system login password.
  I think both approaches are valid trade-offs and I think that the default Secure Boot BitLocker configuration, for all its architectural tradeoffs, can probably be credited for an enormous amount of data loss mitigation originating from used hard drives alone.
  john_strinlai
  maybe i am missing something, but how did veracrypt solve all of the admin and policy issues you’re bringing up? (specifically for large enterprise fleets)
  dijit
  If you use your key every day you tend not to forget it.
  If I as an admin give you your key: it is “leaked” effectively.
  john_strinlai
  >If you use your key every day you tend not to forget it.
  hoping users don’t forget their password is a very weak policy.
  specifically, the policy and admin points you brought up above, how does veracrypt solve them?
  dcrazy
  Have you never gone on vacation and forgotten your daily-use password upon return?
  akerl_
  Managing an Apple fleet is similarly fine, and that includes using any of the MDM tooling that also does key escrow on enterprise Filevault devices.
  dcrazy
  FileVault absolutely has an optional iCloud Keychain escrow. That’s how the “unlock with Apple Account” feature works. Apple doesn’t have the keys for iCloud Keychain, but it is still stored in iCloud.
  j16sdiz
  > Filevault for MacOS (which is wonderful and never sends your key to be kept somewhere else)
  Did you read the documentation?
  https://support.apple.com/guide/mac-help/protect-data-on-you...
  "iCloud account: Click “Allow my iCloud account to unlock my disk” if you already use iCloud. Click “Set up my iCloud account to reset my password” if you don’t already use iCloud."
  https://developer.apple.com/documentation/devicemanagement/f...
  "FileVault Full Disk Encryption (FDE) recovery keys are, by default, sent to Apple if the user requests them. Only one payload of this type is allowed per system."
  dijit
  Can.
  If you click "Allow my iCloud account to unlock my disk", your recovery key is escrowed to Apple, tied to your Apple Account.
  If you don't select that option it never does.
  I should have said "without your explicit permission", but I assumed we were all adults and understood that.
  The main point is that it's using your account password to unlock, the recovery key is for if you forget your account password.
  dcrazy
  No, you were just plain wrong. You said “never”, when in reality BitLocker and FileVault both have optional escrow.
  Arainach
  Veracrypt is more difficult to set up - whether on one machine or a fleet. Bitlocker is a few buttons in the UI, configurable via Group Policy, and so much more.
  What is brittle or awkward?
  dijit
  "PLEASE ENTER YOUR BITLOCKER RECOVERY KEY"
  Where is it?
  A) Uploaded to microsoft
  B) Somewhere in EntraID?
  C) Somewhere in our onprem AD?
  D) Written down on a scrap of paper when I set up the laptop
  the fact that they never ask for the passphrase is a weakness of the system. Because now you have an extremely difficult situation as soon as you're off the happy path.
  It's also like 64 characters alphanumeric with no capability to copy/paste.
  Compare it to Vera/Filevault where the access key is the users passphrase. In MacOS it's literally your account password, which follows along with your in-OS account credentials.
  philipallstar
  Does that mean it's not the de facto standard on Windows?
  naturalmovement
  So exactly like FileVault?
- dist-epoch
  Both Intel/AMD CPUs produced in the last 5 years or so support full transparent (to the OS) memory encryption. So cold boot attacks are a thing of the past if you enable this feature (it's typically disabled because it reduces RAM speed by about 0.5%).
- crypttales
  [dead]
johnathan101
This is one of those regressions that's easy to miss because everything still "works." Security bugs often don't announce themselves.
- IngoBlechschmid
  Right! Which is why integration tests for these kinds of features are all the more important.
  It was also fun to write, and enabled git-bisecting to isolate the specific kernel refactoring which introduced this bug: https://github.com/NixOS/nixpkgs/pull/532499
CodesInChaos
I don't have to re-enter my boot password after Sleep, so obviously the encryption key is still in memory.
- wrs
  Obviously your distro isn’t using cryptsetup-luksSuspend.
  unethical_ban
  Correct.
  The point being made is: If one isn't re-entering their passphrase after suspend, how are they surprised that the encryption keys are somewhere in memory during suspend?
  ksbd-pls-finish
  Because debian users with luks-suspend have to re-enter their boot password after sleep.
  weaksauce
  > The point being made is: If one isn't re-entering their passphrase after suspend, how are they surprised that the encryption keys are somewhere in memory during suspend?
  If that was the case for the people using the debian extra secure extension that should have wiped the memory clean then someone would have found this bug much earlier than two years. Their password was required to be re-entered even though the key was still in memory somewhere.
  akerl_
  The reason this bug is unexpected is that the user is expecting to have to enter their password (because they expect the key to be wiped on suspend), and then _they are_ asked for their password. But there was a copy of the key elsewhere in kernel memory that was never cleared.
  killerstorm
  Well, potentially a key might be stored in TPM. But I don't think that's better
fpoling
On my laptop with Fedora I just configured Linux to hibernate to disk after 15 minutes of suspend. Powering memory off ensures that bugs like this Debian-specific would not matter.
Plus what Debian extension to Linux tooling does although nice in theory, but in practice if one really worries about cold-boot attacks, then all keys and important documents has to be wiped out from memory, not only LUKS keys.
So hibernating is really the only proper way to protect against cold boot.
- IngoBlechschmid
  > So hibernating is really the only proper way to protect against cold boot.
  I agree; or resurrecting FridgeLock: https://www.sec.in.tum.de/i20/publications/fridgelock-preven...
- killerstorm
  Hmm, where does it get a key to decrypt memory on resume?
  AFAIK it's practical only if you make use of TPM. And if you do, you're basically at mercy of TPM.
  teravor
  > where does it get a key to decrypt memory on resume?
  you enter it...
quotemstr
It's because of vulnerabilities like this that I enable Intel's "total memory encryption" feature. No plaintext leaves the CPU package. DIMM swap attacks become useless. Moreover, it's basically free: the cryptography happens directly in the memory controller, in hardware, inline with the bus transactions the CPU is doing anyway.
teravor
on the subject of encryption keys and memory there is something you can do:
- if your CPU supports it, enable memory encryption.
- if your TPM module supports this look for MemoryOverwriteRequestControl & MemoryOverwriteRequestControlLock (/sys/firmware/efi/efivars/) and toggle them. make sure that your computer always reboots and never powers off. memory will always be wiped on boot.
- someothherguyy
  https://trustedcomputinggroup.org/wp-content/uploads/TCG-PC-...
deng
> Except that, for more than two years, the encryption key remained resident in memory across suspend, leaving it there for the taking by anyone who seized the still-powered laptop.
I don't get it. Obviously, the laptop is locked when it resumes, how is that key "for the taking by anyone"? I'm not saying it is impossible to read out RAM from a locked laptop, but surely not by "anyone".
- nicce
  Anyone with physical access. I think it is understandable from the phrase.
  There is a common misconception about how lock-screens in general work - they usually just prevents using the current hardware and software as it is to access the current OS. But the disk encryption is the main thing that prevents modification and other kind of access to actual data. And if the disk encryption key is lying in the memory, then effectively, the disk encryption is bypassed if someone can access the machine physically and assuming that there are no sufficient tampering protections in place for that machine.
  acdha
  Anyone with physical access, significant tools, and experience. The FBI has people who can pull data out of memory after freezing the RAM but the average laptop thief doesn’t so how serious this is depends significantly on your threat model. If you’re not a major criminal, bitcoin whale, or intelligence target this is almost certainly academic.
  deng
  > If you’re not a major criminal, bitcoin whale, or intelligence target this is almost certainly academic.
  Thanks, that's what I thought.
  deng
  > Anyone with physical access. I think it is understandable from the phrase.
  Sorry, I'm probably dense, I still don't get it. You steal a laptop, you open it, the screen is locked with a password/fingerprint whatever. How do you read out the RAM from that laptop?
  IngoBlechschmid
  Several options. One is you restart and boot from a live system where you are root, and then dump all memory. This is described in the paper with the witty title "Lest We Remember: Cold Boot Attacks on Encryption Keys":
  https://www.usenix.org/legacy/event/sec08/tech/full_papers/h...
  Other options: DMA attacks. Also you never know what the Intel Management Engine hidden in your computer is doing. It's running a version of Minix you don't have any control over, and it has full access to memory.
  john_strinlai
  >How do you read out the RAM from that laptop?
  the term to look up is "cold boot attack" (https://en.wikipedia.org/wiki/Cold_boot_attack).
  tons of cool live demonstrations of how it works on youtube if you've got the 20-40 minutes to spare
  deng
  Still, this is a pretty crazy definition of "anyone".
- jakewins
  There are attacks that allow dumping RAM if the device is powered on though and you have physical access. Depending on config it may be very easy (just plug in a dumper over Thunderbolt on USB C and do direct memory access) or hard (freeze and swap physical RAM to an unlocked machine).. but the idea was defense-in-depth here; a well configured device should both be hard to dump RAM on and it should not give encryption keys if an attacker succeeds.
- saidnooneever
  you dump the physical memory, then decrypt the disk offline
panny
[flagged]
- palata
  And I don't use GUIs, but it doesn't mean I have to be a jerk to people who are happy when their GUI gets better :-).
  panny
  Suggesting I'm an exotic animal for being budget and environmentally friendly is being a jerk too.
- ekunazanu
  > That's a you problem. I shutdown my machine when I'm not using it.
  "We designed the antennas correctly, you're holding the phone the wrong way."
  adontz
  It's not a good analogy. Something is still on in suspend. Good you can control Linux kernel, but what about all other chips which may be an attack vector?
  1718627440
  Except shutting down and hibernate are two actions the user can literally select from the same menu.
- bwat49
  I shutdown mine too but only because suspend is still a crapshoot on linux
  jchw
  There will always be more suspend/resume bugs to work through. It varies a lot per device. I feel it's necessary to paint the picture for people who are curious what it means for it to be a crapshoot, so indulge me while I share my experiences.
  For work I have a ThinkPad T16 Gen 4 with the newer AMD gfx1151 iGPU. Works great. I have yet to witness any issues with suspend/resume. I suspect this is the case because it is running Ubuntu with Lenovo's own support package. Theoretically, from firmware to kernel, this is all tested and validated by Lenovo, like what certainly happens with every Windows laptop and all of the components that go into them.
  I also have a gen 1 Framework 16. I have seen it crash on suspend, but it is pretty rare, so I've just shrugged it off for now. It would be hard to debug, I don't see it every month despite using the thing every day.
  All of my desktops currently have perfectly reliable suspend resume, you can slam it all day and all night. The last time I ran into issues was a use-after-free issue in AMDGPU. Pretty alarming, although to be clear it never hit any LTS or vendor kernels that I am aware of. I hit it because I prefer to run the latest kernel on my personal machines.
  I have certainly owned laptops where suspend basically didn't work, or it would not stay suspended. I think this mainly went away when I started specifically picking laptops for Linux support.
  For Intel iGPUs and dGPUs, the track record has been flawless for me. I have a few of the new Battlemage cards that default to the xe kernel driver and those have been working very well as expected. So that's nice.
  I don't think this situation will be fixed until more hardware vendors are taking part in validating their stuff on desktop Linux and keeping track of the kernels. The current Linux model seems to be just dealing with whatever the vendors crap out for Windows, often full of weird ACPI behaviors and buggy firmware. It's not to say that the fault of the problems don't often lie with code in the Linux kernel, but they do not seem to wish to be bug-compatible with Windows and I think that is perfectly reasonable, so for problems that come from essentially broken firmware, it simply is going to need vendors to actually fix their shit.
  (And that includes AMD. The drivers are good in some regards, but it's hard to ignore AMD's stability issues even still. At this rate, more of the long outstanding AMD driver issues will get resolved by Claude than AMD engineers... Like with Panel Self Refresh on 7040 iGPU, apparently.)
- codedokode
  I am too lazy for that, and I hate that after boot you need to launch everything again.
  IngoBlechschmid
  Suspend to (encrypted) swap might be a good middle ground between you and grandparent. Suspend to memory will (at best) protect your LUKS volume key, but other sensitive data remains.
  A couple of years ago, three security researchers from the TU Munich implemented a prototype for also encrypting (most) parts of the memory just before suspend, to address this limitation; but as far as I know, it was not upstreamed or developed further: https://www.sec.in.tum.de/i20/publications/fridgelock-preven...
  1718627440
  You can usually change that in the settings of the Desktop environment.
  codedokode
  There is no universal support for restoring state between the apps. For example, Terminal won't run the scripts that were running, the browser will not automatically restore the pages etc, some apps might not launch or launch with wrong state.
  Gnome desktop environment cannot even remember the position and size of console windows, you are expecting too much.
naturalmovement
Definitely not a symptom of Linux being a hodgepodge of code thrown together from a thousand different sources and no one person could tell you how it all fits.
- cevn
  Bugs happen in all code. The difference is, anyone can fix stuff in open source. Closed source bugs are out of control and must be worked around. Usually by switching to OSS
- steve918
  I wonder if you think other OSes are any different?
  TempleOS is the only thing that comes to mind that doesn't fit your description and it's not practically useful.
  Any sufficiently large codebase is a mix of ideas and concepts implemented by different people with different priorities over a large timespan and if you can fit the entire thing in your head it's not very interesting or complex.
  IngoBlechschmid
  Qubes OS, the Linux distribution aspiring to offer a reasonably secure operating system, pioneering a "every app runs in a virtual machine" approach in the Linux laptop/desktop space, tracks this at the following issue:
  https://github.com/QubesOS/qubes-issues/issues/2890
  saidnooneever
  QubesOS is Xen based. Not Linux.
  naturalmovement
  The *BSDs, Mac, and Windows all keep critical code in the same tree as the OS.
  Something like disk encryption would be immediately visible.
  So you don't have this mess of 80 different distros with 60 different versions of systemd, 20 that don't use it, a million kernel versions and it's all thrown together in a Costco-sized trash bag and we call the output "Linux".
  yaris
  In my experience any software system (not just operating system) after crossing a certain limit on complexity and age looks exactly as hodgepodge of code pieces thrown together, sometimes from different sources even if developed by one org. All major OSs have long crossed those limits, I believe.
  brainwad
  Windows for ages did not really keep all the code in one repo. There were like a dozen parallel repos for e.g. the shell, kernel, IE, etc. Also every feature was developed on team-level branches; integrating all those branches often caused unexpected bugs.
- stackghost
  Of course it's (indirectly) a symptom of that.
  What's the alternative? Proprietary closed-source operating systems owned by corps who can be compelled to insert covert backdoors?
  If BSD was as popular as Linux it would have the exact same problems.
- dist-epoch
  "Mythos, find me a bug in LUKS. I know there is one in there".