Category: Labs

  • linux-hardened-unredacted: a hardened kernel for Debian

    Internet censorship and surveillance have gotten worse every year. Security matters more than ever as AI-driven vulnerability discovery gets better. The people we build for depend on our machines to reach the open Internet safely, and every one of those machines runs Linux.

    The kernel underneath is the part that arguably matters most. Our Tor exit relays, FreeSocks, our proxies, XMPP.is, our Matrix server – they all sit on top of it. Compromise the kernel and nothing above it is safe.

    So we built our own. linux-hardened-unredacted is a hardened kernel we build, sign, and publish as .deb packages for Debian 13 (trixie), with Ubuntu coming soon, on amd64. It starts from upstream Linux, applies the linux-hardened patch set maintained by Levente Polyak (anthraxx), and adds our own config, signing, and an automated, reproducible build pipeline.

    Repository: git.unredacted.org/unredacted/linux-hardened-unredacted

    Why we built it

    We run a lot of servers – over 300 of them. Hundreds of terabytes a month move across our Tor exits and our censorship-evasion and secure-infrastructure services. Most of that runs on Debian, increasingly on hardware we own and rack ourselves.

    Those machines carry other people’s traffic. A kernel bug on one of them is about the worst case we can think of, which is why the kernel sat near the top of our list to harden. We built it for ourselves, ran it across nearly the whole fleet, then released it for anyone else on Debian.

    Finding kernel bugs used to mean a person reading the code, spotting the flaw, and writing the exploit. Less so now. Fuzzing and automated analysis turn up bugs faster every year, and they work for attackers exactly as well as they work for us – more serious bugs, less time between a flaw going public and someone pointing it at a real person.

    We respond on two fronts. We cut the attack surface so fewer bugs are reachable and the survivors are harder to use, and when a fix lands upstream we rebuild, re-sign, and ship as fast as we can.

    Where linux-hardened came from

    We didn’t write the hardening patches. linux-hardened was started by Daniel Micay – the developer behind GrapheneOS – as the mainline-Linux counterpart to that work. It builds on grsecurity/PaX and the kernel’s own Kernel Self-Protection Project, and some of its ideas have since reached mainline. The slab freelist hardening in every kernel today began as Micay’s patches before it landed upstream.

    These days Levente Polyak (anthraxx), an Arch Linux developer, maintains it. GrapheneOS archived its own copy and points to his as the project’s home; it’s what Arch ships, and it’s what we build on. Tracking a tree that plenty of other people already run beats maintaining a private fork only we ever look at. What we add on top is small enough to audit in an afternoon.

    What the hardened kernel changes

    This kernel makes the bugs that slip through harder to exploit, and rips out features that mostly serve as a way in. A kernel runs to millions of lines of code, most of which never executes on boxes like ours, and every piece we don’t need is one more door. We keep what we use, harden it, and drop the rest.

    It clears memory both when it hands it out and when it takes it back. Normally freed memory keeps whatever was in it – a password, a key, part of someone’s traffic – until something overwrites it. Ours zeroes it on the way out and again on the way back, so a bug that reads memory it shouldn’t finds zeroes, not leftovers. Same for the kernel’s own stack: when a syscall returns, the space it used is wiped, so stale arguments and pointers aren’t sitting there for the next call to leak.

    It treats memory errors as attacks. A big share of kernel exploits come down to one thing: a write past the end of a buffer, or a copy with the wrong length. This kernel checks the size of every copy between itself and your programs, randomizes where objects land on the heap, and wraps allocations in canaries, so an overflow trips a check instead of becoming an exploit. A lighter check runs the entire time the machine is up, sampling a fraction of allocations to catch use-after-free and out-of-bounds bugs in real traffic – cheap enough to leave on in production. And it loads itself at a random address on every boot, so an attacker can’t assume anything sits where it did last time.

    Nothing gets to modify the running kernel. Kernel memory is never writable and executable at once, so injected code has nowhere to run. Drivers won’t load unless they carry a signature we created, which keeps tampered modules out. The kernel runs in integrity lockdown from boot – on every machine, not just Secure Boot ones – closing the interfaces that could modify it at runtime, root included. On a Secure Boot machine, the signed boot chain goes further and stops anything tampering with the kernel before it even loads.

    We also cut whole features such as direct access to physical memory (/dev/mem), live kernel memory dumps (/proc/kcore), loading a new kernel at runtime (kexec), hibernation (which writes an unverified image of the kernel to disk), and a pile of old, rarely-used syscalls. Heavy tracing tools – perf, BPF, ptrace – are limited to the administrator instead of every process.

    All of this comes with tradeoffs, and you should know them before installing:

    • Out-of-tree modules (NVIDIA, ZFS, VirtualBox) need to be rebuilt against our headers and signed. DKMS handles this with a per-machine key you enroll once.
    • Rootless Docker and Podman, the Chromium sandbox, Snap, and Flatpak all want unprivileged user namespaces, which we leave off. A single config line turns them back on.
    • kexec, kdump, and hibernation are gone (a problem for laptops).

    Every workaround we know about lives in the incompatibilities doc. And none of this is permanent: the hardened kernel installs alongside your existing one, GRUB lists both, and if something breaks you reboot into the stock kernel and you’re back where you started.

    How it compares to a stock Debian kernel

    Debian’s kernel is already hardened. Out of the box it ships stack-protector-strong, KASLR, FORTIFY_SOURCE, hardened usercopy, W^X (write-xor-execute) memory, slab freelist hardening, and heap zeroing on allocation. Plain Debian is a good place to start. We keep all of it and push further:

    ProtectionStock Debian kernelOur hardened kernel
    KASLR, stack protector, FORTIFY_SOURCE, hardened usercopy, W^X, slab freelist hardeningYesYes
    linux-hardened patch set (slab canaries, freed-memory verification)NoYes
    Zero memory on allocationYesYes
    Zero memory on freeOff by defaultOn
    Reject unsigned kernel modulesOnly under Secure BootAlways
    Direct physical-memory access (/dev/mem)Enabled (restricted)Removed
    Load a new kernel at runtime (kexec)EnabledRemoved
    Hibernation (suspend to disk)EnabledRemoved
    Checkpoint/restoreEnabledRemoved
    Slab mergingOnOff

    “Reject unsigned modules: always” means a tampered driver won’t load on our kernel even on a machine without Secure Boot, where stock Debian would load it anyway. “Removed” means more than “disabled”: the code isn’t compiled into the kernel at all, so it can’t be turned back on at runtime.

    None of this makes Debian’s kernel a bad choice. It’s a sensible default that has to boot on every laptop, server, and obscure embedded board anyone might own. We’re making narrower choices. These are our machines, so dropping hibernation and kexec costs us nothing.

    How it works

    Most of the actual work went into the build process and automation.

    The config

    We don’t keep a finished .config in the repo. A checked-in config goes stale – it drifts from upstream and quietly sheds protections one kernel version at a time, and you don’t notice until you go looking for something else.

    Instead we build it fresh each time from two pieces. The base is anthraxx’s own config, the one that tracks the patch set; we pin it to an exact commit and hash and check its GPG signature before the build touches it. On top goes our overlay – around ninety lines pinning the hardening options we care about. A couple dozen of them diverge from anthraxx. Some go further: we refuse unsigned modules, run lockdown in integrity mode, and zero registers on function return, among others. The rest are either the same protection done differently or a deliberate trade-off, like dropping a plugin that would break reproducible builds. The remaining lines pin defenses we want kept – including ones anthraxx already sets – so a later change can’t silently drop one.

    Then the build checks that every line of our overlay actually made it into the final config. Miss one and the build fails inside two minutes, naming the option that didn’t land. That check has earned its keep more than once: it caught an upstream rename of a stack-hardening option, and a missing build dependency that had silently dropped a whole family of protections. A second check diffs each build against our last release and fails if an inherited protection went missing – so we pick up new hardening from anthraxx automatically and don’t lose old hardening by accident.

    The build

    Push a version tag and the pipeline takes over. It pulls the kernel source from kernel.org and checks the signature against the release keys, downloads the linux-hardened patch and checks it against anthraxx’s key, and stops if either fails. Everything is re-hashed before use, so nothing can be swapped between steps. Then it builds the config, runs the fidelity check above, and compiles.

    Out comes a kernel image around 230 MB, headers, an optional debug package (~1.3 GB), and small meta-packages – so you install one package name and apt carries you forward to new kernels over time. The pipeline signs the image for Secure Boot, signs the modules during the build, records everything in a manifest, and attaches it all to a draft release for a person to review. Timestamps come from the tag’s date and the build container is pinned by digest, so the result is reproducible.

    Signing and Secure Boot

    On a Secure Boot machine you trust our kernel once, by enrolling our certificate through MOK at the next reboot. There’s no Microsoft-signed shim – we left it out on purpose.

    Microsoft’s program requires the signing key to live in a hardware security module (HSM), or an equivalent FIPS 140-2 device, with documented two-factor authorization. Ours doesn’t – it lives in our CI’s secret store and loads only for the build job. That’s what lets the pipeline sign on its own and turn an upstream fix into a signed kernel within hours. A key that needs a human at a hardware token for every signature can’t move that fast.

    The cost is real: both signing keys are online, so anyone who fully compromised our build infrastructure could sign a malicious kernel. Our security policy says exactly that. It’s the price of automated builds without an HSM, for now.

    Reproducible builds

    Because the key is online, we give you a way to check what we shipped. Every release carries a manifest: the exact build container, every input and its hash, the hash of each package before signing, the signing certificate’s fingerprint, and a full parts list of the build environment down to the compiler version. Take the manifest, rebuild from the same public inputs and toolchain, strip the signatures off both sides, and the bytes should match. No private key needed. We run that check every night.

    It’s also why we turn off one optional plugin that mixes randomness into the build: it would break a byte-for-byte match, and proving what we shipped matters more to us than the bit of entropy it adds.

    Responding to CVEs

    The pipeline watches upstream and handles the repetitive parts; the decisions stay with us. Every hour it checks anthraxx for a new release, and when one lands it opens a tracking issue with a best-effort list of the CVEs fixed since the version we’re on. It won’t start a build by itself – a person verifies the upstream signature first. Once a week it checks anthraxx’s config and opens a pull request if anything changed. The release is hand-done too: the build hands a person a draft to review and boot-test, and because the kernel is a root of trust for boot, a human signs off before it ships.

    For a serious CVE we aim to queue a build within 24 hours of the upstream fix and publish within 48 once the reproducibility check passes. That timeline is what the automation buys us – and donations are what pay the people who run it.

    Shipping it

    Packages go to our own Debian repository behind a CDN, which keeps the load off our infrastructure. The repository metadata is signed with a separate key you verify out of band before your first install.

    One caveat: CDNs usually log which IP pulled which kernel version, and when. The packages are signed and tamper-evident, but if your kernel version is something you’d rather not advertise, that download timing is still metadata. Pull updates over Tor or a VPN, or mirror the repository locally. For us it’s a fair trade for getting updates to the people who need them, ourselves included.

    Where does it go from here?

    There’s plenty left to do.

    • The toolchain isn’t fully pinned yet – the last step toward a build anyone can reproduce exactly.
    • It’s amd64-only on Debian today. Ubuntu, Debian 12, arm64, and more kernel flavors are next.
    • An HSM would eventually close the online-key gap and let us ship a Microsoft-signed shim, so Secure Boot users could skip the manual key enrollment. That one comes down to funding and time.

    We’ll keep writing about it as it develops, like we do with our other projects.

    We need your help

    Unredacted is a 501(c)(3) non-profit. We build free and open infrastructure that helps people evade censorship and protect their privacy. All of it runs on donations, and responding to a kernel CVE in hours takes real engineering time.

    If you find any of this useful, please consider making a donation. If you’re an organization that wants to collaborate, get in touch.

    And if you run Debian: install it, read the config policy, rebuild a release, and check your bytes against ours. Tell us if they don’t match. We appreciate every one of you testers.

    A huge and special thank you to our donors for making this work possible.

  • Unredacted Labs, experiments for anti-censorship & privacy-focused infrastructure

    Censorship is on the rise, and has been for a while. Censors are working hard every day to restrict access to crucial information that challenges their authoritarian regimes across the world. It’s important that we continue work to counter their efforts, and develop new and innovative ways to build out anti-censorship infrastructure which helps people evade unjust Internet restrictions.

    Since late last year, we’ve been experimenting with various ideas and projects to aid in the advancement of anti-censorship and privacy-focused infrastructure. Experimentation is a big part of what we do, and it’s important to continue with that experimentation. To house all of these experiments, we created the concept of Unredacted Labs. While most of the work was conducted in secret, we’ve dropped hints about our work via our community chats and social media channels in the past.

    It’s time that we show the public what is and has been happening inside the Unredacted Labs.

    Unredacted Labs Experiments:


    NoiseNet (AS401401)

    NoiseNet is an experimental anycast network designed by Unredacted that attempts to introduce noise (randomness) and an additional layer of encryption into parts of our network, making it more more difficult for censors to monitor and perform traffic correlation attacks against parts of our infrastructure. NoiseNet also helps us distribute load across our network and mitigate DDoS attacks.

    NoiseNet operates on top of AS401401, and has edge routers distributed all over the world which can be seen below.

    Understanding the NoiseNet flow

    When a packet is destined to an anycasted NoiseNet prefix, it’ll generally reach the closest PoP to where the packet originated. Depending on the specific destination, the NoiseNet edge node will route the packet through an encrypted tunnel to the core router(s) where it should be delivered to. The router(s) will then pass it to the server it’s destined for where the encrypted tunnel terminates. This means that not even the core router(s) in our datacenters can inspect the true nature of inbound packets. Only the network before the edge node (the upstream), the edge node itself, and the server in which the packet is destined for can see what an inbound packet contains. This essentially reduces the amount of intermediary networks that can potentially snoop on our traffic, while also making it generally more difficult to do so at the same time.

    Once the server processes an inbound packet, it can generate a response which is sent directly out via our upstream provider(s) in the datacenter where the server lives.

    Breakdown of the flow of packets through NoiseNet

    • Packet gets sent to a NoiseNet prefix.
      • A NoiseNet PoP receives the packet.
        • The router forwards the packet to the proper core router via an encrypted tunnel.
          • A core router forwards the tunnel flows to a server, where the tunnel terminates.
            • The server generates a response, and sends the packet back out to the core router.
              • The core router forwards the packet directly via it’s upstream(s) back to the source.

    Visual examples of how packets can flow through NoiseNet

    User -> NoiseNet PoP -> encrypted tunnel -> core router(s) -> endpoint

    This design makes it harder for snoopers to snoop due to the following:

    1. It results in the ingress and egress network paths being asymmetrical, and makes correlating ingress and egress flows more difficult for a censor or would-be snooper.
    2. It encapsulates each packet destined to our core network PoPs, so that it reduces the amount of intermediary networks that can observe and track flows going through NoiseNet.

    There’s also several ways that NoiseNet introduces additional noise and resiliency into our network.

    • Anycasting our prefixes (advertising them from all of our PoPs) creates variance in network paths when packets are en route to our edge network. The more upstreams and Internet Exchange points (IXPs) we’re connected to, the more diversity there is in the ways different networks can reach ours. The more paths to our edge network, the harder it is for snoopers to conduct dragnet surveillance.
    • No single point of failure. We have a large amount of edge nodes, and additional core network PoPs can be created at will, creating more network path diversity and general resiliency.
    • Our distributed edge network allows us to more easily mitigate DDoS attacks, because the load is spread out across our entire network rather than a single point of ingress.

    Diversity is our strength

    The more upstreams we have, and Internet Exchanges Points (IXPs) we’re on, the better. Many of our upstreams are tier 2 or tier 3 networks, and they advertise our prefixes over their own interconnections with other networks. We’re continually working on expanding NoiseNet by deploying new edge nodes around the world and connecting directly to more Internet Exchanges.

    Our upstream network providers
    Our Internet Exchange points

    How we utilize NoiseNet today

    Currently, at the time of writing NoiseNet ingests and routes packet for all of our Tor exit relays. As we assess the stability and feasibility of NoiseNet, we’ll begin to host more of our services behind it. Until then, we may periodically disable NoiseNet by unicasting our IP prefixes for research & testing purposes.

    https://metrics.torproject.org/rs.html#search/as:AS401401

    The future of NoiseNet

    While the foundations of NoiseNet are in place, we have a lot of work to do.

    Below is a list of things we’re planning to develop over time.

    • Partner with more companies & organizations that can downstream us. A special thank you to Triplebit, a nonprofit who has partnered with us and become one of our upstream networks.
    • Consideration to exchange routes over our edge & core allowing for the egress of packets from our core back over tunnels to edge nodes to keep traffic more “local” if packets can be routed over an IXP or kept more local in some way.
    • The development of noisenet-randomizer, a program that utilizes Traffic Control (TC) and/or eBPF to introduce random delays in packet processing/routing on our edge or core network.

    There are some glaring issues that we need to resolve before NoiseNet is considered stable as well.

    • Some of our upstreams appear to track L4 connections, and drop packets for flows which they have not observed us initiating (an anti-spoofing DDoS protection mechanism). This causes an issue with our current design, and can cause packet loss or blackholed inbound traffic in certain PoPs. We are working on a way to detect this and disable prefix advertisement via upstreams we detect this behavior on.
    • Network health is not currently determined, nor it is a factor in prefix advertisement status. We need to develop automations that observe automatic speed tests and packet loss metrics from edge <-> core and edge <-> Internet. These automations will be able to decide when to disable prefix advertisement, use path prepending or BGP communities to shift traffic away from edge nodes to improve network performance and reliability.

    Interested in helping us, and have an interest in what we’re doing? Reach out! We’re always looking for students or professional volunteers to help us with our ambitious projects.

    GreenWare

    GreenWare started as a general proof-of-concept to operate Tor relays on top of power efficient hardware that is easy to scale. It’s also an initiative which helps us reduce our overall carbon footprint, and sets us on the right path to powering some of our services entirely by renewable energy in the future.

    In addition to reducing our carbon footprint, we attempt to offset the footprint of our services using Stripe Climate by automatically allocating 1% of donation revenue that we receive to it. As of writing, we’ve helped to remove 184 kg of CO₂ in the air from planet Earth, which is roughly equivalent to the carbon output from driving 417 mi in a car. This simply isn’t enough, and we need to work to reduce that potential footprint as well.

    Last year, in our “year in review (2024)” blog post we showcased some of our experimentation with this idea using PoE powered Raspberry Pi 5s. Over several months, we validated that this would actually work – and it worked well.

    It progressed into something that we’re quite confident can be deployed on a large scale, and could be powered 100% by renewable energy in the future with the proper battery storage.

    There were some quirks to be had with the Raspberry Pi 5s, though. We found the PoE HATs to be generally buggy and unreliable. The boards being exposed were easy to bump into and cause a Pi to reboot. The density also wasn’t dense enough for what we truly had in mind.

    For quite a long time, we’d been tracking the development of the ComputeBlade – modular CM4/5 blades servers with the potential for very high density. They’re so dense that you can fit 20 blades in a single 1U chassis. We recently got our hands on a kit and began tinkering with them right away.

    Deployment time

    We ended up deploying all 20 blades in our datacenter and have migrated the majority of our Tor exit relays to them. They’re currently running great, and while having several design flaws with the chassis – we’re confident that power efficient efficient hardware like this can help reduce the carbon footprint of the Tor network and more of our services in the future.

    All together, they only use a little over 100W of power.

    Purple cables to represent the Tor network
    Purple RGB!
    PoE!

    The future of GreenWare

    As time goes on, we’ll be exploring new ways to experiment with power efficient hardware. We’re even thinking about ways that we can operate micro-PoPs in cheaper, less important locations where we can harvest the energy of the sun and use it to power our hardware.

    We’re also exploring all-in-one carbon tracking solutions such as the “Compute Energy & Emissions Monitoring Stack” (CEEMS) to track our carbon footprint, and find ways to lower it or offset it across the board.

    Have questions or want to deploy your own power efficient hardware like we did? Join any one of our community chats!

    Parting thoughts

    Now more than ever, it’s time to fight against Internet censorship and dragnet surveillance! Our hope is that the experiments we create inspire others to build more tech in the anti-censorship & privacy space. We can’t do it alone, nor could be do it without our generous donors. In the years to come, we’ll continue building and updating innovative experiments to help create the most impact.

    If you like what we do, and want to help us fund our mission – consider making a donation.

    If you’re an organization or company, and want to collaborate – contact us.

    A huge and special thank you to the Human Rights Foundation, our Supporters and various donors for making this work possible!

Support Unredacted
Privacy Coins (Click to copy)
Donate