❌

Lees weergave

Dirk Eddelbuettel: nanotime 0.3.15 on CRAN: Coping

Another very minor update, now at 0.3.15, for our nanotime package is now on CRAN, and has been built for r2u and Debian. nanotime relies on the RcppCCTZ package (as well as the RcppDate package for additional C++ operations) and offers efficient high(er) resolution time parsing and formatting up to nanosecond resolution, using the bit64 package for the actual integer64 arithmetic. Initially implemented using the S3 system, it has benefitted greatly from a rigorous refactoring by Leonardo who not only rejigged nanotime internals in S4 but also added new S4 types for periods, intervals and durations.

This release adjusts the package for the maybe overly hasty switch R 4.6.0 has undertaken with respect to using C++20 as a default C++ compilation standard. I am of course largely in favour of such a switch to more modern C++. But I am also cognizant of the fact that not all compilers and machines are ready. And just as I have already seen one other package fail to compile on a particular CRAN system (!!) under C++20, this package all of a sudden, and only on that same system, started to throw two (harmless) compiler warnings. We could call these erroneous as newer versions of the same compiler do not throw them but it does not matter. The decision to default to C++20 has been made, and now we live with it. But maybe some hardware platforms should be moved behind the barn. Either way, this release both adds an explicit cast to two lines that may not really need it (but this will not hurt) and also dials the compilation standard down to C++17 on one particular platform. So once again there are no user-facing changes, or behavioural changes or enhancements, in this release.

The NEWS snippet below has the fuller details.

Changes in version 0.3.15 (2026-05-21)

  • Add extra const_cast as one CRAN machine with more ancient setup whines otherwise and is obviously less C++20 ready than it thinks

  • tools/configure also checks where this is being built and ’as needed' downgrades the compilation to C++17

Thanks to my CRANberries, there is a diffstat report for this release. More details and examples are at the nanotime page; code, issue tickets etc at the GitHub repository – and all documentation is provided at the nanotime documentation site.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can now sponsor me at GitHub. You can also sponsor my Tour de Shore 2026 ride in support of the Maywood Fine Arts Center.

  •  

Tianon Gravi: Containers Are a Security Boundary (some assembly required)

I've heard "containers are not a security boundary" enough times that it's started to feel like received wisdom, and my honest read (after 13+ years) is that it's technically defensible but practically sloppy – and the sloppiness matters.

The part that's true: containers share a kernel, and a kernel exploit crosses the container boundary where a VM would not. That difference is real and non-trivial, and the CVE history backs it up – CVE-2019-5736, CVE-2022-0492, and CVE-2024-21626 all happened in "correctly configured" production containers.

The part I'd push back on is that the comparison point is almost never stated. "Containers aren't a security boundary" is being used as shorthand for "containers aren't a VM boundary" – but the conclusion people seem to draw from that is "therefore don't bother", which doesn't actually follow. The more honest version is that default Docker doesn't provide strong isolation between mutually untrusting parties, but a hardened configuration does.

What ships by default in Moby is actually a pretty reasonable foundation: seccomp is enabled (with a builtin profile blocking ~50 syscalls – credit where it's due: this is mostly @jessfraz's work; she even ran contained.af as a public CTF for years daring people to escape a container under her seccomp profile, and to my knowledge it was never claimed), AppArmor is enabled (the docker-default profile), and several sensitive /proc paths are masked. What's not on by default: no-new-privileges (setuid binaries inside can escalate), CAP_NET_RAW is still granted to every container (even though the kernel has supported unprivileged ICMP sockets for over a decade, meaning most modern distributions no longer need CAP_NET_RAW for ping), and user namespace remapping – though user namespaces aren't quite the silver bullet they might sound like; Debian left them disabled by default for years because the kernel attack surface they exposed hadn't been hardened against unprivileged callers.

The boundary isn't absent – it doesn't come completely pre-assembled. With VMs, the hypervisor is there whether you asked for it or not; with containers, assembling the boundary is left as an exercise for the operator. That's a much more solvable problem than "the technology is incapable", but it does mean the work falls to whoever's running the containers.

So, some things you can do today without waiting for defaults to change:

--user (or USER in your Dockerfile) is worth calling out specifically, because I think it's arguably stronger than user namespace remapping in one important way – and partly for the same reason Debian was hesitant about user namespaces in the first place. User namespace remapping protects the host from a root-in-container escape: if you do escape, you land as an unprivileged user on the host. But you were still root inside the container the whole time. Running as a non-root user means you were never root anywhere. The blast radius of a compromised process is limited whether or not it escapes, including for things like reading secrets, modifying container contents, or lateral movement within the container itself. Most application containers have no legitimate reason to be root.

Beyond that, a short list of things that are easy to enable and hard to justify leaving off:

  • --security-opt no-new-privileges – prevents setuid binaries from escalating; can also be set daemon-wide in daemon.json with "no-new-privileges": true
  • --read-only – a read-only root filesystem means a compromised process can't easily persist tooling or modify the container (pair with a writable tmpfs mount for /tmp etc as needed)
  • --cap-drop NET_RAW – or --cap-drop ALL and add back only what you actually need; CAP_NET_RAW is almost never legitimately needed by application containers
  • never --privileged – if something seems to require it, the right answer is almost always a more targeted capability grant or bind mount, not the nuclear option
docker run \
  --user 1234:5678 \
  --security-opt no-new-privileges \
  --read-only \
  --tmpfs /tmp \
  --cap-drop ALL \
  acme/untrusted-workload:latest

None of these require a daemon restart or infrastructure changes, and stacked together they go a long way toward actually building the boundary that the defaults leave unbuilt.

(this post was written with the assistance of "claude my eyes right out" but all thoughts and understanding are Tianon's)

  •  
❌