Meltdown fix committed to OpenBSD

bcantrill · on Feb 22, 2018

Great to see this, and congratulations to OpenBSD on landing KPTI! From the illumos/SmartOS perspective, our collaboration with OpenBSD and DragonFly engineers has been invaluable[1]; working together, all projects made quicker progress. Indeed, the cross-project collaboration has been a silver lining to the very dark cloud of Intel's selective (and irresponsible) disclosure.

[1] https://twitter.com/arekinath/status/951321167708606464

twunde · on Feb 22, 2018

Congratulations to the OpenBSD team! How crazy is it that the BSDs were not given notice about the Meltdown/Spectre bugs? Especially considering the strong security focus of several of these variants

cthalupa · on Feb 22, 2018

OpenBSD has a reputation for not respecting embargoes, even as recently as the KRACK wifi thing last year.

If you have a history of breaking embargoes, you're going to stop getting included in them.

Edit: Changed to reflect the explicit nature, as it seems the answer is not all BSDs ;)

bcantrill · on Feb 22, 2018

Please stop defending (if implicitly) Intel's irresponsible disclosure. We at Joyent are (1) a public cloud, (2) have our own operating system (SmartOS), (3) have always respected embargoes and (4) have a close relationship with Intel (!!) -- and we weren't notified either. OpenBSD was not excluded -- rather they (and we and every other system that isn't Windows, MacOS and Linux) were not included. What Intel did here was hugely irresponsible, leaving many in a vulnerable position for an extended period of time; there is nothing that OpenBSD (or anyone else) did to deserve such shabby treatment.

runlevel1 · on Feb 22, 2018

They dropped the ball all around.

Even with 6 months lead time, they were startlingly unprepared in all but their works-as-designed press release.

mbakke · on Feb 23, 2018

I think HN user pinewurst nailed it back in 2016: "[Intel] are basically a PR firm with a good fab in the basement".

https://news.ycombinator.com/item?id=11820078

Their handling of recent events has really cemented this view in my mind.

raverbashing · on Feb 23, 2018

Paraphrasing Linus, "talk is cheap, show me the silicon"

While it's fair to criticize them, sometimes it comes across as "those hardware bozos don't know what they're doing" and hardware design, at the speeds and features Intel does is hard (just look at the number of competitors)

jrockway · on Feb 23, 2018

I think that was a typo, though, and he meant to refer to IBM.

You have to remember that IBM's PR is stuff like "Watson is sitting here curing cancer right now while his best friend Deep Blue beats people at chess!!11!" Intel doesn't do any of that.

mbakke · on Feb 23, 2018

No, it really was in reference to Intel (the thread was about Intel obtaining and developing a dinosaur-era distributed file system called Lustre).

bayindirh · on Feb 23, 2018

Lustre may be old, but it's hardly useless. Some of its capabilities cannot be replaced by any modern storage systems object and file alike.

CEPH comes close, but cannot completely replace it.

When you connect a thousand nodes to a single storage pool with high throughput and low latency requirements, not everything can cut it.

gcb0 · on Feb 23, 2018

and that was sadly the right thing to do. why? because they will have zero consequences. not even the inside (pun intended) trading thing will stick.

ksec · on Feb 23, 2018

Well In a perfect world, a lot of these users will revolt and at least buy more AMD EPIC, which is very price competitive. But Intel is going to do what they do best, give you lots of discount in upcoming chips order to keep you happy. And in Business it is really hard to say no to that.

Then there are other segments including Gamers and Casual users, who really believe in Intel's brand and could careless about what Intel did.

Basically we dont see any damages being done to Intel. And when their PR spin how good their 10nm is and Intel being the underdog with Qualcomm in Modem race, everyone would have forgotten about Meltdown and Spectre.

cthalupa · on Feb 22, 2018

This is fair. If it wasn't a matter of conscious exclusion based on past actions, then I would agree it's poorly handled, and for FreeBSD, Illumos, et al., I can't disagree with you.

benchaney · on Feb 23, 2018

This defense of intel is made even more ridiculous by the fact that they chose to exclude the BSDs before the KRACK controversy happened.

binthere · on Feb 23, 2018

Not defending Intel, just trying to understand their position other than just being negligent. Wouldn't it undermine most users if the security flaw had leaked before major OS fixed it? The more you share, the easier it is for a leak to happen. So they could have decided to share only with the major OS first and then later, once fixes have been implemented, expand this to other interested parties?

cperciva · on Feb 22, 2018

The BSDs have a reputation for not respecting embargoes

I can't speak for other BSDs, but I'm not aware of any time in the past 15 years when FreeBSD has failed to respect an embargo.

cthalupa · on Feb 22, 2018

Thanks. I've updated my post to reflect this

tephra · on Feb 22, 2018

Now I don't have all the information at my fingertips but IIRC with KRACK it was OpenBSD (not all the BSDs) that commited the fix early and they _did_ have permission to do this but the security researcher did regret giving this to them.

I think (but not really sure someone please correct me) that openbsd has a policy of not signing NDAs but I think freebsd devs are willing. Why they, netbs, or illumos (although I have no information on how they handle these kinds of disclosures) wasn't alerted is beyond me.

DCKing · on Feb 22, 2018

I've heard other variations of this story that stated that OpenBSD simply had the fix ready first, asked permission to commit it (without fanfare), and got it. As far as I remember from the discoverer's CCC talk, OpenBSD got the fix first because he discovered the vulnerability first in OpenBSD's WPA implementation.

It's not that OpenBSD did not want to cooperate, it's that there was some miscommunication that ended up not fitting in the coordinated disclosure.

akerro · on Feb 23, 2018

>I've heard other variations of this story that stated that OpenBSD simply had the fix ready first

Why dont you reach to the source?

>Note that I wrote and included a suggested diff for OpenBSD already, and that at the time the tentative disclosure deadline was around the end of August. As a compromise, I allowed them to silently patch the vulnerability.

https://www.krackattacks.com/#openbsd

akerro · on Feb 23, 2018

> OpenBSD has a reputation for not respecting embargoes, even as recently as the KRACK wifi thing last year.

Stop spreading this FUD. OpenBSD developers were allowed to make the patch silently, they contacted the KRACK researcher and he agreed to this. HE CHANGED HIS MIND LATER, after the patch went it and he started blaming OBSD and spreading bullshit.

>Note that I wrote and included a suggested diff for OpenBSD already, and that at the time the tentative disclosure deadline was around the end of August. As a compromise, I allowed them to silently patch the vulnerability.

https://www.krackattacks.com/#openbsd

cthalupa · on Feb 23, 2018

There's a whole other thread covering this.

He asked them to respect the extension. They pushed back against it. Rather than making a big fight about it, he shrugged and went "Well, fine, whatever"

It was obviously not an amenable thing to him, because he's specifically not even going to give them the same level of warning with the understanding they must respect the full embargo period, because right after the bit you quoted, he says:

>To avoid this problem in the future, OpenBSD will now receive vulnerability notifications closer to the end of an embargo.

If it was just him going "Yeah dudes go for it it's cool everyone else will have to wait out the extended embargo but you can just release the patch immediately!" he wouldn't be punishing OpenBSD for it.

brynet · on Feb 22, 2018

You know, it's really tiring that people keep up with the same old shtick every time the topic of embargoes and OpenBSD come up.

OpenBSD followed [0] the original embargo from the KRACK researcher, he gave OpenBSD the go-ahead to commit it.. only later we he got CERT involved did they want to delay it for more months. He changed his mind retroactively. It's even stated in the original FAQ for that particular security issue.

Enough already? This has nothing to do with Intel's very selective disclosure of Meltdown.

[0] https://lobste.rs/s/dwzplh/krack_attacks_breaking_wpa2#c_pbh...

cthalupa · on Feb 22, 2018

OpenBSD followed the original embargo, and got asked to extend it based on the CERT response. You guys said no, and the security researcher shrugged and went 'Well, okay, I guess.' When everyone else is agreeing to the extended embargo, I personally can't say that I feel like OpenBSD respected the embargo there.

It's also not limited to just KRACK. With the OpenSSL/LibreSSL stuff back in 2015, OpenBSD wasn't part of the disclosure because Theo said he wasn't going to deal with an email-list or embargoes prior to that. Marc Espie has said he thinks it's a good thing that the KRACK embargo was broken early.

brynet · on Feb 22, 2018

That's not how it happened, that's the narrative you're spinning, certainly.

https://twitter.com/vanhoefm/status/921425715903463425

cthalupa · on Feb 22, 2018

Please explain how that is not what happened? That tweet does not contradict me in the slightest.

In fact, your original link is quite explicit.

>Then he got CERT (and, thus, US gov agencies) involved and had to extend the embargo even further until today. At that point we already had the ball rolling and decided to stick to the original agreement with him, and he gave us an agreeing nod towards that as well.

So, stsp pretty specifically says "It got extended, we decided not to follow the new date, and he went 'well, okay'"

https://www.krackattacks.com/#openbsd

>To avoid this problem in the future, OpenBSD will now receive vulnerability notifications closer to the end of an embargo.

If OpenBSD did nothing wrong, it seems quite odd that the same person who you are saying was fully on board with it is now explicitly saying he is going to provide late notifications.

Mordak · on Feb 22, 2018

Wow, that's some seriously selective quoting from the krackattacks link there. The one that is relevant to the question of whether or not OpenBSD patched early without permission is this:

> As a compromise, I allowed them to silently patch the vulnerability.

It's pretty easy to argue that OpenBSD doesn't like embargoes, but it's pretty hard to argue that they ignore embargoes and patch whenever they want to. In this specific case, the project asked and received permission to patch early. The fact that the researcher regrets this in hindsight is beside the point.

cthalupa · on Feb 23, 2018

They were asked to respect the extended embargo from CERT. They argued against doing so. The researcher going "Well, fine, we had previously agreed on this date, so do it" after trying to get them to respect the extension is not the same as the researcher going "yeah man just release it now it's all good."

He regrets coming to that compromise when they pushed back. They still pushed back and did not want to follow the extension.

brynet · on Feb 22, 2018

OpenBSD had no idea who the other insiders were, only that it received a patch from the researcher, and the go-ahead to fix an important security problem. Suggesting that the correct response was to sit on the patch for months leaving users exposed to a security bug is unreasonable to place on any open source project.

You should read the rest of what you linked, and check the dates carefully.

cthalupa · on Feb 23, 2018

Yes or no: Was OpenBSD asked by the researcher to respect the CERT requested embargo extension?

tedunangst · on Feb 23, 2018

Denying a request you haven't agreed to isn't breaking an agreement.

cthalupa · on Feb 23, 2018

It's still irresponsible in the face of everyone else agreeing to it. The researcher informed them in good faith, they did not return it when the situation changed.

Fnoord · on Feb 24, 2018

Hmm, not sure what exactly happened with [1] (remotely exploitable vulnerable in OpenSSH in 2006). Did the OpenSSH team follow responsible disclosure there?

[1] https://www.openssh.com/txt/preauth.adv

ams6110 · on Feb 22, 2018

Kernel code is not my area. Is this substantially different from how Linux is addressing the issue?

hansendc · on Feb 22, 2018

I work on the Linux KPTI/KAISER code (and work at Intel).

The high-level description, along with the set of hardware things that get mapped into both kernel/app page tables are basically identical. Creating that list of things was painful, at least for me while I was getting KAISER working. If I were in their position I definitely would have leveraged the list of things that we found in the Linux work. I hope they were able to do that.

Some minor differences in the implementations: In Linux, we allocate the two top-level page tables next to each other so we can just flip a bit to switch, and in the OpenBSD commit they seem to be allocating them separately and then storing the two pointers in a per-cpu data structure. Both are fine ways to do it.

I don't see anything like the Linux "espfix" mechanism. Guess there isn't one.

The BTS/PEBS buffers seem missing. This hardware may just not be supported.

waynecochran · on Feb 22, 2018

Having separate page tables for user space vs kernel space seems like the right thing to do from the beginning -- I was surprised that they were shared.

Could you elaborate on "we allocate the two top-level page tables next to each other so we can just flip a bit to switch"?

hansendc · on Feb 23, 2018

Sharing an address space is nice. Programs pass lots of pointers into the kernel (think read()/write()). For the kernel to get at that data, if the kernel and userspace share an address space you can just use the pointer. If userspace has a truly separate set of page tables, you have to do some relatively crazy (and slow) stuff to go and access the data from inside the kernel. It's much harder than just accessing a pointer.

We can still do this with KPTI because despite having two page tables, the kernel still continues to map all of userspace.

BTW, the bit flipping is pretty much

// Allocate 8k which is also 8k aligned: pgd = alloc_pages(PAGE_SIZE*2);

That allocates something which might go from (physical address) 0x12340000->12360000. The first top-level pagetable is at 0x12340000 (for the kernel) and the second is at 0x12341000 (for running userspace).

So, when you leave the kernel you just set a bit in your CR3 register (well, you do it in assembly, but here it is logically in C):

write_cr3(read_cr3() | 0x1000); // 0x12340000 -> 0x12341000

When you come back in to the kernel you clear the bit:

write_cr3(read_cr3() & ~0x1000); // 0x12341000 -> 0x12340000

blattimwind · on Feb 23, 2018

Other architectures (like SPARC) have had split PTs for like, always. (And they're not optional, either)

brynet · on Feb 22, 2018

The implementation is of course different, but the workaround is the same, separating the user/kernel pages tables. One key difference is that OpenBSD doesn't have to deal with 32-bit binary compatibility [0] and such, and also has made considerable effort to minimize the amount of kernel pages needed, AFAICT 6+(2*ncpus)

[0] https://marc.info/?l=openbsd-misc&m=148926149318522&w=2

Scaevolus · on Feb 22, 2018

No, but note that OpenBSD didn't learn about the issue until the public announcement. Linux developers had months to work on addressing the issue.

caf · on Feb 22, 2018

I have heard it mentioned that the core Linux kernel developers didn't get told months ahead - only some people at Red Hat specifically. Anyone know the truth of that?

cheeseprocedure · on Feb 22, 2018

Was any reasoning made public for this decision? It seems bizarre to have left the *BSD projects in the dark.

kiallmacinnes · on Feb 22, 2018

While there are obvious disadvantages to not being told early, could there have been some advantages? For example, Linux and Windows mitigations already worked out and ready as a reference?

(I'm not saying those advantages might outweigh the disadvantages, just speculating on the pros and cons!)

tedunangst · on Feb 22, 2018

The basic of concept of separate page tables for user land and kernel is well known. It's how sparc64 is implemented from the very beginning. But knowing this is the objective is almost meaningless to actually accomplishing it. The devil is in the details, of which there are many, and they aren't necessarily shared.

mfoy_ · on Feb 22, 2018

I mean... if you want to be really pedantic, there are technically some silver linings. But the disadvantages outweigh that stretch of a silver lining by so many orders of magnitude that it's almost insulting.

It would be like telling a mugging victim: "Hey, at least now you can justify buying a new purse!"

kiallmacinnes · on Feb 22, 2018

If you're downvoting, I'd love to know why.

I asked an honest question, made no statements to which I can imagine people disagreeing with, and went out of my way to make sure it couldn't be taken in a bad way...

tclover · on Feb 22, 2018

In Theo we trust

rurban · on Feb 23, 2018

In this case not really. The OpenBSD impl. looks very costly and is still incomplete. You may wonder why they didn't add a benchmark number to the commit. I suspect there's a lot of pagetable cache trashing going on, with two complete separate pages.

See https://news.ycombinator.com/item?id=16441341

amelius · on Feb 22, 2018

Is there an automated test for this? Otherwise, I'm worried that at some point somebody will accidentally break the code and open up the vulnerability again.

mschuster91 · on Feb 23, 2018

I can't think of a reliable (!) way to do scalable reproducible regression tests for this sort of hardware bug. Probably you'd need a dog slow version of qemu with full cpu emulation, no acceleration, and specific hacks to make the emulated cpu affected by spectre/meltdown.

jankotek · on Feb 23, 2018

Title should be "Meltdown workaround...". It is workaround for buggy CPU, not problem in BSD itself.

yorby · on Feb 23, 2018

Any chance that Meltdown or Spectre were not backdoors (statistically)?

PhantomGremlin · on Feb 23, 2018

How, exactly, does that happen? Do dozens of Intel people all get together in a conference room?

So an Intel Project Manager comes into the room and says to a bunch of Intel's CPU designers: "the ghost of Andy Grove says we have to put a backdoor in for the NSA. They want it to be obscure. What they want us to do is to speculatively execute instructions across the user/kernel protection boundary. But wait ... don't allow the effects of those instructions to be easily visible. After all, it is nominally a protection boundary. Instead, just make sure to disturb the CPU cache. NSA can then easily observe cache timing changes and use that to read kernel memory. That should be good enough for NSA's needs. But, boy, won't people have a meltdown if this ever becomes widely known?"

"Oh, and by the way, top Intel engineers, you must keep this secret 'forever'. You can't tell anyone outside the company. You need to keep this on a 'need to know' with all your fellow design, verification, and test engineers, whom you must swear to secrecy. You can't anonymously leak this, or the NSA will find out and send you off to Gitmo for rendition. Mum's the word."

"Also, guys, there's no point in putting in this backdoor for a single chip. We must carry this backdoor forward to every new CPU design team that we bring on for every new CPU chip Intel makes. Sure, some chips are designed in Oregon and some in Israel, and some may even be designed in cyberspace. But we'll be sure to keep this backdoor in all future implementations. We need to let future CPU architects know that they must forever honor this promise we made to the NSA."

Yeah, that could have happened.

Or, more likely, out of the 1,000,000 design decisions, big and small, that go into creating a modern CPU which has literally billions of transistors, nobody involved thought this was a serious enough concern. If they thought of it at all. Because, ultimately, they get paid for shipping functional silicon rather than for worrying about information leakage in every possible corner case.

If this were so simple, why didn't someone figure it out before? Meltdown is mostly Intel, but the similar Spectre exploit affects Intel, ARM, and AMD. And has for many years.

Are all the engineers at all of those companies in on this backdoor? Or was the backdoor so diabolically clever that only the "CPU architect cabal" at these companies the only ones who needed to know?

speedplane · on Feb 23, 2018

An intentional backdoor: no. But an intentional ignorance of a major security vulnerability: extremely likely.

Computer architects have been aware of covert channels for literally decades, from heat signatures, to CPU usage... heck it's even taught in most CS or ECE Master degrees with respect to caching. I wouldn't go as far as saying these two vulnerabilities were "obvious" to most programmers or computer engineers, but they must have been known to many people at the major chip-makers. Someone must have raised it some meeting, probably multiple meetings, and were probably shot down.

So not a conspiracy theory, just lack of proper incentives to keep products secure.

microcolonel · on Feb 23, 2018

If you make any effort to understand how these bugs work, it's hard to imagine they were designed intentionally.

Though you could make the argument that it's likely some people knew about them and didn't say, but that's a different argument I think.

davidkuhta · on Feb 22, 2018

First thought was 'Who is Meltdown Fix, and why are they committed to OpenBSD?' *smh

Edit: Just meant to comment on the mild linguistic ambiguity in the title.