Skip to main content

A Microsoft Employees Rant: “I Contribute to the Windows Kernel. We Are Slower Than Other Operating Systems. Here Is Why.”


Image result for windows logo png

I was having a discussion in a forum with Marc Bevand about how windows is still slow despite of the “fast” windows 8.

And out of nowhere an anonymous Microsoft developer who contributes to the Windows NT kernel wrote a fantastic and honest response acknowledging this problem and explaining its cause.

The post has since been deleted.
But I am re-posting it since it’s too insightful.


PS: The anonymous poster himself deleted his post as he thought it was too cruel and did not help make his point, which is about the social dynamics of spontaneous contribution. However he let me know he does not mind the re-post at the condition I redact the SHA1 hash info, which I did


"I’m a developer in Windows and contribute to the NT kernel. (Proof: the SHA1 hash of revision #102 of [Edit: filename redacted] is [Edit: hash redacted].) I’m posting through Tor for obvious reasons.
Windows is indeed slower than other operating systems in many scenarios, and the gap is worsening. The cause of the problem is social. There’s almost none of the improvement for its own sake, for the sake of glory, that you see in the Linux world.
Granted, occasionally one sees naive people try to make things better. These people almost always fail. We can and do improve performance for specific scenarios that people with the ability to allocate resources believe impact business goals, but this work is Sisyphean. There’s no formal or informal program of systemic performance improvement. We started caring about security because pre-SP3 Windows XP was an existential threat to the business. Our low performance is not an existential threat to the business.
See, component owners are generally openly hostile to outside patches: if you’re a dev, accepting an outside patch makes your lead angry (due to the need to maintain this patch and to justify in in shiproom the unplanned design change), makes test angry (because test is on the hook for making sure the change doesn’t break anything, and you just made work for them), and PM is angry (due to the schedule implications of code churn). There’s just no incentive to accept changes from outside your own team. You can always find a reason to say “no”, and you have very little incentive to say “yes”.
There’s also little incentive to create changes in the first place. On linux-kernel, if you improve the performance of directory traversal by a consistent 5%, you’re praised and thanked. Here, if you do that and you’re not on the object manager team, then even if you do get your code past the Ob owners and into the tree, your own management doesn’t care. Yes, making a massive improvement will get you noticed by senior people and could be a boon for your career, but the improvement has to be very large to attract that kind of attention. Incremental improvements just annoy people and are, at best, neutral for your career. If you’re unlucky and you tell your lead about how you improved performance of some other component on the system, he’ll just ask you whether you can accelerate your bug glide.
Is it any wonder that people stop trying to do unplanned work after a little while?
Another reason for the quality gap is that that we’ve been having trouble keeping talented people. Google and other large Seattle-area companies keep poaching our best, most experienced developers, and we hire youths straight from college to replace them. You find SDEs and SDE IIs maintaining hugely import systems. These developers mean well and are usually adequately intelligent, but they don’t understand why certain decisions were made, don’t have a thorough understanding of the intricate details of how their systems work, and most importantly, don’t want to change anything that already works.
These junior developers also have a tendency to make improvements to the system by implementing brand-new features instead of improving old ones. Look at recent Microsoft releases: we don’t fix old features, but create new ones. New features help much more at review time than improvements to old ones.
(That’s literally the explanation for PowerShell. Many of us wanted to improve cmd.exe, but couldn’t.)
More examples:

  • We can’t touch named pipes. Let’s add %INTERNAL_NOTIFICATION_SYSTEM%! And let’s make it inconsistent with virtually every other named NT primitive. 
  • We can’t expose %INTERNAL_NOTIFICATION_SYSTEM% to the rest of the world because we don’t want to fill out paperwork and we’re not losing sales because we only have 1990s-era Win32 APIs available publicly. 
  • We can’t touch DCOM. So we create another %C#_REMOTING_FLAVOR_OF_THE_WEEK%! 
  • XNA. Need I say more? 
  • Why would anyone need an archive format that supports files larger than 2GB? 
  • Let’s support symbolic links, but make sure that nobody can use them so we don’t get blamed for security vulnerabilities (Great! Now we get to look sage and responsible!) 
  • We can’t touch Source Depot, so let’s hack together SDX! 
  • We can’t touch SDX, so let’s pretend for four releases that we’re moving to TFS while not actually changing anything! 
  • Oh god, the NTFS code is a purple opium-fueled Victorian horror novel that uses global recursive locks and SEH for flow control. Let’s write ReFs instead. (And hey, let’s start by copying and pasting the NTFS source code and removing half the features! Then let’s add checksums, because checksums are cool, right, and now with checksums we’re just as good as ZFS? Right? And who needs quotas anyway?) 
  • We just can’t be fucked to implement C11 support, and variadic templates were just too hard to implement in a year. (But ohmygosh we turned “^” into a reference-counted pointer operator. Oh, and what’s a reference cycle?) 

Look: Microsoft still has some old-fashioned hardcore talented developers who can code circles around brogrammers down in the valley. These people have a keen appreciation of the complexities of operating system development and an eye for good, clean design. The NT kernel is still much better than Linux in some ways — you guys be trippin’ with your over-commit-by-default MM nonsense — but our good people keep retiring or moving to other large technology companies, and there are few new people achieving the level of technical virtuosity needed to replace the people who leave. We fill headcount with nine-to-five-with-kids types, desperate-to-please H1Bs, and Google rejects. We occasionally get good people anyway, as if by mistake, but not enough. Is it any wonder we’re falling behind? The rot has already set in."

Comments

Popular posts from this blog

Visualizing large scale Uber Movement Data

Last month one of my acquaintances in LinkedIn pointed me to a very interesting dataset. Uber's Movement Dataset. It was fascinating to explore their awesome GUI and to play with the data. However, their UI for exploring the dataset leaves much more to be desired, especially the fact that we always have to specify source and destination to get relevant data and can't play with the whole dataset. Another limitation also was, the dataset doesn't include any time component. Which immediately threw out a lot of things I wanted to explore. When I started looking out if there is another publicly available dataset, I found one at Kaggle. And then quite a few more at Kaggle. But none of them seemed official, and then I found one released by NYC - TLC which looked pretty official and I was hooked.
To explore the data I wanted to try out OmniSci. I recently saw a video of a talk at jupytercon by Randy Zwitch where he goes through a demo of exploring an NYC Cab dataset using OmniSci. A…

ARCore and Arkit, What is under the hood: SLAM (Part 2)

In our last blog post (part 1), we took a look at how algorithms detect keypoints in camera images. These form the basis of our world tracking and environment recognition. But for Mixed Reality, that alone is not enough. We have to be able to calculate the 3d position in the real world. It is often calculated by the spatial distance between itself and multiple keypoints. This is often called Simultaneous Localization and Mapping (SLAM). And this is what is responsible for all the world tracking we see in ARCore/ARKit.
What we will cover today:How ARCore and ARKit does it's SLAM/Visual Inertia OdometryCan we D.I.Y our own SLAM with reasonable accuracy to understand the process better Sensing the world: as a computerWhen we start any augmented reality application in mobile or elsewhere, the first thing it tries to do is to detect a plane. When you first start any MR app in ARKit, ARCore, the system doesn't know anything about the surroundings. It starts processing data from cam…

ARCore and Arkit: What is under the hood : Anchors and World Mapping (Part 1)

Reading Time: 7 MIn
Some of you know I have been recently experimenting a bit more with WebXR than a WebVR and when we talk about mobile Mixed Reality, ARkit and ARCore is something which plays a pivotal role to map and understand the environment inside our applications.
I am planning to write a series of blog posts on how you can start developing WebXR applications now and play with them starting with the basics and then going on to using different features of it. But before that, I planned to pen down this series of how actually the "world mapping" works in arcore and arkit. So that we have a better understanding of the Mixed Reality capabilities of the devices we will be working with.
Mapping: feature detection and anchors Creating apps that work seamlessly with arcore/kit requires a little bit of knowledge about the algorithms that work in the back and that involves knowing about Anchors. What are anchors: Anchors are your virtual markers in the real world. As a develope…