The ‘Open’ Heartbleed


Before I start, let me; let everyone know that I’m not a fan of opensource, to the extent that you can call me anti opensource.

So, on 7th April 2014, the world woke up to heartbleed, a vulnerability in the VERY popular openssl software which is used by almost the entire world to secure their websites and systems. When I say the entire world, it includes giants like Google, Facebook, Yahoo, Netflix, Adobe and Paypal to name a few. Openssl is an opensource cryptographic library used to implement Transport Layer Security (TLS), known previously as Secure Socket Layer (SSL). In simpler terms, it’s the very engine that makes your secure websites and servers “Secure”, so a critical vulnerability being found in openssl means nothing less than Armageddon, and Armageddon it is since the last two weeks in the digital world.

The bug was discovered by Neel Mehta of Google Security on 1st April 2014 (All Fool’s Day) but was made public seven days later. Some engineers (Riku, Antti and Matti) at a Finnish Security Firm called Codenomicon discovered the same bug on 3rd April, unaware that Google had already found it three days earlier. The term “Heartbleed” was coined by Ossi Herrala, a systems administrator at Codenomicon. The name heartbleed refers to an extension of openssl called hearbeat, it is used to keep connections open, even when there is no activity between the client and the server, Herrala thought it was fitting to call it Heartbleed because it was bleeding out the important information from the memory, let me try to explain how. By the way, Mehta donated the $15,000 bounty he was awarded for helping find the bug to the Freedom of the Press Foundation’s campaign for the development of encryption tools for journalists to use when communicating with sources.

The RFC 6520 Heartbeat Extension tests TLS/DTLS secure communication links by allowing a computer at one end of a connection to send a “Heartbeat Request” message, consisting of a payload, typically a text string, along with the payload’s length as a 16-bit integer. The receiving computer then must send the exact same payload back to the sender.

The affected versions of OpenSSL allocate a memory buffer for the message to be returned based on the length field in the requesting message, without regard to the size of actual payload in that message. Because of this failure to do proper bounds checking, the message returned consists of the requested payload followed by whatever else happened to be in the allocated memory buffer. The problem was compounded by OpenSSL’s decision to write its own version of the C dynamic memory allocation routines. As a result, the oversized memory buffer returned to the requestor was likely to contain data from memory blocks that had been previously requested and freed by OpenSSL. Such memory blocks may contain sensitive data sent by users or even the private keys used by OpenSSL. In addition, by using its own memory management routines OpenSSL bypassed mitigation measures in some operating systems that might have detected or neutralized the bug.

The heartbleed bug is exploited by sending a malformed heartbeat request with a small payload and large length field to the server in order to elicit the server’s response permitting attackers to read up to 64K bytes of server memory that was likely to have been used previously by SSL. Attackers in this way could receive sensitive data, compromising the security of the server and its users. Vulnerable data include the server’s private master key which would enable attackers to decrypt current or stored traffic via passive man-in-the-middle attack (if perfect forward secrecy is not used by the server and client), or active man-in-the-middle if perfect forward secrecy is used. The attacker cannot control which data are returned, as the server responds with a random chunk of its own memory.

The bug might also reveal unencrypted parts of users’ requests and responses, including any form post data in users’ requests, session cookies and passwords, which might allow attackers to hijack the identity of another user of the service. CVE-2014-0160 has been assigned to this vulnerability. At its disclosure, some 17 percent or half a million of the Internet’s secure web servers certified by trusted authorities were believed to have been vulnerable to an attack. The Electronic Frontier Foundation, Ars Technica, and Bruce Schneier all deemed the Heartbleed bug “catastrophic.” Forbes cybersecurity columnist, Joseph Steinberg, described the bug as potentially “the worst vulnerability found (at least in terms of its potential impact) since commercial traffic began to flow on the Internet.”

The vulnerability was introduced into OpenSSL’s source code repository on December 31, 2011 by Dr. Stephen N. Henson, one of OpenSSL’s four core developers, following a request from Dr. Robin Seggelmann, the change’s author. The vulnerable code has been adopted to widespread use with the release of OpenSSL version 1.0.1 on March 14, 2012. Dr. Seggelmann, the German Programmer who wrote the code (are the Germans going to cause another world war? This time in the digital world??😉 ) worked for the OpenSSL project while getting his Ph.D. studies from 2008 to 2012 at University of Munster. Adding to the drama of the situation, he submitted the code at 11:59 p.m. on New Year’s Eve 2011, though he claims the timing has nothing to do with the bug. “I am responsible for the error, Because I wrote the code and missed the necessary validation by an oversight. I failed to check that one particular variable, a unit of length, contained a realistic value. This is what caused the bug, called Heartbleed,” said Seggelmann, now an employee with German telecommunications provider Deutsche Telekom AG.

He said the developer who reviewed the code failed to notice the bug, which enables attackers to steal data without leaving a trace. “It is impossible to say whether the vulnerability, which has since been identified and removed, has been exploited by intelligence services or other parties,” he said.

Now, providing you with all the info given above is not the reason why I’ve written this blog, it is to highlight one simple fact, that you can’t trust anything that is open source. This entire open source phenomenon started right in front of me but unlike everyone else, I was never excited about it, the whole world going mad about this open source thing looked very weird to me at that time and the feeling is still the same. I used to share my thoughts with my peers but nobody used to agree with me, I used to sign off with a statement that one day this whole open source thing will cause trouble for all of us, BIG time, I could see that they thought that I’m a weirdo but then I was ok with that because I’m not a person who would change his thought process just because everybody else thought differently, it’s difficult to swim against the tide and you have no company, you are alone but if your will is strong enough, you will make it to the shore.

Now, the heart of the open source community is bleeding and it’s bleeding heavily in the open. Something like this was bound to happen, it was nothing but imminent, just a matter of time as they say. Free stuff excites everybody but all of us should always remember that there are no free lunches in this world. If you are getting something for free, you’ve got to doubt it. These open source fanatics call Microsoft all kinds of names for their heavy licensing costs and it looks very weird to me, If I create something of great value then I would like to get paid handsomely for that, I believe in philanthropy but giving away my valuable work for free doesn’t qualify as philanthropy for me. Even as a user, if I could compare buying software with buying a car, I would like to buy a car from a very reputed brand like Ford, General Motors, Chevrolet or Honda and not a car built by my neighbourhood mechanic in his garage. Those big brands will charge me big bucks compared to the mechanic but I’m sure that the branded car is still going to be a better deal. I always thought that the open source only looks good to novice users and to the people who think that they are great technocrats but don’t really know anything about it, reading four blogs a day doesn’t make you a techie. I’m still ok with someone liking the concept of opensource but when people started to dish out all the open software, that again proved my point that all that was nothing but crap. To give you one very simple example, when firefox came out, it was tagged as being way faster than Mircrosoft’s IE, but the firefox programmers were doing nothing but just trying to make a fool out of their users. When you close firefox, it disappears much faster than IE does, and it indeed happens but it does just that, it just disappears from the user’s view, it doesn’t get closed any faster than IE, if you try to open firefox again right after closing it, you would get an error saying that another instance is already running, if you go to the task manager, you will be able to see the process still running. This is nothing but gimmickry and the novice users and the world’s great technocrats still think that firefox or chrome are better browsers than IE because they are fooled by the eyewash that these opensource guys indulge in. I always thought that even on the security front, these browsers are no good but I never tried to do any kind of research into that, mainly because I was just so sure of it. I always tell my wife to never use firefox for any online shopping, bill payments or net banking. Now, this heartbleed bug has once again confirmed my theory that open source software is less secure, here’s how. Any website would need to do the following to get around this heartbleed.

  • Upgrade their server software to a non-vulnerable version. I can’t give you general advice on how to do this because it depends on which software you are running.
  • After upgrading your software, generate a new SSL/TLS key and get a certificate for the new key. Start using the new key and certificate. (This is necessary because an attacker could have gotten your old key.)
  • Revoke the certificate you were previously using. (This is necessary because an attacker who got your old key could be using your old key and certificate to impersonate your site.)
  • Have your users change the passwords that they use to log in to your site. (This is necessary because users’ existing passwords could have been leaked.)

However, even if all of the affected certificates were to be revoked, contemporary web browser software handles certificate revocation poorly. The most frequent users of a site — often its administrators — can continue using a revoked certificate for weeks or months without the browser notifying them that anything is amiss. In this situation, an attacker can perform a man-in-the-middle (MITM) attack by presenting the certificate to unsuspecting users whose browsers will behave as if they were connecting to the legitimate site. For example, some browsers only perform OCSP revocation checks for Extended Validation certificates, while others ignore certificate revocation lists completely. SSL Certificates are used to secure communication between browsers and websites by providing a key with which to encrypt the traffic and by providing third-party verification of the identity of the certificate owner. There are varying levels of verification a third-party Certificate Authority (CA) may carry out, ranging from just confirming control of the domain name (Domain Validation [DV]) to more extensive identity checks (Extended Validation [EV]). However, an SSL certificate — or any of the certificates which form a chain from the server’s certificate to a trusted root installed in the browser or operating system — may need to be revoked. A certificate should be revoked when it has had its private key compromised; the owner of the certificate no longer controls the domain for which it was issued; or the certificate was mistakenly signed. An attacker with access to an un-revoked certificate who also has access to the certificate’s private key can perform a man-in-the-middle (MITM) attack by presenting the certificate to unsuspecting users whose browsers will behave as if they were connecting to a legitimate site. There are two main technologies for browsers to check the revocation status of a particular certificate: using the Online Certificate Status Protocol (OCSP) or looking up the certificate in a Certificate Revocation List (CRL). OCSP provides revocation information about an individual certificate from an issuing CA, whereas CRLs provide a list of revoked certificates and may be received by clients less frequently. Browser support for the two forms of revocation varies from no checking at all to the use of both methods where necessary.

Firefox does not download CRLs for websites which use the most popular types of SSL certificate (all types of certificate except EV which is usually displayed with a green bar). Without downloading the CRL, Firefox is happy to carry on as usual; letting people visit the website and transfer sensitive personal information relying on a certificate that is no longer valid. In any case even if OCSP were available, by default Firefox will only check the validity of the server’s certificate and not attempt to check the entire chain of certificates (again, except for EV certificates). Google Chrome, by default, does not make standard revocation checks for non-EV certificates. Google does aggregate a limited number of CRLs and distributes this via its update mechanism but it’s not very efficient in that area. For the majority of Chrome users with the default settings, as with Firefox, nothing will appear to be amiss. For the security conscious, Google Chrome does have the option to enable proper revocation checks, but in this case the end result depends on the platform. On Windows, Google Chrome can make use of Microsoft’s CryptoAPI to fetch the CRL and it correctly prevents access to the site. However, RSA’s CRL is not delivered in the conventional way: instead of providing the CRL in a binary format, it is encoded into a text-based format which is not the accepted standard. Mozilla’s NSS — which is used by Firefox on all platforms and by Google Chrome on Linux — does not support the format. On Linux, Google Chrome does make a request for the CRL but cannot process the response and instead carries on as normal. Microsoft’s web browser, Internet Explorer is one of the most secure browsers in this context. It fetches revocation information (with a preference for OCSP, but will fallback to CRLs) for the server’s certificate and the rest of the certificate chain. Now, I think that should make my wife proud of my advice to her😉

All this gimmickry and all these security flaws in the open source software stem from the fact that the motto of all these guys is NOT to make great software, rather it is to just make something that at least ‘looks’ better than MS products. It’s always easy to build your product that betters on the competition’s flaws or at least perceived flaws, for example, what makes IE a tad bit slower than some of these so called faster browsers is things like I just explained above and these things are not only in the security area, they are in all perceivable areas. If I have an accident in my branded car, I will have better chances of being alive than in the car that a mechanic built, overlooking and compromising on the safety features to make his car cheaper and faster. Microsoft’s Implementation of SSL/TLS remains completely unaffected by this bug and while people might tell me that OpenSSL is a respected opensource software (or at least it was, till the time it’s heart bled), I would argue that the reputation is based largely on wishful thinking and open source mythology. What good can a software be that is made by a small group of developers, most volunteers and all but one part-time. Huge parts of the Internet, multi-zillion dollar businesses, implicitly trust the work these people do. Basically because of only two factors a) It’s free, b) It’s open source, the code being available freely makes people think that a lot of people would be able to review it and find bugs, but since the discovery of this heartbleed bug took years to be discovered (which was not a highly complex miss to find, instead it can even be called a silly programming mistake) we now know that it’s more of a myth than a reality. So, just to save a few millions, enterprises have chosen to put their prestige at stake. It just never ceases to amaze me.

You might think that at least now these big guys might just stop trusting opensource with their life and start paying up for better software but I don’t think that it’s still going to happen because a)heartbleed, even after all of it’s popularity, still remains a bug known only to geeks, b) People think that lightening doesn’t strike the same place twice. So, in my opinion, these people are not going to learn unless something more devastating happens yet again which makes people lose money from their bank accounts and make them lose their emails and other confidential data. So, my advice to all of you is don’t take that car from the neighbourhood mechanic even if he gives it to you for free.

Posted in IT Infrastructure | Tagged , , , | Leave a comment

How to set the Default Excel Version when you have multiple versions on your server/PC


As per Microsoft, you need to install the oldest version first and the newest version last, if you intend to use multiple versions of MS Office on the same machine. There are KB articles for each version of Office that explain this, http://support.microsoft.com/kb/2121447 is for Office 2010 and it has the KB #s for other supported versions of Office as well. But there is a big assumption in these articles, that all of us would like to make the latest version as the default version, this is not always true and hence the problem. Like in my scenario, there was a need to install office 2010 on a server that already had office 2003 and the 2003 version needed to be made the default office application. So, if I install Office 2010 on top of 2003, then the newer one would become the default version which is NOT what is needed, so the workaround was to repair the office 2003 version AFTER the installation of the 2010 version so that it becomes the default one.

This is not a one time affair though, Microsoft recommends following the same order (oldest one first and newest one last) while installing patches to the office versions as well. Now, what happens if the only patches released in a given month apply to office 2010, after you apply those patches, the 2010 version will become the default version. So, we again need to follow the same workaround i.e repair the office 2003 version to make it the default app.

I have only tested this for Excel but I assume this should work on the others, such as word and PowerPoint as well. Also, let me clarify that here, I’m talking about “Installing” the office versions and not using the ‘Click-to-Run’ technology as discussed at http://support.microsoft.com/kb/982434

This is one of the reasons why it’s not such a good idea to install multiple versions of the same app on the same machine, Microsoft does not recommend installing two or more versions of MS Office on the same machine and they don’t even support the same if it has been done on a terminal server. So, the best way out is to have just one good version but if you can’t do with that then the above workaround is there for you.

Posted in IT Infrastructure | Tagged , | 1 Comment

Server has the RAID configuration intact but still can’t find the OS to boot up


I think all of us would have been in this situation sometime or the other where we have a server whose RAID config has not been lost but the server is still not able to find the OS. It is indeed a strange situation because if the RAID config is not lost then the server should be able to see the Windows Partition, and if it’s able to see that then it should be able to boot off the files present there. This normally happens after a system board or HDD0 replacement. I found an explanation for this problem while working on such a situation with an IBM HW CE, he helped me fix the problem and I asked him for the link where this info is present. The link is here – http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5079636

It seems that Loading the defaults in Unified Extensible Firmware Interface (UEFI) deletes the Boot Order entries for any UEFI aware operating system. These entries are required to boot the operating system correctly. The workaround for this problem is given below.

The workaround is dependent upon the particular UEFI operating system. Refer to the UEFI operating system information for the actual file name and path used to boot that particular operating system.

For example, for Microsoft Windows Server 2008, the steps are as follows:

1) Power on the system, and, when prompted, press F1 to enter setup.
2) Select Boot Manager.
3) Select Boot from File.
4) Select the GUID Partition Tables (GPT) System Partition where you installed Microsoft Windows Server 2008. (This is mostly the first entry in the available list of options)
5) Select EFI.
6) Select Microsoft.
7) Select Boot.
8) Select bootmgfw.EFI.

The next question in my mind was that will the server boot up normally (without the need to follow the above steps) when it’s restarted or especially if it is powered off and then powered on, although I haven’t tested it myself but the answer should be yes, because the operating system is supposed to recreate the UEFI Boot Order Entry after we point it towards it once.

Posted in IT Infrastructure | Tagged , , , | Leave a comment

I’m Back!!


OK, So here I’m with my next blog after about two and a half years, That’s a loooong time which flew really fast. I got married in the interim and life has turned on it’s head since then, I’m finding that being married is quite “different” from being single, your priorities change completely and you start leading …. ummmm a very ‘different’ life for the lack of a better word. But my absence from the blogging sphere was not entirely due to that, one of the other more important reasons was my original motto behind writing blogs – that I would write only on topics on which not many people have written on the internet. I knew it would be difficult to find such topics but then writing a blog a day was never my intent. Now, I have decided to write about things that I encounter in my everyday tech life, it could be a difficult to solve problem or just something that attracts my attention. If nothing, it would serve as my online diary of tech solutions which would at least ensure that I don’t end up slogging for the same kind of problem again, if I end up helping some folks by way of doing that, then I would consider myself lucky🙂

Posted in Uncategorized | Leave a comment

WINSXS


Who ate my disk ?????????????????????????????

It’s that folder which resides in C:\Windows and takes up a lot of space on your Windows Server 2008 machine and which didn’t used to do that in Windows Server 2003. What does it contain?, can we delete it?, can we move it to a different drive?, can we delete something that’s under it? Are the questions that, almost, all of my team members have asked me over the recent past and my short answer to them has been ‘NO’. Explaining all about this folder (I know, geeks are supposed to call a folder as “Directory” as the word ‘folder’ sounds a bit too laymanish but I like to call it a folder) during work hours and that too separately to every individual, demands a lot of time, and time is at a premium at work as I try to finish all of my day’s tasks before the day ends (this line is for my manager, might just be useful during my next appraisal). This was not a problem for my team at my previous organization, as I used to have whiteboard sessions (which later on turned into “Glassboard” sessions as I started using a glass wall partition for my illustrations instead of the usual whiteboard. Now, that’s what I call being innovative ;)) for them after work hours or sometimes, even during weekends. I must also admit for a fact that they are the only ones in the entire world who read my blog😉. For my current team, I thought about sending an email to the team DL. So, that they get their answers pertaining to the WINSXS folder and then I thought, why can’t it be my next blog? So, here’s my blog entry for the WINSXS folder.

WINSXS refers to’ Windows Side by Side’ which refers to the concept of hosting different versions of the same file. Side-by-side technology is a standard for executable files in Microsoft Windows that attempts to reduce DLL hell. DLL hell designates a group of problems that arise from the use of dynamic-link libraries in Microsoft Windows. Problems include version conflicts, missing DLLs, duplicate DLLs, and incorrect or missing registration. In SxS, Windows stores multiple versions of a DLL in the WinSXS subdirectory of the Windows directory, and loads them on demand. This reduces dependency problems for applications that include an SxS manifest.

To start with, this folder takes up a LOT of space on the C: drive and keeps growing over time. The normal size is around 6-7 GB but I’ve even seen them as big as 15 GB. One of the major differences between W2K8 and the previous versions is the move from INF described OS to ‘componentization’. A component in Windows is one or more binaries, a catalog file, and an XML file that describes everything about how the files should be installed. From associated registry keys and services to the kind of security permissions the files should have. Components are grouped into logical units, and these units are used to build the different Windows editions.

Now, let us look at the folders that reside within the WINSXS folder
1. \Winsxs\Catalogs: Contains security catalogs for each manifest on the system
2. \Winsxs\InstallTemp: Temporary location for install events
3. \Winsxs\Manifests: Component manifest for a specific component, used during operations to make sure files end up where they should
4. \Winsxs\Temp: Temp directory used for various operations, you’ll find pending renames here
5. \Winsxs\Backup: Backups of the manifest files in case the copy in \Winsxs\Manifests becomes corrupted
6. \Winsxs\Filemaps: File system mapping to a file location
7. \Winsxs\: The payload of the specific component, typically you will see the binaries here.

WINSXS folder contains all of the OS components; it is called the component store. Each component has a unique name that includes the version, language, and processor architecture that it was built for. So, calling the WINSXS folder the entirety of the whole OS won’t be all that wrong. This folder is also the reason why Windows 2008 doesn’t ask for the Installation DVD while installing additional features and roles or while running SFC, like it used to happen with the previous versions of Windows. Windows 2008 copies the entire media content to the installed server with all components placed in the Single Instance Storage folder (SIS), C:\Windows\winsxs. When a feature or role is installed on the server, the required files are either copied or linked from the SIS folder so no media is needed.

But why does this folder get bigger and bigger? – Answer is ‘servicing’ (patches and service packs). In previous versions of Windows the atomic unit of servicing was the file, in Windows Server 2008 it’s the component. When we update a particular binary we release a new version of the whole component, and that new version is stored alongside the original one in the component store. The higher version of the component is used in the system, but the older version in the store isn’t deleted. The reason for that is the third part of why the component store gets so large.

Not every component in the component store is applicable, For example, on systems where IIS is available but has not been installed, the IIS components are present in the store, but not installed into any location on the system where they might be used. If you’re familiar with how multi-branch servicing works in previous versions of Windows then it’ll make sense to you that we have a different version of the component for each distribution branch and service pack level, and that all these different versions are also stored in the WINSXS folder, even if they’re not immediately applicable. So a single package that contains an update to one component will end up installing four versions of that component in the WINSXS folder – double that on a 64 bit operating system for some components.

Now that you know why the store can grow to be so large, your next question is probably to ask why we don’t remove the older versions of the components. The reason is reliability. The component store, along with other information on the system, allows us to determine at any given time what the best version of a component to be used is. That means that if you uninstall a security update, we can install the next highest version on the system – we no longer have an “out of order uninstall” problem. It also means that if you decide to install an optional feature, we don’t just choose the RTM version of the component, we’ll look to see what the highest available version on the system is. As each component on the system changes state that may in turn trigger changes in other components, and because the relationships between all the components are described on the system we can respond to those requirements in ways that we couldn’t in previous OS versions.

The only way to safely reduce the size of the WINSXS folder is to reduce the set of possible actions that the system can take – the easiest way to do that is to remove the packages that installed the components in the first place. This can be done by uninstalling superseded versions of packages that are on your system. Windows Vista Service Pack 1 contains a binary called VSP1CLN.EXE, a tool that will make the Service Pack package permanent (not removable) on your system, and remove the RTM versions of all superseded components. This can only be done because by making the Service Pack permanent we can guarantee that we won’t ever need the RTM versions. As far as I know, it only works on Vista and not Windows Server 2008 but haven’t tested it yet.

Here is how to use this tool.
NOTE 1: After you use this cleanup tool, you will no longer be able to remove Service Pack 1, should any problems occur. Make sure that the system is stable before using.
NOTE 2: This tool is a one-time use tool. Once it’s used it will no longer work on the same installation.
Open Windows Explorer and navigate to C:\Windows\System32. Look for the file “vsp1cln.exe.”
Right click this file and select the ‘Run as Administrator’ option.
The Vista Service Pack 1 Cleanup Tool will remove all of the redundant files that it has replaced.
The amount of disk space you gain will depend on the system, what files are installed, etc.

For Windows Vista SP2, Windows Component Cleanup Tool (compcln.exe) located in the system32 folder seems to be the tool that can be used to clear the superseded versions and this can be run even on Windows Server 2008 SP2 machines, The steps to run it on a Vista machine are given below.

Running this compcln.exe tool is pretty simple:
1. Execute the command “Compcln.exe” at the elevated command prompt. The path is “c:\Windows\System32 \compcln.exe”.
3. You will be prompted with a question whether to keep SP2 permanently in the system.
This operation will make all service packs and other packages permanent on this computer.
Upon completion you will not be able to remove any cleaned packages from this system.
Would you like to continue? (Y/N):
4. Once you type “Y” and press enter, the system will start performing the windows components cleanup.

On my test run, it freed up 1 GB of space from the WINSXS folder from a Windows Server 2008 SP2 test box and I know it’s not much.

In short, if you have a dwindling C: drive and the WINSXS folder is the main culprit, sadly enough, you can’t do much about it. I have seen people blaming Microsoft after running into disk space issues with servers where they have installed Windows Server 2008 on 20 GB drives. Their argument is that it was enough for W2K3, yes it was enough for the previous version but skipping due diligence while upgrading to W2K8 or while going for a new server build is the sole reason for their problems. Microsoft came out with this as the solution for the ubiquitous DLL Hell Problem with the previous versions of Windows. Who was not fed up with the missing DLLs ? How many times had you seen something like “A Required DLL File, Z.DLL, was not found” or “The procedure entry point Y couldn’t be located in X.DLL” when you would try to run an application, or during startup. Expecting that your Ferrari would only need the same amount of fuel as your dad’s car is nothing but stupidity. Microsoft recommends a minimum of 32 GB or greater for the OS alone (some application files also go into the C: drive, So, consider that as well) but going by that minimum requirement would be something like housing a family of four in a studio flat, it’s not impossible but obviously full of problems. I have always liked to have at least 50 GB allocated to the C: drive on the servers and the humongous size of the WINSXS folder is something that justifies my liking.

But then, I was wondering, what would happen if we delete WINSXS ? And Yes, I tried deleting the folder from a Windows Server 2008 SP2 test box. First, it won’t let me delete/move/rename it. So, had to mount it’s hard drive on another system and then delete it. But, when I tried to boot up the system again, it didn’t come up. So, DO NOT DELETE/RENAME/MOVE WINSXS.

Posted in IT Infrastructure | Tagged , , , | 4 Comments

Watson


From Science Fiction to Reality

The other day, I was looking at a VB Script, which needed to be modified just a bit in order to be useful for me. Being a scripting illiterate, I was overwhelmed by the complexity of the coding, I couldn’t understand anything that was in the script, forget about successfully modifying it as per my needs. My failure to get the desired result drove me to think – why can’t computers understand human language?? Why do we have to tell them things in their language and not ours?? Why couldn’t I just type in what I wanted the script to do in plain English and get the computer to execute it?? I didn’t ponder over it much as it was already very late and I needed to catch up on some sleep. The next day in office was very much like any other day, sometime during the middle of the day, an email from our chairman popped up on my inbox. I didn’t bother to read it immediately as I was in the middle of something, later that day, I opened that mail and started reading it, Sam was announcing the success of WATSON in the game show Jeopardy. As I read further, I found that Watson is the latest supercomputer built by IBM and it could understand human language. What?????????? A computer that can understand human language?? Oh My God, isn’t that what I was thinking about last night?? Something like that had never happened with me, what was science fiction for me the night before, was a reality the very next morning. I was once again overwhelmed, this time due to the enormity of the achievement. I always wanted to witness something like this, the next BIG stride in the world of technology, something that will change the way people use computers and something that will change the world itself. It had to be IBM to come up with something like this, thirty years after building the first PC, we have now given the world the first computer that can understand plain English.

The name Watson, made me think of Dr. Watson, the windows program error debugger that gathers information about your computer when an error occurs with a program. That was named after the character by the same name of Sherlock Holmes fame. The original name of this diagnostic tool was Sherlock. But it was more than obvious that IBM’s Watson was named after our founder Thomas J. Watson. Sam’s mail was about Watson winning Jeopardy – Jeopardy! is an American quiz show featuring trivia in history, literature, the arts, pop culture, science, sports, geography, wordplay, and more. The show has a unique answer-and-question format in which contestants are presented with clues in the form of answers, and must phrase their responses in question form. The show has a decades-long broadcast history in the United States since its creation by Merv Griffin in 1964. It first ran in the daytime on NBC from March 30, 1964 until January 3, 1975; concurrently ran in a weekly syndicated version from September 9, 1974 to September 5, 1975; and later ran in a revival from October 2, 1978 to March 2, 1979. All of these versions were hosted by Art Fleming. Its most successful incarnation is the Alex Trebek-hosted syndicated version, which has aired continuously since September 10, 1984, and has been adapted internationally. Although, I have never watched jeopardy, I immediately understood that making Watson compete on jeopardy was just a demonstration of it’s capabilities at understanding human language and providing the desired results. Sam himself said in his mail that we have not spent the last four years and millions of dollars just to win a game show.

The project under which Watson was built is known as the DeepQA project. Watson is designed according to Unstructured Information Management Architecture – UIMA for short. This software architecture is the standard for developing programs that analyze unstructured information such as text, audio and images.  Watson doesn’t use any kind of a database to answer questions because of the simple reason that it’s impossible to build such a database that can answer each and every question in the world. Thus, it has to rely on text which in other words means unstructured data. Computers until now could only understand data presented to them in a structured format. But that’s where Watson is different.  The biggest challenge that Watson overcomes is simply to understand written text. In order to understand text it has to understand the language very well.  For example, when Watson searches for the answer of the following jeopardy clue “This Greek King was Born in Pella in 356 BC” it might stumble up on a sentence like “Pella is regarded as the birthplace of prince Alexander” now Watson has to understand that the birthplace and being born in mean the same thing. It has to understand that although the sentence has no reference to any king, a prince will grow up to be a king. You might say that a simple key word search can also produce similar results but it’s much beyond key word searches. To demonstrate that let me carry forward my last example about Alexander, while searching for the answer, Watson might also stumble upon another sentence like “Sreeraj, the king of laziness, was born in Pella” now, this sentence would be a better match if we were to rely on key word searches. It has a direct match for the words ‘King’ and ‘born’ which the previous sentence did not have. But, since Watson has been designed by much smarter people than me, it uses some smart algorithms which enable it to ignore this sentence and arrive at the correct answer as Alexander. In short, it has to analyze the sentence just the way a human would. Watson searches through it’s text data and generates hundreds of possible answers, evaluates each simultaneously and narrows its responses down to its top choice in about the same time it takes human champions to come up with their answers. Speed counts in a system doing this amount of processing in such little time. Humans get better with experience and so does Watson, it has been designed to learn from it’s mistakes and adapt dynamically to achieve a higher percentage of accuracy.

Being a systems guy, I HAVE to tell you the system details about Watson. The system powering Watson consists of 10 server racks and 90 IBM Power 750 servers based on the POWER7 processor. The computing power of Watson can be compared to over 2,880 computers with a single processor core, linked together in super high-speed network. A computer with a single processor core takes more than 2 hours to perform the deep analytics needed to answer a single Jeopardy! clue. Watson does this in less than three seconds. Up to now, Watson has been utilizing about 75% of its total processing resources. 500 gigabytes of disk hold all of the information Watson needs to compete on Jeopardy!. 500 GB might not seem like enough knowledge to compete on the quiz show Jeopardy!. Consider this: Watson mainly stores natural language documents – which require far less storage than the image, video and audio files on a personal computer. The information is derived from about 200 million printed pages of text. The Watson server room is cooled by two industrial grade, 20-ton air conditioning units. The two 20-ton air conditioning units that regulate the temperature of the Watson server room are enough to cool a room about one-third the size of a football field. The hardware that powers Watson is one hundred times more powerful than Deep Blue, the IBM supercomputer that defeated the world’s greatest chess player in 1997. The POWER7 processor inside the Power 750 is designed to handle both computation-intensive and transactional processing applications – from weather simulations, to banking systems, to competing against humans on Jeopardy!. Watson is optimized to answer each question as fast as possible. The same system could also be optimized to answer thousands of questions in the shortest time possible. This scalability is what makes Watson so appealing for business applications. In the past, the way to speed up processing was to speed up the processor. This consumed more energy and generated more heat. Watson scales its computations over 90 servers, each with 32 POWER7 cores running at 3.55 GHz. This provides greater performance and consumes less power. Watson is not connected to the Internet. However, the system’s servers are wired together by a 10 Gigabit Ethernet network.

Now, what is so great about Watson from a layman’s point of view?? Having a computer understand human language is a BIG deal for us but what does it change for the average computer user?? Imagine a person who could read up all the information ever printed on planet earth. Think about the amount of knowledge that person would have and now to the key part – Imagine you could ask a question to that person and ask that in your everyday language. There would be many who would say that Google does the same job and answers all our questions but what if you know that the person sitting next to you would know the answer to your question? Wouldn’t you prefer directing your question to him and not sit and think about how to phrase your question correctly for a Google search so that you get the most relevant results?? Anyways, Google doesn’t provide direct answers to your queries, it just throws up a lot of relevant information which you need to sit and read to get the correct answer to your question. Google can’t answer questions like “How much will I gain if I were to sell my stocks now?” “What are the chances of my patient recovering from disease without a surgery?” etc. Unlike Google you get straight answers here, you don’t have to wade through tons of information, interpret it correctly and arrive at the answer, Watson does all that for you. It’s just like speaking to a knowledgeable person. I, comparing Watson with Google doesn’t mean that they are peers or competitors, Google is just a search engine, nothing more, whereas, Watson is peerless. The difference is of chalk and cheese or should I say God and Man.

At the moment, Watson has been tuned to win Jeopardy but it can easily be tuned to be useful in some of the world’s biggest industries. IBM Plans to make use of Watson in some key areas like Finance and Medicine for example, Rice University uses a workload-optimized system based on POWER7 to analyze the root causes of cancer and other diseases. To the researchers, it’s not simply a more versatile server; it’s a giant leap towards understanding cancer.

I don’t deny the possibility of us not hearing about Watson again after the current euphoria disappears. It could very well be the next best thing that couldn’t live up to the expectations. But even then, it marks one of the greatest achievements in computing history which will eventually help us build a smarter planet.

Posted in IT Infrastructure | Tagged , , | 2 Comments

How Kerberos Works


What happens at the windows logon screen between the time you press enter after entering your credentials and ‘Loading Your Personal Settings’ message appears

It hardly takes a second for your password to be accepted at the logon screen but what goes on behind the scenes to log you in on to your workstation in a domain environment will take much more than a second to explain.

My primary area of expertise is Active Directory but if you have read my previous blogs then you would know that I haven’t written anything on my favourite subject till now. That’s because so much has been written on AD that there is nothing new which I can write about.  Moreover, I don’t like to like to write a blog just to update my blog site. In order to write something, I need a subject which is not discussed on the web as much as other stuff is. This is because I want to contribute to the IT Infrastructure community in general and the Windows folks in particular through my blog and writing on subjects which have been written about zillions of times by great authors would not contribute anything to our community. One of the examples could be ABE (Access Based Enumeration) there is not too much about ABE on the internet apart from the Microsoft website and that came as a motivation for me to write about ABE. But, I always wanted to write about AD and now I have found something related to AD, which does not have too much written about it. It’s Kerberos, the preferred authentication protocol in Active Directory environments. This also gives me the opportunity to explain the behind the scenes action during a logon process.

Kerberos replaces LM, NTLM and NTLMV2 which were used in the pre Windows 2000 era (and are still used in some cases, we will come to that later). Massachusetts Institute of Technology (MIT) developed Kerberos to protect network services provided by Project Athena; Project Athena was a joint project of MIT, Digital Equipment Corporation, and IBM (my current employer) to produce a campus-wide distributed computing environment for educational use. It was launched in 1983, and research and development ran until June 30, 1991, eight years after it began. As of 2010, Athena is still in production use at MIT. Project Athena was important in the early history of desktop and distributed computing. It created the X Window System, Kerberos, and Zephyr Notification Service. It influenced the development of thin computing, LDAP, Active Directory, and instant messaging.

The name Kerberos comes from Greek mythology; it is the three-headed dog that guarded the entrance to Hades (hell), according to Hindu mythology, Lord Yama (God of Death) has a dog named ‘sarvara’, which sounds similar to kerberos. Why was this name chosen, remains a mystery to me.

Now, let’s get to know this dog better, Kerberos sees users (which are usually the client) as UPNs (User Principal Names) and services as SPNs (Service Principal Names), Your AD logon name – the one that looks like an email address (e.g., username@bigfirm.com) – is your UPN, Kerberos “introduces” UPNs to SPNs by giving a UPN a “ticket” to the SPN’s service. Let’s try to understand the user logon process through an example.  Let’s call our user OM, OM comes to office in the morning and starts his workstation. At the login screen he enters his username and password. At this point, his workstation sends a pre-authenticator to the Authentication Service (AS) of his local KDC (Key Distribution Centre), the KDC is better known as the domain controller, or we should say KDC is one of the roles of a domain controller. The KDC has two components, the Authentication Service or AS and the Ticket Granting Service or TGS. The pre-authenticator contains the current date and time in YYYYMMDDHHMMSSZ format, the Z in the end denotes that the date and time is in universal (zulu) time. This info is encrypted with OM’s password. Upon receiving the pre-authenticator, the AS decrypts the pre-authenticator using OM’s password, which it already has. If the AS is unable to decrypt the pre-authenticator then it means that the user entered the wrong password as that doesn’t match with the user’s password in the domain controller and hence the user receives a message saying his/her password is wrong. If the AS is able to decrypt the pre-authenticator then it compares the date and time inside with the DC’s own date and time, if the difference is not more than 5 minutes (default value, but can be changed) then AS sends the user a TGT (Ticket Granting Ticket). This is the reason that all domain joined machines need to have a time which is not different from the DC’s time by more or less than five minutes. The TGT is valid for 10 hours (default value but can be changed), This TGT contains a temporary password for the user and that is encrypted by the password of the krbtgt user account’s password. Krbtgt account is created by default when the DC is first installed. The user does not need to decrypt the TGT and hence it doesn’t need to know the krbtgt user’s password. As you might have noticed that the user has still not logged on to his workstation, that’s because the authentication process is yet to be completed. The next step in this process starts when the user sends the newly acquired TGT to the TGS, TGS is as we discussed earlier the second component of KDC, TGS decrypts the TGT and that confirms that OM is indeed OM and not an impersonator and then assigns OM a Service Ticket (ST) to his workstation. The service tickets generally contain time start, total lifetime, security token etc. This service ticket is encrypted with the workstation’s computer account password in AD. OM presents this ST to his workstation upon which his workstation decrypts it, as it has it’s own password and then allows OM to login. At this point OM sees the “Loading Your Personal Settings” message, then his profile gets loaded/created and then he is ready to work on his workstation. Now, to better understand Kerberos, let’s take this example a bit further and see what happens when OM tries to access a file server. When the user tries to access any particular service, like the file server or print server, it needs to authenticate itself to that particular server or service. This authentication process starts by the user sending his/her TGT to the TGS and upon verifying it, the TGS assigns a service ticket for the file server to OM, this ST is again encrypted with the password of the computer account of file server. OM presents this ST to the file server, file server decrypts it and then allows OM to view the list of shared folders on that file server. Whether OM is able to enter any or all of those shares depends on whether OM has the required NTFS and share permissions to those shares. This emphasises the fact that Kerberos is used for authentication and not for authorization.

Now, that we have understood how Kerberos works, let’s get to know, why is it considered so cool and better than NTLM. There are several reasons; we will discuss the two most important ones, one of them being that if you are using Kerberos then the user’s actual password is sent over the network just once in a day (or 10 hours to be exact). For the rest of the day the user uses his Ticket Granting Ticket (TGT) to authenticate for the various services that it might need. Second, would be the advanced encryption techniques available for Kerberos. If you are using Windows Server 2008 or above you can opt to use AES (Advanced Encryption Standard) which is one of the best generally available encryption techniques and it hasn’t been hacked yet. For, older versions of Windows you can use RC4 HMAC which isn’t a bad encryption technology either. Microsoft had to opt for a comparatively lesser encryption technology for older versions of Windows because till the late nineties it was unlawful in the United States to export software which used encryption technology beyond a certain level. LM and NTLM were ridiculously easy to hack through replay/mirror attacks, NTLMV2 was much better and Kerberos is extremely difficult or even impossible to hack.

It’s time to look at the scenarios where Kerberos is NOT used for authentication. The first scenario would be when you use an IP address in a UNC path. In this case Kerberos is not used because Kerberos needs SPNs and SPNs need DNS names. The second would be trying to connect to a computer which is in a workgroup. The third would be trying to connect to a pre windows 2000 computer. The most interesting scenario is when the domain controller is inundated with logon requests, it starts to login users with the previous authentication protocols to avoid the extra work that it needs to do with using Kerberos. I think this is one of the contributing reasons why Microsoft recommends the maximum hardware resources usage for domain controllers to be at 30%. But, How would you know whether you have been logged in with Kerberos or something else?? There are several indicators. If you are not logged in with Kerberos then you wouldn’t be able to add machines to domain, you won’t get any group policies etc. but the simplest way to find that out would be to run the command klist. Klist allows us to see the tickets that we currently have. It comes by default with Windows 7, Windows Server 2008 and later and can be installed on the previous versions. It’s part of the Windows Server 2003 resource kit. If you are not logged in with Kerberos then there won’t be any tickets.

This was the story behind what all happens within the blink of an eye during the logon process. I hope you liked what I had to share with you as my first blog on my favourite subject – Active Directory.

Posted in IT Infrastructure | Tagged , , , , , , , | 1 Comment