Inside Manage Patch Status

Back in August, I stumbled across a new type of DCOM 10016 error in SharePoint 2010, caused by the Product Version Job timer job. When I found the error, I was primarily concerned with keeping my event logs clean. Since then, the inelegance of my original work-around and the incomplete picture I contented myself with at the time began to nag at me, but I only recently started digging deeper, prompted largely by the fact that this topic has generated more traffic to my blog in the last quarter than any other.

This is a fairly lengthy consideration, but I think it’s necessary to cover these details because the information in the Managed Patch Status (AKA Check Product and Patch Installation Status) page in Central Administration may not be revealing what we’d reasonably infer.

In this post and the posts to follow, I’ll cover a few things:

  • Why I think granting Local Activation rights to the Windows Installer Service puts a dent in the least-privileged model.
  • What this DCOM error means to the reliability of data displayed in the new Manage Patch Status page in SharePoint 2010 Central Administration.
  • What the job does and doesn’t do, with or without rights to launch the Windows Installer Service.
  • Considerations for disabling the Product Version Job timer job.

The Problems

I believe most people will come to this problem in the way that I have, which I’ve seen repeated on many TechNet fora since then. People want to know why they are getting inundated with approximately 100 DCOM 10016 System event log errors and twice that many MsiInstaller Application event log warnings and informational events nightly, at around 00:52. The exact number of messages will vary based on the SharePoint products installed in the farm, including related products such as Project Server, Office Web Apps, FAST Search, etc. For a more detailed review of these events and how they can be identified, please refer to my original post.

Additionally, we have a nightly timer job which seems to be failing, per these DCOM errors. The job itself claims to check, “the install state of the machine and puts that data into the database”. This is rather vague. As of August this is what I understood:

  • The timer job appeared to fail to use the Windows Installer Service to perform a check of installed SharePoint products.
  • I didn’t know anything about how that check happened or how the data was used afterwards.
  • I didn’t know if the event log messages were ephemeral (annoying only because they generate clutter), as they are for the IIS WAMREG DCOM 10016 errors.
  • I felt it would be bad to grant rights to launch the Windows Installer Service to the farm account in an otherwise-least-privileged configuration (where the Farm account does not already have local administration rights).

In this post I want to dwell on the inner workings of the  job itself, and then come back to the implications for our event logs, permissions and job scheduling.

Inside the Job

In order to find out how the job works, I had to crack it open in .NET Reflector and SQL Management Studio. I need to disclaim this post, because I’m not a developer, and to be perfectly honest I’m in a bit over my head with Reflector, but I was prompted to investigate in this way based on the apparent misinformation in one of the TechNet threads I mentioned above. Geoff Belair went to considerable lengths to work through this topic with Microsoft support, but from what I can tell, there are a number of mistakes in the answer he received. It suggests the wrong database gets updated and is a fairly inaccurate description of what this job does, by my reading of the following clues.

(In)validating the Microsoft Support Explanation

It’s unfair to take a Microsoft Support e-mail which has been re-posted on the web as authoritative, but this was the closest thing to official information I’ve found, other than the brief words about this job on TechNet and MSDN. The key bit of that reply that I wanted to immediately verify was this:

The Timer Job “Product Version Job” runs every night at 12:52 A.M and analyze which are the dlls are updated, once it get the information then it’s put the updated version data on to Content Database “dbo.version” table.

So I took a look at the dbo.Versions table in the Central Admin Configuration database (never do this in production, of course).

What caught my eye was that there was no product information in this table whatsoever. I knew that the job was checking for the state of individual products based on the MsiInstaller informational events in the Application logs. So I poked around a little more and found what I was expecting in the dbo.ServerVersionInformation table:

Having looked at this data, I realised it was pretty familiar. I went back to Central Administration, looked in the Upgrade and Migration section and clicked on Check Product and Patch Installation Status, which took me to this Manage Patch Status page. The key thing to note is that the version numbers and the Patch Status columns match the data on the page below precisely. I’ve actually manually updated that data just to give it a sneaky check, and this page is definitely pulling it in from that source. You’d never do this on a real system, however. I wouldn’t even do it without having a recent snapshot for my development environment.

At this point I was pretty confident the timer job was trying to update this table, but I wanted to get a bit better assurance before testing the job in anger. I also wanted to understand how the Windows Installer Service gets involved, as this activity seems to take place outside the ULS logs.

Analysing the job in .NET Reflector

Cracking open ULS Viewer while running the timer job, you immediately see the fourth event in my screenshot below. Job-admin-product-version calls SPProductVersionJobDefinition.

This is where I opened Reflector. I started with Microsoft.SharePoint.dll and drilled down to Microsoft.SharePoint.Administration.SPProductVersionJobDefinition, which executes SPServerProductInfo.UpdateProductInfoInDatabase(Server.Local.Id);

SPServerProductInfo calls the GetMsiData method, which works with a number of SPMsi methods (SPMsi.GetPropertyUsingProductCode, SPMsi.MsiEnumPatchesEx, SPMsi.MsiGetPatchInfoEx, SPMsi.SPMsiSafeHandle, SPMsi.MsiOpenDatabase, SPMsi.MsiDatabaseQuery). Further down, SPProductVersionRow is clearly collecting the same data as the columns of the SQL dbo.ServerVersionInfromation table I examined earlier. If interested in these workings, I’d recommend perusing it with Reflector at a more leisurely pace than this.

Note: all of the SPMsi.Msi* methods are using msi.dll, which is the Windows Installer.

The Product Version Job’s use of the Windows Installer

From here, I could explore the workings of the Windows Installer in finer detail – but for the purposes of our SharePoint knowledge, all that’s really important to know is that the timer job is using the Windows Installer’s own methods to query the installed product versions on the servers. As I understand it, the Windows Installer typically stores this data in the registry, at HKEY_LOCAL_MACHINESOFTWAREMicrosoftWindowsCurrentVersionUninstall. That’s an oversimplification, but stick with me for now.

We can see a number of SharePoint products in this location. They are all keys beginning with “90140000”. Taking a look at the data in these keys, it’s pretty clear that it aligns with the data that’s written to SQL’s dbo.ServerVersionInformation table (down to the registry key value in the “Patchable Unit” column). Additionally, these are all the same products that are identified in our Application event log messages. You can even see the patched products have a longer key, with a suffix that looks something like “_Office14.OSERVER_{48017E90-141F-4948-A576-F4B9B6284B70}”.

Perhaps most importantly, the ProductVersion Property of the Windows Installer is what defines the four “version” values (including “DisplayVersion”) of the Uninstall keys above. This is the key information that the Product Version Job is after, and the name of this timer job feels like an even better fit in this context.

While unravelling the job in this way has given me a fair amount of confidence about how SharePoint retrieves this information, there are still a number of issues to consider. For starters, I suspect people look at Manage Patch Status data and feel pretty confident about that representation of the installation state of their servers. Being a fairly skeptical type, I suspected that the Windows Installer’s “record keeping” would be good up to a point, but no further, so I put on my demolition hat and started breaking stuff, in an effort to place that point. In my next post I’ll review those test results, then consider the implications for DCOM rights to the Windows Installer Service and the timer job scheduling options.

8 thoughts on “Inside Manage Patch Status”

  1. Do you mind if I quote a few of your posts as long as I
    provide credit and sources back to your weblog? My blog
    is in the exact same area of interest as yours and my visitors would really benefit from some of the information you present here.

    Please let me know if this alright with you. Appreciate it!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.