I’ve had a couple of customers lately who’s had sudden issues with Azure Pack reporting a error 500 when used in combination with ADFS after logging on.
It’s because the ADFS Certificate has been updated and the thumbprint in WAP doesn’t match the one presented from ADFS anymore.
I’ve modified Mark’s script a little bit so I can easily run it at various customers without modifying the URL’s. It will basically read the old value from the config and re-use that hostname for the ADFS dns entry.
This script assumes you are using ADFS for both the tenant and admin sites.
One more post in my WSUS/Hotfix series of blogposts. I’ve been asked a couple of times how we approve Hotfixes and if we include them in the images.
I’ve made an Autoapproval Rule where we approve all Hotfixes automatically to the various Computer Groups with a Deadline, like this.
And this is how the details looks like;
First of all, any server that could cause problems if it automatically rebooted doesn’t have a Deadline, thats servers like Hyper-V Hosts and SOFS Nodes. Those servers are managed by SCVMM’s (System Center Virtual Machine Manager) Patch Management. VMM has a feature to put a cluser node in maintenance mode, automatically drain the node of VM’s, patch it, and then bring the node back online again before it takes the next node. So we handle all patching of clustered servers from SCVMM. While we let the WSUS Client handle all other servers. We might add SCCM to the mix some day and let it handle all of the servers, but as most of our customers don’t want to run SCCM to manage their Fabric, this is the way we do it now.
By putting a deadline, we know the hotfix will be installed sooner or later. And if there is a Patch Tuesday before that date, it will also install the hotfixes at the same time.
Notice that the hotfix is NOT approved for All Computers and NOT for Unassigned Computers. How come?
When we build a VM image for any OS, it’s done automatically through MDT. Those VM’s are ending up in Unassigned Computers as they don’t have a role yet and we don’t want any Hotfixes in the images. Of course, if there is a mandatory hotfix whish is needed to make the image or deploy it, that one will be included!
The reasons we don’t want any hotfixes in an image is quite simple if you think about it. There are two main reasons really.
The first one is that if we make an image in august, which contains hotfixes. When we deploy that image 3 months later, there is a big chance that the hotfix we had in the image is replaced by a proper update from Microsoft so there was no use for the hotfix in the first place.
Second, when we create an image, we don’t add Clustering, Hyper-V and other roles and features to the image, right? So Windows will then only install the hotfixes for the core OS. And when the image is later deployed and someone adds the Hyper-V Role, it would install hotfixes for that role then. So the server wouldn’t be fully patched anyway so adding 5 or 15 hotfixes automatically after deployment doesn’t really make much of a difference.
Third, a minor reason is also that we normally use the same images for Fabric, Workload and Tenants and we like to keep them quite generic.
When doing a Live Migration from SCVMM (System Center Virtual Machine Manager) with VSM, moving a Virtual Machine from one Cluster to another Cluster and at the same time also to a new Storage Location, you are getting an error message similar to this:
PowerShell
0
1
2
3
4
5
6
7
8
9
Error(12700)
VMM cannot complete the host operation on the HOST07.FABRIC.DOMAIN.COM server because of the error:Virtual machine migration operation for'markustest01'failed at migration source'HOST07'.(Virtual machine ID FAA0957A-4AF9-4B84-8AE9-2E9BC56CA9A6)
Migration did not succeed.Could not start mirror operation forthe VHD file'\\FASOFSL02.FABRIC.DOMAIN.COM\CSV02\markustest01-1\FAS2012R2-001G2.vhdx'to'\\FASOFS01.FABRIC.DOMAIN.COM\vDisk03\markustest01-6\FAS2012R2-001G2.vhdx':'General access denied error'('0x80070005').
Unknown error(0x8001)
Recommended Action
Resolve the host issue and then try the operation again.
The strange thing is that there is a destination folder in the new location, it’s just does not copy content to that folder and aborts with the Access Denied error. But If you shutdown the VM first, so it’ s just a migration over the Network, it works!
The solution is to give the SOURCE Cluster Write Access on the DESTINATION Storage. When you do a VSM Migration, the destination Hyper-V host, creates the Directory on the SOFS Node, but it’s the Hyper-V Host that owns the VM that copies the VHD’s files to the destination storage. And as the current owner, by default does not have access to write there, it will fail. One could think that VMM should grant permissions to a host when VMM knows that the host needs to write in the location?
Maybe it’s fixed in the next version, but until then, there are two ways to do this. Solution 1) In VMM add the Destination SOFS Shares as Storage on the Source VM Hosts like this. That will make VMM add the VM Hosts with Modify Permissions in the SOFS Shares so it can write there.
This works quite fine, if the Hyper-V Clusters and all Storage is located in roughly the same location. But if you have one compute cluster with storage in one location, and another compute cluster with storage in another location. There is then a risk that you may be running VM’s cross the WAN link.
Solution 2) This is the one we used. By not using VMM to grant permissions to the shares, but rather do it manually we achieve the same solution as above but with the added benefit that a new VM will always be provisioned on the local storage and there is no (or a lot less) risk of running a VM cross the WAN link. Yes, it’s still technically possible to do it, but no one will by accident provision a VM that uses storage in the other datacenter.
You can either add each node manually, so we have created a “Domain Servers Hyper-V Hosts” security Group in AD where we add ALL Hyper-V hosts to during deployment. And then added that group to the Share and NTFS Permissions. All Hyper-V hosts will then automatically have write access to all locations they may need.
I wrote these two short scripts to query the VMM Database for the available SOFS Nodes and use powershell to grant permissions to the share, and to NTFS.
As all our SOFS Shares were called vDiskXX or CSVXX (where XX is a number) I just used a vDisk* and CSV* to do the change on all those shares. You might have to modify it a little to suit your name standard.
PowerShell
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Grant SOFS FileShare permissions to group: Domain Server Hyper-V Hosts
$group=(Get-ADGroup"Domain Server Hyper-V Hosts").Name
Updated Script (2016-02-04):
I got a report that the script was getting an error on some servers, which I managed to reproduce. Here is an alternative version where it will connect to the server and execute the ACL change locally via invoke-command. It’s also only changing permissions on Continuously Available (SOFS) shares.
A customer had a lot of VM’s with Dynamic MAC address, rather than the preferred method of using Static MAC addresses.
Here is a small powershell script that will shutdown each of the VM’s with a Dynamic MAC Address, change to a Static MAC Address and then start the VM.
I’m running the script on the System Center Virtual Machine Manager (SCVMM) Server and to make sure VMM does not shutdown itself, I’ve added an exclude for the SCVMM Server.
This blog-post is about using System Center Virtual Machine Manager (SCVMM) Availability Sets to spread similar VM’s to different Hyper-V Hosts to increase reliability both when using Failover Clustering, and when using stand-alone Hyper-V hosts.
First of all, what is Availability Sets?
In SCVMM 2012 SP1, Microsoft added Availability Sets. Failover Cluster Manager users are probably familiar with AntiAffinityClassNames, and Availability Sets are a very similar concept. This allow the user to specify a set of VMs which they would prefer to keep on separate hosts, and the Intelligent Placement engine works hard to make sure that all our features respect that preference.
Attempting to place multiple VMs with the same Availability Set onto a single host will generate a placement warning, meaning that the host will be prioritized last in the placement dialog
When placing a VM with an Availability Set into a cloud placement or as part of a service will avoid hosts with another VM from the same Availability Set, and warn the user if that was the only choice.
Dynamic Optimization will never move 2 VMs from the same Availability Set onto the same host. It will also actively attempt to separate any VMs with the same Availability Set that are on the same host.
Power Optimization will never power off a host that would lead to 2 VMs with the same Availability Set sharing a host.
Putting a host in maintenance mode will attempt to spread VMs with the same availability set to different target hosts.
If your VMs are highly available and hosted on a Hyper-V failover cluster, VMM will create AntiAffinityClassNames on the VMs with an Availability Set, so that even during cluster failover, SCVMM opt to failover to different hosts, if possible.
You can manually create Availability Sets through SCVMM by selecting Properties on a VM.
Just click Create to make a new Name and assign it to the VM’s you want to keep on separate Hosts. When a Availability Set is not assigned to a VM any longer, the Availability Set will be deleted automatically, thus cleaning up the list for you.
For example, for your SQL Server Cluster, you may want to create a Availability Set name called SQL and assign it to your SQL Server Nodes. Easy!
Also, if you are using Service Templates, you can opt in to automatically create Availability Set names for your services.
Though I like to control things like that automatically. Depending on your naming convention for your Virtual Servers, this might or might not be possible for you.
In our case we have a strict naming policy to name servers with:
PREFIX FUNCTION NUMBER as seen in this picture:
Which makes it very easy for me to define that all servers called CLAZSQ* are similar and should be kept on different servers.
But, if all servers were called SRV0001-SRV9999 it would not be possible to utilize the ServerName for setting Availability Set names, and you would have to query the CMDB for info first.
Also, in our environment we have multiple Tenants, who could each have servers called DomainController01 and DomainController02. So just having a availability set called DomainController, would not be enough. I have to make it DomainController_TenantName or something similar.
I wrote this quick and short Powershell script to automatically assign a Availability Set to all VM’s. It will remove Numbers from the VM Name, and use the VMName + UserID (Tenant Subscription id) as the Availability Set Name. Clean, simple and easy, just schedule it to run regularly, or even make a SMA Job to trigger when a VM is created through AzurePack.
I just noticed that the VM Usage History has been fixed in UR7. It’s been broken as long as I remember and used to only show the last 3 days of data, even if you selected 1 month. I don’t think it was mentioned in the ChangeLog for UR7?
I just updated our System Center Virtual Machine Manager 2012 R2 Environment to Update Release 7. SCVMM would then report that the SCVMM Managed Computers has an Out-of-Date Agent which need to be upgraded.
It’s possible to do it manually by right clicking on each server and choose “Update Agent” or use this short powershell script to do it on all machines at the same time.
PowerShell
0
1
2
3
4
5
# Change name to the RunAs Admin Account specified in SCVMM
Here is the list of Hotfixes I’m deploying in our production environment and that I deploy regularly at customers. Those production environments are a Fabric (Private Cloud) running Hyper-V, Storage Spaces, SOFS, ADFS, Domain Controllers, Azure Pack, System Center, SQL Servers, and more, yes everything you need in a Fabric. Though not Exchange, Lync or Sharepoint etc. So this list might not be complete for your system.
And as always, use your own judgement which hotfixes you would like to deploy in your environment or not. Hotfixes are not tested as much as ServicePacks used to be, and Update Rollups are, so it’s possible there are problems with them.
My philosophy is that I like to have everything updated and reduce the risk of having a problem. The number of times I have had issues with a hotfix are, as far as I can remember one (1), including the several years I worked at Microsoft Premier support and were assisting customers with problems and now and then provided a hotfix for an issue. So I rather install hotfixes I know of and are relevant to reduce the risk of hitting a real problem than wait for that issue to actually happen and then find a hotfix or open a case with Microsoft.
A hotfix included all previous fixes for that module too, so when troubleshooting a problem, it’s common that Microsoft Support asks you to install hotfix X, Y and Z to get the components involved in the problem to the latest revision. Thus, it might look like some of the KB Articles and hotfixes below does not apply to you, or you don’t have that problem in your environment. But if it’s related to Cluster, Hyper-V or any other component that you do use, it might be wise to install it anyway as it could fix 10 other problems that you are not aware of.
There is as far as I know (and I’ve also asked Premier Support) no way to script the import of updates into WSUS directly from Windows Catalog. You will have to manually use a Web Browser to import them. Click, Click, Click, wait, Click, Click….
The list is ordered by release date so the latest hotfixes are at the top. And looking at a fresh Fabric deployment, it looks like most hotfixes older than 10/14/2014 has been superseded, except for KB2965733 which was still needed by a couple of servers in this new fresh environment. But things might be different for you.
Interrupts to the Intelligent Platform Management Interface driver are missed in Windows Server 2012 R2 http://support.microsoft.com/kb/3061460 Released: 6/9/2015
Unexpected ASP.Net application shutdown after many App_Data file changes occur on a server that is running Windows Server 2012 R2 http://support.microsoft.com/kb/3052480 Released: 6/9/2015
Backup application that calls the VSS service becomes unresponsive when the DFSR service is running in Windows http://support.microsoft.com/kb/3054249 Released: 5/12/2015
Resolution of external DNS records on a Windows Server 2012 R2 Hyper-V guest cluster fails through a Hyper-V Network Virtualization Gateway http://support.microsoft.com/kb/3049448 Released: 5/12/2015
Shared Hyper-V virtual disk is inaccessible when it’s located in Storage Spaces on a Windows Server 2012 R2-based computer http://support.microsoft.com/kb/3025091 Released: 5/12/2015
“The URL cannot be resolved” error in DirectAccess and routing failure on HNV gateway cluster in Windows Server 2012 R2 http://support.microsoft.com/kb/3047280 Released: 5/12/2015
Hyper-V host crashes and has errors when you perform a VM live migration in Windows 8.1 and Windows Server 2012 R2 http://support.microsoft.com/kb/3031598 Released: 4/14/2015
Hotfix enables AD FS token replay protection for Web Application Proxy authentication tokens in Windows Server 2012 R2 http://support.microsoft.com/kb/3042121 Released: 4/14/2015
“HTTP 400 – Bad Request” error when you open a shared mailbox through WAP in Windows Server 2012 R2 http://support.microsoft.com/kb/3042127 Released: 4/14/2015
Files cannot be copied when drive redirection is enabled in Windows 8.1 or Windows Server 2012 R2 http://support.microsoft.com/kb/3042841 Released: 4/14/2015
“STATUS_PURGE_FAILED” error when you perform VM replications by using SCVMM in Windows Server 2012 R2 http://support.microsoft.com/kb/3044457 Released: 4/14/2015
“Your computer can’t connect to the remote computer” error because RD Gateway service freezes in Windows Server 2012 R2 http://support.microsoft.com/kb/3042843 Released: 4/14/2015
A SQL Server that is running in a Hyper-V virtual machine takes a long time to restore a database to a dynamic VHD http://support.microsoft.com/kb/2970653 Released: 3/10/2015
DNS server does not try the second forwarder and other DNS improvements in Windows Server 2012 R2 http://support.microsoft.com/kb/3038024 Released: 3/10/2015
“0x000000D1” Stop error when you fail over a cluster group in Windows Server 2012 or Windows Server 2012 R2 http://support.microsoft.com/kb/3036614 Released: 3/10/2015
Hotfix for update password feature so that users are not required to use registered device in Windows Server 2012 R2 http://support.microsoft.com/kb/3035025 Released: 3/10/2015
Added 7/18/2015 “0x0000003B” or “0x0000007E” Stop error on a Windows-based computer that has 4K sector disks https://support.microsoft.com/kb/3027108 Released: 2/10/2015
System may freeze if a reserved disk is mounted accidentally in Windows 8.1 or Windows Server 2012 R2 http://support.microsoft.com/kb/3027110 Released: 2/10/2015
RemoteApp window is too large or too small when you use RDP to run a RemoteApp application in Windows Server 2012 R2 http://support.microsoft.com/kb/3026738 Released: 2/10/2015
Operation fails when you try to save an Office file through Web Application Proxy in Windows Server 2012 R2 http://support.microsoft.com/kb/3025080 Released: 2/10/2015
You are not prompted for username again when you use an incorrect username to log on to Windows Server 2012 R2 http://support.microsoft.com/kb/3025078 Released: 2/10/2015
You are prompted for authentication when you run a web application in Windows Server 2012 R2 AD FS http://support.microsoft.com/kb/3020813 Released: 2/10/2015
Time-out failures after initial deployment of Device Registration service in Windows Server 2012 R2 http://support.microsoft.com/kb/3020773 Released: 2/10/2015
You are prompted for a username and password two times when you access Windows Server 2012 R2 AD FS server from intranet http://support.microsoft.com/kb/3018886 Released: 2/10/2015
RDS License Manager shows no issued free or temporary client access licenses in Windows Server 2012 R2 http://support.microsoft.com/kb/3013108 Released: 12/9/2014
iSCSI SAN server that’s running Windows Server 2012 R2 restarts unexpectedly on a high-speed network http://support.microsoft.com/kb/3000123 Released: 11/11/2014
TRIM and UNMAP activities for thin provisioning on one volume block all activities on other volumes http://support.microsoft.com/kb/2996802 Released: 11/11/2014
SMBv1 named pipe requests do not time out when the remote server hangs in Windows 7, Windows Server 2008, Windows 8.1, and Windows Server 2012 R2 http://support.microsoft.com/kb/2995054 Released: 10/14/2014
SMB 3.0 Transparent Failover feature does not work after you disconnect a drive cable in Windows http://support.microsoft.com/kb/2991247 Released: 10/14/2014
WTSQuerySessionInformation API function always returns zero bytes for WTSIncomingBytes and WTSOutgoingBytes http://support.microsoft.com/kb/2981330 Released: 10/14/2014
“0x00000018” Stop error when volumes are mounted in Windows Server 2012 R2 or Windows Server 2012 http://support.microsoft.com/kb/2973052 Released: 8/12/2014
Updates to improve the compatibility of Azure RemoteApp in Windows 8.1 or Windows Server 2012 R2 http://support.microsoft.com/kb/2977219 Released: 8/12/2014
Error 58 when an application calls BackupRead function to back up files that are shared by using SMB in Windows http://support.microsoft.com/kb/2973055 Released: 7/8/2014
2965733 The guest cluster is not available to service users after failover in a Hyper-V Network Virtualization environment https://support.microsoft.com/kb/2965733 Released: 6/10/2014
Windows Server 2012 R2 or Windows 8.1 crashes when virtual volumes are exposed to hyper-v virtual machines http://support.microsoft.com/kb/2925766 Released: 2/11/2014
Memory and deadlock issues for the RD Virtualization Host and RD Connection Broker role services in Windows 8.1 http://support.microsoft.com/kb/2908810 Released: 2/11/2014
Hotfix improves storage enclosure management for Storage Spaces in Windows 8.1 and Windows Server 2012 R2 http://support.microsoft.com/kb/2913766 Released: 1/14/2014
OffloadWrite is doing PrepareForCriticalIo for the whole VHD in a Windows Server 2012 or Windows Server 2012 R2 Hyper-V host http://support.microsoft.com/kb/2913695 Released: 1/14/2014
Earlier this week I had a need to move a lot of VM’s from a couple of Hosts to another cluster. And instead of doing it one by one in VMM (Virtual Machine Manager), I wrote a small quick and dirty script that I had not really planned on publishing. Though a customer had a need for that script today, so I figured more people might need it.
Enter the name of the current Host where the VM’s are running.
Enter the name of the destination Hostgroup (seen in VMM). Start script.
PowerShell
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$VMHost="FAHOST11"#Name of current Host where VMs are running
$HostGroup="Workload"#Name Hostgroup of Destination Servers
$SleepTime="10"#Seconds to wait between each move. To give the Move time to complete before next start. Else all server might end up on the same Host.
The script will calculate the best possible host to move the VM too and then move it there and make it HighAvailable.
I didn’t initially have the sleep line in my script, though I did notice while it was executing that it tried to move too many at the same time (I think the default limit is 2) so some failed. And another issue is that the HostRating may get wrong if its doing a lot of calculations while there are no VMs on the destination host, and then suddenly lots of VMs end up there at the same time. So a sleep should hopefully take care of both those problems at the same time.
It looks like the events are happening every 30 minutes, and at the same time as Windows is for some (so far) unknown reason doing a reinstall of a lot of MSI packages, and the above Interactive Service is triggered at the same time as it’s reinstalling the DHCPExt.msi
I can so far unfortunately not find anything that’s logging why Windows is reconfiguring all MSI Packages on the server every 30 minutes.
It does look like it’s the DHCP Server extension that’s causing the Interactive Service errors, as they always happen at the same time. Though, the DHCP Server extension shouldn’t be reconfiguring in the first place.
We always enable the Reliability History on all servers which can be handy at times to see when a problem begun happening.
Check this Out!
It looks like the problem started on April 28 at 8:42 PM.
As the Reliability History tool is disabled by default, I’ll make another blogpost showing how you can enable this feature for all your servers.
When I wanted to see what had happened around April 28th. I noticed that was the oldest entries in the Application log. When the log has become full, it has removed the oldest entries according to the settings.
So I don’t think I’ll get any more details that way, and it does look like this problem has gone on for quite some time.
I’ll just reinstall the Hyper-V Host as it’s done in a few minutes compared to spending hours trying to fix the problem.
AND… I’ll create a Group Policy that will increase the Eventlog Size to x10 the default. So the next time something like this happens, I’ll have information to dig deeper.
You have a group policy with a WMIFilter that queries Win32_Product class.
You have an application installed on the machine that queries Win32_Product class.
As the problem is not happening every 90-120 minute which would be true if it was GPO Triggered, I would say it’s an application that uses the Win32_Product class. And after doing some digging, it turns out it’s a known problem with VMM which will be fixed in UR7. Or hopefully earlier with a hotfix.
Updated 2015-05-19 10:12:
Wow, I got a hotfix for the issue within 15 minutes after contacting the VMM Team.
I’ve just installed it in our test environment and will later install it in the customers production environment.
Unfortunately I don’t have a KB or Hotfix ID for this, but if you contact Premier Support I think you can mention that you need a hotfix for Engine.Adhc.Operations.dll which gives support for RegKey: UpdateDHCPExtension
That info should make them able to find the correct hotfix.