Ashley
Blog - Robert Bogue [MVP]
Rob's Notebook
SharePoint Calendar

Categories

Links

Archives

Other Blogs

Thor Projects LLC - Welcome : Blog - Robert Bogue [MVP]
Monday, March 31, 2008

The Monster Server Project - VMWare, iSCSI, and learning

So I’ve been working on a project for quite some time, and I’m finally seeing the light at the end of the tunnel.  Doing work with SharePoint means getting to know virtual machines and, relatively speaking, throwing a lot of hardware at the problem.  I like many other SharePoint consultants have ended up with a high end laptop.  However, that’s scored me the reputation as the guy with the heaviest bag.  Not only do I have to carry a big laptop but I’m also carrying two or three external hard drives at any one time.  That may make my chiropractor happy but it’s not something that I’m real fond of.

So my idea is that I’d get a big virtual server up and running and I’d just connect to the images running on that server – instead of trying to run them locally.  The idea sounds good, particularly when I’m in my office with reliable high speed connectivity.  However, there’s some question to how this will work at clients and on the road.  I’ll let everyone know how that goes when I get there.  It is, however, the impetus for this project.

I had purchased a server to do a conversion project.  The short story is I needed a server with fast disks and a fast processor to handle some conversion work.  Yes, the project was big enough to just bundle in the cost of the hardware.  I ended up with a Dell 1950 1U server with a Quad Core 2.66 Ghz and 4GB of RAM.  I also configured it with 4x 146GB SAS 10K drives.  I didn’t need that much space and I was concerned about speed of the drive arms.  While I was at it I added a Dell Remote Access card so I could get to the server even if something bad happened.  The server is better than my core network infrastructure by a wide margin – it’s the first truly server class machine I have.

The conversion project came to an end and I was left with a server that by all accounts could make a good virtual host.  However, there wasn’t enough RAM to run multiple machines (to do things like test farm configurations for SharePoint).  So I went to buy RAM for the machine.  I thought it would hold 16GB of RAM and was prepared to make that purchase.  However, I realized upon further review that it would take 32GB of RAM.  So I maxed it out.  My wife will tell you this is a standard response from me.  I rarely go halfway.(I think if she understood she’d be confused by the fact that it only has one quad core processor in it.)

I had configured the disks on the machine in a RAID 1+0 for performance … I wanted to keep that because in addition to my development machines I wanted to be able to run my web site off of the machine.  When I started to look at the RAM to disk ratio I could tell it’s out of whack.  By this time I had decided to use VMWare Infrastructure (ESX) to host my virtual machines.  My friends at Bluelock had already convinced me that the performance was in some cases better than on physical hardware.  (Watch Windows boot and you still won’t believe how fast it is.)  So that meant I had to figure out storage options that would work with VMWare.

Well, VMWare doesn’t support USB drives, and their support for storage controllers is VERY limited.  Basically, I couldn’t find a direct attach solution for SATA drives that was supported.  Why SATA drives?  Well, because they’re cheap for large capacities.  Since I had fast storage, I needed some Tier 2 slow storage.  Honestly, there’s not much you can do to beat $250 for a 1TB drive.

What is supported in VMWare is iSCSI.  iSCSI is the poor man’s SAN technology.  I call it the poor man’s SAN because it’s much less expensive than a Fiber Channel solution.  The performance can be relatively close to fiber channel performance if correctly configured.  For what I needed iSCSI was going to be more than adequate.

Well, that’s fine except that iSCSI cabinets are on the somewhat expensive side – even those that take SATA drives.  After a relatively exhaustive search I found a 2U unit sold by Enhance Technologies called RS8IP.  It can hold a total of 8 SATA drives.  For my initial bite at the apple I settled on 4x 1TB Seagate drives.  The MSRP for the RS8IP is slightly over $3K and the 4 drives cost about another $1K.  Yes, I am talking about spending $4K on storage – but it’s 4TB of unprotected storage.  I opted for RAID5 which left me with a 3TB usable space. (actually 2.793 TB since 1TB drives are really slightly smaller than that when you do your division by 1024 instead of 1000).

I’ve had VMWare up and running for a while on the system, but honestly I’ve not had the time to really get to know it too well.  I do know that it’s really quick with the virtual machines I’ve been running on it.  I hadn’t fully converted to using it for my development systems for space reasons – and it hadn’t been moved to its permanent home, it’s been sitting here at the house so I couldn’t really run multiple instances and get to them from the outside.  Anyway, I was impressed by windows boot times and general performance.

Configuring the RS8IP was pretty easy, except for some mistakes I made.  There’s a quick install that can be accessed from the front of the unit.  I did that and when I got asked about LBA64 – I said Yes.  That, I would learn later would be a problem.

I started out trying to figure out how to get VMWare to talk to the unit.  I got an error message about VMotion and iSCSI having not been licensed… Interesting since the licensing screen said I was licensed for it.

A quick call to a buddy and I was told that I had to create a VMKernel port for iSCSI on my virtual switch in VMWare.  Once I created that I could create the iSCSI storage adapter.  Once I had the adapter I scanned for the iSCSI targets (luns) I exposed on the RS8IP.  No dice.

It was about this time I was really not feeling good.  It’s not too often that I wonder off into areas that I don’t have a way to test and troubleshoot but I was concerned this might be one of those times.  However, I saw instructions for the MS iSCSI initiator.  I downloaded it and installed it into a virtual machine with Windows XP on it.  The MS iSCSI initiator loaded up, it saw the RS8IP, but it didn’t show any volumes.  As a verification, I went over to a Vista machine I have and set it up on the Vista machine.  Presto… I had a 2793 GB drive showing up.

It was about this time that I started thinking about what might be different between Vista and XP… and I realized that LBA64 support was a likely candidate.  I thought that perhaps VMWare would have the same limitations so … I rebuilt the drive array without LBA support.

And … Presto after a few short hours I could see the drive in XP.  (By the way, had I paid better attention I could have started the test while the array was rebuilding but I forgot to setup the LUN and well, I didn’t remember until after the array was rebuilt.)

So it’s visible in XP and in Vista… but still not visible in VMWare.  A few emails and a little while later I had a buddy at Bluelock respond to ask me if I had enabled iSCSI through the VMWare firewall… Doh!  Once I enabled it through the VMWare firewall and rescanned.  I had a LUN show up.  Success … or so I thought.

I went to add the storage to VMWare and I was told that there wasn’t a partition table, it was corrupt, the world was coming to an end… well, you get the point.  I briefly played around with the fdisk utility from the command prompt and decided that there were enough problems VMWare was having that I should probably turn LBA 64 back on and see what happened.

And I didn’t see the drive.  So I thought, well, maybe it has to fully rebuild.  A few hours later … after the rebuild was completed, I tried again.  No dice.  By this time I’m getting pretty frustrated.  I’m feeling like I’m trying to guess the combination to a lock.

After some more research it became apparent that the LBA64 bit question comes up right at 2TB.  So once again I deleted the array and this time created two volumes and two LUNs.  One volume was 1396 GB and the other was 1397.  Neither had LBA64 turned on.  So I rescanned from VMWare and found both luns.  I went in and got VMWare to add them to the storage and even copied files to them almost immediately.  Sure the array was still building in the background, but I copied a non-trivial amount of data over pretty quickly.

Sunday morning while my family was sleeping I decided to have some fun.  So I simultaneously installed: Windows XP, Windows Vista, Windows Server 2003 R2, Windows Server 2003 R2 x64, Windows Server 2008, and Windows Server 2008 x64 on the environment.  In less than 2 hours I had six operating systems installed.  OK, that’s what I’m talking about!  I figure most of the reason it took 2 hours was I was having to remember to check on the installations and press keys.  Oh, yea, I patched all of those operating systems in the same 2 hours.  If you’ve ever patched a new installation of XP you know that can take 2 hours on its own.

I know I’ve still got a few gremlins in the system but they’re minor at this point.  First, I don’t think VMWare has the adapters teamed correctly.  I was looking at the switch diagnostics and it was showing 50% utilization on one port and almost none on the other port that the NICs were attached to.  Second, the drives are being recognized by the RS8IP as SATA 1.5Gb/s instead of SATA 3.0 Gb/s.  (It’s got a cool screen that shows me that information though.)  Third and Finally, I don’t know enough about VMWare yet to figure out how to convert my new operating systems into templates that I can use to create new systems.

So what did I learn from this exercise?

1.       It’s good to have friends.  OK, I already knew that but it’s worth repeating.  Seriously – thanks Ben and Andy for your help.

2.       Don’t create volumes that are larger than 2TB no matter how tempted you are.  It’s just going to be painful.

3.       VMWare is a powerful tool but one that requires a lot of learning.  (Oh, and finding information on it or the problems your facing is pretty difficult.)

4.       iSCSI is cool.  It performs pretty well and can be setup by mortals (notwithstanding the 2TB volume size issue.)

5.       To setup iSCSI on VMWare: a) Make sure you have a VMKernal port, b) Make sure you let VMWare make outbound iSCSI connections, c) Setup the iSCSI cabinet in the dynamic discovery tab of the iSCSI adapter, d) Rescan the iSCSI adapter (takes 60+ seconds), e) Don’t create LUNs that are greater than 2TB.


Categories: Professional | 0 Comments
 
 
Thursday, March 13, 2008

What is not a Huge MOSS Workflow Issue (Take 2)

Several months ago I responded (in what has to be the worst titled blog post “SPWorkflowAssociation.AutoCleanupDays”) to an inflammatory post by Dave Wollerman titled “Huge MOSS Workflow Issue… What is Microsoft Thinking!!!” I was recently pointed to Dave’s response to my response.

So I’m with Dave in that I think everyone should know about the situation.  Perhaps where I differ from Dave’s thinking is I believe people should have solutions to the problem so that’s what I’m going to offer here.

First, a recap.  By default, Workflows will clean themselves up 60 days after they “end”.  This process doesn’t delete the actual workflow history table entries but does disconnect them from the user interface, so they’re harder to get to.

So there are two key things to note about this:

1)      It’s settable.  You can change it in your element manifest as my original post on this topic pointed out.  You can also change it in the API for every existing workflow association if you want.  (SPWorkflowAssociation.AutoCleanupDays).  If you never want SharePoint to delete workflow history, the workaround is simple.  Write a tool that runs every 59 days that goes through every site, web, list, and workflow association and sets the AutoCleanUpDays property to 9999.  The end.  No issue.  It won’t clean anything up for you.  It’s probably 4 hours worth of coding including testing.  I don’t understand how something that can be fixed so easily is a “Huge MOSS Workflow Issue”

2)      Workflow History != Audit (for you VB programmers, Workflow History <> Audit).  Workflow history isn’t an audit.  It’s not secured by default.  It’s not tamper proof.  If you’re using Workflow history for audit – and the auditors are letting you – then there’s something wrong.  If you need an audited record then have the workflow bundle everything up and record it in a record center (or insert your solution here) so that it is a final record.  I can put an entry into the list saying that the Easter Bunny approved something if I’ve got contributor access to the web site (and no one has changed the default permissions of the history list.)  I can delete the records indicating that I did approve something.  It’s just not secure so using it as an audit record when it isn’t not only doesn’t work, it doesn’t make sense.

So in Dave’s defense, I’ve definitely had those “What were they thinking?!?” moments.  The support engineers, product managers, and program managers will all be happy to pipe in and confirm that I’ve had them.  However, I have them over things that are more pervasive and don’t have good workarounds.  (Like say checking for the existence of a list without throwing an exception, or not allowing for the creation of a content type with a specific ID from the API, etc.)  I know it can be frustrating to expect that SharePoint operates one way only to be shown that it operates another.

However, ultimately, this is a non-issue.  It’s easy to work around and it’s a bad approach to start.


Categories: Professional | 2 Comments
 
Tuesday, March 11, 2008

InfoPath Forms Services – The form template is not browser-compatible

Every once in a great while Murphy meets me at my doorstep and decides to do a “Ridealong” all day.  Occasionally I keep one step ahead of him.  Due to some much appreciated help from the InfoPath product team – today is one of those days.

A quick rewind, just more than a week ago I ran out of time before going to the SharePoint Conference.  I was trying to publish an InfoPath form for a client.  I opened in design mode with Infopath that day and I got a complaint about a duplicate key violation.  I quickly change the XSN file extension to CAB, extracted the files, fixed the manifest.xsf (I think it was that file) and used MakeCab to create a new cab file.  I renamed the file to XSN and I thought everything was fine.  InfoPath opened in design mode.  However, as I said I ran out of time before publishing the form.

I arrived at the client this morning to publish the form and started by getting some drop down list data on to their environment.  During this process I noticed something was wrong.  Knowing that I hadn’t rebooted in several days I decided to reboot my laptop.  For good measure once my laptop was rebooted I decided to take the virtual machine I developed the form in and restart it.  I figured I’d start clean to eliminate problems.

It’s at this point in the story that Murphy decided that he needed to get involved.  So what happened was I got a Check disk happen when I restarted the VPC.  Normally I don’t pay much attention.  When I started seeing it scrolling the screen with sequential sector numbers I sat up and paid attention.  Somehow a substantial amount of the master file table and directory structure was unreadable.  I’m still not sure what caused this melt down.  So after quite some time I got the system to boot up and I got logged in.  There was, however, a serious problem.  Explorer couldn’t start because SHELL32.DLL was missing.  It turns out that Task Manager (Ctrl-Alt-Esc) also needs SHELL32.DLL.  So I have an unbootable system.

I attached the drive as a secondary drive to a working VPC and turned on undo disks… of course the files that I needed were gone.  I tried a file recovery tool that I bought and it didn’t work.  That’s surprising because I’ve seen it bring back entire directory structures in the past.  I’m becoming impressed at the amount of damage that Murphy did in really short time.

I come back to my office … I look to my backups and find that I have a backup of the image from a week ago – after I had done all my work.  After some restore time … I had nearly (if not everything) that I had lost.

So I start working on my publishing in my test environment – so I can verify that I have everything.  Here’s where Murphy thought he had me.  I publish the form and get a message on the Upload Form Template page:

 

The form template is not browser-compatible.  It might be possible to correct the problem by opening the form template in Microsoft Office InfoPath, and then republishing it.

 

So I go back to InfoPath and rerun the Design Checker.  I got 8 messages about some post backs that needed to happen – and no errors.

So I open the ULS Logs (C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\LOGS) and find an interesting error:

 

GetSolutionInformation failure: The following file is either missing or is not part of included in the form template: OSD220.OSD To add a file to the form template in design mode, use the Resource Files dialog box on the Tools menu, and then add the file you want.

 

So I crack my XSN by renaming it .CAB and see that the file it’s complaining about actually exists. it looks like this:

<?XML version="1.0" ENCODING='UTF-8'?>

<!DOCTYPE SOFTPKG SYSTEM "http://www.microsoft.com/standards/osd/osd.dtd">

<?XML::namespace href="http://www.microsoft.com/standards/osd/msicd.dtd" as="MSICD"?>

<SOFTPKG NAME="Cab1" VERSION="1,0,0,0">

                                <TITLE> Cab1 </TITLE>

                                <MSICD::NATIVECODE>

                                                <CODE NAME="AddMoveChangeRequests">

                                                                <IMPLEMENTATION>

                                                                                <CODEBASE FILENAME="AddMoveChangeRequests.dll">

                                                                                </CODEBASE>

                                                                </IMPLEMENTATION>

                                                </CODE>

                                </MSICD::NATIVECODE>

</SOFTPKG>

Well, at this point I’m stumped.  So I drop an email to some very nice folks (Thanks Nick & Brian).  The inform me that InfoPath will show the same error if I try to fill the form out (and not use Preview as I’m in the habit of doing.)

It seems that when I put the file back together from fixing the duplicate ID problem I somehow accidentally sucked up the OSD file and it’s not supposed to be there.  (Because it’s not listed in the manifest.xsf)

They are kind enough to tell me how to fix it too.  Take the form in design mode and do a File-Save as Source Files.  Then open up the Manifest.xsf in design mode and finally doing a Save As to turn it back into a packaged XSN file.

Apparently the OSD file is an Open Software Description file … I still don’t know exactly what it’s supposed to be.

So here’s what I learned today:

1)      If you get a message that a form isn’t browser enabled… Check it.

2)      If you check it and see that it is browser enabled and there are no errors check the ULS logs

3)      If you get a message about a missing file in the ULS logs it could be the file is present and shouldn’t be.

4)      You should always try to fill out the form.  Just testing with Preview isn’t enough.

5)      It helps to have good people willing to bail you out when Murphy tries to do a “Ridealong”

 


Categories: Professional | 0 Comments