Sunday, December 13, 2009

Office Interop Object Collection Technique – The right way.

In my previous article, I described the ghost excel process issue and its solution by releasing all interop references. In this article I will write about the way to release interop references.

Basically, every COM object that we access in .NET has a RCW (Runtime Callable Wrapper) associated with it. Each RCW has exactly one COM instance wrapped in it. The RCW is a light weight object allocated on managed heap. However, it can wrap within a COM wrapper allocated on native heap which may be resource hog.

Each RCW also has a refCount property to track the lifetime of the RCW object. Once the refCount property of RCW reaches zero, the underlying COM object is released and is no more accessible. And if at this point the code tries to access the RCW, an exception is thrown saying “COM object that has been separated from its underlying RCW cannot be used.”


Note: Using System.Runtime.InteropServices.Marshal.ReleaseComObject(object) you can decrease the refCount property of RCW. However, if not done carefully and judiciously, you can fall in trap of difficult to trace premature object killing.


The refCount property of an RCW is increased when a client references it. And it is decreased either automatically by GC or by using System.Runtime.InteropServices.Marshal.ReleaseComObject(object) method.

Below is how it works programmatically.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Excel = Microsoft.Office.Interop.Excel;
using System.Reflection;
using System.Runtime.InteropServices;
namespace RCW
{
    class Program
    {
        static void Main(string[] args)
        {           
            Excel.Application xlApp = new
Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
            Excel.Workbooks workbooks = xlApp.Workbooks;

            //The first reference of new workbook added to app
            Excel.Workbook newWorkbook = workbooks.Add(Missing.Value);

            //The second reference to the added workbook
            Excel.Workbook secondReference = workbooks[1];

            //The third refrence to the added workbook
            Excel.Workbook thirdReference = workbooks[1];

            int workbookRCWCounter = Marshal.ReleaseComObject(newWorkbook);
            Console.WriteLine(workbookRCWCounter.ToString());

            workbookRCWCounter = Marshal.ReleaseComObject(secondReference);
            Console.WriteLine(workbookRCWCounter.ToString());

            workbookRCWCounter = Marshal.ReleaseComObject(thirdReference);
            Console.WriteLine(workbookRCWCounter.ToString());

            Console.Read();
        }
    }
}
In above code, we are adding a new workbook to the excel application and collecting the reference to newWorkbook variable. This increases the RCW counter for that workbook object to 1. Again we reference the newly created workbook object from the workbooks collection variable and store it in secondReference variable. This increases the RCW counter for that workbook object to 2. We do the same again and collect the third reference to thirdReference variable which in turn increases RCW count to 3.

Observe that each time we referenced the newly created workbook object either while adding workbook or while referencing from workbooks collection it increases its RCW’s refCount. This can be termed as direct references to RCW.

Running the code the output will be 2 1 0 signifying the decrement of reference counter by one every time we called Marshal.ReleaseComObject.

In direct reference pattern, it is advised to call Marshal.ReleaseComObject for each direct reference after you are done with it.

On the other hand if we change our direct RCW reference to variable to variable references the RCW counter won’t increase.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Excel = Microsoft.Office.Interop.Excel;
using System.Reflection;
using System.Runtime.InteropServices;
namespace RCW
{
    class Program
    {
        static void Main(string[] args)
        {
            Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
            Excel.Workbooks workbooks = xlApp.Workbooks;

            //The first reference of new workbook added to app
            Excel.Workbook newWorkbook = workbooks.Add(Missing.Value);

            //The second reference to the added workbook
            Excel.Workbook secondReference = newWorkbook;

            //The third refrence to the added workbook
            Excel.Workbook thirdReference = newWorkbook;

            int workbookRCWCounter = Marshal.ReleaseComObject(newWorkbook);
            Console.WriteLine(workbookRCWCounter.ToString());

            workbookRCWCounter = Marshal.ReleaseComObject(secondReference);
            Console.WriteLine(workbookRCWCounter.ToString());

            workbookRCWCounter = Marshal.ReleaseComObject(thirdReference);
            Console.WriteLine(workbookRCWCounter.ToString());

            Console.Read();
        }
     }
}
In above code, observe that we took the first direct workbook RCW reference to newWorkbook variable. Which increased the RCW count to 1. However subsequently, we just copied the newWorkbook reference to secondReference and thirdReference variable. Since the secondReference and thirdReference variables were not directly referenced from workbook RCW object the refCount never increased and hence was 1 in spite of having three references. Thus the output of running above code will be 0 –1 –1. This is because when we called the Marshal.ReleaseComObject first time on newWorkbook reference the RCW refCount decreased to zero thus releasing the underlying COM object. Subsequenly calling Marshal.ReleaseComObject on second and third reference variables just decreased refCount to –1 stating invalid state of RCW object.

An interesting thing to note here is that if after releasing the original direct reference variable (newWorkbook) do you try to access the secondReference or thirdReference workbook references you will end up with “COM object that has been separated from its underlying RCW cannot be used.”  exception. This essentially points to premature object killing which can be annoying and difficult to trace.
In variable to variable reference copy pattern, it is advised to call the Marshal.ReleaseComObject only when all references are out of scope. This will help in preventing premature object killing.

Conclusion:
I explained the two patterns of RCW object references and behaviour of refCount in each pattern.  It may so happen that your code has a mix of these patters hence use your judgement to best while deciding which pattern to use at what time.

No comments:

Post a Comment