Monday, November 1, 2010

Why IEnumerator<T> is IDisposable

Recently I've posted that IEnumerator<T> is IDisposable, but had not explained, why it is.

The explanation is rather simple - because of "yield return". Let's track this down.

Consider we have the following class:
public class Numbers : IDisposable
 {
  public int One()
  {
   return 1;
  }
  public int Two()
  {
   return 2;
  }
  public void Dispose()
  {
   Console.WriteLine("In Dispose()");
  }
 }
And also we have a method uses "yield return" in the following manner:
static IEnumerable<int> DisposableEnumerable()
{
 using (var numbers = new Numbers())
 {
  yield return numbers.One();
  yield return numbers.Two();
               // some code
 }
}
And of course a method calls DisposableEnumerable():
static void Main(string[] args)
{
   foreach (var s in DisposableEnumerable())
   {
    Console.WriteLine(s);
   }
}
A good C# developer should has an intuitive rule that IDisposable entity declared in using() should be disposed as soon as using statement is closed. But when the numbers should be disposed? During execution we left the using block for three times - after numbers.One(), numbers.Two() and after "some code". Again intuitively we expect that Numbers.Dispose() will be called after "some code". Let's check our guess with Reflector.

"Yield returns" are compiled into a compiler-generated class inside the class with "yield return" method.

Here is the refomatted code from the compiler-generated class (some members are omitted):
sealed class DisposableEnumerableImpl : IEnumerable<int>, IEnumerator<int>
 {
  // Fields
  private int _state;
  public Numbers _n;

  // Methods
  public DisposableEnumerable(int state)
  {
   this._state = state;
  }

  public bool MoveNext()
  {
   try
   {
    switch (_state)
    {
     case 0:
      _state = -1;
      _n = new Numbers();
      _state = 1;
      _current = _n.One();
      _state = 2;
      return true;

     case 2:
      _state = 1;
      _current = _n.Two();
      _state = 3;
      return true;

     case 3:
      _state = 1;
      Finally();
      break;
    }
   }
   catch
   {
    ((IDisposable)this).Dispose();
   }
   return false;
  }


  void IDisposable.Dispose()
  {
   switch (_state)
   {
    case 1:
    case 2:
    case 3:
     try
     {
     }
     finally
     {
      Finally();
     }
     break;
   }
  }
  private void Finally()
  {
   _state = -1;
   if (_n != null)
   {
    _n.Dispose();
   }
  }

 }
According to the code the class has five states (some are not shown in the code above):
State Description
-2 Initial state. Is set when the class is created to be used as IEnumerable.
-1 Finished state.
0 Initial state. Is set when the class is created to be used as IEnumerator.
1 Interim state. Used during state changes.
2 State after number.One() is called.
3 State after number.Two() is called.

So according to this, the Numbers() class from inside the DisposableEnumerableImpl will be disposed after 3rd state, in case of any exceptions or when the enumerator is disposed.

This is actually very important rule - to dispose IEnumerator<T>. Consider the following buggy implementation of FirstOrDefault() method:
Invalid solution
static T FirstorDefault<T>(IEnumerable<T> source)
{
 var enumerator = source.GetEnumerator();
 if (enumerator.MoveNext())
  return enumerator.Current;
 else
  return default(T);
}
Look to this method through the DisposableEnumerableImpl. As far as source will not be enumerated to the end and enumerator will not be disposed, Numbers instance inside DisposableEnumerableImpl will never be disposed! So the correct implementation is:

static T FirstorDefault<T>(IEnumerable<T> source)
{
 using (var enumerator = source.GetEnumerator())
 {
  if (enumerator.MoveNext())
   return enumerator.Current;
  else
   return default(T);
 }
}
Now the Numbers will be disposed.

Progg it

1 comment: