Wednesday, September 13, 2006

Possible bug with interior_ptr<> in C++ CLI

UPDATE: It's fixed! It will be in the next release. Now THAT is service.

UPDATE: Bug is confirmed and reported. Thanks to Brian Kramer for his help.

I encountered some strange behaviour when using interior_ptr in a generic function. It appears that, for the purposes of pointer arithmetic, the byte size of the template parameter is always assumed to be 4, regardless of whether it is a char or a double. This doesn't happen when interior_ptr is given a concrete type directly.

The following example is a generic version of the online help example that demonstates the problem:


// interior_ptr bug example
// compile with: /clr
 
generic<typename T>
void ProbeFirstTwoElements(array<T>^ arr)
{
   // create an interior pointer into the array
   interior_ptr<T> ipi = &arr[0];
 
   System::Console::WriteLine("1st element in arr holds: {0}", arr[0]);
   System::Console::WriteLine("ipi points to memory address whose value is: {0}", *ipi);
 
   ipi++;
 
   System::Console::WriteLine("after incrementing ipi, it points to memory address whose value is: {0}", *ipi);
   System::Console::WriteLine();
}
 
int main(array<System::String ^> ^args)
{
    array<int>^ arrInt = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
    ProbeFirstTwoElements(arrInt);  
 
    array<double>^ arrDouble = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
    ProbeFirstTwoElements(arrDouble);  
 
    array<char>^ arrChar = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
    ProbeFirstTwoElements(arrChar);  
 
    return 0;
}
 
// Expected output:
//
// 1st element in arr holds: 1
// ipi points to memory address whose value is: 1
// after incrementing ipi, it points to memory address whose value is: 2
//
// 1st element in arr holds: 1
// ipi points to memory address whose value is: 1
// after incrementing ipi, it points to memory address whose value is: 2
//
// 1st element in arr holds: 1
// ipi points to memory address whose value is: 1
// after incrementing ipi, it points to memory address whose value is: 2
//
// Actual output:
//
// 1st element in arr holds: 1
// ipi points to memory address whose value is: 1
// after incrementing ipi, it points to memory address whose value is: 2
//
// 1st element in arr holds: 1
// ipi points to memory address whose value is: 1
// after incrementing ipi, it points to memory address whose value is: 5.29980882362664E-315
//
// 1st element in arr holds: 1
// ipi points to memory address whose value is: 1
// after incrementing ipi, it points to memory address whose value is: 5


Most seriously, this bug has the potential for buffer overflows, even though the code might appear to pass various "correctness" tests.

Tuesday, September 12, 2006

Blue Ball Machine

Here is a neat animated tesselation tile; it has probably been around for ages, but it is new to me. Click on the picture to see it as a tesselated background.

Wednesday, September 06, 2006

Say no to 0870!

Many UK companies think they have the right to charge you for the privilege of speaking to them. SAYNOTO0870.COM is a website that gives free or local rate alternative phone numbers.

[Thanks to a chatty tradesmen my sister met for this one]

Friday, September 01, 2006

.NET Generics : Numerical limitations

Generics were added to the .NET framework in version 2.0, and I for one gave a big cheer. I am a massive fan of templates in C++ and use them extensively, especially as part of the STL library. Unfortunately .NET generics have some serious limitations when used for numerical programming that are worth highlighting.

The Problem

Consider the following STL algorithm "accumulate" that simply adds a list of numbers:

template<typename T>
T accumulate(T* begin, T* end, T initalValue)
{
    T sum = initalValue;
    for(T* i = begin; i != end; ++i)
    {
        sum = sum + *i;
    }
    return sum;
}

I know this is not what the STL algorithm looks like, but I have simplified things (such as * instead of 'iterator') for those readers not as well acquainted with the beautiful language.

Now consider the .NET generic equivalent (in C++/CLI for easy comparison):

generic<typename T>
T accumulate(IEnumerable<T>^ items, T initalValue)
{
    T sum = initialValue;
    for each(T i in items)
    {
        sum = sum + i;
    }
    return sum;
}

Great....except this won't compile because sum + i is not supported by Object and that is how generics work.

The Reason

For templates, all wildcard parameters (like T in our example) are resolved at compile time. They are like smart (and safe) macros. For generics, this resolution is done at runtime, therefore any method calls or functions unique to the wildcard parameter (like +) could fail at runtime if the actual parameter type is not supported.

To avoid lots of runtime exceptions, the compiler tries to enforce some measure of constraint on the method calls a generic function makes through the where keyword. This can apply an interface constraint ("must support method..."), an inheritance constraint ("must be a..."), and various other constraints. Crucially though for our purposes: it cannot be applied to the numeric operators.

There is a good reason for delaying generic type resolution until runtime: it allows you to use generic libraries with parameters the library designer never expected. In C++ templates, this can only be done by distributing the header files, which is inconvenient from a source management point of view, and makes it difficult to control commercial IP.

Some people argue that runtime resolution reduces code bloat. It does, but who cares? Check your hard drive: it is not full of binary code, it is full of data.

The Solutions

The plural nature of this sections heading should warn you that no perfect fix is obvious. I present the solutions to date and leave the reader to make up their own mind:

  • Eric Gunnerson acknowledges the problem and suggests reflection could be used, or that we should wait for a language update.
  • Andrew Clymer has some good solutions involving delegate functions, perhaps stored in a factory. He also suggests how reflection might be used.
  • Juan Wajnerman shows us what the generic algorithms look like when compiled to IL. He points out that this is not a problem of the framework, just a restriction in the language syntax.
  • Frédéric Didier has written a port of STL in C# that overcomes the problem with delegates (albeit at some syntactic cost).
Personally I would like more expressive syntax in the where command, but it will be hard even then to capture all eventualities. Perhaps a where * clause that permits all function calls and tells the compiler: "if this goes wrong at runtime, I will take the rap".

After conferring with Steve, we think the constraints could be inferred from the functions you actually use. This is basically type inference for function prototypes. The "contract" is then automatically added to the generic functions prototype, and violations will still be detected at compile time by the function's clients. No need for runtime exceptions, and no need for the where command at all.

Maybe STL/CLR (neé STL.NET) will solve all these problems (it must address them somehow I guess). Any comments welcome.