Wednesday, March 3, 2010

DateTime.Date not supported by EF GroupBy

I ran into this issue today where the Entity Framework does not support the GroupBy method against DateTime.Date and you won't find out until runtime. Here is what the original statement looked like:

myTable.GroupBy(mt => mt.ActivityDate.Date).
OrderBy(mt => mt.Key).
Select(mt => new{Date= mt.Key, Count = mt.Count()}).Take(90).ToList()


This resulted in the following exception:
NotSupportedException: The specified type member 'Date' is not supported in LINQ to Entities. Only initializers, entity members, and entity navigation properties are supported.

This was problematic because I need to report on a count of items in the database for each day of the last 90 days.

I found a post on the MSDN forums that gave a workaround for doing the group by that worked great, MSDN. So I updated my original statement to this:

myTable.GroupBy(mt => new {mt.ActivityDate.Year, mt.ActivityDate.Month, mt.ActivityDate.Day}).
OrderBy(mt => mt.Key).
Select(mt => new{Date= mt.Key, Count = mt.Count()}).Take(90).ToList()


This works but then you have this funny anonymous type and displaying it isn't pretty. So my next problem was figuring out how to display this data. I updated my statement to convert the Key into a DateTime type:

myTable.GroupBy(mt => new {mt.ActivityDate.Year, mt.ActivityDate.Month, mt.ActivityDate.Day}).
OrderBy(mt => mt.Key).
Select(mt => new{Date = new DateTime(mt.Key.Year, mt.Key.Month, mt.Key.Day), Count = mt.Count()}).Take(90).ToList()


At first glance you might think this works but it doesn't. Instead you get the following exception at runtime.

NotSupportedException: Only parameterless constructors and initializers are supported in LINQ to Entities.

A link over on Don't Be Iffy took me back to MSDN to find out
The idea of supporting only parameter-less constructor was one of the hard decisions we made as a product team. The main idea with this approach was we shouldn't open up new surface area in LINQ over Entities that is not supported by EDM. EDM in general doesn't allow you to construct in random objects (using the NEW constructor in eSQL). To make it consistent with the whole stack, we decided to implement it like wise. Limiting the constructions only to:
- Parameterless constructors
- Anonymous types (as it’s the only way to do multi-project in LINQ over Entities)


I obviously wasn't going to be able to do what I needed in Linq to Entities so I had to find a work around. I figured once I had the data from the DB that I needed I shouldn't have any problems doing this, especially since I didn't have these same problems in Linq to SQL. Note: I considered going the Linq to SQL route but I wasn't quite ready to give up on Linq to Entities yet.

What I decided to do instead was to grab the data from EF without using deferred execution and then create a new anonymous type locally. Below is the completed solution.

var data = myTable.
GroupBy(ra => new {ra.ActivityDate.Year, ra.ActivityDate.Month, ra.ActivityDate.Day}).
OrderBy(ra => ra.Date).
Select(ra => new{Key = ra.Key, Count = ra.Count()}).
Take(90).ToList();

data.Select(ra => new {Date = new DateTime(ra.Key.Year, ra.Key.Month, ra.Key.Day), Count = ra.Count});


This worked exactly as I expected it to and it only required some minor modification on my part to get it there. Hopefully this helps someone else out there that is having a similar issue.

Wednesday, May 6, 2009

Joining DataTables

Recently, I needed to perform several joins in a SQL query and return that data to a DataGrid. Normally this works great because I only have 2-3 tables I am performing a join on and the number of records being returned is relatively low. In this latest case I had many tables to join together and some of them had millions of rows in them.

The query to pull all the data I needed would run for several minutes before returning anything to me. I decided that to speed this up I would handle the joins in the application and just pull the rows I needed in the queries themselves. This took a query that would return in 15+ minutes down to less than 1 minute.

Microsoft has some code here: on how to do this. The code they had was over-engineered for what I needed to do so I decided to write my own.

The main method takes 2 tables and returns a new table. It is quick and dirty and can still be fleshed out if needed.

public static class MergeData
{
public static DataTable Merge(DataTable tableOne, DataTable tableTwo)
{
DataTable output;

output = new DataTable();

CopyColumns(tableOne, ref output);
CopyColumns(tableTwo, ref output);
CopyData(tableOne, tableTwo, ref output);

return output;
}

public static void CopyData(DataTable tableOne, DataTable tableTwo, ref DataTable destinationTable)
{
foreach (DataRow row in tableOne.Rows)
{
DataRow nr;
DataRow[] joinedRows;

nr = destinationTable.NewRow();

foreach (DataColumn dc in tableOne.Columns)
{
nr[dc.Table.TableName + "." + dc.ColumnName] = row[dc.ColumnName];
}

string selectStatement = tableTwo.Columns[1].ColumnName + " = '" + row[tableOne.Columns[1].ColumnName] + "'";
joinedRows = tableTwo.Select(selectStatement);

foreach (DataColumn dc in tableTwo.Columns)
{
nr[dc.Table.TableName + "." + dc.ColumnName] = joinedRows[0][dc.ColumnName];
}

destinationTable.Rows.Add(nr);
}
}

public static void CopyColumns(DataTable table, ref DataTable output)
{
foreach (DataColumn dc in table.Columns)
{
DataColumn nc;

nc = new DataColumn(table.TableName + "." + dc.ColumnName, dc.DataType);
AddColumn(nc, ref output);
}
}

public static void AddColumn(DataColumn dc, ref DataTable dt)
{
if (!dt.Columns.Contains(dc.ColumnName))
{
dt.Columns.Add(dc);
}
}
}

Friday, April 10, 2009

Back to C#

After about 6 months of being away from C# I am finally able to get back to working with the language again. My first project went very smooth. It consisted of creating a class to read and write INI files. The most difficult part was the regex needed to determine what sections, settings and values. I need to perform some more research on regex, tokens and how C# uses them.

My next project is a bit more interesting and complicated. The basics of it are I need to parse a csv file and import it into a database. It isn't just 1 csv file but literally 10s of thousands of csv files and they get updated twice a day or more. The app will have to update the database at least once a day possibly more.

Parsing the csv will be interesting since some fields have quotes and just about any character can appear inside those quotes. I also need to look for certain strings and set them to DBNull since most fields from the csv file will be Ints or Floats in the database.

Some new tools in my arsenal since I last posted here.
XtUnit - Extension framework provides better support for unit tests with a database.
Log4Net - If you aren't using it for logging you don't know what you are missing.

I am glad to be back and look forward to updating this blog again.

Thursday, October 2, 2008

Does C# need the Template pattern?

My understanding of the Template pattern is that it allows a programmer to provide a hook for outside use into a classes methods. With the advent of C# a new type has come into use, the Delegate. From the MSDN site
A delegate is a new type that references a method. Once a delegate is assigned a method, it behaves exactly like that method. The delegate method can be used like any other method, with parameters and a return value...


This sounds to me like a perfect candidate to hook into a method. Now maybe the Template pattern is still useful in other ways, although I haven't come up with one, but maybe Delegates can make it largely obsolete.

Some quick sample code I did to just test it out.

namespace DelegateTest
{
public delegate int Test(int MyInt);
class Program
{
public static Test MyTest;
static void Main(string[] args)
{
int SomeNumber = 5;
MyTest += new Test(MyInt);

if (MyTest != null)
{
SomeNumber = MyTest(SomeNumber);
}
Console.WriteLine(SomeNumber);
Console.Read();
}

public static int MyInt(int TestInt)
{
return 6;
}
}
}

Friday, September 19, 2008

Strategy Pattern

I found a very useful purpose for the strategy pattern. I am writing an application that will move all or part of a database depending on the users selection. Since 90% of the code was the same I decided to use an abstract class with concrete methods for the business logic that was the same and abstract classes for where it differed. The strategies I have to choose from are:

1. Move entire DB
2. Move all rows from X number of tables.
3. Move X number of rows from X tables.

Doing a quick mock I came up with

public abstract class DBMoveStrategy
{
//properties...

//events...

//concrete methods

public abstract void GetSource();
public abstract void AddConstraints(DataSetFacade Data);

}

Then I sub-class for the different branches of logic.

public class DBMoveAll : DBMoveStrategy
{
public override void GetSource()
{
//some code
}

public override void AddConstraints(DataSetFacade Data)
{
//some code
}
}

public class DBTables : DBMoveStrategy
{
public override void GetSource()
{
//some code
}

public override void AddConstraints(DataSetFacade Data)
{
//some code
}
}

public class DBMoveRows : DBMoveStrategy
{
public override void GetSource()
{
//some code
}

public override void AddConstraints(DataSetFacade Data)
{
//some code
}
}

To use this inside another class I just do the following:

public class MyControllerClass
{
private DBMoveStrategy movedb;

public MyControllerClass()
{
if (IsMoveAll)
movedb = new DBMoveAll();
}

public void Connect(string Server, string Database, string UserName, string Password)
{
movedb.connect(string Server, string Database, string UserName, string Password)
}

public void GetData()
{
movedb.GetSource();
movedb.AddConstraints();
}
}


There is more logic involved but this will give a flavor for how I am using the pattern. There are probably some better ways I could use it or other patterns that might be better but I am still learning the language and better ways to write code.

Friday, September 5, 2008

Event Driven Error Handling

I have been reading a new titled "Design Patterns in C#". It is a great book, tons of information and it is quite understandable. While reading through the observer pattern it struck me. Why do we need to use so may try/catch statements in our code when we could fire off the exceptions from an event handler.

So I wrote up some quick code to test it out.

public delegate void TestErrorHandler(string message);

public class TestClass
{
public event TestErrorHandler test;

public void TriggerHandler(object blah)
{
if (blah == null)
{
test("Null argument in TriggerHandler");
}
}
}

static void Main(string[] args)
{
TestClass MyTest = new TestClass();
MyTest.test += new TestErrorHandler(OhCrap);
MyTest.TriggerHandler(null);
}

public static void OhCrap(string e)
{
Console.WriteLine(String.Format("Oh CRAP! {0}", e));
Console.Read();
}

I can see some pitfalls to this solution like:
1. You fire off the same Exception Event from multiple methods, i.e. ArgumentNullException.
2. How do you tell what observing method called the method that threw the Exception Event? For instance what if you had a private property that had the same methods being called multiple times?
3. What about if I call the same method multiple times from the same piece of code. a for loop for instance?

These are definitely challenging issues to solve but I feel it may be possible. I already have some ideas but right now don't have the time to fully test these out. If it is possible then it could take decoupling code to a whole new level.

Monday, August 11, 2008

Updating identity columns in a Dataset

One of the things I have noticed about working with DataSets is that there is not a lot of information out there on how to work with Foreign Key constraints and merging 2 databases together. Let me rephrase that, there is a but it feels scattered about and I felt it didn't teach me what I needed to know. After spending some time researching the subject here is the best way I have found to deal with the issue.

First you need to setup your connection.

string ConnectionString = "server=" + _DatabaseServer + ";user id=" + UserName + ";password=" + Password + ";Trusted_Connection=no;connection timeout=30";

Establish your connection

Connection = new SqlConnection(ConnectionString);
Connection.Open();

Now we setup our adapter
string ParentQuery = "SELECT * FROM ParentTable";
string ChildQuery = "SELECT * FROM ChildTable";
SqlDataAdapter ParentAdapter = new SqlDataAdapter(ParentQuery, Connection);
SqlDataAdapter ChildAdapter = new SqlDataAdapter(ChildQuery, Connection);

At this point we can fill our DataSet. Before we do we need to setup an event handler for updating the Identity column in each row when it gets written back to the database. The reason for this is depending on how you you merge DataSets or add rows to a DataSet the Identity column will have a place holder. This place holder will not necessarily be the same value as what the database will contain. This wouldn't be a problem if you were only updating 1 table but when you have child tables that need the same value you will run into issues.

Here is where you add the event handler

ParentAdapter.RowUpdated += new SqlRowUpdatedEventHandler(AdapterRowUpdated);

And here is the event handler

private void AdapterRowUpdated(object sender, SqlRowUpdatedEventArgs e)
{
SqlCommand IdentityCommand = new SqlCommand("SELECT @@IDENTITY", e.Command.Connection);

e.Row["ID"] = IdentityCommand.ExecuteScalar();
e.Row.AcceptChanges();
}

Now when you Update the database all your rows will have the same IDs.

Go ahead and fill our DataSet.

ParentAdapter.FillSchema(MyDataSet, SchemaType.Source, "ParentTable");
ProfileAdapter.Fill(MyDataSet, "ParentTable");
ChildAdapter.FillSchema(MyDataSet, SchemaType.Source, "ChildTable");
ChildAdapter.Fill(MyDataSet, "ChildTable");

We have 1 DataSet with 2 DataTables in it, ParentTable and ChildTable. The database has an Identity field and a Foregn Key constraint. Those are not replicated down to the DataSets so we have to do this manually.

The Unique constraint
UniqueConstraint ParentConstraints = new UniqueConstraint(new DataColumn[] { MyDataSet.Tables["ParentTable"].Columns["ID"] });
MyDataSet.Tables["ID"].Constraints.Add(ParentConstraints);

Pay attention to the UpdateRule setting for the Foreign Key constraint. We set it as Rule.Cascade. This is done so that when we update the ID column for a row that ID gets propagated to any row within the child tables that contain the same Identity.

ForeignKeyConstraint ParentFK = new ForeignKeyConstraint("IDFK",
MyDataSet.Tables["ParentTable"].Columns["ID"],
MyDataSet.Tables["ChildTable"].Columns["ID"]);
ProfileFK.UpdateRule = Rule.Cascade;
MyDataSet.Tables["ChildTable"].Constraints.Add(ParentFK);

If you are merging 2 databases together I would recommend following the same steps above for the second database. That way you have 2 identical DataSets when you go to merge them together and you can reuse all the same code again. You don't have to do this if you don't want to as there are ways around it by using MissingSchemaAction but I prefer to have identical DataSets to work from.

We have 2 DataSets and we need to merge them together. The problem is that when you Fill your DataSet by default it sets all Rows to a RowState of NoChange. This means when you merge your 2 DataSets nothing will get written to your database because nothing has changed. Any new data should be marked as Added so the Adapter will Insert it into the database when you perform the Update. This is not very efficient if you have multiple tables that need to be updated. In that case wrap the below foreach loop in another foreach loop to iterate through each table.

foreach (DataRow row in NewData.Tables["ParentTable"].Rows)
{
row.BeginEdit();
row.SetAdded();
row.EndEdit();
}


Merge the 2 DataSets. This is very easy using the Merge method in your Destination DataSet, NewData being the DataSet I want to merge into my existing data. You have to set preserveChanges to true or the Merge will throw out all your changes and nothing will make it into the new database.

MyDataSet.Merge(NewData, true, MissingSchemaAction.Add);

Our DataSets are now merged and ready to be Updated in the new location. This is pretty trivial. The only thing to remember is that the Identity column is marked as ReadOnly, therefore you can't update it without first changing it.

SqlCommandBuilder ParentCommandBuilder = new SqlCommandBuilder(ParentAdapter);
SqlCommandBuilder ChildCommandBuilder = new SqlCommandBuilder(ChildAdapter);
MyDataSet.Tables["ParentTable"].Columns["ID"].ReadOnly = false;
ParentAdapter.Update(MyDataSet.Tables["ParentTable"]);
MyDataSet.Tables["ParentTable"].Columns["ID"].ReadOnly = true;
ChildAdapter.Update(MyDataSet.Tables["ChildTable"]);

Now just check the destination database and everything should be there. Just repeat the steps above for multiple tables and multiple databases and you can merge as many databases and tables as you need to.