LINQ joins when working with huge datasets

This is more of a research based findings.
I was trying to create an object map. This were POCO and had to recursively create the childitems collection of the same type in a nested structure. So each object had a collection of its own type. I was using LINQ inside and had two simple joins which were joining three in-memory C# collections. These collections were created as to mirror database tables so that we can work with LINQ, instead of working with database queries, which is definitely faster.

The total number of records were more than 46000. What we found that LINQ joins were slowing down the whole process and took 75 seconds to build the whole hierarchical object tree. When I moved all the data from three tables and created one flat table, then it took less that 15 seconds to create the same object tree.

So this is for all to note that it is better and efficient to work with flat tables using LINQ instead of working with multiple normalised tables and joining them when working with huge data. For small set of data, the difference will be negligible and can be ignored.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s