Thursday, September 3, 2015

Distinct counts will not play nicely with my giant foreach loop plans.

var foo = _data.Select(x => x.Bar).Distinct().Count();
var baz = _data.Select(x => x.Qux).Distinct().Count();

This idea won't fly. A loop to get a bunch of totals at once can't efficiently also find distincts, not if there is more than one distinct and the list to loop though cannot be sorted by the one relevant property upfront. If we want to refactor the two lines of C# above into our foreach loop we are in trouble. The only way to do so would entail pushing Bar and Qux values to other collections and looping through them in order to see if the next Bar or Qux exists therein already. It would be expensive. You could reduce overhead by, again, sorting the whole of the core collection by either Bar or Qux upfront so that only the values for Bar or Qux need to be tucked away into a goofy second collection, but even then this notion is still nasty.

No comments:

Post a Comment