Compare ArrayList  
Author Message
TheSniipe





PostPosted: Visual C# General, Compare ArrayList Top

Hello,

I have 2 sorted array lists.

e.g.  arrayList1 = a,b,c,d,e,f,h,k,l  and  arrayList2 = a,c,d,f,g,h,i,j

ArrayList1 contains the list of strings that are to be compared against.  Unfortunately there are thousands of strings in each.  So I don't know if it would be any good doing something like; does arrayList[0] == arrayList[0]||arrayList[1]||.... (in a loop of course).

Another thing, arrayList2 is getting its (distinct) data from a database so I can't really check it against arrayList1 there and then, as it would leave the connection open for longer... I don't know maybe this is an ok trade off

I don't suppose there is an efficient arrayList1.compare(arrayList2) method out there   Or some other alternative.

Thanks.



Visual C#10  
 
 
TheSniipe





PostPosted: Visual C# General, Compare ArrayList Top

or something like

foreach item in arrayList1

if (arrayList2.binarysearch(item) < 0)

arrayListDiscrepency.add(item)

...


 
 
TheSniipe





PostPosted: Visual C# General, Compare ArrayList Top

anyone got any ideas on this


 
 
webbes





PostPosted: Visual C# General, Compare ArrayList Top

Have a look at the abstract KeyedCollection class(http://msdn2.microsoft.com/en-us/library/ms132438.aspx) and the Hashtable.

"The KeyedCollection class provides both O(1) indexed retrieval and keyed retrieval that approaches O(1)."

Replace array2 bij a KeyedCollection or Hashtable and compare using the Contains method.

Cheers,

Wes



 
 
louthy





PostPosted: Visual C# General, Compare ArrayList Top

A hashtable won't work as it won't keep the sorted order.

Use a SortedDictionary instead of an ArrayList, as that will allow for instant look-up of the key in the second list.  I've provided an example below.

 class Program
 {
  static void Main(string[] args)
  {
   SortedDictionary<string, object> list1 = new SortedDictionary<string, object>();
   SortedDictionary<string, object> list2 = new SortedDictionary<string, object>();

   list1.Add("A", null);
   list1.Add("B", null);
   list1.Add("C", null);
   list1.Add("D", null);
   list1.Add("E", null);
   list1.Add("F", null);
   list1.Add("G", null);

   list2.Add("A", null);
   list2.Add("B", null);
   list2.Add("C", null);
   list2.Add("D", null);
   list2.Add("E", null);
   list2.Add("F", null);
   list2.Add("G", null);
   list2.Add("K", null);

   bool equal = Compare(list1, list2);

   if (equal)
   {
    Console.WriteLine("They're the same!");
   }
   else
   {
    Console.WriteLine("They differ");
   }
  }

  static bool Compare(SortedDictionary<string, object> list1, SortedDictionary<string, object> list2)
  {
   // Early check for difference
   if (list1.Count != list2.Count)
   {
    return false;
   }

   // Compare each value in list 1 with each value in list 2
   foreach (KeyValuePair<string, object> item in list1)
   {
    if (!list2.ContainsKey(item.Key))
    {
     // Return the moment we find a difference
     return false;
    }
   }

   // Must be the same
   return true;
  }
 }


 
 
TheSniipe





PostPosted: Visual C# General, Compare ArrayList Top

Darn, that looks sweet, I think its a .net2 code tho. I'm using 1.1
 
 
TheSniipe





PostPosted: Visual C# General, Compare ArrayList Top

I'm thinking of doing something like this: The Call, and below that the function

if(sb.Length != codesList.Length)

{

string diff = getDiferences(sb, codesList);

throw new System.ArgumentException("The following codes do not exist in the database, please ammend the input file: " + diff );

}

 

private string getDiferences(StringBuilder sb, string codesList)

{

string sRetVal = null;

//The sb are the origionals and codesList has to be checked against it

//the strings look like this: 'IRE','ENG','SCO'

//So I will split the strings in codesList and check each one sb.contains(codesListArray[ i ])

//if it exists, should I remove it from both... to bring the computation down There could be 10000 codes in the DB and 400 in the file

//if it doesn't exsist add it to the string sRetVal

return sRetVal;

}


 
 
louthy





PostPosted: Visual C# General, Compare ArrayList Top

Here you go, this should work:

 

using System;
using System.Collections;
using System.Text;
using System.Collections.Specialized;
using System.Collections.Generic;

namespace Test
{
 class Program
 {
  static void Main(string[] args)
  {
   SortedList list1 = new SortedList();
   SortedList list2 = new SortedList();

   list1.Add("A", null);
   list1.Add("B", null);
   list1.Add("C", null);
   list1.Add("D", null);
   list1.Add("E", null);
   list1.Add("F", null);
   list1.Add("G", null);
   list1.Add("K", null);

   list2.Add("A", null);
   list2.Add("B", null);
   list2.Add("C", null);
   list2.Add("D", null);
   list2.Add("E", null);
   list2.Add("F", null);
   list2.Add("G", null);
   list2.Add("K", null);

   bool equal = Compare(list1, list2);

   if (equal)
   {
    Console.WriteLine("They're the same!");
   }
   else
   {
    Console.WriteLine("They differ");
   }
  }

  static bool Compare(SortedList list1, SortedList list2)
  {
   // Early check for difference
   if (list1.Count != list2.Count)
   {
    return false;
   }

   IDictionaryEnumerator iter = (IDictionaryEnumerator)list1.GetEnumerator();

   // Compare each value in list 1 with each value in list 2
   while( iter.MoveNext() )
   {
    if (!list2.ContainsKey((string)iter.Key))
    {
     // Return the moment we find a difference
     return false;
    }
   }

   // Must be the same
   return true;
  }
 }
}


 
 
TheSniipe





PostPosted: Visual C# General, Compare ArrayList Top

Thanks louthy, I'll try and implement your code in my app. I really appreciate your help. Thanks again.
 
 
louthy





PostPosted: Visual C# General, Compare ArrayList Top

Actually, I've just modified it to use a SortedList. I don't think the StringDictionary is sorted.
 
 
louthy





PostPosted: Visual C# General, Compare ArrayList Top

Here's a version that produces a list of items that are missing from the second list:

class Program
{
static void Main(string[] args)
{
SortedList list1 = new SortedList();
SortedList list2 = new SortedList();

list1.Add("A", null);
list1.Add("B", null);
list1.Add("C", null);
list1.Add("D", null);
list1.Add("E", null);
list1.Add("F", null);
list1.Add("G", null);
list1.Add("K", null);

list2.Add("C", null);
list2.Add("D", null);
list2.Add("E", null);
list2.Add("F", null);
list2.Add("G", null);
list2.Add("K", null);

SortedList notPresent = Compare(list1, list2);

if (notPresent.Count == 0)
{
Console.WriteLine("All items present in DB");
}
else
{
Console.WriteLine("These items are not present in DB");

IDictionaryEnumerator iter = (IDictionaryEnumerator)notPresent.GetEnumerator();

// Compare each value in list 1 with each value in list 2
while (iter.MoveNext())
{
Console.WriteLine(iter.Key.ToString());
}
}
}

static SortedList Compare(SortedList list1, SortedList list2)
{
SortedList notPresentList = new SortedList();
IDictionaryEnumerator iter = (IDictionaryEnumerator)list1.GetEnumerator();

// Compare each value in list 1 with each value in list 2
while( iter.MoveNext() )
{
if (!list2.ContainsKey((string)iter.Key))
{
notPresentList.Add(iter.Key, iter.Value);
}
}

return notPresentList;
}
}


 
 
TheSniipe





PostPosted: Visual C# General, Compare ArrayList Top

ok, which is a better method Your way, or this way (arrayList1 can be 1000+ strings, and arrayList2 can be 500 strings->

arrMissingCodes = compareArrays(arrCodesList, arrDB);

private string compareArrays(ArrayList ar1, ArrayList ar2)

{

StringBuilder sb = new StringBuilder();

ar1.Sort();

ar2.Sort();

foreach (string item in ar1)

{

if(ar2.BinarySearch(item) < 0)

{

sb.Append(item + ", ");

}

}

return sb.ToString();

}


 
 
louthy





PostPosted: Visual C# General, Compare ArrayList Top

My way would be much faster as it works with lists which are already sorted (each item is sorted as you add it to the list). Also the SortedList works like a hash-table too which allows far faster random access. Doing a binary search would be far slower than doing hashtable access.

You could run the two side by side and profile them, but I think you'll find my solution will be orders of magnitude faster.


 
 
louthy





PostPosted: Visual C# General, Compare ArrayList Top

Just profiled the two:

Louthy method: Compare two random lists with 10000 items in each 31 milliseconds
Louthy method: Compare two identical lists with 10000 items in each 31 milliseconds
TheSniipe method: Compare two random lists with 10000 items in each 93 milliseconds
TheSniipe method: Compare two identical lists with 10000 items in each 125 milliseconds

Hope that helps :)


 
 
James Curran





PostPosted: Visual C# General, Compare ArrayList Top

Here's my take on it. It assumes that both lists are sorted, and returns the differences.

It is also O(N), while every other one presented here has been either O(N^2) or O(N ln N)



class ArrayCompare

{
static public string Compare(IList list1, IList list2)
{
int i = 0;
int j = 0;
StringBuilder sb1 = new StringBuilder("Items missing from List2: ");
StringBuilder sb2 = new StringBuilder("Items missing from List1: ");
while (i < list1.Count)
{
if (j == list2.Count)
break;
IComparable obj1 = list1[ i ] as IComparable;
IComparable obj2 = list2[ j ] as IComparable;
int cmp =obj1.CompareTo(obj2);
switch (Math.Sign(cmp))
{
case 0:
++
i;
++
j;
break;
case 1: // list1Idea. > list2[j]);

sb2.AppendFormat("\"{0}\", ", obj2);
++
j;
break;
case -1:
sb1.AppendFormat("\"{0}\", ", obj1);
++
i;
break;
}
}
while (i < list1.Count) // we reached the end of list 2 first.

{
sb1.AppendFormat("\"{0}\", ", list1[i]);
++
i;
}
while (j < list2.Count) // we reached the end of list 1 first.

{
sb2.AppendFormat("\"{0}\", ", list2[j]);
++
j;
}
return sb1.ToString() + Environment.NewLine + sb2.ToString();
}
}