LINQ. Any vs. exists - what's the difference?

Using Linq on a collection, what's the difference between the following lines of code?

if(!coll.Any(i => i.Value))

and

if(!coll.Exists(i => i.Value))

Update 1

When I disassembled. Exists, it looked like there was no code.

Update 2

Anyone knows why there's no code for this?

#1 building

TLDR; Any looks slow in terms of performance (if I've set it up correctly to evaluate both values almost at the same time)

        var list1 = Generate(1000000);
        var forceListEval = list1.SingleOrDefault(o => o == "0123456789012");
        if (forceListEval != "sdsdf")
        {
            var s = string.Empty;
            var start2 = DateTime.Now;
            if (!list1.Exists(o => o == "0123456789012"))
            {
                var end2 = DateTime.Now;
                s += " Exists: " + end2.Subtract(start2);
            }

            var start1 = DateTime.Now;
            if (!list1.Any(o => o == "0123456789012"))
            {
                var end1 = DateTime.Now;
                s +=" Any: " +end1.Subtract(start1);
            }

            if (!s.Contains("sdfsd"))
            {

            }

Test list generator:

private List<string> Generate(int count)
    {
        var list = new List<string>();
        for (int i = 0; i < count; i++)
        {
            list.Add( new string(
            Enumerable.Repeat("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 13)
                .Select(s =>
                {
                    var cryptoResult = new byte[4];
                    new RNGCryptoServiceProvider().GetBytes(cryptoResult);
                    return s[new Random(BitConverter.ToInt32(cryptoResult, 0)).Next(s.Length)];
                })
                .ToArray())); 
        }

        return list;
    }

10M record

"Any: 00:00:00.377037existence: 00:00:00.2490249"

There are 5M records.

"Any: 00:00:00.0940094 exists: 00:00:00.1420142"

Have 1M records

"Any: 00:00:00.0180018 existing: 00:00:00.009009009"

Using 500k, (I also flip them in order to evaluate to see if there are no additional operations associated with any of the first operations to run.)

"Existing: 00:00:00.005005 any: 00:00:00.0100010"

100k records

"Existing: 00:00:00.0010001 any: 00:00:00.0020002"

It seems that Any is a slower Level 2.

Edit: for 5 and 10 m records, I changed the way it generated the list, and suddenly Exists was slower than Any, which means there was a problem with the way I tested it.

New testing mechanism:

private static IEnumerable<string> Generate(int count)
    {
        var cripto = new RNGCryptoServiceProvider();
        Func<string> getString = () => new string(
            Enumerable.Repeat("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 13)
                .Select(s =>
                {
                    var cryptoResult = new byte[4];
                    cripto.GetBytes(cryptoResult);
                    return s[new Random(BitConverter.ToInt32(cryptoResult, 0)).Next(s.Length)];
                })
                .ToArray());

        var list = new ConcurrentBag<string>();
        var x = Parallel.For(0, count, o => list.Add(getString()));
        return list;
    }

    private static void Test()
    {
        var list = Generate(10000000);
        var list1 = list.ToList();
        var forceListEval = list1.SingleOrDefault(o => o == "0123456789012");
        if (forceListEval != "sdsdf")
        {
            var s = string.Empty;

            var start1 = DateTime.Now;
            if (!list1.Any(o => o == "0123456789012"))
            {
                var end1 = DateTime.Now;
                s += " Any: " + end1.Subtract(start1);
            }

            var start2 = DateTime.Now;
            if (!list1.Exists(o => o == "0123456789012"))
            {
                var end2 = DateTime.Now;
                s += " Exists: " + end2.Subtract(start2);
            }

            if (!s.Contains("sdfsd"))
            {

            }
        }

Edit2: OK, in order to eliminate any impact of generating test data, I'll write it all to a file, and now read it from there.

 private static void Test()
    {
        var list1 = File.ReadAllLines("test.txt").Take(500000).ToList();
        var forceListEval = list1.SingleOrDefault(o => o == "0123456789012");
        if (forceListEval != "sdsdf")
        {
            var s = string.Empty;
            var start1 = DateTime.Now;
            if (!list1.Any(o => o == "0123456789012"))
            {
                var end1 = DateTime.Now;
                s += " Any: " + end1.Subtract(start1);
            }

            var start2 = DateTime.Now;
            if (!list1.Exists(o => o == "0123456789012"))
            {
                var end2 = DateTime.Now;
                s += " Exists: " + end2.Subtract(start2);
            }

            if (!s.Contains("sdfsd"))
            {
            }
        }
    }

10M

"Any: 00:00:00.1640164 exists: 00:00:00.0750075"

5M

"Any: 00:00:00.0810081 exists: 00:00:00.0360036"

1M

"Any: 00:00:00.0190019 existing: 00:00:00.007007007"

500K

"Any: 00:00:00.0120012 exists: 00:00:00.0040004"

#2 building

Act as Matas About benchmarking The answer Continuation.

TL / DR: Exists() is as fast as Any().

First of all: benchmark with stopwatch is not accurate( See series0ne for answers to different but similar topics ), but it is much more accurate than DateTime.

The way to get really accurate readings is to use performance analysis. However, one way to understand how the performance of the two methods is measured against each other is by executing the number of method loads, and then comparing the fastest execution time of each method. In this way, it doesn't matter that JITing and other noises give us bad readings (and indeed do), because both implementations are "equally wrong" in a sense.

static void Main(string[] args)
    {
        Console.WriteLine("Generating list...");
        List<string> list = GenerateTestList(1000000);
        var s = string.Empty;

        Stopwatch sw;
        Stopwatch sw2;
        List<long> existsTimes = new List<long>();
        List<long> anyTimes = new List<long>();

        Console.WriteLine("Executing...");
        for (int j = 0; j < 1000; j++)
        {
            sw = Stopwatch.StartNew();
            if (!list.Exists(o => o == "0123456789012"))
            {
                sw.Stop();
                existsTimes.Add(sw.ElapsedTicks);
            }
        }

        for (int j = 0; j < 1000; j++)
        {
            sw2 = Stopwatch.StartNew();
            if (!list.Exists(o => o == "0123456789012"))
            {
                sw2.Stop();
                anyTimes.Add(sw2.ElapsedTicks);
            }
        }

        long existsFastest = existsTimes.Min();
        long anyFastest = anyTimes.Min();

        Console.WriteLine(string.Format("Fastest Exists() execution: {0} ticks\nFastest Any() execution: {1} ticks", existsFastest.ToString(), anyFastest.ToString()));
        Console.WriteLine("Benchmark finished. Press any key.");
        Console.ReadKey();
    }

    public static List<string> GenerateTestList(int count)
    {
        var list = new List<string>();
        for (int i = 0; i < count; i++)
        {
            Random r = new Random();
            int it = r.Next(0, 100);
            list.Add(new string('s', it));
        }
        return list;
    }

After executing the above code four times (1000 exists() and Any(), on a list of 1000 000 elements), it's not hard to see that these methods are almost as fast.

Fastest Exists() execution: 57881 ticks
Fastest Any() execution: 58272 ticks

Fastest Exists() execution: 58133 ticks
Fastest Any() execution: 58063 ticks

Fastest Exists() execution: 58482 ticks
Fastest Any() execution: 58982 ticks

Fastest Exists() execution: 57121 ticks
Fastest Any() execution: 57317 ticks

There is a subtle difference, but it is too small to be explained by background noise. My guess is that if a person can do 10, 000 or 100, 000 Exists() and Any(), then this tiny difference will more or less disappear.

#3 building

When you correct the measurements - as described above: any and exist, and add an average - we will get the following output:

Executing search Exists() 1000 times ... 
Average Exists(): 35566,023
Fastest Exists() execution: 32226 

Executing search Any() 1000 times ... 
Average Any(): 58852,435
Fastest Any() execution: 52269 ticks

Benchmark finished. Press any key.

#4 building

The difference is that Any is an extension method of Any IEnumerable < T > defined on System.Linq.Enumerable. It can be used on Any IEnumerable < T > instance.

Existence does not seem to be an extension method. My guess is that the type of coll is list < T >. If so, Exists is an instance method, and its function is very similar to Any.

In short, these methods are basically the same. One is more common than the other.

  • There is also an overload with no parameters, just looking for any items in the enumeration.
  • There is no such overload.

#5 building

In addition, this is only valid if the Value is of type bool. Typically, this is used with predicates. Usually, any predicate is used to find out whether there is any element that satisfies the given condition. Here, you only need to map from element i to the bool attribute. It will search for "i" with a Value attribute of true. When finished, the method returns true.

Tags: Attribute less

Posted on Mon, 10 Feb 2020 08:54:35 -0500 by GaryAC