BitArray is good, but don't abuse it. Another online memory boom

One: Background

1. Storytelling

Last month I wrote a blog about using bitmap to compress the original List <CustomerID>with high intensity, which compresses the original List memory nearly 106 times, but bitmap is not always good. You must make it in the right scene.Instead of abusing with closed eyes, the corresponding set of bitmaps in C#is BitArray.

It looks like it's dramatic😄😄😄As a result, BitArray's abuse of memory has led to a rise or fall of less than 10G, but the focus of this kind of thing is still on the solution. Writing it to yourself in the future will not evaporate this rare practical experience.

Second: Solution ideas

1.Take a look at the managed heap

It's a good idea to watch the managed heap, but it doesn't work every time. After all, there are a variety of reasons for the memory boom and bust, just like a cold, hot and viral, right(viii)Or use the old command:! Dumpheap-stat-min 102400, find objects larger than 100M on the managed heap.

0:030> !dumpheap -stat -min 102400
Statistics:
              MT    Count    TotalSize Class Name
00007ffe094ec988        1      1438413 System.Byte[]
00007ffdab934c48        1      1810368 System.Collections.Generic.Dictionary`2+Entry[[System.Int32, mscorlib],[System.Collections.Generic.HashSet`1[[System.Int64, mscorlib]], System.Core]][]
00007ffe094e6948        1      2527996 System.String
00007ffdab9ace78        4     29499552 System.Collections.Generic.Dictionary`2+Entry[[System.Int64, mscorlib],[System.DateTime, mscorlib]][]
00007ffe094e4078        4    267342240 System.String[]
00007ffe094e9220      135    452683336 System.Int32[]
00007ffdab8cd620      123   1207931808 System.Collections.Generic.HashSet`1+Slot[[System.Int64, mscorlib]][]
00007ffe094c8510      185   1579292760 System.Int64[]
00007ffdab9516b0      154   1934622720 System.Linq.Set`1+Slot[[System.Int64, mscorlib]][]
000001cc882de970      347   3660623866      Free
Total 1371 objects

After removing some sensitive classes, look again as if there were no particularly conspicuous collections, such asSystem.Int64[],System.Linq.Set1+Slot[[System.Int64,Mscorlib]][] is generally used as memory storage for other collections, many times! gcroort can't grab it, but the largest one is the Free column, which has 347 fragments up to 3.5G, indicating that the big object heap is a mess at this time. If GC can help compress it a little better.

2. View the call stack for each thread

Inertia peeks first at how many threads are in the program.

0:000> !threads
ThreadCount:      74
UnstartedThread:  0
BackgroundThread: 72
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
                                                                                                        Lock  
       ID OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception
   0    1 2958 000001cc882e5a40    2a020 Preemptive  0000000000000000:0000000000000000 000001cc882d8db0 1     MTA 
   2    2 2358 000001cc883122c0    2b220 Preemptive  000001D41B132930:000001D41B1348A0 000001cc882d8db0 0     MTA (Finalizer) 
   3    4 2204 000001cc883ae5d0  102a220 Preemptive  0000000000000000:0000000000000000 000001cc882d8db0 0     MTA (Threadpool Worker) 
   5    7 278c 000001cca29d8ef0  202b220 Preemptive  000001D41AB53A98:000001D41AB55A58 000001cc882d8db0 1     MTA 
   6   40 2a64 000001cca3048f10  1020220 Preemptive  0000000000000000:0000000000000000 000001cc882d8db0 0     Ukn (Threadpool Worker) 
   7   46  e34 000001cca311c390  202b220 Preemptive  0000000000000000:0000000000000000 000001cc882d8db0 0     MTA 
   8   47 27d8 000001cca3115e00    2b220 Preemptive  0000000000000000:0000000000000000 000001cc882d8db0 0     MTA 

...

You can see that there are currently 74 threads and 72 background threads. Next, use ~*e! Clrstack to see what each managed thread is doing. Because there is too much content, I'll select Ha.

0:000> ~*e !clrstack
OS Thread Id: 0x2d64 (29)
        Child SP               IP Call Site
000000d908cfe698 00007ffe28646bf4 [GCFrame: 000000d908cfe698] 
000000d908cfe768 00007ffe28646bf4 [HelperMethodFrame_1OBJ: 000000d908cfe768] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)

OS Thread Id: 0x214c (30)
        Child SP               IP Call Site
000000d90957e6e8 00007ffe28646bf4 [GCFrame: 000000d90957e6e8] 
000000d90957e7b8 00007ffe28646bf4 [HelperMethodFrame_1OBJ: 000000d90957e7b8] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)

OS Thread Id: 0x1dc0 (40)
        Child SP               IP Call Site
000000d950ebe878 00007ffe28646bf4 [GCFrame: 000000d950ebe878] 
000000d950ebe948 00007ffe28646bf4 [HelperMethodFrame_1OBJ: 000000d950ebe948] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)

OS Thread Id: 0x274c (53)
        Child SP               IP Call Site
000000d9693fe518 00007ffe28646bf4 [GCFrame: 000000d9693fe518] 
000000d9693fe5e8 00007ffe28646bf4 [HelperMethodFrame_1OBJ: 000000d9693fe5e8] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)
000000d9693fe700 00007ffe09314d05 System.Threading.ManualResetEventSlim.Wait(Int32, System.Threading.CancellationToken)
000000d9693fe790 00007ffe0930d996 System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken)
000000d9693fe800 00007ffe09c9b7a1 System.Threading.Tasks.Task.InternalWait(Int32, System.Threading.CancellationToken)

A strange phenomenon was found with 4 threads 29,30,40,53 in Monitor.ObjWait There's a jam. Looking at the call stack, these four guys are preparing to bulk insert data [InsertBatch] into Mongodb, and one of the other threads got lock doing InsertBatch first. These four threads are waiting. How do they feel abstract? Let me draw a picture:

3. Find the collection at insertbatch

Here I'm going to talk about thread 30. From the call stack above, you should see oneSystem.Collections.Generic.IEnumerable1<System. uCanon>, you can guess from IEnumerable that the implementation class should be a collection such as List or HashSet, then use! dso to dump out all the objects on thread stack 30.

This should be the List <Xxx.Common.GroupConditionCustomerIDCacheModel>, then use!ObjsizeEuphorbia!do Measure the List and dump it.

0:030> !objsize 000001d3fa581518 
sizeof(000001d3fa581518) = 1487587080 (0x58aac708) bytes (System.Collections.Generic.List`1[[DataMipCRM.Common.GroupConditionCustomerIDCacheModel, DataMipCRM.Common]])
0:030> !do 000001d3fa581518
Name:        System.Collections.Generic.List`1[[DataMipCRM.Common.GroupConditionCustomerIDCacheModel, DataMipCRM.Common]]
MethodTable: 00007ffdab9557d0
EEClass:     00007ffe08eb22a0
Size:        40(0x28) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ffe09478740  4001871        8     System.__Canon[]  0 instance 000001d3fa5b9bf8 _items
00007ffe094e9288  4001872       18         System.Int32  1 instance             1520 _size
00007ffe094e9288  4001873       1c         System.Int32  1 instance             1520 _version
00007ffe094e6f28  4001874       10        System.Object  0 instance 0000000000000000 _syncRoot
00007ffe09478740  4001875        8     System.__Canon[]  0   static  <no information>


You can see that the list occupies 1487587080/1024/1024=1.4G, and Nima is so big that it's scary from _size is 1520, which means the focus is on _Now in the items array, take out the first item with da and dissect it.

0:030> !da -length 1 -details 000001d3fa5b9bf8
Name:        DataMipCRM.Common.GroupConditionCustomerIDCacheModel[]
MethodTable: 00007ffdab955e10
EEClass:     00007ffe08eaaa00
Size:        16408(0x4018) bytes
Array:       Rank 1, Number of elements 2048, Type CLASS
Element Methodtable: 00007ffdab955740
[0] 000001d3fa581540
    Name:        DataMipCRM.Common.GroupConditionCustomerIDCacheModel
    MethodTable: 00007ffdab955740
    EEClass:     00007ffdab94b9e8
    Size:        64(0x40) bytes
    File:        D:\LuneceService\DataMipCRM.Common.dll
    Fields:
                      MT    Field   Offset                 Type VT     Attr            Value Name
        00007ffdaac69258  4000589       28     ...oDB.Bson.ObjectId      1     instance     000001d3fa581568     <_id>k__BackingField
        00007ffe094e9288  400058a       20             System.Int32      1     instance                 1901     <ShopId>k__BackingField
        00007ffe094e6948  400058b        8            System.String      0     instance     000001d3f7154070     <GroupConditionHasCode>k__BackingField
        00007ffe094e6948  400058c       10            System.String      0     instance     000001cca7b46ac0     <unit>k__BackingField
        00007ffe094f1cb0  400058d       18     ...lections.BitArray      0     instance     000001d3fa581580     <customeridArray>k__BackingField

You can see from the Type column in the last row that there is a BitArray class, or the same, measured before typing out.

0:030> !objsize 000001d3fa581580     
sizeof(000001d3fa581580) = 956008 (0xe9668) bytes (System.Collections.BitArray)
0:030> !do 000001d3fa581580     
Name:        System.Collections.BitArray
MethodTable: 00007ffe094f1cb0
EEClass:     00007ffe08ead968
Size:        40(0x28) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ffe094e9220  40017e2        8       System.Int32[]  0 instance 000001d5320c6d18 m_array
00007ffe094e9288  40017e3       18         System.Int32  1 instance          7647524 m_length
00007ffe094e9288  40017e4       1c         System.Int32  1 instance                2 _version
00007ffe094e6f28  40017e5       10        System.Object  0 instance 0000000000000000 _syncRoot

Looking at output, this bitarray occupies 956008/1024/1024 = 0.91M. True tmd is large. Looking at 764w bit bits, there is a CustomerID=7647524-1 on it. Basically, even if the problem is found, now I finally know why the memory is so large, let's figure it out.

Fourth: Summary

Finally, I asked why this is the case for moving bricks because of a report presented to the customer to speed up the process.Save the population at each point on BitArray and enter the Monogdb cache so that customers can subsequently select a point to drill downIf you do, you can recover BitArray's population directly from mongodb, eliminating the pains of repetitive program calculation, because the population at each point is exclusive, and people at each point may only have dozens, hundreds, thousands, but this customer has more than 800 weeks. Naturally, this CustomerID is very large, and unfortunately, Arbitray has a small number of large numbers, these are all very important.When you get together, it's a typical abuse of bitarray, what do you mean?

If you have more questions to interact with me, scan below to enter ~

Tags: C# Windows MongoDB less

Posted on Thu, 21 May 2020 20:59:50 -0400 by ealderton