Android ANR Log Analysis Guide

When your project becomes more and more complex, or your users reach a certain order of magnitude, your code accidentall...
definition
ANR type
Causes of ANR
ANR case analysis process
1, View events_log
2, traces.txt log analysis
ANR case sorting

When your project becomes more and more complex, or your users reach a certain order of magnitude, your code accidentally has small problems, and you will receive all kinds of bug s, among which you will not be unfamiliar with the problem of ANR. This paper will explain in detail the types of anr, the reasons for its emergence, the detailed analysis of anr cases and classic cases.

definition

ANR(Application Not Responding)   The application is not responding. If your application is blocked in the UI thread for too long, anr will appear. Usually anr will appear, and a prompt box will pop up to let the user know whether to continue to wait or close the program.

ANR type

There are generally the following types of ANR:
1:KeyDispatchTimeout (common)
The input event was not processed within 5S, and ANR occurred.
logcat log keyword: Input event dispatching timed out

2:BroadcastTimeout
The foreground Broadcast: onReceiver does not complete processing within 10S, and an ANR occurs.
Background Broadcast: onReceiver does not complete processing within 60s, and ANR occurs.
logcat log keyword: Timeout of broadcast BroadcastRecord

3:ServiceTimeout
Foreground Service: ANR occurs when the life cycle of onCreate, onStart and onBind is not processed within 20s.
Background Service: ANR occurs when the life cycle of onCreate, onStart and onBind is not processed within 200s
logcat log keyword: Timeout executing service

4: ContentProviderTimeout
The ContentProvider did not complete processing within 10S, and an ANR occurred. logcat log keyword: timeout publishing content providers

Causes of ANR

1: The main thread frequently performs time-consuming IO operations, such as database reading and writing
2: Deadlock of multithreading operation, and the main thread is block ed;
3: The main thread is blocked by the Binder peer block;
4: ANR in WatchDog in system server;
5:service binder is online and cannot communicate with System Server
6: System resources are exhausted (pipeline, CPU, IO)

ANR case analysis process

We will analyze ANR step by step, which makes us better understand how to find, analyze and solve problems.

1, View events_log

View events under the mobilelog folder_ Log, search for the keyword from the log: am_anr, find the time point where ANR occurs, process PID and ANR type.

Such as log:

07-20 15:36:36.472 1000 1520 1597 I am_anr : [0,1480,com.xxxx.moblie,952680005,Input dispatching timed out (AppWindowToken}}, Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.)]

From the above log, we can see that the application is com.xxxx.moblie   At 07-20 15:36:36.472, an ANR of KeyDispatchTimeout type occurred, and its process number was 1480. Sort out the key information:
ANR time: 07-20 15:36:36.472
Process pid: 1480
Process name: com.xxxx.moblie
ANR type: KeyDispatchTimeout

We already know that the ANR of KeyDispatchTimeout occurs because   The input event did not complete processing within 5 seconds. So at this time 07-20 15:36:36.472   What did the program do in the first five seconds of the program, that is, around the time period (15:36:30 ~ 15:36:31)? This is simple, because we already know the pid, and then search the logs with pid = 1480. These logs represent the track of the process. The key logs are as follows:

07-20 15:36:29.749 10102 1480 1737 D moblie-Application: [Thread:17329] receive an intent from server, action=com.ttt.push.RECEIVE_MESSAGE 07-20 15:36:30.136 10102 1480 1737 D moblie-Application: receiving an empty message, drop 07-20 15:36:35.791 10102 1480 1766 I Adreno : QUALCOMM build : 9c9b012, I92eb381bc9 07-20 15:36:35.791 10102 1480 1766 I Adreno : Build Date : 12/31/17 07-20 15:36:35.791 10102 1480 1766 I Adreno : OpenGL ES Shader Compiler Version: EV031.22.00.01 07-20 15:36:35.791 10102 1480 1766 I Adreno : Local Branch : 07-20 15:36:35.791 10102 1480 1766 I Adreno : Remote Branch : refs/tags/AU_LINUX_ANDROID_LA.UM.6.4.R1.08.00.00.309.049 07-20 15:36:35.791 10102 1480 1766 I Adreno : Remote Branch : NONE 07-20 15:36:35.791 10102 1480 1766 I Adreno : Reconstruct Branch : NOTHING 07-20 15:36:35.826 10102 1480 1766 I vndksupport: sphal namespace is not configured for this process. Loading /vendor/lib64/hw/gralloc.msm8998.so from the current namespace instead. 07-20 15:36:36.682 10102 1480 1480 W ViewRootImpl[MainActivity]: Cancelling event due to no window focus: KeyEvent { action=ACTION_UP, keyCode=KEYCODE_PERIOD, scanCode=0, metaState=0, flags=0x28, repeatCount=0, eventTime=16099429, downTime=16099429, deviceId=-1, source=0x101 }

As we can see from the above, the program received an action message at time 07-20 15:36:29.749.

07-20 15:36:29.749 10102 1480 1737 D moblie-Application: [Thread:17329] receive an intent from server, action=com.ttt.push.RECEIVE_MESSAGE.

It turned out to be app com.xxxx.moblie   A push message (com.ttt.push.RECEIVE_MESSAGE) was received, which led to blocking. Let's connect the information we have obtained so far: at time 07-20 15:36:29.749   App com.xxxx.moblie   Received the push message action=com.ttt.push.RECEIVE_MESSAGE is blocked, and ANR of KeyDispatchTimeout occurs 5 seconds later.

Although I know how to start, the specific reason has not been found. Was it because the CPU was very nervous and various apps seized resources again? Let's look at the CPU information,. Search keywords:   ANR IN

07-20 15:36:58.711 1000 1520 1597 E ActivityManager: ANR in com.xxxx.moblie (com.xxxx.moblie/.ui.MainActivity) (Process name) 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: PID: 1480 (process pid) 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: Reason: Input dispatching timed out (AppWindowToken}}, Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.) 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: Load: 0.0 / 0.0 / 0.0 (Load Indicates 1 minute,5 minute,15 minute CPU Load of) 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: CPU usage from 20ms to 20286ms later (2018-07-20 15:36:36.170 to 2018-07-20 15:36:56.436): 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 42% 6774/pressure: 41% user + 1.4% kernel / faults: 168 minor 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 34% 142/kswapd0: 0% user + 34% kernel 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 31% 1520/system_server: 13% user + 18% kernel / faults: 58724 minor 1585 major 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 13% 29901/com.ss.android.article.news: 7.7% user + 6% kernel / faults: 56007 minor 2446 major 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 13% 32638/com.android.quicksearchbox: 9.4% user + 3.8% kernel / faults: 48999 minor 1540 major 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 11% (CPU Utilization rate of)1480/com.xxxx.moblie: 5.2%(Usage of user status) user + (Kernel usage) 6.3% kernel / faults: 76401 minor 2422 major 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 8.2% 21000/kworker/u16:12: 0% user + 8.2% kernel 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 0.8% 724/mtd: 0% user + 0.8% kernel / faults: 1561 minor 9 major 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 8% 29704/kworker/u16:8: 0% user + 8% kernel 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 7.9% 24391/kworker/u16:18: 0% user + 7.9% kernel 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 7.1% 30656/kworker/u16:14: 0% user + 7.1% kernel 07-20 15:36:58.711 1000 1520 1597 E ActivityManager: 7.1% 9998/kworker/u16:4: 0% user + 7.1% kernel

I have marked the relevant meaning in the log. com.xxxx.moblie   It takes up 11% of the CPU, which is not much. Today's mobile phones are basically multi-core CPUs. If your CPU is 4 cores, the upper limit is 400%, and so on.

Since it is not the CPU load, what is the reason? At this time, it depends on our ultimate killer - traces.txt.

2, traces.txt log analysis

When the APP does not respond, the response is slow, or the WatchDog monitoring does not receive a response, the system will dump a traces.txt file and store it in the file directory: / data/anr/traces.txt. Through the traces file, we can get the thread name, stack information, current thread status, binder call and other information.
Get the file through the adb command: adb pull /data/anr/traces.txt
trace: Cmd line:com.xxxx.moblie

"main" prio=5 tid=1 Runnable | group="main" sCount=0 dsCount=0 obj=0x73bcc7d0 self=0x7f20814c00 | sysTid=20176 nice=-10 cgrp=default sched=0/0 handle=0x7f251349b0 | state=R schedstat=( 0 0 0 ) utm=12 stm=3 core=5 HZ=100 | stack=0x7fdb75e000-0x7fdb760000 stackSize=8MB | held mutexes= "mutator lock"(shared held) // java stack call information, you can view the call relationship and locate the specific location at ttt.push.InterceptorProxy.addMiuiApplication(InterceptorProxy.java:77) at ttt.push.InterceptorProxy.create(InterceptorProxy.java:59) at android.app.Activity.onCreate(Activity.java:1041) at miui.app.Activity.onCreate(SourceFile:47) at com.xxxx.moblie.ui.b.onCreate(SourceFile:172) at com.xxxx.moblie.ui.MainActivity.onCreate(SourceFile:68) at android.app.Activity.performCreate(Activity.java:7050) at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1214) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2807) at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2929) at android.app.ActivityThread.-wrap11(ActivityThread.java:-1) at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1618) at android.os.Handler.dispatchMessage(Handler.java:105) at android.os.Looper.loop(Looper.java:171) at android.app.ActivityThread.main(ActivityThread.java:6699) at java.lang.reflect.Method.invoke(Native method) at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:246) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:783)

I'll analyze some fields in traces.txt in detail to see what information it can provide us
Main: Main identifies the main thread. If it is a thread, it is named in the format of "Thread-X". X represents the thread id and increases gradually.
prio: thread priority. The default value is 5
tid: tid is not the ID of the thread, but the unique ID of the thread
Group: is the thread group name
sCount: the number of times the thread was suspended
dsCount: is the number of times the thread was suspended by the debugger
obj: object address
self: the Native address of the thread
sysTid: is the thread number (the thread number of the main thread is the same as the process number)
nice: is the scheduling priority of the thread
sched: indicates the scheduling policy and priority of threads respectively
cgrp: scheduling home group
handle: the address of the thread processing function.
State: is the scheduling state
Schedstat: from  / Read out proc/[pid]/task/[tid]/schedstat. The three values respectively represent the execution time of the thread on the cpu, the waiting time of the thread and the length of the time slice of thread execution. The three values that do not support this information are 0;
utm: is the time value used in the thread user mode (unit: jiffies)
stm: is the scheduling time value in kernel mode
Core: is the serial number of the cpu core that last executed this thread.

We are most concerned about the stack information of Java, which can locate the specific location. From the above traces, we can judge ttt.push.InterceptorProxy.addMiuiApplicationInterceptorProxy.java:77   This leads to ANR in com.xxxx.moblie. At this time, you can check the source code, find the problem and solve it.

Summarize the analysis process: first, we search am_anr to find the time point, process PID and anr type of anr, and then search PID to find the log about the first 5 seconds. Filter ANR IN to view CPU information, then view traces.txt, find java stack information, locate code location, and finally view the source code to analyze and solve problems. This process can basically find the reason for ANR sequence of events.

ANR case sorting

1, The main thread is lock ed by other threads, resulting in deadlock

waiting on <0x1cd570> (a android.os.MessageQueue)

DALVIK THREADS: "main" prio=5 tid=3 TIMED_WAIT | group="main" sCount=1 dsCount=0 s=0 obj=0x400143a8 | sysTid=691 nice=0 sched=0/0 handle=-1091117924 at java.lang.Object.wait(Native Method) - waiting on <0x1cd570> (a android.os.MessageQueue) at java.lang.Object.wait(Object.java:195) at android.os.MessageQueue.next(MessageQueue.java:144) at android.os.Looper.loop(Looper.java:110) at android.app.ActivityThread.main(ActivityThread.java:3742) at java.lang.reflect.Method.invokeNative(Native Method) at java.lang.reflect.Method.invoke(Method.java:515) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:739) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:497) at dalvik.system.NativeStart.main(Native Method) "Binder Thread #3" prio=5 tid=15 NATIVE | group="main" sCount=1 dsCount=0 s=0 obj=0x434e7758 | sysTid=734 nice=0 sched=0/0 handle=1733632 at dalvik.system.NativeStart.run(Native Method) "Binder Thread #2" prio=5 tid=13 NATIVE | group="main" sCount=1 dsCount=0 s=0 obj=0x1cd570 | sysTid=696 nice=0 sched=0/0 handle=1369840 at dalvik.system.NativeStart.run(Native Method) "Binder Thread #1" prio=5 tid=11 NATIVE | group="main" sCount=1 dsCount=0 s=0 obj=0x433aca10 | sysTid=695 nice=0 sched=0/0 handle=1367448 at dalvik.system.NativeStart.run(Native Method) ----- end 691 -----

2, The main thread does time-consuming operations, such as database reading and writing.

"main" prio=5 tid=1 Native held mutexes= kernel: (couldn't read /proc/self/task/11003/stack) native: #00 pc 000492a4 /system/lib/libc.so (nanosleep+12) native: #01 pc 0002dc21 /system/lib/libc.so (usleep+52) native: #02 pc 00009cab /system/lib/libsqlite.so (???) native: #03 pc 00011119 /system/lib/libsqlite.so (???) native: #04 pc 00016455 /system/lib/libsqlite.so (???) native: #16 pc 0000fa29 /system/lib/libsqlite.so (???) native: #17 pc 0000fad7 /system/lib/libsqlite.so (sqlite3_prepare16_v2+14) native: #18 pc 0007f671 /system/lib/libandroid_runtime.so (???) native: #19 pc 002b4721 /system/framework/arm/boot-framework.oat (Java_android_database_sqlite_SQLiteConnection_nativePrepareStatement__JLjava_lang_String_2+116) at android.database.sqlite.SQLiteConnection.setWalModeFromConfiguration(SQLiteConnection.java:294) at android.database.sqlite.SQLiteConnection.open(SQLiteConnection.java:215) at android.database.sqlite.SQLiteConnection.open(SQLiteConnection.java:193) at android.database.sqlite.SQLiteConnectionPool.openConnectionLocked(SQLiteConnectionPool.java:463) at android.database.sqlite.SQLiteConnectionPool.open(SQLiteConnectionPool.java:185) at android.database.sqlite.SQLiteConnectionPool.open(SQLiteConnectionPool.java:177) at android.database.sqlite.SQLiteDatabase.openInner(SQLiteDatabase.java:808) locked <0x0db193bf> (a java.lang.Object) at android.database.sqlite.SQLiteDatabase.open(SQLiteDatabase.java:793) at android.database.sqlite.SQLiteDatabase.openDatabase(SQLiteDatabase.java:696) at android.app.ContextImpl.openOrCreateDatabase(ContextImpl.java:690) at android.content.ContextWrapper.openOrCreateDatabase(ContextWrapper.java:299) at android.database.sqlite.SQLiteOpenHelper.getDatabaseLocked(SQLiteOpenHelper.java:223) at android.database.sqlite.SQLiteOpenHelper.getWritableDatabase(SQLiteOpenHelper.java:163) locked <0x045a4a8c> (a com.xxxx.video.common.data.DataBaseHelper) at com.xxxx.video.common.data.DataBaseORM.<init>(DataBaseORM.java:46) at com.xxxx.video.common.data.DataBaseORM.getInstance(DataBaseORM.java:53) locked <0x017095d5> (a java.lang.Class<com.xxxx.video.common.data.DataBaseORM>)

3, Too much binder data

07-21 04:43:21.573 1000 1488 12756 E Binder : Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 388568 (data: 1, 32, 7274595) 07-21 04:43:21.573 1000 1488 12756 E Binder : android.util.Log$TerribleFailure: Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 388568 (data: 1, 32, 7274595) 07-21 04:43:21.607 1000 1488 2951 E Binder : Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 211848 (data: 1, 23, 7274595) 07-21 04:43:21.607 1000 1488 2951 E Binder : android.util.Log$TerribleFailure: Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 211848 (data: 1, 23, 7274595) 07-21 04:43:21.662 1000 1488 6258 E Binder : Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 259300 (data: 1, 33, 7274595)

4, binder communication failed

07-21 06:04:35.580 <6>[32837.690321] binder: 1698:2362 transaction failed 29189/-3, size 100-0 line 3042 07-21 06:04:35.594 <6>[32837.704042] binder: 1765:4071 transaction failed 29189/-3, size 76-0 line 3042 07-21 06:04:35.899 <6>[32838.009132] binder: 1765:4067 transaction failed 29189/-3, size 224-8 line 3042 07-21 06:04:36.018 <6>[32838.128903] binder: 1765:2397 transaction failed 29189/-22, size 348-0 line 2916

8 October 2021, 00:43 | Views: 2239

Add new comment

For adding a comment, please log in
or create account

0 comments