Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(crash): Fix for #887 #1113

Closed
wants to merge 2 commits into from
Closed

fix(crash): Fix for #887 #1113

wants to merge 2 commits into from

Conversation

DickSmith
Copy link
Contributor

@DickSmith DickSmith commented Jul 16, 2018

https://www.telerik.com/account/support-tickets/view-ticket?threadid=1175125
#887

We enabled "markingMode: none" some month back which noticeably improve performance, however, there has been an increase in crashes that seem to appear related to these garbage collected objects when traversing the app rapidly.

All are caused by "Fatal Exception: com.tns.NativeScriptException"

They generally fall into a few different types:

  1. Calling js method onCreateView failed Error: java.lang.NoSuchMethodError: no non-static method
    Seen with multiple classes/methods:
    • "Landroid/widget/TextView;.setHint()Ljava/util/Set;"
    • "Lorg/nativescript/widgets/GridLayout;.addView()Z"
    • "Lcom/facebook/drawee/view/SimpleDraweeView;.setImageURI(Ljava/lang/Object;)Ljava/lang/Object;"
    • "Landroid/widget/TextView;.setPadding()Ljava/util/Set;"
    • "Landroid/animation/AnimatorSet;.clone()Ljava/util/Iterator;"
    • "Lorg/nativescript/widgets/GridLayout;.addView()Ljava/lang/Object;"
    • "Lorg/nativescript/widgets/GridLayout;.addView(Ljava/lang/Object;)Ljava/lang/Object;"
    • "Lorg/nativescript/widgets/GridLayout;.addView()Z"
  2. Attempt to use cleared object reference id=
  3. Calling js method onCreateView failed Error: java.lang.Exception: Failed resolving method addView on class
    Various classes:
    • Calling js method onCreateView failed Error: java.lang.Exception: Failed resolving method addView on class android.view.ViewGroup com.tns.Runtime.resolveMethodOverload(Runtime.java:1062) com.tns.Runtime.callJSMethodNative(Native Method) com.tns.Runtime.dispatchCallJSMethodNative(Runtime.java:1101)
    • Calling js method onCreateView failed Error: java.lang.Exception: Failed resolving method addView on class org.nativescript.widgets.GridLayout com.tns.Runtime.resolveMethodOverload(Runtime.java:1062) com.tns.Runtime.callJSMethodNative(Native Method) com.tns.Runtime.dispatchCallJSMethodNative(Runtime.java:1101)
  4. Calling js method onDismiss failed Error: java.lang.IllegalStateException: Activity has been destroyed android.app.FragmentManagerImpl.enqueueAction(FragmentManager.java:1460)
    4b. Calling js method onDismiss failed TypeError: Cannot read property 'setOnTouchListener' of null File: "<embedded>, line: 37, column: 522034 StackTrace: Frame: function:'t.onUnloaded',
  5. Possible others:
    • Unable to start activity ComponentInfo{com.ugroupmedia.pnp14/com.tns.NativeScriptActivity}: com.tns.NativeScriptException: Calling js method onCreate failed Error: Could not find a page for fragment3[3]. File: "<embedded>, line: 37, column: 640522 StackTrace: Frame: function:'c', file:'<embedded>', line: 37, column: 1307106 Frame: function:'L', file:'<embedded>', line: 37, column: 637193 Frame: function:'e.onCreate', file:'<embedded>', line: 37, column: 638611 Frame: function:'n.onCreate', file:'<embedded>', line: 37, column: 1690605 Frame: function:'e.onCreate', file:'<embedded>', line: 37, column: 640523 Frame: function:'n.onCreate', file:'<embedded>', line: 37, column: 1689023
    • Failure delivering result ResultInfo{who=null, request=64206, result=-1, data=Intent { (has extras) }} to activity {com.ugroupmedia.pnp14/com.tns.NativeScriptActivity}: com.tns.NativeScriptException: Calling js method onActivityResult failed Error: Cannot convert object to Lcom/google/android/gms/tasks/OnCompleteListener; at index 0 File: "<embedded>, line: 37, column: 729120 StackTrace: Frame: function:'onSuccess', file:'<embedded>', line: 37, column: 746398 Frame: function:'', file:'<embedded>', line: 37, column: 729121 Frame: function:'e.notify', file:'<embedded>', line: 37, column: 779309 Frame: function:'e.onActivityResult', file:'<embedded>', line: 37, column: 642514 Frame: function:'n.onActivityResult', file:'<embedded>', line: 37, column: 1689820
    • Calling js method run failed TypeError: Cannot read property 'getWindow' of undefined
    • Calling js method run failed Error: java.lang.IllegalStateException: Can not perform this action after onSaveInstanceState android.app.FragmentManagerImpl.checkStateLoss(FragmentManager.java:1411)

While all, or none of these may not all be related to the garbage collection, I though I would just include some of them for now.

I think the main issue is that the method may through an error, but is never caught, and many of these are occurring in asynchronous contexts. Strangely enough, the same method when not called on the main thread (for workers?) is silently caught but no logging is preformed, which may explain some difficulty I've encounter in the pas when workers have failed silently; so this PR catching and logging both. (I wasn't sure if there is a preference for logging, so I included both the logger.write and e.printStackTrace().)

@darind
Copy link
Collaborator

darind commented Jul 16, 2018

The biggest issue with markingMode: none is that all plugins used by the application must be written with it in mind. We have tested the {N} core modules being compatible with this mode, but we cannot guarantee that other plugins will respect it (by properly storing objects that need to persist longer in higher scopes).

Globally swallowing exceptions in the runtime could leave the application in a completely unpredictable state, and lead to even more exceptions and crashes (in the unmanaged C++ code that cannot be caught at all). Even worse, it will make the application behave in an unexpected way and might provide wrong results to the end user.

For this reason we do not recommend wrapping those method calls in try/catch statements by default, but rather make it optional. For example there could be a switch in package.json which would enable this exception swallowing mode and only application developers willing to use it could enable it.

@etabakov
Copy link
Contributor

Thanks for chiming in @darind. I think your suggestion aligns very well with what we have already discussed with @vakrilov and @NathanWalker - to have two modes for reporting errors. One for release when exception are caught and fail slightly and are properly logged and one for debug when errors are actually thrown and visible to the developer.

@DickSmith
Copy link
Contributor Author

Hey @darind @etabakov,
Just to clarify, all this PR is doing is add the logging to line 1112 so that it doesn't fail silently, and then doing the same for the invocation of the method on the line 1101.

Whether this is a global solution, or enabled/disabled via a flag or the build type, I believe both invocations of this method should likely be handled consistently.

This makes development more efficient as now both failures on and off the main thread are logged to the console, which is easier to navigate and copy/paste from than the black-screen Error Activity, while also not crashing the user unnecessarily in production, which leads to poorer user perception, poorer ratings, and more app uninstalls.

Our internal QA has reported no adverse affects to this change and new or additional crashes. We will also be releasing this change in isolation into production today and will continue to monitor and report any additional results.

I'm also going to try a build that will report caught exceptions as non-fatals to Crashlytics using Timber or similar: https://medium.com/@sembozdemir/using-timber-with-crash-reporting-tools-931eafd1296c

@darind
Copy link
Collaborator

darind commented Jul 19, 2018

@DickSmith, in addition to adding logging, this PR is swallowing the exception, which is a change in the behavior of the runtime. This behavioral change can also be observed by the corresponding failing unit test which expects that the application will crash if you attempt to invoke a non-existent method.

If your initial intent was to only add logging, please do not hesitate to rethrow the original exception from the catch clause tha you added.

As far as whether swallowing exceptions in a production application is concerned or leaving the application to crash, I have already expressed my opinion - I would have preferred this to be left as a choice of the developper by introducing a config option rather than making it the default behavior

@NathanWalker
Copy link
Contributor

NathanWalker commented Jul 19, 2018

@darind The intent is definitely to change the behavior of the runtime since it's bug and crash inducing and not helpful to even developers since a crash in even local dev forces a developer to relaunch the runtime which causes undue development time delays as well as great team frustration as well as reduction in management confidence in NativeScript as an effective development solution to any product.

Not sure how that's not clear from @DickSmith intent in this PR and his description as to the details.

The unit test should change to effectively test a valid way to handle this situation.

Certainly the proposal and discussion set forth by @vakrilov and myself is acceptable. However I'm questionable even on why the default would be to crash rather than to handle appropriately to allow a developer to do his job effectively.

@darind Your comments leave me to question a lot here. Your opinions appear to suggest that it is your intent to crash as a default and I'm left wondering how under any circumstances that would be good for the developer or any framework user for that matter.

@NathanWalker
Copy link
Contributor

NathanWalker commented Jul 19, 2018

@darind if there's something I'm missing perhaps an example case may help. The closest I can think of is a EXC_BAD_ACCESS exception in XCode for instance. The debugger stops dead in XCode when that occurs. A developer needs to stop the runtime and relaunch to start debugging again. This is good and makes sense however with NativeScript could the runtime not catch those since we have the luxury of doing so, allow developer to see it in log - make adjustment, save and have livesync pick it up and continue developing or maybe I'm missing something?

@DickSmith
Copy link
Contributor Author

@darind
To be clear, why is this same method call caught silently on line 1112, but not on line 1101?

I have no idea at all why line 1112 is caught without so much as a log, but more so, if there are any concerns about possible unpredictable states or C++ exceptions, then why catch line 1112 at all, as the same should be true there, as well?

Line 1112 is not only being 'swallowed', but completely hidden from the developer.

The intent of this PR is to handle both the same way, and clearly notify the developer of any failures. If a try is fine on line 1112 it should be fine on line 1101. If it isn't fine on 1101 then it isn't fine on 1112.

@darind
Copy link
Collaborator

darind commented Jul 19, 2018

@DickSmith, on line 1112, there’s a try/finally statement, which is quite different from a try/catch. A finally statement will not prevent the exception from propagating. It will just ensure that the block in this statement will execute before crashing, just like it would do with line 1101.

@NathanWalker, I see your point about the developer experience. That’s why I suggest adding this as an option to the runtime. My concern with making it the default behavior is that it might hide very subtle bugs which might only be discovered by the users in production.

@NathanWalker
Copy link
Contributor

NathanWalker commented Jul 19, 2018

@darind I think we've gotten to the clarity we've all been after. Thanks for listening.

My concern with making it the default behavior is that it might hide very subtle bugs which might only be discovered by the users in production.

This is actually precisely our concern. You don't want user's to discover crashes in production in a mobile app if they don't cause "irreversible damage to users data or device". None of the stacks that @DickSmith has demonstrated would fall in that category and in fact all of the production crashes I've ever seen in my ~3 years development with {N} (I think - been long time now I think) have ever fallen into that category and would beg you to understand would ever fall in that category to be quite frank. Which is really why "crash by default" is never acceptable in production. It is in fact the #1 problem area of {N} in production.

You want desperately for production end users to encounter weirdness over crashes 100% of the time. Period.

That is exactly the defining line behind a review like:

"Crashed my phone. Removed app immediately. Don't install. 1 star".
vs.
"Looks promising. Hope they work out the bugs, little unstable. 2 stars".

@DickSmith
Copy link
Contributor Author

@darind
While Android doesn't have a review process, platforms that do, such as Apple or game consoles, require full stability and no crashes, even hard to reproduce ones. If they encounter any crashes during their testing they will bounce the app and you'll need to resubmit a new one and just hope they won't encounter it on the next review.

This is because crashes are toxic to user perception, not only of the app, but of the platform. While Google may be the exception here and not care, app developers do care and will avoid frameworks that are unable to fail elegantly for random edge cases.

Even non-technical users know that app crashes are bad. They may not know everything that your app intended to do, so some failures may be invisible, but they do know it didn't intend to crash. No app should intend to crash, and no framework should intend that to be the 'default' behaviour; once an app has compiled and launched, maintaining stability, even over full functionality, should always be the priority.

@vtrifonov
Copy link
Contributor

vtrifonov commented Jul 20, 2018

@DickSmith @NathanWalker I totally agree that throwing exceptions in production is not something that anyone would want. However especially when developing I would prefer my app to crash and show me the exception immediately as I may not see in the console that there's an error in the application.
We have added a flag for this so that you can use it instead of changing the runtime, so please let us know how it behaves. You can check this PR: #1120. It is available in the android@next version.
The flag is disabled by default, so if you want to enable it you should add in your app/package.json file this:

{
	...
	"android": {
		...
		"discardUncaughtJsExceptions": true
	}
}

@DickSmith
Copy link
Contributor Author

DickSmith commented Jul 20, 2018

@vtrifonov
This is excellent. :D I will update our testing app to use this today with the new flag.

Also, I released a version yesterday of our app containing a custom build of the runtime with my PR changes, so I'll keep the support ticket up-to-date with the results as they come in (more user activity on weekends).

Since #1120 is the same as this PR, but with the config flags added, I will close this PR.

@DickSmith DickSmith closed this Jul 20, 2018
@vtrifonov
Copy link
Contributor

@DickSmith , @NathanWalker just to let you know that the name of the setting is changed from autoCatchJSMethodNativeCalls to discardUncaughtJsExceptions which seems more clear.

@vtrifonov
Copy link
Contributor

@DickSmith, @NathanWalker one more update :) we have moved the setting in the **app/package.json ** root instead of in the android section as we have implemented the same functionality in ios runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants