feat(logging): persist and export application logs #814

nsakaimbo · 2020-03-23T14:17:31Z

Description

This series of PRs adds activity logs to the Desktop application, exportable via the ~~File~~ Help menu. User logs are archived and exported to the user's desktop, at which point they can be shared for troubleshooting/debugging purposes.

The log format follows industry-standard best practices of categorization by functional area and priority (info, debug, error and warn), and includes details of both key events and uncaught exceptions (with associated stack traces 🎉).

To-date, logging in the application has been limited solely to the developer environment via the npm debug module. The goals of realizing this feature are to (1) minimize disruption of the developer environment by maintaining behavior of the existing debug module, (2) extend logging to incorporate more useful format, and (3) make persisted user logs available for troubleshooting purposes.

[2020-03-20 16:17:46.594] [desktop:settings] [error] Failed to create settings file
[2020-03-20 16:17:46.597] [desktop:boot] [info] Booting WordPress.com v5.0.0
[2020-03-20 16:17:46.599] [desktop:settings] [info] Get default setting for proxy-type = 
[2020-03-20 16:17:46.600] [desktop:index] [info] Starting app handlers
[2020-03-20 16:17:46.627] [desktop:updater] [info] Updater config: {"downloadUrl":"https://apps.wordpress.com/desktop/","apiUrl":"https://api.github.com/repos/Automattic/wp-desktop/releases","delay":2000,"interval":43200000}
[2020-03-20 16:17:46.699] [desktop:server] [info] Checking server port: 41050 on host 127.0.0.1
[2020-03-20 16:17:46.632] [desktop:runapp] [info] Checking for other instances
[2020-03-20 16:17:46.597] [desktop:boot] [info] App Path: /Users/distiller/wp-desktop/release/mac/WordPress.com.app/Contents/Resources/app
[2020-03-20 16:17:46.598] [desktop:boot] [info] Server: http://127.0.0.1:41050

...

Motivation and Context

Troubleshooting issues reported by users has primarily been driven by educated (as much as possible) guesswork and back-and-forth to probe for more details about any given problem. In an effort to minimize this feedback loop and make debugging more efficient, users now have the option of exporting an archive of their logs to their desktop which they can attach to support requests.

Organization

This feature is staggered into three PRs to make code review more manageable.

PR 1. Add custom logging module (lib/logger)

PR 2. Replace usage of prior debug module with the custom logger

PR 3. Add log archival and export functionality via the application's "Help" menu

[1] WP-Desktop Custom Logger Implementation

This PR adds a custom logging solution to wp-desktop. The lib/logger implementation is a wrapper for the popular winston logging library, which provides a lot of great functionality and flexibility out-of-the-box. Key benefits of utilizing winston are (1) the ability to customize log formatting, (2) the ability to log to multiple transports and (3) capturing uncaught exceptions.

Key Features

Custom log format includes timestamp, functional area and priority
Log to both a file and, in a developer environment, to the console
Maintain use of the DEBUG environment variable to filter logs by functional area (e.g. DEBUG=desktop:*)
Maintain the same initialization API as prior debug module, i.e. log = require( 'lib/logger')( 'desktop:<function>')

The location of the application's log file during end-to-end testing is redirected to the test/logs folder.

dmsnell · 2020-03-23T16:40:28Z

desktop/env.js

+			throw err;
+		}
+	} );
+}


there shouldn't be a need for existsSync() right? We always want to make sure the directory is there and so adding the existsSync() only creates a race-condition where the directory could be there when it's called and gone by the time it's done, leaving us in the same state as if we just assumed it was there.

mkdirSync() shouldn't fail if the directory already exists.

const logPath = process.env.WP_DEBUG_LOG || path.join( appData, 'logs', 'wp-desktop.log' ); mkdirSync( path.dirname( logPath ), { recursive: true }, error => { … } );

Thanks - good catch. Addressed in d7d3aea.

Interestingly enough, a plainmkdirSync( ... { recursive: true} ) will throw an error (code EEXIST) if the directory already exists (at least, it does in CI). Updated with try/catch in 029cb26 and it's working again as expected.

dmsnell · 2020-03-23T16:45:49Z

desktop/lib/state/index.js

+
+State.prototype.getLogPath = function( ) {
+	return this.logPath;
+};


Would these be necessary if we moved the logging directory logic into the logging module? It doesn't look obvious that it needs to occur in desktop/env.js especially since this logic is specific to the logger and there's a logger module. Anything that relies on the logger module will pull it in and that module-init code that we write will run before anything can call it, so we still have the ability to initialize things before any logging call could be made.

Yeah I'm not thrilled about this either, but this seemed like a reasonable choice given the alternatives. The logs path needs to be set relative to the user's userData directory, but the userData directory itself is overriden early in the application's lifecyle in env.js.

/** * Initialize core components */ const state = require( './lib/state' ); const config = require( './lib/config' ); const appData = path.join( app.getPath( 'appData' ), config.appPathName ); // Initialize log directory prior to requiring any modules that log const logPath = process.env.WP_DEBUG_LOG ? process.env.WP_DEBUG_LOG : path.join( appData, 'logs', 'wp-desktop.log' ); mkdirSync( path.dirname( logPath ), { recursive: true }, ( err ) => { if ( err ) { err.message = 'Failed to initialize log directory: ' + err.message; throw err; } } ); state.setLogPath( logPath ); // Initialize settings const Settings = require( './lib/settings' ); // Catch-all error handler // We hook in very early to catch issues during the startup process require( './app-handlers/exceptions' )(); // if app path set to asar, switch to the dir, not file var apppath = app.getAppPath(); if ( path.extname( apppath ) === '.asar' ) { apppath = path.dirname( apppath ); } process.chdir( apppath ); process.env.CALYPSO_ENV = config.calypso_config; // Set app config path app.setPath( 'userData', appData ); if ( Settings.isDebug() ) { process.env.DEBUG = config.debug.namespace; }

It's definitely awkward having lib/logger know in advance about "where the overidden userData directory is", and it would be an easy footgun if the path to userData ever changes and we forget to change the path buried in lib/logger as well. Having this path set in env.js, next to the userData override, is far from perfect but makes it a little more obvious what's going on.

The prior implementation (that wasn't really implemented) seemed to have the log path set in the application's config (ref). This seems like a reasonable choice as well, save for the fact that the config values are set at buildtime, not runtime, so a path relative to userData in the config object would also smell a little funny.

dmsnell

Did we consider replacing debug's logging mechanism? Do we need the additional functionality that Winston brings? If we're mostly trying to expose the logs then we could probably shrink the scope of this work by augmenting debug instead, plus I think that would allow us to grab logs for any subsystem in Calypso, not just the ones we specifically convert.

My guess is that getting the logs as they are is probably 1000x better than not getting any and getting additional structures is about 1200x better. Obviously there's no evidence yet but in terms of adding a new dependency without a clear need I'd rather question if it's needed before adding it.

If you have seen it already and know it's super helpful then please go ahead and introduce it. That being said, please make sure that there's good integration with our existing debug calls as well.

It doesn't look like this PR actually does any logging. While this may help separate the concerns on the PR it'd be nice to more clearly see what it's supposed to be doing. Can we compare this to the existing logging that it looks like is supposed to be happening as evidenced in the deleted code from desktop/env.js?

What were the tradeoffs in this work vs. adding a menu option to download the existing log data?

nsakaimbo · 2020-03-24T21:34:17Z

Did we consider replacing debug's logging mechanism?

This feature replaces debug in the Desktop application. In order to make the diff a little easier to navigate, the feature is broken into 3 PRs. This PR is scoped to the implementation of lib/logger. The swap of debug for our own implementation is done in subsequent PR #815, and finally the export of persisted logs is handled in #816.

... please make sure that there's good integration with our existing debug calls as well

Yep. One of the goals of the implementation is to maintain parity with the behavior of debug . By wrapping winston, we maintain all of debug's key behaviors:

Same initialization API when replacing the debug calls (example)
Namespacing of log items by function
Logging to console in a development environment (e.g. when running make dev)
Usage of the DEBUG environment variable for log filtering

We preserve the functionality of debug mentioned above while going beyond those basic functions by adding persistence, additional logging elements (e.g. severity, localized timestamps) and custom formatting which make log inspection significantly more helpful for debugging purposes.

It doesn't look like this PR actually does any logging.

To reduce noise and make these changes easier to review, this PR is solely focused on the logger implementation. The actual replacement of debug with this implementation is done in #815 (here is an example of the swap).

For context, here is the diff in its entirety, which you're more than welcome to inspect. IMO conceptualizing and discussing changes is a little easier when changes can be broken up this way.

What were the tradeoffs in this work vs. adding a menu option to download the existing log data?

The key limitation of the debug module is that log formats cannot be customized, whereas winston is highly flexible in this regard. winston provides the following basic functionality that debug doesn't give us (at least, not out of the box).

Being able to categorize log items according to severity and functional area makes logs significantly more helpful and easier to navigate with even simple text search. Being able to add a local timestamp is also significant, as we can inspect logs relative to when a user says they experienced an issue.
winston supports multiple transports, meaning we can support both file output and console logging in a dev environment (see my earlier point about preserving all of the behaviors we got with debug.)
Forward looking: if we decided that another format might serve us better in the future, e.g. structured logs, this would be relatively trivial to implement with winston.

If you have seen it already and know it's super helpful then please go ahead and introduce it ... Can we compare this to the existing logging that it looks like is supposed to be happening as evidenced in the deleted code from desktop/env.js?

The availability and formatting of debug logs in this manner is fairly standard and provides significant value when debugging. Anything that reduces the feedback cycle between our users' reporting an issue and our ability to identify/reproduce a problem is a step in the right direction.

Log formatting with debug is relatively limited (not to mention clumsy to customize any further):

desktop:index Starting app handlers +0ms
desktop:index Waiting for app window to load +69ms
desktop:server Checking server port: 41050 on host 127.0.0.1 +63ms
desktop:server Server started +6ms

Log format looks like so:

[2020-03-20 16:17:46.594] [desktop:settings] [error] Failed to create settings file
[2020-03-20 16:17:46.597] [desktop:boot] [info] Booting WordPress.com v5.0.0
[2020-03-20 16:17:46.599] [desktop:settings] [info] Get default setting for proxy-type = 
[2020-03-20 16:17:46.600] [desktop:index] [info] Starting app handlers
[2020-03-20 16:17:46.627] [desktop:updater] [info] Updater config: {"downloadUrl":"https://apps.wordpress.com/desktop/","apiUrl":"https://api.github.com/repos/Automattic/wp-desktop/releases","delay":2000,"interval":43200000}
[2020-03-20 16:17:46.699] [desktop:server] [info] Checking server port: 41050 on host 127.0.0.1
[2020-03-20 16:17:46.632] [desktop:runapp] [info] Checking for other instances
[2020-03-20 16:17:46.597] [desktop:boot] [info] App Path: /Users/distiller/wp-desktop/release/mac/WordPress.com.app/Contents/Resources/app
[2020-03-20 16:17:46.598] [desktop:boot] [info] Server: http://127.0.0.1:41050
// ...

Nothing new here and this is a fairly common format (see VSCode, for example). Side note: I don't think the millisecond difference item is particularly useful, but this is also trivial to add with winston.

To inspect log output for yourself, you can download an artifact from #815 (or build that branch locally) (instructions).

... [ability to] grab logs for any subsystem in Calypso, not just the ones we specifically convert

A couple of points worth noting:

Logging in the preload script is currently (1) limited in scope and (2) implemented with IPC to mitigate any performance overhead. If we decide on logging certain items from within Calypso at a later date, this would be the recommended way to go. That being said ...
winston does offer some flexiblitly in this regard, in that we can replace Calypso's calls to debug or console.log. If we go in that direction, we'd have to carefully consider and measure impact on application responsiveness. IMO we should be very deliberate about any log items we want to capture from Calypso, and send those via IPC. (Edit: worth pointing out that applications like Slack and VSCode don't combine browser and node process logging, and for good reason.)

dmsnell · 2020-03-24T23:01:25Z

This feature replaces debug in the Desktop application. In order to make the diff a little easier to navigate, the feature is broken into 3 PRs. This PR is scoped to the implementation of lib/logger. The swap of debug for our own implementation is done in subsequent PR #815, and finally the export of persisted logs is handled in #816.

I may not have communicated what I wanted. I understand that you're replacing debug - that was very clear. What I was asking is if you considered using debug's standard approach to replacing the logging mechanism it uses internally. It's a trivial interface.

winston supports multiple transports, meaning we can support both file output and console logging in a dev environment (see my earlier point about preserving all of the behaviors we got with debug.)

For example, debug is not limited in this way and winston isn't providing functionality we don't already have available.

Forward looking: if we decided that another format might serve us better in the future, e.g. structured logs, this would be relatively trivial to implement with winston.

Not something I'm worried about until we actually have a need for it. You can probably persuade me that winston is more feature-packed. I'm asking about if we have the need now and if it warrants adding another logger into our mix. Although our API supports debug statements the way we're used to, what if we take advantage of winstons functionality and then try to go back to debug?

Put another way, what about winson is necessary for the goal. I for one know that in my experience the logging levels have mostly been useless when I'm trying to debug. In most of my scouting the log levels have been arbitrary and left me having to treat them all the same particularly because different developers use different subjective guidelines for determining which level they should use to report an event.

Also on local time I consider that more of a liability than a help due to the way that local clocks can change so dramatically. As I write this comment I've opened the console and run new Date(), changed my timezone, and run new Date() again - I do get the timezone but the actual numbers change willy-nilly in accordance to system time. We can know the local timezone offset if we want it, if it helps, by printing it when the app boots. Has the lack of local time been a hindrance to us so far? Has it made things confusing to diagnose?

adding persistence

Can you elaborate here? Based on the deleted code it looks like debug was supposed to already be writing to a file? Was that a wrong inference from the code? Was something broken preventing our existing system from actually writing to that file?

In order to make the diff a little easier to navigate, the feature is broken into 3 PRs

In this case for me, at least since this involves changes whose implications lie beyond this PR I'm not finding it easier to understand. It's fine to leave them separate, but maybe at the same time the work itself is bigger in scope than it needs to be if it's hard to contain in a single PR. I know logging extends throughout the whole app so maybe this isn't proper to consider it the case here. Just responding to the comment from a personal take (the PRs are now separate but I have to open three tabs to review any one of them 😉)

nsakaimbo · 2020-03-25T00:48:11Z

Although our API supports debug statements the way we're used to, what if we take advantage of winstons functionality and then try to go back to debug

Sure, down the road we may decide to use another logging solution. In the case of debug specifically, part of the intent in keeping the API as close to the original as possible would be to make the revert relatively straightforward (that, and I wanted to maintain the namespacing feature, which I think we should retain regardless of logging implementation). However I believe I've stated the case for the fact that we're adding to, and not losing, any of the functionality with debug.

I for one know that in my experience the logging levels have mostly been useless when I'm trying to debug

YMMV, but I strongly disagree with this point, based on my own experience 🤷‍♂. A log level may incorporate some human judgement, but at the end of the day it's one more data point that, in addition to others, can make inspecting and filtering logs a lot less tedious and provide some hint to context/intent. If an error is thrown (or is uncaught), that's a fairly obvious categorization. Also, there is a strong case to be made for warnings w.r.t unexpected conditions. There's some judgement involved to categorization, sure, but those judgements can and should be made collectively as part of PR review w.r.t the developer's/team's priorities.

Also on local time I consider that more of a liability than a help due to the way that local clocks can change so dramatically ... As I write this comment/changed my timezone ...

Well, you did change your timezone. 😉Generally speaking, we wouldn't expect that kind of deliberate manipulation prior to users sharing their logs, so I would strongly disagree with this argument as well. Sure, there may be cases in which a timestamp may not be reflective of the user's locale, but generally speaking we can expect this to be representative. Again, this is something I've found helpful in my own experience as it's been another data point that's helpful in tracing user issues (e.g. if a user mentions a specific timeframe).

In summary, I don't expect any element to be perfect on its own, but the more information/context I have to debug a problem, the better. From my own experience, it would be more tedious not to have any additional context such as severity or timestamps(!). (Also, emphasis on user logs, as this is different from viewing streaming output in your console for development purposes, in which case you may/may not care as much about a timestamp).

I'm also going to emphasize that we're not re-inventing the wheel here. Logs for various applications we use every day (include other Automattic applications) incorporate exactly these items, and for good reason.

multiple PRs vs single

I'm fine with making this one giant PR. However I often find that large features/diffs tend to result in less thorough reviews, as the sheer volume of the changes can be overwhelming (which is understandable). Either approach has its limitations, and chances are I would be prompted to break things up if I went with One Giant PR™.

That being said, let me know and it would be no problem to close these initial PRs and simply point the final one at develop. 👍

Based on the deleted code it looks like debug was supposed to already be writing to a file?

Yep, it seems that was the intent at some point earlier in the application's history, but that idea wasn't fleshed out and no file was being written to whatsoever. It looks like part of the goal was to upload logs with crash reports, however crash reporting wasn't fully implemented. If and when we do get around to it, we now have logs we can upload with our crash reports (if we so choose).

dmsnell · 2020-03-25T01:02:21Z

Generally speaking, we wouldn't expect that kind of deliberate manipulation prior to users sharing their logs

😅

However I believe I've stated the case for the fact that we're adding to
we're not re-inventing the wheel here

Perfectly clear. In fact, this is the primary reason I'm hesitant about these changes. In this
case we are reinventing the wheel since we're replacing the existing logging functionality
with one that presumably does the same thing but in a completely different way 😉

I believe that the the size of this PR is probably a result of rebuilding the existing system
instead of using it and expanding it.

Yep, it seems that was the intent at some point earlier in the application's history, but that idea wasn't fleshed out and no file was being written to whatsoever.

Sounds like that was a bug and probably part of why this has been more work than it should have been. Would have been nice if we could have done this all in one small PR adding a menu option to download that file.

Based on your responses I don't think any of your original points were unclear or miscommunicated. In short, my feedback is that it seems like this work could be accomplished with much less by relying on debug and what's already there but I do the work to see how much it would take.

If you didn't explore using debug or try to reuse what's already built then I would encourage you to at least try. Otherwise it's your call.

nsakaimbo · 2020-03-25T23:27:17Z

I briefly explored using debug, but I encountered a number of low-hanging limitations w.r.t customizing log output that were cumbersome to work with. Even something as simple as generating and formatting a timestamp is clumsy at best (see here and here for examples). The same goes for adding custom elements such as severity. There's an open "master" issue against the debug repo which aggregates a lot of these customization pain points.

My assessment of debug is it's a tool that's primarily designed for use in a developer console or terminal, and not so much a tool that has an end-user environment in mind. Case-in-point: as far as I can tell, the library requires the DEBUG environment variable to be set for any logging to be output. (IMO this in and of itself is a deal breaker for our use case.)

There's probably clever hacks to all of the above issues. However there are clear benefits to a solution that addresses most (if not all) all of these items out-of the-box, while preserving behaviors we've been using to-date.

Regardless of underlying library (winston or debug), the logging interface requires updating in order to support log levels (e.g. replacing debug(...) with log.info(...)). This means all logging calls in the application would follow suit, and the extent of the diff would largely be the same. Going forward, however, an advantage of implementing a logging wrapper in this way is we can swap out logging libraries relatively seamlessly (i.e. isolate changes only to lib/logger).

belcherj

I'm happy with the justifications made for replacing debug. If we want we can re-evaluate in a future PR.

[logging] Add custom logger

1640464

nsakaimbo requested a review from a team March 23, 2020 14:17

nsakaimbo mentioned this pull request Mar 23, 2020

feat(logging): replace debug with custom logger #815

Merged

[logging] Fix typo in namespaces docs

3facc86

nsakaimbo added the [Status] Needs Review label Mar 23, 2020

nsakaimbo changed the title ~~[Feature] User Activity Logs [1]~~ [Feature] User Application Logs [1] Mar 23, 2020

dmsnell reviewed Mar 23, 2020

View reviewed changes

nsakaimbo added 2 commits March 26, 2020 10:24

Avoid existsSync race condition

d7d3aea

Don't use underscore to mark private

da7b276

nsakaimbo mentioned this pull request Mar 26, 2020

feat(logging): archive and export logs [3] #816

Merged

nsakaimbo added 5 commits March 26, 2020 14:34

Check for EEXIST error code

cedac6f

Try dir create w/ try-catch

717944e

Simplify mkdirSync call

8e6a540

Try with try/catch

029cb26

Let winston initialize log directory

929d053

nsakaimbo changed the title ~~[Feature] User Application Logs [1]~~ feat(logging): persist and export application logs Mar 30, 2020

Merge branch 'develop' into feature/activity-logging/add-custom-logger

750980c

belcherj approved these changes Apr 1, 2020

View reviewed changes

feat(logging): replace debug with lib/logger (#815)

7cb5327

nsakaimbo merged commit d2044bd into develop Apr 1, 2020

nsakaimbo deleted the feature/activity-logging/add-custom-logger branch April 1, 2020 22:17

matticbot removed the [Status] Needs Review label Apr 1, 2020

nsakaimbo mentioned this pull request Apr 2, 2020

chore: fix a couple references broken from conflict resolution #828

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(logging): persist and export application logs #814

feat(logging): persist and export application logs #814

nsakaimbo commented Mar 23, 2020 •

edited

dmsnell Mar 23, 2020

nsakaimbo Mar 26, 2020

nsakaimbo Mar 26, 2020

dmsnell Mar 23, 2020

nsakaimbo Mar 26, 2020 •

edited

dmsnell left a comment

nsakaimbo commented Mar 24, 2020 •

edited

dmsnell commented Mar 24, 2020

nsakaimbo commented Mar 25, 2020

dmsnell commented Mar 25, 2020

nsakaimbo commented Mar 25, 2020

belcherj left a comment

feat(logging): persist and export application logs #814

feat(logging): persist and export application logs #814

Conversation

nsakaimbo commented Mar 23, 2020 • edited

Description

Motivation and Context

Organization

[1] WP-Desktop Custom Logger Implementation

Key Features

dmsnell Mar 23, 2020

Choose a reason for hiding this comment

nsakaimbo Mar 26, 2020

Choose a reason for hiding this comment

nsakaimbo Mar 26, 2020

Choose a reason for hiding this comment

dmsnell Mar 23, 2020

Choose a reason for hiding this comment

nsakaimbo Mar 26, 2020 • edited

Choose a reason for hiding this comment

dmsnell left a comment

Choose a reason for hiding this comment

nsakaimbo commented Mar 24, 2020 • edited

dmsnell commented Mar 24, 2020

nsakaimbo commented Mar 25, 2020

dmsnell commented Mar 25, 2020

nsakaimbo commented Mar 25, 2020

belcherj left a comment

Choose a reason for hiding this comment

nsakaimbo commented Mar 23, 2020 •

edited

nsakaimbo Mar 26, 2020 •

edited

nsakaimbo commented Mar 24, 2020 •

edited