The roads I take...

KaiRo's weBlog

Mai 2016
1
2345678
9101112131415
16171819202122
23242526272829
3031

Zeige Beiträge veröffentlicht im Mai 2016 und mit "stability" gekennzeichnet an. Zurück zu allen aktuellen Beiträgen

Populäre Tags: Mozilla, SeaMonkey, L10n, Status, Firefox

Verwendete Sprachen: Deutsch, Englisch

Archiv:

Juli 2023

Februar 2022

März 2021

weitere...

16. Mai 2016

Tools I Wrote for Crash (Stats) Analysis

Now that I'm off the job that dominated my life (and almost burned me out) for the last years, I finally have some time again to blog. And I'll start with stuff I actually did for that job, as I still am happy to help others to continue from where I left.

The more fun part of the stability management job was actually creating new analysis - and tools. And those tools are still helpful to people working on crash analysis or crash stats analysis now - so as my last task on the job, I wrote some documentation for the tools I had created.

One of the first things I created (and which was part of the original job description when I started) was a prototype for detecting crash "explosiveness", i.e. a detector for crashes that are rising significantly in volume. This turned out to be quite helpful for me and others to use, and the newest reports of it are listed in my Report Overview. I probably should talk about it in more detail at some point, but I did write up a plan on the wiki for the tool, and the (PHP) code is on hg.m.o (that was the language I knew best and gave me the fastest result for a prototype). I had plans to port/rewrite it in python, but didn't get to it. Calixte, who is looking after most of "my" tools now, is working on that though, and I have already promised to review his work as a volunteer so we can make sure we have this helpful capability in better code (and hopefully better UI in the end) for future use.

In general, I have created one-line docs for all the PHP scripts I had in the Mercurial repository, and put them into the run-reports script that is called by a daily cron job. Outside of the explosiveness script, most of those have been obsoleted by Socorro Super Search (yay for Adrian's work and for the ElasticSearch backend!) nowadays.

Also, the scripts that generate the summed-up data for Are We Stable Yet dashboard and graphs (also see an older blog post discussing the graphs) have been ported to python (thanks Peter for helping me to get started there) - and those are available in the Magdalena repository on GitHub. You'll see that this repository doesn't just have more modern code, using python instead of PHP and the public Socorro API instead of private PostgreSQL access, it also has a decent README documenting what it and every script in it does. :)

The most important tools for people analyzing crash stats are in the Datil repository on GitHub (and its deployment on crash-analysis), though. I used all those 4 dashboards/tools daily in the last months to determine what to report to Release Managers and other parties, find out what we need to file as bugs and/or push to get fixed. Datil, like Magdalena, has good docs right in the repository now, readable directly on GitHub.

So, what's there?
Well, the before-mentioned "Are We Stable Yet" dashboard and graphs, for sure (see the longtermgraph docs for what graphs you can get and a legend of what the lines mean).
There's also a tool/prototype for "what's important" weighed top crash lists that I called "Top Crash Score", see the score docs for what it does and examples on how to use that tool.
And finally, I created a search query comparison tools that did let me answer questions like "which crashes happen more with or without multi-process support (e10s) being active?" or "which crashes have vanished with the new beta and which have appeared (instead)?" - which was incredibly helpful to me at least. Read the searchcompare docs for more details and examples.

I probably won't spend a lot of time with those tools any more, neither in usage nor in development, but I'm still happy about people using them, giving me feedback, and I'm also happy to review and merge pull requests that feel like making sense to me!

Von KaiRo, um 22:33 | Tags: analysis, CrashKill, explosiveness, Mozilla, stability | keine Kommentare | TrackBack: 0

4. Mai 2016

Projects Done, Looking For New Ones

I haven't been blogging much recently, but it's time to change that - like multiple things in my life that are changing right now.

I'll start with the most important piece first: My contract with Mozilla is ending in a week.

I had been accumulating frustration with pieces of my role that were founded in somewhat tedious routine like the whack-a-mole on crash spikes which was not very rewarding as well as never really giving time to breath and then overworking myself trying to get the needed success experiences in things like building dashboards and digging into data (which I really liked).
Being very passionate about Mozilla's Mission and Manifesto and identifying with the goals of my role I could for years paper over this frustration and fatigue but it kept building up in the background until it started impairing my strongest skill: communication with other people.

So, we had to call an end to this particular project - a role like this is never "finished", but it's also far from "failed" as I accomplished quite a bit over those 5 years, in various variants of the role.

After some cooldown and getting this out of my system, I'm happy to take on a new role of project management, possibly combined with some data analysis, somewhere, hopefully in an innovative area that aligns with my interests and possibly my passion for people being in control of their own lives.

As for Mozilla, no matter if an opportunity for work comes up there, I will surely stay around in the community, as I was before - after all, I still believe in the project and our mission and expect to continue to do so.

In other project management news, I just successfully finished the project of taking over my new condo and move in within a week. It took quite some coordination and planning beforehand, being prepared for last-minute changes, communicating well with all the different involved people and making informed but swift decisions at times - and it worked out perfectly. Sure, to put it into IT terms, there are still a few "bugs" left (some already fixed) and there's still a lot of followup work to do (need more furniture etc.) but the project "shipped" on time.

I'm looking forward to doing the same for future work projects, wherever they will manifest.

Von KaiRo, um 16:51 | Tags: burnout, CrashKill, Mozilla, project management, stability, stress | keine Kommentare | TrackBack: 0

Feeds: RSS/Atom