Monday, 19 February 2018

Re: More diagnostics data from desktop

Will Cooke wrote on 14/02/18 15:22:
> We want to be able to focus our engineering efforts on the things that
> matter most to our users, and in order to do that we need to get some
> more data about sort of setups our users have and which software they
> are running on it.
> We would like to add a checkbox to the installer, exact wording TBD,
> but along the lines of "Send diagnostics information to help improve
> Ubuntu".  This would be checked by default.

I've just drafted a design for this. <> It's
basically a subset of the System Settings screen.

> The result of having that box checked would be:
> * Information from the installation would be sent over HTTPS to a
> service run by Canonical's IS team.  This would be saved to disk and
> sent on first boot once there is a network connection.

Information from the installation would be fascinating, for improvement
of the installer in particular. However, I don't think it would give you
an accurate idea about the "sort of setups our users have", for
improvement of Ubuntu in general. It could lead you to think that, for

* Internet connection is less common than it really is. (Because of
things like proxies, as mentioned by Ernst Sjöstrand, or
not-yet-installed wi-fi drivers. And because if people still can't
get online later, they might uninstall Ubuntu and you'll never get
a report.)

* Wired Internet connections are more common than they really are.
(Because they're being used temporarily during installation, while
wi-fi isn't working.)

* Typical screen resolution is lower than it really is. (Because
people don't tweak the resolution until after installation. And
because if they fail to do so, they might uninstall Ubuntu,
resulting in a report for a system that soon stops existing.)

* Bluetooth devices are much less common than they really are.

I think it would be much more interesting to measure these things month
by month.

>   The file
> containing this data would be available for the user to inspect.
> That data would include:
>    * Ubuntu Flavour
>    * Ubuntu Version
>    * Network connectivity or not

If I understand "on first boot once there is a network connection", that
would exclude devices that were offline until the second startup or later.

>    * CPU family
>    * RAM
>    * Disk(s) size
>    * Screen(s) resolution
>    * GPU vendor and model
>    * OEM Manufacturer
>    * Location (based on the location selection made by the user at
> install).  No IP information would be gathered
>    * Installation duration (time taken)
>    * Auto login enabled or not
>    * Disk layout selected
>    * Third party software selected or not
>    * Download updates during install or not
>    * LivePatch enabled or not
> * Popcon would be installed.  This will allow us to spot trends in
> package usage and help us to  focus on the packages which are of most
> value to our users.

This effectively singles out .deb package installation as the only thing
that should be reported periodically, with everything else reported
one-off. Is that just for ease of implementation, or is there a reason
not to report the other things periodically too?

For example, if we could see how often people change their reported
location, we'd have info on how accessible the time zone UI should be.
And if it turns out that only a tiny fraction of Livepatch users turn it
on during install, vs. afterwards, that would influence future installer

> Any user can simply opt out by unchecking the box, which triggers one
> simple POST stating, "diagnostics=false".

What is the purpose of this?

> There will be a
> corresponding checkbox in the Privacy panel of GNOME Settings to
> toggle the state of this.

This checkbox was implemented in Ubuntu 13.10 and later. I've just
tweaked the design to update the examples of metrics collected.

Juerg Haefliger wrote on 15/02/18 07:10:
> Please make this an opt-in rather than an opt-out. This just smells
> like a trend towards a Windows/Android installation where you have to
> unset gazillions of check boxes to prevent the machine from posting
> your life to the vendor. We shouldn't go there.

The diagnostics checkbox was introduced in System Settings five years
ago. So if this "smells like a trend", it's a glacially slow trend.

One characteristic of "big data" is that users often can't be expected
to foresee the kinds of ways the data can be combined in future. So if
there is a privacy problem, making collection opt-in won't necessarily
solve it.

On the other hand, making it opt-in would give us results that were just
as useful as making it opt-out, unless the resulting sample was (a) too
small to be useful or (b) too biased in some way.


ubuntu-devel mailing list
Modify settings or unsubscribe at: