Thursday, 22 February 2018

Re: More diagnostics data from desktop

Robie Basak wrote on 21/02/18 14:10:
> On Wed, Feb 21, 2018 at 12:18:35PM +0000, Matthew Paul Thomas wrote:
>> Now, I'm not a statistician, so maybe I've made a silly
>> miscalculation or misunderstanding.
> I'm also not a statistician. The assumption you're making is that the
> people who opt-in would be a representative sample of the whole
> population, and I think this is quite plainly not true.

I'm not making that assumption. Originally I wrote, "making it opt-in
would give us results that were just as useful as making it opt-out,
unless the resulting sample was (a) too small to be useful or (b)
*too biased in some way*" (emphasis added). And when I'd calculated that
(a) almost certainly wouldn't be an issue, I said, "If you were planning
to do any sub-sample analysis, *or reweighting for known biases*, then
the original sample would need to be bigger" (emphasis added).

It might be the case that for Ubuntu, opt-out is substantially biased
too, because the people who opt out are both numerous enough and
different enough from the rest.

If we're going to tackle sampling bias, that's great, but we should have
an actual mathematical plan for doing it. (For which, again, consult a
statistician.) Making it opt-out might be part of an effective plan, or
it might not.


ubuntu-devel mailing list
Modify settings or unsubscribe at: