Wednesday, April 30, 2008

How did the Government statistician get his figures in the smacking referendum petition?

There's been a bit of misinformation around the smacking petition. I've seen news reports, blogs and media releases that have all provided incorrect figures, while correctly stating that the petition has been deemed by the Government Statistician not to have a valid 10% of signatures.

According to petitioners, they handed in 324,511 signatures on February 29, (not 324,316 as DPF noted). The sample was 1/11th - so 29,501 signatures were checked. That's a pretty high sample size meaning there is a higher margin of error.

Of those signatures checked, 25,754 were valid, meaning 3,747 were not. Of those signatories not qualified 3,373 could not be found on the electoral roll, 214 were illegible, 158 were duplicates and 2 triplicates. So 25,754 x 11 = 283,294. The number required is 285,027 so this indicates a shortfall of just 1,733 signatures, give or take errors.

However the Government’s Statistician's best estimate is just 266,903 or a shortfall of more than 18,000; nearly 17,300 greater than the 1/11th sample would indicate. The standard error is +/- 1600. He did a one sided hypothesis test at the 95% confidence level -and at the top confidence level (99%), that's 269,500. So why use the highest confidence level? How did he get his 266,903 number?

He used the estimator of Goodman and Kiranandana. No idea what that is. Can anyone tell me?

Update We have answers. Over at No Right Turn,Idiot Savant turned to google god to subvert the satanic statistics to assist me. Because there`s 160 replications, that suggests there will be a further 1600 replicates in the rest of the population at minimum, and up to a further 1,600 x 10 hidden replicates in the population as a whole. You also check the invalids (see comments for explanations of all this). So it may well not be as dodgy as Family First's Bob McCoskrie makes it out to be. Just pretty complicated.They`ll get the signatures.

Update 2After comments here, I have added in an extra sentence and shortened the post for clarity. And I see Gordon Copeland is also querying this.




Blogger Idiot/Savant said...

Answer here.

April 30, 2008 at 3:34 AM  
Blogger Graeme Edgeler said...

I wrote this in answer on Kiwiblog, so thought I'd drop it here too, though you seem to have the answer from I/S.

Dave - there were 160 signatures in the sample that were proved duplicated (because they appeared twice or three times within the sample).

If you were to then individually check the other samples (i.e. look at the remaining 10 1/11's), you would likely find a similar number within each sample. That is, you would find 160 signatures within each sample that were duplicated within that sample.

But this would ignore any signature that was in both sample A, and sample B. Or sample A, and sample C (0r any of the 110 combinations of elevenths).

You should consider, if they found 160 signatures that were duplicated within a sample, how many would they find if the compared that sample to the remaining 10/11's of signatures?), approximately another 1600.

Consider the next 1/11. There would be 160 internal duplicates, but also duplicates that look valid because you haven't checked everyone else, a further 1440. In the third 1/11 sample, there'd be another 160 internal duplicates, and 1280 external ones. Keep doing this and you get around 19,000 duplicates - not the 1760 you'd estimate.

April 30, 2008 at 8:13 AM  
Blogger Swimming said...

Thanks Graeme,
Yeah, that's why I stopped, I knew there must have been some compounding factor.

April 30, 2008 at 10:35 AM  
Blogger Graeme Edgeler said...

No problem - I realised after writing this that it might be helpful to consider this: even with the check done, it's technically possible that every signature in the sample is an illegal duplicate of another signature somewhere else in the petition.

April 30, 2008 at 11:09 AM  
Blogger Idiot/Savant said...

Because there`s 160 replications, that suggests there will be a further 1600 replicates in the rest of the population.

Not quite. 160 duplicates in the sample suggests another 1600 duplicates in the sample where the other match is in the rest of the population. Which, as Graeme points out, means an extra 17,600 duplicates over and above the number you'd get if you just counted internal duplicates.

April 30, 2008 at 4:00 PM  
Blogger Swimming said...

I/S quite right, what I meant when I wrote that was that there are at least 1600 replicates *plus* the other matches in the rest of the population. Have corrected the post to clarify that - as if you sample x, and y is a duplicate outside the sample, but x is removed as he is not on the electoral roll. Unsampled y is erroneously included if you use the formula outlined by Family First, but taken into account by the statistician.

April 30, 2008 at 4:29 PM  
Blogger Muerk said...

Ooooh. NOW I get it (well kind of). Thanks guys.

April 30, 2008 at 7:10 PM  
Blogger Swimming said...

it's technically possible that every signature in the sample is an illegal duplicate of another signature somewhere else in the petition.
And, given the illegible signatures, you`d never prove it as its pretty hard to match two illegible signatures, or an illegible one with its matching legible one. It's also technically possible that every signature in the sample is a triplicate, some of which have two in the sample :-) Unlikely, of course

Its also technically possible that none of the signatories outside the sample are on the electoral roll.

we could go on....

May 1, 2008 at 1:30 AM  
Blogger Idiot/Savant said...

Anyway, I don't think they'll have any trouble getting the extra signatures they need.

May 1, 2008 at 2:23 AM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home

Powered by Blogger

Clicky Web Analytics