<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" >

<channel><title><![CDATA[INTEGRATIVE STATISTICS - Blog]]></title><link><![CDATA[https://www.integrativestatistics.com/blog]]></link><description><![CDATA[Blog]]></description><pubDate>Tue, 18 Nov 2025 11:35:55 -0500</pubDate><generator>Weebly</generator><item><title><![CDATA[Plot the Data and the Answer Emerges]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/plot-the-data-and-the-answer-emerges]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/plot-the-data-and-the-answer-emerges#comments]]></comments><pubDate>Mon, 18 Aug 2025 15:31:37 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/plot-the-data-and-the-answer-emerges</guid><description><![CDATA[In this two-page .pdf file, see how puzzling patterns in student test scores become abundantly clear with the use of data visualization.  			  			 				 					Your browser does not support viewing this document. Click here to download the document. 				 				 				  				 			 [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">In this two-page .pdf file, see how puzzling patterns in student test scores become abundantly clear with the use of data visualization.</div>  <div class="wsite-scribd">			  			 				<div id="828989790854164873-pdf-fallback" style="display: none;"> 					Your browser does not support viewing this document. Click <a href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/data_visualization_solves_a_higher-education_puzzle.pdf" target="_blank" rel="noopener noreferrer">here</a> to download the document. 				</div> 				<div id="828989790854164873-pdf-embed" style="display: none; height: 500px;"> 				</div>  				 			</div>]]></content:encoded></item><item><title><![CDATA[The Truth About Uplift Modeling]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/the-truth-about-uplift-modeling]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/the-truth-about-uplift-modeling#comments]]></comments><pubDate>Fri, 08 Aug 2025 19:39:05 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/the-truth-about-uplift-modeling</guid><description><![CDATA[You&rsquo;ve designed a program or intervention to try to influence outcomes for a large number of people.&nbsp; Whom should you target:&nbsp; those closest to deciding the desirable way, to nudge them &ldquo;over the line&rdquo;?&nbsp; Those farthest away?&nbsp; Or those on the fence?&nbsp; Which strategy will have the greatest impact?&nbsp;An analyst and VP at Fidelity Investments, Victor S. Y. Lo, has devised and evaluated uplift models&nbsp; for topics including election email campaigns, per [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">You&rsquo;ve designed a program or intervention to try to influence outcomes for a large number of people.&nbsp; Whom should you target:&nbsp; those closest to deciding the desirable way, to nudge them &ldquo;over the line&rdquo;?&nbsp; Those farthest away?&nbsp; Or those on the fence?&nbsp; Which strategy will have the greatest impact?<br />&nbsp;<br />An analyst and VP at Fidelity Investments, <a href="https://www.niss.org/people/victor-sy-lo">Victor S. Y. Lo</a>, has devised and evaluated <em><a href="https://en.wikipedia.org/wiki/Uplift_modelling" target="_blank">uplift models</a></em>&nbsp; for topics including election email campaigns, personalized medicine, credit card marketing, and supply-chain modeling.&nbsp; I thought Victor might have an answer to the question, "at what place along the likelihood spectrum is a person most influenceable by an intervention?&rdquo;&nbsp; That is, for greatest &ldquo;lift,&rdquo; is it the high-likelihood people who should be targeted, or the low, or those on the fence?&nbsp;<br />&nbsp;<br />His ultimate answer was that it&rsquo;s context-dependent.&nbsp; No research has shown that it's <em>generally</em> a certain sort of person for whom an intervention will have the most effect.&nbsp; It could be one answer for getting young second-generation Americans in the Northeast to apply for credit cards, and another answer for getting middle-aged California diabetics to avoid hospital readmission.&nbsp; In each context the uplift research will point your way forward.&nbsp; This lesson has been borne out in my own work over the years in higher education and health care.<br />&nbsp;<br />The takeaway here is to distrust any blanket statement claiming that you should always give the greatest focus to those people at a certain place along the probability spectrum.</div>]]></content:encoded></item><item><title><![CDATA[US College Enrollment:  What Matters Most]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/us-college-enrollment-what-matters-most]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/us-college-enrollment-what-matters-most#comments]]></comments><pubDate>Mon, 30 Dec 2024 14:30:26 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/us-college-enrollment-what-matters-most</guid><description><![CDATA[In a well-conceived 2008&nbsp;article,&nbsp;Su Jin Jez&nbsp;examines the factors affecting whether a given US student will attend college.&nbsp; It is no small feat to establish cause and effect relationships when factors of interest -- race, academic preparation, and family wealth -- can all so easily confound one another.Jez astutely uses statistical analysis (regression) to disentangle these complex relationships.&nbsp; Working&nbsp;out of a UC Berkeley think tank, the author draws on a large [...] ]]></description><content:encoded><![CDATA[<div class="paragraph"><span style="color:rgb(2, 37, 81)">In a well-conceived 2008&nbsp;<a href="https://files.eric.ed.gov/fulltext/ED503340.pdf" target="_blank">article</a>,&nbsp;</span>Su Jin Jez<span style="color:rgb(2, 37, 81)">&nbsp;examines the factors affecting whether a given US student will attend college.&nbsp; It is no small feat to establish cause and effect relationships when factors of interest -- race, academic preparation, and family wealth -- can all so easily confound one another.</span><br /><br /><span style="color:rgb(2, 37, 81)">Jez astutely uses statistical analysis (regression) to disentangle these complex relationships.&nbsp; Working&nbsp;out of a UC Berkeley think tank, the author draws on a large-scale, nationally-representative study of data from the Integrated Postsecondary Education Data System (IPEDS).&nbsp; Her research shows, in a nutshell, that</span><ul style="color:rgb(2, 37, 81)"><li>when you control for wealth, race is no factor in college enrollment;</li><li>when you control for academic preparation, wealth is no factor either!</li></ul> <span style="color:rgb(2, 37, 81)">So students from wealthier families will more likely attend, regardless of race, and this connection is true only because wealth is paired with better academic preparation.</span><br /><br /><span style="color:rgb(2, 37, 81)">The article is long on methods and findings and a little short on discussion of implications for K-12 or higher education.&nbsp; But even so it constitutes a good example of the way one can use sequential regression, in planned stages, to clarify otherwise puzzling relationships among variables.</span></div>]]></content:encoded></item><item><title><![CDATA[How to Read a (Good) Research Abstract]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/how-to-read-a-good-research-abstract]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/how-to-read-a-good-research-abstract#comments]]></comments><pubDate>Fri, 08 Nov 2024 21:40:59 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/how-to-read-a-good-research-abstract</guid><description><![CDATA[       Just how risky is that breakfast?&nbsp; I like&nbsp;this abstract&nbsp;published in the Journal of the American Medical Association.&nbsp; It describes a study of the effect of egg intake on mortality risk over the course of 17.5 years of follow-up.&nbsp; &nbsp;If you know how to read and interpret the abstract's key points, you can translate them into very meaningful, concrete terms.&nbsp; Here's how to read the abstract's key sentence:"Each additional half an egg consumed per day was si [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/data-scrabble-style.jpg?1731103622" alt="Picture" style="width:219;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><span style="color:rgb(2, 37, 81)">Just how risky is that breakfast?&nbsp; I like&nbsp;</span><a href="https://jamanetwork.com/journals/jama/article-abstract/2728487" target="_blank">this abstract</a><span style="color:rgb(2, 37, 81)">&nbsp;published in the Journal of the American Medical Association.&nbsp; It describes a study of the effect of egg intake on mortality risk over the course of 17.5 years of follow-up.&nbsp; &nbsp;<br /><br />If you know how to read and interpret the abstract's key points, you can translate them into very meaningful, concrete terms.&nbsp; Here's how to read the abstract's key sentence:</span><br /><br /><strong><span style="color:rgb(2, 37, 81)">"Each additional half an egg consumed per day was significantly associated with higher risk of [...]&nbsp; mortality (adjusted HR, 1.08 [95% CI, 1.04-1.11]; adjusted ARD, 1.93% [95% CI, 1.10%-2.76%])."</span></strong><br /><br /><strong style="color:rgb(2, 37, 81)">1. Translation</strong><span style="color:rgb(2, 37, 81)">:&nbsp; For each extra half an egg eaten per day, mortality from all causes was estimated as...</span><ul style="color:rgb(2, 37, 81)"><li>8% higher in relative terms (multiplied), which meant</li><li>1.93% higher in absolute terms (added).</li></ul><br /><strong><span style="color:rgb(2, 37, 81)">2. Three definitions to help decipher that key sentence:</span></strong><ul style="color:rgb(2, 37, 81)"><li>HR = hazard ratio:&nbsp; the <u>multiplier</u> for the risk of mortality linked with half an egg extra per day, while controlling for other factors.</li><li>ARD = absolute risk difference:&nbsp; the <u>added</u>&nbsp;extra risk, in percentage points.</li><li>CI = confidence interval:&nbsp; a reasonable range for each result.&nbsp;</li></ul><br /><strong style="color:rgb(2, 37, 81)">3. The findings in concrete terms</strong><span style="color:rgb(2, 37, 81)">:</span><ul style="color:rgb(2, 37, 81)"><li>Background fact:&nbsp; 21% of all study participants died of whatever cause during the 17.5 years.</li><li>For those who ate half an egg above the average per day, mortality risk was estimated as 21% times the HR of 1.08,&nbsp;or 23%.&nbsp;</li><li>This works out to 21% plus 1.93 percentage points, which also is 23%.</li><li>Using the CI for the ARD, for these extra egg-eating people we arrive at a range for mortality risk between 22% and 24%.&nbsp; The CI tells us that we add at least 1.10% of risk.&nbsp; With 95% confidence, then, the added risk is greater than zero; thus at the .05 level the finding is statistically significant.</li></ul><br /><span style="color:rgb(2, 37, 81)"><strong>4. Does it matter?</strong><br />Armed with this information we can each make our own informed decision as to whether the findings have not just&nbsp;</span><em style="color:rgb(2, 37, 81)"><a href="https://www.integrativestatistics.com/blog/what-is-statistical-significance">statistical significance</a></em><span style="color:rgb(2, 37, 81)">&nbsp;but&nbsp;</span><a href="http://www.yellowbrickstats.com/insidious.htm" target="_blank"><em>practical</em>&nbsp;<em>significance</em></a><span style="color:rgb(2, 37, 81)">.&nbsp; What do you think?&nbsp; Would you skip an extra omelette per week to avoid a 2-percentage-point increase in 17.5-year mortality?</span><br /><br /><span style="color:rgb(2, 37, 81)">&#8203;</span><u style="color:rgb(2, 37, 81)">Contact:&nbsp; Info@IntegrativeStatistics.com</u></div>]]></content:encoded></item><item><title><![CDATA[Showing Cause and Effect:  4 Approaches]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/showing-cause-and-effect-4-approaches]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/showing-cause-and-effect-4-approaches#comments]]></comments><pubDate>Sat, 19 Oct 2024 19:03:34 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/showing-cause-and-effect-4-approaches</guid><description><![CDATA[       Analysts, researchers and statisticians&nbsp;tend to fall into one of four categories with respect to the way they handle claims about causation.A.&nbsp; Many who report bivariate results&nbsp;(e.g., correlations or group differences)&nbsp;as if they indicate causal effects, plain and simple.&nbsp; This approach is unfortunately common among most of us when just beginning our research careers.&nbsp; I encounter this group frequently when I peer-review manuscripts.B.&nbsp; Some who build i [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/cause-effect.jpg?1729364825" alt="Picture" style="width:217;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><span style="color:rgb(2, 37, 81)">Analysts, researchers and statisticians</span><span style="color:rgb(2, 37, 81)">&nbsp;tend to fall into one of four categories with respect to the way they handle claims about causation.</span><br /><br /><strong style="color:rgb(2, 37, 81)">A.&nbsp; Many who report bivariate results&nbsp;</strong><span style="color:rgb(2, 37, 81)">(e.g., correlations or group differences)&nbsp;</span><strong style="color:rgb(2, 37, 81)">as if they indicate causal effects, plain and simple.</strong><span style="color:rgb(2, 37, 81)">&nbsp; This approach is unfortunately common among most of us when just beginning our research careers.&nbsp; I encounter this group frequently when I peer-review manuscripts.</span><br /><br /><strong style="color:rgb(2, 37, 81)">B.&nbsp; Some who build in controls</strong><span style="color:rgb(2, 37, 81)">&nbsp;for obvious variables or variables easily obtainable&nbsp;</span><strong style="color:rgb(2, 37, 81)">and then&nbsp;&nbsp;</strong><span style="color:rgb(2, 37, 81)">(maybe with unjustified optimism, or even hubris)&nbsp;&#8203;</span><strong style="color:rgb(2, 37, 81)">report those results as if they indicate causal effects.</strong><br /><br /><strong style="color:rgb(2, 37, 81)">C.&nbsp; Some</strong><strong style="color:rgb(2, 37, 81)">&nbsp;who</strong><span style="color:rgb(2, 37, 81)">&nbsp;try their best to use statistical or other means to control for relevant variables as fits the situation; who try multiple methods; and who then take pains to&nbsp;</span><strong style="color:rgb(2, 37, 81)">report those results as indicating causal effects&nbsp;<em>to one degree or another</em>.</strong><br /><br /><strong style="color:rgb(2, 37, 81)">D.&nbsp; A few purists</strong><span style="color:rgb(2, 37, 81)">&nbsp;such as Gregory Miller, Jean Chapman, Donald Rubin, and Elazar Pedhazur.&nbsp; They stand by the phrase "no causation without randomization" and claim that almost no published analysis of non-experimental data ever succeeds at revealing causal effects.</span><br /><br /><span style="color:rgb(2, 37, 81)">Being a card-carrying member of group C, I am constantly on the lookout for good ways to isolate causes and effects using quantitative methods.&nbsp; Somewhere I ran across the following ingenious approach.</span><br /><br /><span style="color:rgb(2, 37, 81)">Suppose we want to explain the daily volume of men's shoe sales using the amount of money spent daily on radio ads for a store's men's shoes.&nbsp; Sometimes such ads coincide with days when shoe sales are high anyway.&nbsp; We might say there's "anticipation" in the timing of the ads.&nbsp; So on that score there would be correlation even in the absence of causation.&nbsp;</span><br /><br /><span style="color:rgb(2, 37, 81)">A surprisingly helpful approach is to see whether that correlation is much higher than the correlation between the amount of advertising money spent and the volume of sales at another shoe store down the street.&nbsp;Or, the first store's sales of women's shoes.</span><br /><br /><span style="color:rgb(2, 37, 81)">This is a low-tech method requiring no specialized statistical skills, but it promises to productively isolate the connection of interest, keeping at bay the confounding variable that threatens the causal claim.</span><br /><br /><span style="color:rgb(2, 37, 81)">What smart quantitative designs have you encountered lately?<br /><br />***<br />&#8203;</span><br /><u style="color:rgb(2, 37, 81)">Contact:&nbsp; Info@IntegrativeStatistics.com</u><br /><span style="color:rgb(2, 37, 81)">&#8203;</span></div>]]></content:encoded></item><item><title><![CDATA[When Statistics Are MAGIC]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/august-27th-2024]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/august-27th-2024#comments]]></comments><pubDate>Tue, 27 Aug 2024 17:52:28 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/august-27th-2024</guid><description><![CDATA[       Whether judging the worth of someone else&rsquo;s statistical work or your own, it&rsquo;s always helpful to keep in mind the &ldquo;MAGIC&rdquo; criteria spelled out by Robert Abelson in&nbsp;Statistics As Principled Argument.&nbsp; Here are the five:"1. &nbsp;Magnitude:&nbsp; How big is the effect?"&nbsp; Statistically significant&nbsp;or not, is the size of the effect&nbsp;too trivial to matter?"2. &nbsp;Articulation:&nbsp; How precisely stated is it?"&nbsp; Is the finding too vague or [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/ruby-slippers.jpeg?1724781351" alt="Picture" style="width:222;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph" style="text-align:left;"><span style="color:rgb(2, 37, 81)">Whether judging the worth of someone else&rsquo;s statistical work or your own, it&rsquo;s always helpful to keep in mind the &ldquo;MAGIC&rdquo; criteria spelled out by Robert Abelson in&nbsp;</span><em style="color:rgb(2, 37, 81)">Statistics As Principled Argument.</em><span style="color:rgb(2, 37, 81)">&nbsp; Here are the five:</span><br /><br /><span style="color:rgb(2, 37, 81)">"1. &nbsp;</span><strong style="color:rgb(2, 37, 81)">M</strong><span style="color:rgb(2, 37, 81)">agnitude:&nbsp; How big is the effect?"&nbsp; </span><a href="https://www.integrativestatistics.com/blog/what-is-statistical-significance">Statistically significant</a><span style="color:rgb(2, 37, 81)">&nbsp;or not, is the size of the effect&nbsp;</span><a href="https://www.yellowbrickstats.com/insidious.htm" target="_blank">too trivial to matter</a><span style="color:rgb(2, 37, 81)">?</span><br /><span style="color:rgb(2, 37, 81)">"2. &nbsp;</span><strong style="color:rgb(2, 37, 81)">A</strong><span style="color:rgb(2, 37, 81)">rticulation:&nbsp; How precisely stated is it?"&nbsp; Is the finding too vague or muddled to be useful?</span><br /><span style="color:rgb(2, 37, 81)">"3. &nbsp;</span><strong style="color:rgb(2, 37, 81)">G</strong><span style="color:rgb(2, 37, 81)">enerality:&nbsp; How widely does it apply?"&nbsp; Does it only matter for one city, one college major, one health condition?</span><br /><span style="color:rgb(2, 37, 81)">"4. &nbsp;</span><strong style="color:rgb(2, 37, 81)">I</strong><span style="color:rgb(2, 37, 81)">nterest:&nbsp; How interesting is it?"&nbsp; Will it get anyone&rsquo;s attention?</span><br /><span style="color:rgb(2, 37, 81)">"5. &nbsp;</span><strong style="color:rgb(2, 37, 81)">C</strong><span style="color:rgb(2, 37, 81)">redibility:&nbsp; How believable is it?"&nbsp; Not that counter-intuitive findings should be ignored.&nbsp; But they should be especially questioned: &nbsp;extraordinary claims require extraordinary evidence.</span><br /><br /><span style="color:rgb(2, 37, 81)">These five criteria are worth keeping close at hand to help you decide when a statistical finding is really actionable.</span><br /><span style="color:rgb(2, 37, 81)">&#8203;</span><br /><u style="color:rgb(2, 37, 81)">Contact:&nbsp; Info@IntegrativeStatistics.com</u><br /><span style="color:rgb(2, 37, 81)">&#8203;</span></div>]]></content:encoded></item><item><title><![CDATA[Watch Out for Unsound Research Practices]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/watch-out-for-unsound-research-practices]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/watch-out-for-unsound-research-practices#comments]]></comments><pubDate>Tue, 23 Jul 2024 11:14:18 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/watch-out-for-unsound-research-practices</guid><description><![CDATA[A savvy consumer or sponsor of research must avoid being taken in by common tricks and fallacies.&nbsp; You&rsquo;ve heard of unsound practices such as cherry-picking and fishing expeditions.&nbsp; Maybe you&rsquo;ve heard of the Texas Sharpshooter Fallacy.&nbsp; Can you put your finger on why these are problematic?         &#8203;In cherry-picking, results are chosen and presented that best fit the idea being promoted, at the exclusion of the other findings.&nbsp; In other words, what you're sh [...] ]]></description><content:encoded><![CDATA[<div class="paragraph" style="text-align:left;"><span>A savvy consumer or sponsor of research must avoid being taken in by common tricks and fallacies.&nbsp; You&rsquo;ve heard of unsound practices such as <strong>cherry-picking</strong> and <strong>fishing expeditions</strong>.&nbsp; Maybe you&rsquo;ve heard of the <strong>Texas Sharpshooter Fallacy</strong>.&nbsp; Can you put your finger on why these are problematic?</span></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/cherry-picking_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph" style="text-align:left;">&#8203;In <strong>cherry-picking</strong>, results are chosen and presented that best fit the idea being promoted, at the exclusion of the other findings.&nbsp; In other words, what you're shown is a biased selection.<br />&#8203;</div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/net2.png?1721733996" alt="Picture" style="width:189;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph" style="text-align:left;">&#8203;A <strong>fishing expedition</strong> is related.&nbsp; In this questionable practice, researchers continue to seek out findings (whether group differences, relationships, or what have you) until they come upon some that support their desired position.&nbsp; They analyze for as long as it takes to find the &ldquo;right&rdquo; results.&nbsp; Then they report those, downplaying or excluding all the others obtained along the way.&nbsp; (Another related term:&nbsp; &ldquo;torturing the data until they confess.&rdquo;)<br />&#8203;</div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/target.png?1721733576" alt="Picture" style="width:163;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph" style="text-align:left;">The <strong>Texas Sharpshooter Fallacy </strong>is related as well.&nbsp; Imagine a person who sprays the side of a barn with a shotgun.&nbsp; Then he walks up to the barn and locates a spot where a few hits have formed a tight cluster.&nbsp; He paints a target around these; paints <em>over</em> all the rest; and proudly proclaims that "the target" is where he was aiming all along.<br /><br />Underlying all three types of errors is the principle that <em>the more analyses one conducts on a given topic, the greater the chance of a false positive.</em>&nbsp; In a false positive, one is fooled into thinking a result is noteworthy when in fact it is caused by nothing more than chance.<br /><br />In addition, all three can be seen as examples of the unsound practice of Hypothesizing After Results are Known, or HARKing.&nbsp; HARKing is opportunistic; it inadvisedly focuses on what often turn out to be chance findings.&nbsp; You will find these types of errors discussed in the context of the Multiple Comparison Problem and, more subtly, The Garden of Forking Paths as described by leading statistician <a href="https://statmodeling.stat.columbia.edu/2021/03/16/the-garden-of-forking-paths-why-multiple-comparisons-can-be-a-problem-even-when-there-is-no-fishing-expedition-or-p-hacking-and-the-research-hypothesis-was-posited-ahead-of-time-2/" target="_blank">Andrew Gelman in his blog</a>.<br />&#8203;<br />Recognizing these errors when others fall for them will make you a savvier interpreter of research.&nbsp; Avoiding these types of mistakes will go a long way toward making your own work more sound.&nbsp;<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Contact:&nbsp; Info@IntegrativeStatistics.com</div>]]></content:encoded></item><item><title><![CDATA[Special Considerations for Evaluating Statistical Evidence on Police Impartiality or Bias]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/guidelines-for-evaluating-statistical-evidence-on-police-impartiality-or-bias]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/guidelines-for-evaluating-statistical-evidence-on-police-impartiality-or-bias#comments]]></comments><pubDate>Thu, 14 Sep 2023 15:02:49 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/guidelines-for-evaluating-statistical-evidence-on-police-impartiality-or-bias</guid><description><![CDATA[Statistical evidence on this topic has become pivotal to&nbsp;increasingly&nbsp;&#8203;many criminal cases.&nbsp; In the tradition of&nbsp;Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993), defense attorneys, prosecutors, and judges increasingly seek to evaluate the soundness, validity, or credibility of findings&nbsp;created by statistician-experts.&nbsp;&nbsp;This 9-page&nbsp;.pdf&nbsp;piece offers evaluation guidelines tailored to such cases and discusses pitfalls to avoid.    [...] ]]></description><content:encoded><![CDATA[<div class="paragraph"><span style="color:rgb(2, 37, 81)">Statistical evidence on this topic has become pivotal to&nbsp;</span><span style="color:rgb(2, 37, 81)">increasingly&nbsp;</span><span style="color:rgb(2, 37, 81)">&#8203;many criminal cases.&nbsp; In the tradition of&nbsp;</span><em style="color:rgb(2, 37, 81)">Daubert v. Merrell Dow Pharmaceuticals, Inc</em><span style="color:rgb(2, 37, 81)">., 509 U.S. 579 (1993), defense attorneys, prosecutors, and judges increasingly seek to evaluate the soundness, validity, or credibility of findings&nbsp;</span><span style="color:rgb(2, 37, 81)">created by statistician-experts.&nbsp;</span><span style="color:rgb(2, 37, 81)">&nbsp;This 9-page&nbsp;<a href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/long_daubert___acceptance_of_methods_-_roland_b_stark_-_sep._13_2023.pdf">.pdf</a>&nbsp;piece offers evaluation guidelines tailored to such cases and discusses pitfalls to avoid.</span></div>  <div><div style="margin: 10px 0 0 -10px"> <a title="Download file: long_daubert___acceptance_of_methods_-_roland_b_stark_-_sep._13_2023.pdf" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/long_daubert___acceptance_of_methods_-_roland_b_stark_-_sep._13_2023.pdf"><img src="//www.weebly.com/weebly/images/file_icons/pdf.png" width="36" height="36" style="float: left; position: relative; left: 0px; top: 0px; margin: 0 15px 15px 0; border: 0;" /></a><div style="float: left; text-align: left; position: relative;"><table style="font-size: 12px; font-family: tahoma; line-height: .9;"><tr><td colspan="2"><b> long_daubert___acceptance_of_methods_-_roland_b_stark_-_sep._13_2023.pdf</b></td></tr><tr style="display: none;"><td>File Size:  </td><td>279 kb</td></tr><tr style="display: none;"><td>File Type:  </td><td> pdf</td></tr></table><a title="Download file: long_daubert___acceptance_of_methods_-_roland_b_stark_-_sep._13_2023.pdf" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/long_daubert___acceptance_of_methods_-_roland_b_stark_-_sep._13_2023.pdf" style="font-weight: bold;">Download File</a></div> </div>  <hr style="clear: both; width: 100%; visibility: hidden"></hr></div>]]></content:encoded></item><item><title><![CDATA[Analyzing Student Retention - a Peculiar Case Study, with Regression]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/analyzing-student-retention-a-peculiar-case-study-with-regression]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/analyzing-student-retention-a-peculiar-case-study-with-regression#comments]]></comments><pubDate>Tue, 26 Jul 2022 20:10:31 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/analyzing-student-retention-a-peculiar-case-study-with-regression</guid><description><![CDATA[Presenting at the Best Practice Solutions higher-education Enrollment Management Symposium in Philadelphia (July 22, 2022) spurred me to solve a peculiar problem.&nbsp; The way financial aid related to attrition at a certain selective NY college had been stymying me.&nbsp; The solution took actual thought.&nbsp; You might find it instructive or entertaining.&#8203;Techniques includedCorrelationLinear regressionLogistic regressionA variety of data graphicsIt's about a 10-minute read.&nbsp; Enjoy! [...] ]]></description><content:encoded><![CDATA[<div class="paragraph">Presenting at the Best Practice Solutions higher-education Enrollment Management Symposium in Philadelphia (July 22, 2022) spurred me to solve a peculiar problem.&nbsp; The way financial aid related to attrition at a certain selective NY college had been stymying me.&nbsp; The solution took actual thought.&nbsp; You might find it instructive or entertaining.<br />&#8203;<br />Techniques included<ul><li>Correlation</li><li>Linear regression</li><li>Logistic regression</li><li>A variety of data graphics</li></ul><br />It's about a 10-minute read.&nbsp; Enjoy!&nbsp;</div>  <div><div style="margin: 10px 0 0 -10px"> <a title="Download file: analyzing_student_retention_-_a_peculiar_case_study_with_regression.pdf" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/analyzing_student_retention_-_a_peculiar_case_study_with_regression.pdf"><img src="//www.weebly.com/weebly/images/file_icons/pdf.png" width="36" height="36" style="float: left; position: relative; left: 0px; top: 0px; margin: 0 15px 15px 0; border: 0;" /></a><div style="float: left; text-align: left; position: relative;"><table style="font-size: 12px; font-family: tahoma; line-height: .9;"><tr><td colspan="2"><b> analyzing_student_retention_-_a_peculiar_case_study_with_regression.pdf</b></td></tr><tr style="display: none;"><td>File Size:  </td><td>213 kb</td></tr><tr style="display: none;"><td>File Type:  </td><td> pdf</td></tr></table><a title="Download file: analyzing_student_retention_-_a_peculiar_case_study_with_regression.pdf" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/analyzing_student_retention_-_a_peculiar_case_study_with_regression.pdf" style="font-weight: bold;">Download File</a></div> </div>  <hr style="clear: both; width: 100%; visibility: hidden"></hr></div>]]></content:encoded></item><item><title><![CDATA[The St. Petersburg Paradox:  Infinite Payouts, and Code You Can Try]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/the-st-petersburg-paradox-infinite-payouts-and-code-you-can-try]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/the-st-petersburg-paradox-infinite-payouts-and-code-you-can-try#comments]]></comments><pubDate>Wed, 01 Jun 2022 17:41:14 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/the-st-petersburg-paradox-infinite-payouts-and-code-you-can-try</guid><description><![CDATA[       This account draws from George Pipis&rsquo;s &ldquo;predictive &lsquo;hacks&rsquo;&rdquo; page and&nbsp;Wikipedia.&nbsp;Introduction The St. Petersburg Paradox results from an imaginary lottery game.&nbsp; The game pays out winnings that, in the truly long run, are infinite.&nbsp; Despite this, when people are asked how much they would pay to play, they typically name a small amount such as $20 or $30.&nbsp;How the Game WorksAn ordinary coin is flipped until it comes up heads. &nbsp;When  [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0px;margin-right:0px;text-align:left"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/coin-flip.jpg?1654116464" alt="Picture" style="width:398;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><em>This account draws from <a href="https://predictivehacks.com/st-petersburg-paradox/">George Pipis&rsquo;s &ldquo;predictive &lsquo;hacks&rsquo;&rdquo; page</a> and&nbsp;<a href="https://en.wikipedia.org/wiki/St._Petersburg_paradox" target="_blank">Wikipedia</a>.</em><br />&nbsp;<br /><strong>Introduction </strong><br />The St. Petersburg Paradox results from an imaginary lottery game.&nbsp; The game pays out winnings that, in the truly long run, are infinite.&nbsp; Despite this, when people are asked how much they would pay to play, they typically name a small amount such as $20 or $30.<br />&nbsp;<br /><strong>How the Game Works</strong><br />An ordinary coin is flipped until it comes up heads. &nbsp;When it does, the player wins some amount. If heads occurs on the first flip, the payout is $2.&nbsp; If on the 2nd flip, $4.&nbsp; If on the 3rd, $8.&nbsp; Etc.&nbsp; You can imagine that it&rsquo;s possible, though not likely, for the first heads to &ldquo;wait&rdquo; until the 14th flip.&nbsp; In that case, winnings would be 2^14 or $16,384.&nbsp;<br />&nbsp;<br />How much would you pay for the chance to play?<br />&nbsp;<br /><strong>The Paradox</strong><br />In theory, and in the long run, there is no limit to the amount one could win at this hypothetical game.&nbsp; Even so, few people say they would risk a large amount.&nbsp; <a href="https://en.wikipedia.org/wiki/St._Petersburg_paradox" target="_blank">Wikipedia</a> has good information on why, bringing in work on behavioural economics from researchers such as the legendary pair, Amos Tversky and Daniel Kahneman.<br />&#8203;<br /><strong>Code:&nbsp; Try It Out Yourself&nbsp;</strong><br />The text file below contains several versions of code you can use to simulate results from this game.&nbsp; Some apply to the R software, and one is designed for SPSS.<br />Each portion of code contains comments explaining the purpose or function of different commands.<br />Enjoy, and feel free to share your observations or your feedback about the exercise.<br /><br /></div>  <div><div style="margin: 10px 0 0 -10px"> <a title="Download file: st._petersburg_paradox_in_6_lines_of_code.txt" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/st._petersburg_paradox_in_6_lines_of_code.txt"><img src="//www.weebly.com/weebly/images/file_icons/txt.png" width="36" height="36" style="float: left; position: relative; left: 0px; top: 0px; margin: 0 15px 15px 0; border: 0;" /></a><div style="float: left; text-align: left; position: relative;"><table style="font-size: 12px; font-family: tahoma; line-height: .9;"><tr><td colspan="2"><b> st._petersburg_paradox_in_6_lines_of_code.txt</b></td></tr><tr style="display: none;"><td>File Size:  </td><td>4 kb</td></tr><tr style="display: none;"><td>File Type:  </td><td> txt</td></tr></table><a title="Download file: st._petersburg_paradox_in_6_lines_of_code.txt" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/st._petersburg_paradox_in_6_lines_of_code.txt" style="font-weight: bold;">Download File</a></div> </div>  <hr style="clear: both; width: 100%; visibility: hidden"></hr></div>]]></content:encoded></item><item><title><![CDATA[Correlation and Causation:  Chocolate and the Nobel Prize]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/correlation-and-causation-chocolate-and-the-nobel-prize]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/correlation-and-causation-chocolate-and-the-nobel-prize#comments]]></comments><pubDate>Tue, 31 May 2022 18:56:35 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/correlation-and-causation-chocolate-and-the-nobel-prize</guid><description><![CDATA[       This short case study draws on an ingenious article by a Swiss-born physician.&nbsp; His classic article presents excellent material for deriving some lessons about data analysis, for novice and intermediate researchers.&nbsp; See the .pdf below.    correlation_and_causation_-_chocolate_and_nobel_prizes.pdfFile Size:  308 kbFile Type:   pdfDownload File    [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-medium " style="padding-top:5px;padding-bottom:10px;margin-left:0px;margin-right:10px;text-align:left"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/chocolate-nobel-graph.png?1654024146" alt="Picture" style="width:553;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph">This short case study draws on an ingenious article by a Swiss-born physician.&nbsp; His classic article presents excellent material for deriving some lessons about data analysis, for novice and intermediate researchers.&nbsp; See the .pdf below.</div>  <div><div style="margin: 10px 0 0 -10px"> <a title="Download file: correlation_and_causation_-_chocolate_and_nobel_prizes.pdf" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/correlation_and_causation_-_chocolate_and_nobel_prizes.pdf"><img src="//www.weebly.com/weebly/images/file_icons/pdf.png" width="36" height="36" style="float: left; position: relative; left: 0px; top: 0px; margin: 0 15px 15px 0; border: 0;" /></a><div style="float: left; text-align: left; position: relative;"><table style="font-size: 12px; font-family: tahoma; line-height: .9;"><tr><td colspan="2"><b> correlation_and_causation_-_chocolate_and_nobel_prizes.pdf</b></td></tr><tr style="display: none;"><td>File Size:  </td><td>308 kb</td></tr><tr style="display: none;"><td>File Type:  </td><td> pdf</td></tr></table><a title="Download file: correlation_and_causation_-_chocolate_and_nobel_prizes.pdf" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/correlation_and_causation_-_chocolate_and_nobel_prizes.pdf" style="font-weight: bold;">Download File</a></div> </div>  <hr style="clear: both; width: 100%; visibility: hidden"></hr></div>]]></content:encoded></item><item><title><![CDATA[What Is Statistical Significance?]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/what-is-statistical-significance]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/what-is-statistical-significance#comments]]></comments><pubDate>Mon, 23 May 2022 18:11:45 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/what-is-statistical-significance</guid><description><![CDATA[ Here is the principle behind tests of statistical significance.There are two dice.&nbsp; One is given to you, one to me.&nbsp; We each roll just a handful of times and check our own average:&nbsp; somewhere from 1 to 6.&nbsp; With such a small number of rolls, your average and my average could be quite far apart. &nbsp;Maybe 3 vs. 5.&nbsp; &nbsp;Maybe 2 vs. 4.5.&nbsp; Even so, we&rsquo;d probably trust that the dice were both the same.Now, suppose we each rolled thousands of times.&nbsp; Random [...] ]]></description><content:encoded><![CDATA[<span class='imgPusher' style='float:left;height:0px'></span><span style='display: table;width:auto;position:relative;float:left;max-width:100%;;clear:left;margin-top:0px;*margin-top:0px'><a><img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/signpost.png?250" style="margin-top: 10px; margin-bottom: 10px; margin-left: 0px; margin-right: 0px; border-width:0; max-width:100%" alt="Picture" class="galleryImageBorder wsite-image" /></a><span style="display: table-caption; caption-side: bottom; font-size: 90%; margin-top: -10px; margin-bottom: 10px; text-align: center;" class="wsite-caption"></span></span> <div class="paragraph" style="display:block;">Here is the principle behind tests of statistical significance.<br /><br />There are two dice.&nbsp; One is given to you, one to me.&nbsp; We each roll just a handful of times and check our own average:&nbsp; somewhere from 1 to 6.&nbsp; With such a small number of rolls, your average and my average could be quite far apart. &nbsp;Maybe 3 vs. 5.&nbsp; &nbsp;Maybe 2 vs. 4.5.&nbsp; Even so, we&rsquo;d probably trust that the dice were both the same.<br /><br />Now, suppose we each rolled thousands of times.&nbsp; Randomness, chance, works according to certain known rules.&nbsp; In thousands of rolls, differences ought to get smoothed out.&nbsp; Your average and mine should be very, very close together.&nbsp; Maybe 3.51 vs. 3.48.&nbsp; &nbsp;&nbsp;<br />&#8203;<br />If they are not very close, most anyone observing would conclude:&nbsp; &ldquo;Something else besides chance must have been inserted into the process.&nbsp; The dice must not be the same; this isn&rsquo;t the sort of difference chance alone would produce.&rdquo;&nbsp; Thus they would call the difference &ldquo;statistically significant.&rdquo;</div> <hr style="width:100%;clear:both;visibility:hidden;"></hr>]]></content:encoded></item><item><title><![CDATA[“Significance” and Why Large Samples Confuse:  Can You Tell When a Finding Is Significant?]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/significance-and-why-large-samples-confuse]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/significance-and-why-large-samples-confuse#comments]]></comments><pubDate>Sun, 27 Mar 2022 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/significance-and-why-large-samples-confuse</guid><description><![CDATA[       The short presentation below explains the crucial difference between statistical and practical significance.&nbsp; It lets you test your ability to recognize each in the context of mean differences (T-tests) and relationships (correlations).    statistical_significance_vs._practical_significance_-_an_exercise_v3.pdfFile Size:  341 kbFile Type:   pdfDownload File    [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0px;margin-right:0px;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/significance-exercise.png?1654021207" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph">The short presentation below explains the crucial difference between statistical and practical significance.&nbsp; It lets you test your ability to recognize each in the context of mean differences (T-tests) and relationships (correlations).</div>  <div><div style="margin: 10px 0 0 -10px"> <a title="Download file: statistical_significance_vs._practical_significance_-_an_exercise_v3.pdf" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/statistical_significance_vs._practical_significance_-_an_exercise_v3.pdf"><img src="//www.weebly.com/weebly/images/file_icons/pdf.png" width="36" height="36" style="float: left; position: relative; left: 0px; top: 0px; margin: 0 15px 15px 0; border: 0;" /></a><div style="float: left; text-align: left; position: relative;"><table style="font-size: 12px; font-family: tahoma; line-height: .9;"><tr><td colspan="2"><b> statistical_significance_vs._practical_significance_-_an_exercise_v3.pdf</b></td></tr><tr style="display: none;"><td>File Size:  </td><td>341 kb</td></tr><tr style="display: none;"><td>File Type:  </td><td> pdf</td></tr></table><a title="Download file: statistical_significance_vs._practical_significance_-_an_exercise_v3.pdf" href="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/statistical_significance_vs._practical_significance_-_an_exercise_v3.pdf" style="font-weight: bold;">Download File</a></div> </div>  <hr style="clear: both; width: 100%; visibility: hidden"></hr></div>]]></content:encoded></item><item><title><![CDATA[Paradoxical Reversals After Analysis]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/paradoxical-reversals-after-analysis]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/paradoxical-reversals-after-analysis#comments]]></comments><pubDate>Mon, 20 May 2019 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/paradoxical-reversals-after-analysis</guid><description><![CDATA[ Does it drive you crazy to see two analyses of the same data reaching opposite conclusions? I just discovered&nbsp;Simpson's Paradox, Lord's Paradox, and Suppression Effects are the same phenomenon &ndash; the reversal paradox, by Yu-Kang Tu, David Gunnell, and Mark S. Gilthorpe (Emerging Themes in Epidemiology 5.1, 2008).Such contradictory results are all too common. It might seem at first that more of X causes an increase in Y, but when we control (or adjust) for Z, we find the opposite! I&rs [...] ]]></description><content:encoded><![CDATA[<span class='imgPusher' style='float:left;height:0px'></span><span style='display: table;width:auto;position:relative;float:left;max-width:100%;;clear:left;margin-top:1px;*margin-top:2px'><a><img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/editor/180.jpg?1648900264" style="margin-top: 10px; margin-bottom: 10px; margin-left: 0px; margin-right: 10px; border-width:0; max-width:100%" alt="Picture" class="galleryImageBorder wsite-image" /></a><span style="display: table-caption; caption-side: bottom; font-size: 90%; margin-top: -10px; margin-bottom: 10px; text-align: center;" class="wsite-caption"></span></span> <div class="paragraph" style="display:block;"><span>Does it drive you crazy to see two analyses of the same data reaching opposite conclusions? I just discovered&nbsp;</span><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2254615/" target="&rdquo;_blank&rdquo;">Simpson's Paradox, Lord's Paradox, and Suppression Effects are the same phenomenon &ndash; the reversal paradox</a><span>, by Yu-Kang Tu, David Gunnell, and Mark S. Gilthorpe (</span><em>Emerging Themes in Epidemiology 5</em><span>.1, 2008).</span><br /><br /><span>Such contradictory results are all too common. It might seem at first that more of X causes an increase in Y, but when we control (or adjust) for Z, we find the opposite! I&rsquo;m continually interested in ways to better use analysis to understand cause and effect, and to distinguish causation from mere correlation. So it&rsquo;s important to get a handle on when and why such contradictions can occur, and what&rsquo;s the best way to interpret them.</span><br /><br /><span>The authors methodically explain what conditions can lead to such reversals. They show how each of three types of reversal effects can occur when statistical control is introduced, and they explain how variables&rsquo;&nbsp;</span><a href="https://en.wikipedia.org/wiki/Level_of_measurement" target="&rdquo;_blank&rdquo;"><em>level of measurement</em></a><span>&nbsp;(categorical or continuous) affects the type of reversal that can occur.</span><br /><br /><strong>Most important</strong><span>, Tu et al. stress that when we decide whether to control for some&nbsp;</span><a href="https://en.wikipedia.org/wiki/Confounding" target="&rdquo;_blank&rdquo;"><em>confounder</em></a><span>, or nuisance variable lurking in the background, we shouldn&rsquo;t make this decision purely on statistical grounds. It takes sound knowledge of the subject matter in question, and not merely statistical know-how, to design an analysis that will produce solid and believable cause-and-effect results.</span><br /><br /><em>&ldquo;It's easy to lie with statistics; it's easier to lie without them.&rdquo; Frederick Mosteller</em></div> <hr style="width:100%;clear:both;visibility:hidden;"></hr>]]></content:encoded></item><item><title><![CDATA[Striking Findings on Baseball Umpires]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/striking-findings-on-baseball-umpires]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/striking-findings-on-baseball-umpires#comments]]></comments><pubDate>Mon, 07 Jan 2019 05:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/striking-findings-on-baseball-umpires</guid><description><![CDATA[       An ingenious&nbsp;FiveThirtyEight&nbsp;article by Michael Lopez, Brian Mills, and Gus Wezerek tries to show that "Everyone Wants To Go Home During Extra Innings &mdash; Maybe Even The Umps." They find that in extra innings major league umpires, probably unwittingly, change their patterns of ball and strike calls in ways that tend to end the game quickly.The authors analyzed a sample of roughly 32,000 pitches thrown between 2008 and 2016. They obtained data using Bill Petti&rsquo;s basebal [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/umpire.jpg?1648903899" alt="Picture" style="width:238;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph">An ingenious&nbsp;<a href="https://fivethirtyeight.com/features/everyone-wants-to-go-home-during-extra-innings-maybe-even-the-umps/" target="&rdquo;_blank&rdquo;">FiveThirtyEight</a>&nbsp;article by Michael Lopez, Brian Mills, and Gus Wezerek tries to show that "Everyone Wants To Go Home During Extra Innings &mdash; Maybe Even The Umps." They find that in extra innings major league umpires, probably unwittingly, change their patterns of ball and strike calls in ways that tend to end the game quickly.<br /><br />The authors analyzed a sample of roughly 32,000 pitches thrown between 2008 and 2016. They obtained data using Bill Petti&rsquo;s baseballr package, scraping pitch locations from Baseballsavant.mlb.com.<br /><br />I love the fact that they undertook this work, and their nifty data graphic, but I wish it were clearer what question each result answers.<br /><br />At one point the main question is presented as a) How much umpires tend to favor calls that would hasten an ending, comparing certain extra-inning scenarios vs. ordinary scenarios.<br /><br />At another point it's stated as b) Strike rates in certain extra-inning scenarios for "teams that are in a position to win vs. teams that look like they&rsquo;re about to lose."<br /><br />A third and more complex comparison is implied by c), How umps "changed their behavior in these situations between 2008 and 2016," but I doubt this is what the authors intended to say.<br /><br />Comments to the article abound, but until we know for sure what each finding means....Finally, not that&nbsp;<a href="http://yellowbrickstats.com/insidious.htm">statistical significance</a>&nbsp;is the be-all and end-all, but it wouldn't have hurt to run a significance test or two, to let us know just how unusual the differences cited would be if one supposes they occurred by chance.<br /><span></span></div>]]></content:encoded></item><item><title><![CDATA[Key Issue Missing from Reporting on Harvard's Race-Conscious Admissions]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/key-issue-missing-from-reporting-on-harvards-race-conscious-admissions]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/key-issue-missing-from-reporting-on-harvards-race-conscious-admissions#comments]]></comments><pubDate>Sat, 13 Oct 2018 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/key-issue-missing-from-reporting-on-harvards-race-conscious-admissions</guid><description><![CDATA[       I've looked in vain for a good, in-depth treatment of the Harvard case centering on anti-Asian bias. The Oct. 11&nbsp;New Yorker&nbsp;column by Harvard Law professor Jeannie Suk Gersen introduces the problem but declines to cite a single number. Elsewhere, reporting commonly cites Asian-Americans' outsized percentage of the Harvard student body vs. their percentage of the US population. What I don't see is any source definitively reporting this group's admission&nbsp;rate&nbsp;as compared [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/harvard.jpg?1648995073" alt="Picture" style="width:300;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph" style="text-align:left;">I've looked in vain for a good, in-depth treatment of the Harvard case centering on anti-Asian bias. The Oct. 11&nbsp;<a href="https://www.newyorker.com/news/our-columnists/anti-asian-bias-not-affirmative-action-is-on-trial-in-the-harvard-case">New Yorker</a>&nbsp;column by Harvard Law professor Jeannie Suk Gersen introduces the problem but declines to cite a single number. Elsewhere, reporting commonly cites Asian-Americans' outsized percentage of the Harvard student body vs. their percentage of the US population. What I don't see is any source definitively reporting this group's admission&nbsp;<em>rate</em>&nbsp;as compared with other races'--let alone pinpointing that difference&nbsp;<strong>when one controls for other relevant factors</strong>. That's the crux of the matter.<br /><br />The Oct. 12 Nell Gluckman article in the&nbsp;<a href="https://www.chronicle.com/article/Harvard-s-Race-Conscious/244793">Chronicle of Higher Education</a>&nbsp;suffers from this deficiency. So does Colleen Walsh's Aug 31&nbsp;<a href="https://news.harvard.edu/gazette/story/2018/08/hundreds-of-experts-scholars-back-harvard-in-admissions-suit/">Harvard Gazette</a>&nbsp;story. Somewhat more helpful is this passage from Julie J. Park's Sep. 24&nbsp;<a href="https://www.insidehighered.com/admissions/views/2018/09/04/harvards-admissions-policies-are-being-distorted-lawsuit-charging-anti">Inside Higher Ed</a>&nbsp;column:<br /><br />"According to an expert report filed in the case on the side of Harvard by David Card of the University of California, Berkeley, the admit rate for the Classes of 2014-2019 was 5.15 percent for Asian Americans and 4.91 percent for white applicants who are not recruited athletes, legacies, on a special dean&rsquo;s list or children of faculty/staff members. It is problematic that white people are more likely to fall into these special categories [....]"<br /><br />This leaves me to imagine that an apples-to-apples comparison, one which adds back all such special categories for Whites, could yield racial admit-rates that are sharply different, on the order of 12% vs. 5%, or rather similar, such as 7% vs 5%.<br /><br />More helpful still is the&nbsp;<a href="https://www.economist.com/united-states/2018/06/23/a-lawsuit-reveals-how-peculiar-harvards-definition-of-merit-is">Economist</a>&nbsp;story from June 23. It describes an intriguing result from the plaintiff's consulting economist, Peter Arcidiacono, using an unspecified "statistical model." Controlling for other (unspecified) factors,<br /><br />"He estimates that a male, non-poor Asian-American applicant with the qualifications to have a 25% chance of admission to Harvard would have a 36% chance if he were white. If he were Hispanic, that would be 77%; if black, it would rise to 95%."<br /><br />This summary, of course, describes a special, narrow case. The full analysis would presumably cover students from the entire socio-economic spectrum, from all genders, and so on, and those findings could hardly be as striking as these. We can only hope Arcidiacono's methods are given adequate scrutiny. Models purported to be establishing cause and effect, especially those that rely on statistical control, can go awry in so many ways. And they can lead to bizarre conclusions. The late statistician Elazar Pedhazur used to spoof analyses that in effect answered questions akin to "How tall would this corn plant have grown if it had been a tomato plant?"<br />&nbsp;<br /><br /></div>]]></content:encoded></item><item><title><![CDATA[Ingenious Research Linking Tree Cover with Student Learning]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/ingenious-research-linking-tree-cover-with-student-learning]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/ingenious-research-linking-tree-cover-with-student-learning#comments]]></comments><pubDate>Mon, 01 Oct 2018 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/ingenious-research-linking-tree-cover-with-student-learning</guid><description><![CDATA[       It's heartening to see the original, high-quality research reflected in&nbsp;Might School Performance Grow on Trees? Examining the Link Between &ldquo;Greenness&rdquo; and Academic Achievement in Urban, High-Poverty Schools, a joint project of the U. of Illinois and the U.S. Forest Service. Ming Kuo, Matthew H. E. M. Browning, Sonya Sachdeva, Kangjae Lee and Lynne Westphal have admirably investigated the connection between amount of tree cover around Chicago schools and the extent of stud [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/leafy-playground.jpg?1648901103" alt="Picture" style="width:371;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph">It's heartening to see the original, high-quality research reflected in&nbsp;<a href="https://www.frontiersin.org/articles/10.3389/fpsyg.2018.01669/full">Might School Performance Grow on Trees? Examining the Link Between &ldquo;Greenness&rdquo; and Academic Achievement in Urban, High-Poverty Schools</a>, a joint project of the U. of Illinois and the U.S. Forest Service. Ming Kuo, Matthew H. E. M. Browning, Sonya Sachdeva, Kangjae Lee and Lynne Westphal have admirably investigated the connection between amount of tree cover around Chicago schools and the extent of student learning in math and reading, while striving to rule out other factors that could explain the variation in student performance.<br /><br />How unusual among educational research projects to gather data using "Light Detection and Ranging (LiDAR) collected with a scanning laser instrument mounted onto a low-flying airplane"!<br /><br />One might be impatient to suggest, as I was, that amount of tree cover at school could be serving as a proxy for level of affluence in the neighborhood generally-- which would perhaps be a truer cause of achievement level. The authors thought of this too and controlled for it effectively in their sequential regression analysis:<br /><br />"School Trees contribute uniquely to the prediction of academic achievement even after Neighborhood Trees are statistically controlled for. Neighborhood Trees, however, showed [little relationship with achievement] once School Trees were statistically controlled for. These findings suggest School Trees are stronger drivers of academic performance than other types of greenness, including grass cover and trees in surrounding neighborhoods."<br />&#8203;<br />I also recommend this article for its intelligent Limitations section.<br />&nbsp;<br /></div>]]></content:encoded></item><item><title><![CDATA[How (Not) to Assess the Effect of Images in Warning Labels]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/how-not-to-assess-the-effect-of-images-in-warning-labels]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/how-not-to-assess-the-effect-of-images-in-warning-labels#comments]]></comments><pubDate>Fri, 29 Jun 2018 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/how-not-to-assess-the-effect-of-images-in-warning-labels</guid><description><![CDATA[       "Ours is the first study to evaluate the effectiveness of sugary drink warning labels," touts Grant Donnelly, a lead author of a joint&nbsp;study&nbsp;by the Harvard Business School and Harvard University Behavioral Insights Group. Kudos for their smart approach to testing the effect of images as part of those warning labels (objective measures showed that images indeed brought about the desired reduction in purchases).But shame on the researchers for ignoring or missing decades of psycho [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/soda-label.jpg?1648901401" alt="Picture" style="width:339;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph">"Ours is the first study to evaluate the effectiveness of sugary drink warning labels," touts Grant Donnelly, a lead author of a joint&nbsp;<a href="https://news.harvard.edu/gazette/story/2018/06/graphic-warning-labels-on-sugary-drinks-linked-to-reduced-purchases/" target="&rdquo;_blank&rdquo;">study</a>&nbsp;by the Harvard Business School and Harvard University Behavioral Insights Group. Kudos for their smart approach to testing the effect of images as part of those warning labels (objective measures showed that images indeed brought about the desired reduction in purchases).<br /><br />But shame on the researchers for ignoring or missing decades of psychological and behavioral-economics research on the best ways of investigating cause and effect. For the study also incorporated a naive direct question asking participants "how seeing a graphic warning label would influence their drink purchases." An abundant literature, from&nbsp;<a href="https://www.researchgate.net/publication/229060046_Telling_More_Than_We_Can_Know_Verbal_Reports_on_Mental_Processes" target="&rdquo;_blank&rdquo;">Nisbett and Wilson (1977)</a>&nbsp;to my own recent&nbsp;<a href="http://www.yellowbrickstats.com/documents/BehavioralEconomicsHigherEducation.pdf">article</a>, shows that it would be foolish to trust in such subjective interpretations of the factors behind each person's decision-making process. After acquiring such good, objective information, why would Donnelly et al. water it down with subjective findings that are sure to introduce bias?<br /><br />UPDATE: the original study materials made available by the authors at&nbsp;<a href="https://osf.io/wqk65/register/5771ca429ad5a1020de2872e" target="&rdquo;_blank&rdquo;">Open Science Framework</a>&nbsp;tell a different story than the summary in the Harvard&nbsp;<em>Gazette</em>&nbsp;quoted above. The survey did&nbsp;<em>not</em>&nbsp;ask respondents "how seeing a graphic warning label would influence their drink purchases." Instead, the survey asked for reactions to the images and then separately asked about intention to buy a soft drink.&nbsp; Evaluated in this way, each topic was much more amenable to unbiased reporting by a participant than that person's causal assessment would be. The responses would then be linked "in the back end" by the researchers to investigate any causal connection. A good design after all.</div>]]></content:encoded></item><item><title><![CDATA[Test What You've Learned]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/test-what-youve-learned]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/test-what-youve-learned#comments]]></comments><pubDate>Thu, 15 Mar 2018 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/test-what-youve-learned</guid><description><![CDATA[       &#8203;The most impressive users of research and analysis are those who take the results and feed them into further experimentation.&nbsp; The instinct to test, to experiment, creates progress regardless of whether the industry is business or academia, for-profit or non-profit.Once, after a lengthy discussion of admissions and financial aid policy options with a college administrator, one of us suggested what was for that school a novel course:&nbsp; experiment.&nbsp; Take the alternative [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/httpspixabay-comvectorschemistry-laboratory-experiment-148044.png?1648906006" alt="Picture" style="width:107;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><span style="color:rgb(2, 37, 81)">&#8203;The most impressive users of research and analysis are those who take the results and feed them into further experimentation.&nbsp; The instinct to test, to experiment, creates progress regardless of whether the industry is business or academia, for-profit or non-profit.</span><br /><br /><span style="color:rgb(2, 37, 81)">Once, after a lengthy discussion of admissions and financial aid policy options with a college administrator, one of us suggested what was for that school a novel course:&nbsp; experiment.&nbsp; Take the alternatives that were the subject of so much protracted consideration and put each into practice with a subset of prospective students.&nbsp; In two months, evaluate each action.&nbsp;</span><br /><br /><span style="color:rgb(2, 37, 81)">"This is the real world," the administrator responded.&nbsp; "We can't play games!"&nbsp; Think about how you would respond to this.&nbsp; Is empirical testing somehow risky?&nbsp; If so, is it riskier than setting a course without the benefit of evidence?&nbsp; &nbsp;You can compare it to the choice to keep money under the mattress.&nbsp; The person making that choice has to be unaware that investment risk is overshadowed by the near-certainty of inflation eating into the value of that cash.</span><br /><br /><span style="color:rgb(2, 37, 81)">Kudos to those who find ways to use analytic results to enhance further learning and to push toward the next peak.&nbsp; If that describes you, we'd be proud to help.&nbsp;</span></div>]]></content:encoded></item><item><title><![CDATA["Of Poohsticks and p-values:  Hypothesis Testing in the Hundred Acre Wood"]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/of-poohsticks-and-p-values-hypothesis-testing-in-the-hundred-acre-wood]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/of-poohsticks-and-p-values-hypothesis-testing-in-the-hundred-acre-wood#comments]]></comments><pubDate>Tue, 13 Mar 2018 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/of-poohsticks-and-p-values-hypothesis-testing-in-the-hundred-acre-wood</guid><description><![CDATA[       I just discovered Eric D. Nordmoe's fun and informative&nbsp;creation&nbsp;from 2004. "A walk through Milne's Enchanted forest leads to an unexpected encounter with hypothesis testing." This enjoyable little article is instructive for those new to statistics and full of pleasing connections for the initiated. [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/winnie-the-pooh.png?1648901630" alt="Picture" style="width:403;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph">I just discovered Eric D. Nordmoe's fun and informative&nbsp;<a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9639.2004.00163.x/pdf" target="&rdquo;_blank&rdquo;">creation</a>&nbsp;from 2004. "A walk through Milne's Enchanted forest leads to an unexpected encounter with hypothesis testing." This enjoyable little article is instructive for those new to statistics and full of pleasing connections for the initiated.<br /><span></span></div>]]></content:encoded></item><item><title><![CDATA[Gun Control:  the Right Research Evidence Makes Policy Decisions Easy]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/gun-control-the-right-research-evidence-makes-policy-decisions-easy]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/gun-control-the-right-research-evidence-makes-policy-decisions-easy#comments]]></comments><pubDate>Mon, 12 Mar 2018 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/gun-control-the-right-research-evidence-makes-policy-decisions-easy</guid><description><![CDATA[       &#8203;Suppose a nationally-scaled, 30-year, multiple-author, peer-reviewed, non-partisan, public-health-oriented study concluded the following: "Where guns are more widely available, no more of the burglars and intruders are getting shot, but more of the gun-owners' family and friends are."This is the central finding of&nbsp;The Relationship Between Gun Ownership and Stranger and Nonstranger Firearm Homicide Rates in the United States, 1981&ndash;2010. The authors explain, "Our models co [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/gun.jpg?1648901754" alt="Picture" style="width:214;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph">&#8203;Suppose a nationally-scaled, 30-year, multiple-author, peer-reviewed, non-partisan, public-health-oriented study concluded the following: "Where guns are more widely available, no more of the burglars and intruders are getting shot, but more of the gun-owners' family and friends are."<br /><br />This is the central finding of&nbsp;<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4167105/">The Relationship Between Gun Ownership and Stranger and Nonstranger Firearm Homicide Rates in the United States, 1981&ndash;2010</a>. The authors explain, "Our models consistently failed to uncover a robust, statistically significant relationship between gun ownership and stranger firearm homicide rates (Tables 3 and 4). All models, however, showed a positive and significant association between gun ownership and nonstranger firearm homicide rates."<br /><br />&#8203;They add: "for each 1 percentage point increase in the gun ownership proxy, [stranger firearm homicide rates stayed the same, whereas] nonstranger firearm homicide rates increased by 1.4%. [Similarly,] a 1 standard deviation increase in gun ownership [13.8%] was associated with a 21.1% increase in the nonstranger firearm homicide rate."<br /><br />The research is very sound.<br />&#8203;<ul><li>Siegel, Negussie, Vanture, Pleskunas, Ross, and King paid close attention to the validity of the indicators they used, and they made intelligent use of a proxy when a direct measurement was not available. For their main predictor, "the annual prevalence of household firearm ownership in a given state," they substituted the percentage of suicides committed using a firearm, and they clearly explained why this would be effective.</li></ul> &nbsp;<ul><li>The authors took great care to isolate the relationship of greatest interest by controlling for nuisance variables.</li></ul> &nbsp;<ul><li>They conducted sensitivity analysis: where a judgment call might result in the choice of one analytic approach or another, they analyzed their data in multiple ways to see how much the results changed. One example of this was their treatment of missing data.</li></ul><br />&#8203;Can you refute their findings?<br /><br /></div>]]></content:encoded></item><item><title><![CDATA[How *Not* to Attribute Causality from Statistical Results]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/how-not-to-attribute-causality-from-statistical-results]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/how-not-to-attribute-causality-from-statistical-results#comments]]></comments><pubDate>Sat, 09 Sep 2017 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/how-not-to-attribute-causality-from-statistical-results</guid><description><![CDATA[         [From a major outlet for health care research findings, Fierce Health Care. I've reproduced key passages in blue-black and commented inline in orange.]Employment status is the top socioeconomic factor affecting 30-day [US hospital] readmissions for heart failure, heart attacks or pneumonia, according to a new study from Truven Health Analytics.[Such a conclusion is on very shaky ground, as you'll see.]As readmission penalties reach record highs, analyzing causes is more important than e [...] ]]></description><content:encoded><![CDATA[<div class="paragraph"></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/cause-effect.jpg?1648902257" alt="Picture" style="width:241;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><strong><span style="color:#CC4400">[From a major outlet for health care research findings, <em>Fierce Health Care</em>. I've reproduced key passages in blue-black and commented inline in orange.]</span></strong><br /><br /><span>Employment status is the top socioeconomic factor affecting 30-day [US hospital] readmissions for heart failure, heart attacks or pneumonia, according to a new study from Truven Health Analytics.</span><br /><br /><strong><span style="color:#CC4400">[Such a conclusion is on very shaky ground, as you'll see.]</span></strong><br /><br /><span>As readmission penalties reach record highs, analyzing causes is more important than ever.</span><br /><br /><strong><span style="color:#CC4400">[Granted!]</span></strong><br /><br /><span>Researchers, led by David Foster, Ph.D., collected 2011 and 2012 data from the Centers for Medicare &amp; Medicaid Services and used a statistical test called the Variance Inflation Factor (VIF) for correlations among the nine factors in the Community Need Index (CNI): elderly poverty, single parent poverty, child poverty, uninsurance, minority, no high school, renting, unemployment and limited English.</span><br /><br /><strong><span style="color:#CC4400">[In truth, the VIF tells&nbsp;<u>not</u>&nbsp;what is the most important factor, but only to what extent the different factors, or independent variables, overlap with one another, potentially confounding the results. In this case, trying to isolate one indicator of socioeconomic status (SES) while controlling for eight others will surely distort any connections found. These SES indicators are too much "part and parcel of" one another, too inseparable, to allow for valid use of control in this way.<br /><br />To explain further:&nbsp; it's a mistake to ask "How much does SES (indicator 1) relate to readmission if we statistically remove SES (indicators 2-9) from the relationship?" That'd be much like saying, "How addicted am I to desserts if you discount my intake of cookies, pie, and ice cream?" Or there's Monty Python's question, "Apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, the fresh-water system, and public health, what have the Romans ever done for us?"]</span></strong><br /><br /><span>Their analysis found unemployment and lack of high school education were the only statistically significant factors in connection with readmissions, carrying a risk of 18.1 percent and 5.3 percent, respectively, according to the study.</span><br /><br /><strong><span style="color:#CC4400">[As explained above, these are not valid conclusions to be drawn. But even if the numbers were somehow accurate, what could such statements mean? That readmission risk becomes on average 5.3% for non-high-school graduates? It can't be -- that'd be far too low. That it's 5.3 points higher than it would be otherwise? It can't be that either -- too high. How about 5.3% higher in relative terms? Maybe, but that's about 1 point, which would hardly merit calling high school education an important factor. So what's left?]</span></strong></div>]]></content:encoded></item><item><title><![CDATA[A Brilliant Look at Public Protest Using a Natural Experiment]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/a-brilliant-look-at-public-protest-using-a-natural-experiment]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/a-brilliant-look-at-public-protest-using-a-natural-experiment#comments]]></comments><pubDate>Sat, 09 Sep 2017 04:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/a-brilliant-look-at-public-protest-using-a-natural-experiment</guid><description><![CDATA[       Want to know to what degree political demonstrations produced results in elections? Track the rain.&nbsp;The rain?&nbsp;It actually makes a beautiful example of what's termed an&nbsp;instrumental variable.&nbsp;Read Dan Kopf's excellent&nbsp;Quartz summary&nbsp;or the full&nbsp;article&nbsp;by Andreas Madestam, Daniel Shoag, Stan Veuger, and David Yanagizawa-Drott from Harvard and Stockholm Universities.&nbsp;&#8203;&#8203;Whether it rains at protest locations can scarcely have anything&n [...] ]]></description><content:encoded><![CDATA[<div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/published/protest.jpg?1648902068" alt="Picture" style="width:342;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><span>Want to know to what degree political demonstrations produced results in elections? Track the rain.&nbsp;</span><em>The rain?</em><span>&nbsp;It actually makes a beautiful example of what's termed an&nbsp;</span><em>instrumental variable</em><span>.&nbsp;</span><span style="color:rgb(2, 37, 81)">Read Dan Kopf's excellent&nbsp;</span><a href="https://qz.com/901411/political-protests-are-effective-but-not-for-the-reason-you-think/" target="&rdquo;_blank&rdquo;">Quartz summary</a><span style="color:rgb(2, 37, 81)">&nbsp;or the full&nbsp;</span><a href="https://www.hks.harvard.edu/fs/dshoag/Documents/Political%20Protests%20--%20Evidence%20from%20the%20Tea%20Party.pdf" target="&rdquo;_blank&rdquo;">article</a><span style="color:rgb(2, 37, 81)">&nbsp;by Andreas Madestam, Daniel Shoag, Stan Veuger, and David Yanagizawa-Drott from Harvard and Stockholm Universities.&nbsp;</span>&#8203;<span><br /><br />&#8203;Whether it rains at protest locations can scarcely have anything&nbsp;</span><em>directly</em><span>&nbsp;to do with ultimate election results, but it unquestionably relates to turnout for each demonstration. If the size of turnout relates to election results, then the rain should, statistically (if not causally), relate to them as well. "If the absence of rain means bigger protests, and bigger protests actually make a difference, then local political outcomes ought to depend on whether or not it rained [on protest days]...As it turns out, protest size really does matter."</span></div>]]></content:encoded></item><item><title><![CDATA[Hospital Readmission Rates:  58% of Variance Explained?!?]]></title><link><![CDATA[https://www.integrativestatistics.com/blog/hospital-readmission-rates-58-of-variance-explained8079616]]></link><comments><![CDATA[https://www.integrativestatistics.com/blog/hospital-readmission-rates-58-of-variance-explained8079616#comments]]></comments><pubDate>Wed, 18 Nov 2015 05:00:00 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.integrativestatistics.com/blog/hospital-readmission-rates-58-of-variance-explained8079616</guid><description><![CDATA[ "Fifty-eight percent of national variation in hospital readmission rates was explained by the county in which the hospital was located," announce Jeph Herrin et al. in&nbsp;Community Factors and Hospital Readmission Rates, published in 2014 in&nbsp;Health Services Research. Sound odd to you? After all, for most readmission studies the percent explained is in single digits. Being able to account for 4 or 5% of the variation translates to an ability to&nbsp;assess individual risk&nbsp;that can me [...] ]]></description><content:encoded><![CDATA[<span class='imgPusher' style='float:left;height:0px'></span><span style='display: table;width:auto;position:relative;float:left;max-width:100%;;clear:left;margin-top:0px;*margin-top:0px'><a><img src="https://www.integrativestatistics.com/uploads/1/1/7/8/117816039/editor/sand-thru-hands.jpg?250" style="margin-top: 10px; margin-bottom: 10px; margin-left: 0px; margin-right: 10px; border-width:0; max-width:100%" alt="Picture" class="galleryImageBorder wsite-image" /></a><span style="display: table-caption; caption-side: bottom; font-size: 90%; margin-top: -10px; margin-bottom: 10px; text-align: center;" class="wsite-caption"></span></span> <div class="paragraph" style="display:block;">"Fifty-eight percent of national variation in hospital readmission rates was explained by the county in which the hospital was located," announce Jeph Herrin et al. in&nbsp;<a href="http://onlinelibrary.wiley.com/doi/10.1111/1475-6773.12177/pdf" target="&rdquo;_blank&rdquo;">Community Factors and Hospital Readmission Rates</a>, published in 2014 in&nbsp;<a href="http://www.hsr.org/" target="&rdquo;_blank&rdquo;"><em>Health Services Research</em></a>. Sound odd to you? After all, for most readmission studies the percent explained is in single digits. Being able to account for 4 or 5% of the variation translates to an ability to&nbsp;<a href="http://www.reinforcedcare.com/wp-content/uploads/2016/09/ReInforced-Care-Risk-Scoring-Overview-Sep-2016.pdf" target="&rdquo;_blank&rdquo;">assess individual risk</a>&nbsp;that can meaningfully aid in clinical decisions. Even Harlan Krumholz and his team of 17 researchers and statisticians, the ones whose predictive models underpin the national readmission penalty system, have usually explained&nbsp;<span style="color:rgb(2, 37, 81)">only</span>&nbsp;3-8%. And those models have taken into account about 50 input variables.<br /><br />It turns out that Herrin et al. took their data on 4,073 hospitals and broke it down by&nbsp;<em>2,254 counties</em>. There were almost as many counties as hospitals themselves. And many counties contained only a single hospital.<br /><br />Now, suppose the authors had divided the 4,073 hospitals into, say, 4 groups defined by region, and found that the 4 groups had sizeable differences in readmission rate. That would have been a meaningful way to summarize the data. Even with somewhat more groups -- say, one for each of the 50 states -- that&nbsp;<em>might</em>&nbsp;have been meaningful, though the data would have been spread pretty thin for some states. But to "explain" differences using 2,254 groups? It's not a far cry from simply listing the readmission rates of all 4,073 hospitals and claiming victoriously to have "explained" 100% of the variance in the hospital-to-hospital rate. Sounds like a feat for&nbsp;<a href="https://twitter.com/CaptainObvious/" target="&rdquo;_blank&rdquo;"><strong>Captain Obvious</strong></a>.&nbsp; It's tautology.&nbsp; With such an approach, any apparent "explanation" of the outcome is empty.<br /><br />One reason why this matters a great deal is that, to the extent that some geographic factor is considered responsible for this outcome, hospital performance will no longer be. So if county legitimately explained 58% of the variance, then hospital performance, it might be argued, couldn't account for more than 42%. This is the incorrect conclusion that was reported in unqualified fashion by news outlets such as&nbsp;Becker's Hospital Review.<br />&#8203;<br />The article by Herrin and colleagues makes contributions in other ways, of course, but the chief findings are very misleading. Watch for dialogue, in&nbsp;<em>Health Services Research&nbsp;</em>or elsewhere, on how to interpret the results. The upshot should be quite a bit more nuanced and moderated than what we've seen above. And if you're interested in the role of socioeconomic factors in hospital readmission, you'll find some eye-opening results out of Missouri at <em><a href="https://www.sciencedaily.com/releases/2014/05/140505211048.htm" target="_blank">Science News</a></em>.</div> <hr style="width:100%;clear:both;visibility:hidden;"></hr>]]></content:encoded></item></channel></rss>