<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@62442c3160d94ae2bd3d823de6c628f0" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Introduction</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@43bab6dfa7924fe8b7de825cbd9362a6">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@43bab6dfa7924fe8b7de825cbd9362a6" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>An outlier is a data point which is different from the remaining data. Outliers are also referred to as abnormalities, discordants, deviants, and anomalies. Whereas noise can be defined as mislabeled examples (class noise) or errors in the values of attributes (attribute noise), outlier is a broader concept that includes not only errors but also discordant data that may arise from the natural variation within the population or process. As such, outliers often contain interesting and useful information about the underlying system. These particularities have been exploited in fraud control, intrusion detection systems, web robot detection, weather forecasting, law enforcement and medical diagnosis, generally using methods of supervised outlier detection (see below).</p>
<p>Within the medical domain in general, the main sources of outliers are equipment malfunctions, human errors, anomalies arising from patient-specific behaviors and natural variation within patients. Consider for instance an anomalous blood test result. Several reasons can explain the presence of outliers: severe pathological states, intake of drugs, food or alcohol, recent physical activity, stress, menstrual cycle, poor blood sample collection and/or handling. While some reasons may point to the existence of patient-specific characteristics discordant with the “average” patient, in which case the observation being an outlier provides useful information, other reasons may point to human errors, and hence the observation should be considered for removal or correction. Therefore, it is crucial to consider the causes that may be responsible for outliers in a given dataset before proceeding to any type of action.</p>
<p>The consequences of not screening the data for outliers can be catastrophic. The negative effects of outliers can be summarized as: (1) increase in error variance and reduction in statistical power; (2) decrease in normality for the cases where outliers are non-randomly distributed; (3) model bias by corrupting the true relationship between exposure and outcome.</p>
<p>A good understanding of the data itself is required before choosing a model to detect outliers, and several factors influence the choice of an outlier identification method, including the type of data, its size and distribution, the availability of ground truth about the data, and the need for interpretability in a model. For example, regression-based models are better suited for finding outliers in linearly correlated data, while clustering methods are advisable when the data is not linearly distributed along correlation planes. While this chapter provides a description of some of the most common methods for outlier detection, many others exist.</p>
<p>Evaluating the effectiveness of an outlier detection algorithm and comparing the different approaches is complex. Moreover, the ground-truth about outliers is often unavailable, as in the case of unsupervised scenarios, hampering the use of quantitative methods to assess the effectiveness of the algorithms in a rigorous way. The analyst is left with the alternative of qualitative and intuitive evaluation of results. To overcome this difficulty in this chapter, we will use logistic regression models to investigate the performance of different outlier identification techniques in the medically relevant case study.</p>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@b8f1102016394b7bb111c2f90b22d9c3">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@b8f1102016394b7bb111c2f90b22d9c3" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Learning Objectives</h3>
<ul>
<li>What common methods for outlier detection are available.</li>
<li>How to choose the most appropriate methods.</li>
<li>How to assess the performance of an outlier detection method and how to compare different methods.</li>
</ul>
</div>
</div>
<div class="vert vert-2" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@1f21e8a5a9cf4e4ea028d31660f27087">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@1f21e8a5a9cf4e4ea028d31660f27087" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Subsection Credits</h3>
<p>Textbook chapter 14 - Noise Versus Outliers - Cátia M. Salgado, Carlos Azevedo, Hugo Proença and Susana M. Vieira</p>
<p>Edx content: Kimiko Kechun Huang</p>
<p>Videos: Videos in this unit are presented by <span style="color: #313131; font-family: 'Open Sans', 'Helvetica Neue', Helvetica, Arial, sans-serif;">Jesse Raffa</span></p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@4abc417eda434cc997de210f6afb2457" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Theoretical Concepts</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@c92f4e3db1ee4ffc9ea31f7d256db901">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@c92f4e3db1ee4ffc9ea31f7d256db901" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Outlier Detection</h3>
<p>Outlier identification methods can be classified into supervised and unsupervised methods, depending on whether prior information about the abnormalities in the data is available or not. The techniques can be further divided into univariable and multivariable methods, conditional on the number of variables considered in the dataset of interest.</p>
<p>The simplest form of outlier detection is extreme value analysis of unidimensional data. In this case, the core principle of discovering outliers is to determine the statistical tails of the underlying distribution and assume that either too large or too small values are outliers. In order to apply this type of technique to a multidimensional dataset, the analysis is performed one dimension at a time. In such a multivariable analysis, outliers are samples which have unusual combinations with other samples in the multidimensional space. It is possible to have outliers with reasonable marginal values (i.e. the value appears normal when confining oneself to one dimension), but due to linear or non-linear combinations of multiple attributes these observations unveil unusual patterns in regards to the rest of the population under study.</p>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@b2bc3e07fd4b4982baf8a434f85cd02b">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@b2bc3e07fd4b4982baf8a434f85cd02b" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3><em><strong>Fig. 2.06.1</strong></em> Univariable (boxplots) versus multivariable (scatter plot) outlier investigation</h3>
<p><img src="/assets/courseware/v1/260c2843ef441f4a2cd453d16ca7c501/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_003.jpg" alt="" width="874" height="879" /></p>
<p>To better understand this, the Fig. 2.06.1 provides a graphical example of a scenario where outliers are only visible in a 2-dimensional space. An inspection of the boxplots will reveal no outliers (no data point above and below 1.5 IQR, a widely utilized outlier identification method), whereas a close observation of the natural clusters present in data will uncover irregular patterns. Outliers can be identified by visual inspection, highlighting data points that seem to be relatively out of the inherent 2-D data groups.</p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@ce42173d1e4d4ff58262030455fb0df2" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Statistical Methods</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@7d557256b6d447bc8f186975fdefdfca">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@7d557256b6d447bc8f186975fdefdfca" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>In the field of statistics, the data is assumed to follow a distribution model (e.g., normal distribution) and an instance is considered an outlier if it deviates significantly from the model. Common statistical methods for outlier detection are listed below:</p>
<p><strong>Tukey’s Method</strong> - Inner “fences” are located at a distance of 1.5 IQR below Q1 and above Q3, and outer fences at a distance of 3 IQR below Q1 and above Q3. A value between the inner and outer fences is a possible outlier, whereas a value falling outside the outer fences is a probable outlier.</p>
<p><strong>Z-Score</strong> - The Z-value test computes the number of standard deviations by which the data varies from the mean. It is defined as:<br /><strong><img src="/assets/courseware/v1/606a5a86f9e042149d082e4a26044bde/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_057.png" alt="Z-Score is defined as:" width="80" height="42" /></strong></p>
<p>where <i>x</i> and s denote the sample mean and standard deviation, respectively. In cases where the mean and standard deviation of the distribution can be accurately estimated (or are available from domain knowledge), a good “rule of thumb” is to consider values with |zi|>=3 as outliers. Of note, this method is of limited value for small datasets, since the maximum z-score is at most<strong> <img src="/assets/courseware/v1/b02afa7ee4fcc8d4eee9a0cb6e1ab0b2/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_058.png" alt="" width="74" height="19" /></strong>.<strong><br /></strong></p>
<p><strong>Modified Z-Score</strong> - The estimators used in the z-score, the sample mean and sample standard deviation, can be affected by the extreme values present in the data. To avoid this problem, the modified z-score uses the median and the median absolute deviation (MAD) instead of the mean and standard deviation of the sample:</p>
<p><strong><img src="/assets/courseware/v1/3eae8ca1d5fdf3df6b2892a749f7fadd/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_059.png" alt="" width="922" height="157" /><br /><br />Interquartile Range with Log-Normal Distribution</strong> - If a variable follows a log-normal distribution then the logarithms of the observations follow a normal distribution. A reasonable approach then is to apply the ln to the original data and then apply the tests intended to the “normalized” distributions. We refer to this method as the log-IQ.</p>
<p><strong>Ordinary and Studentized Residuals</strong> - Studentized residuals eliminate the units of measurement by dividing the residuals by an estimate of their standard deviation. One limitation of this approach is that it assumes the regression model is correctly specified.</p>
<p><strong>Cook’s Distance</strong><!-- [if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]--><!-- [if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-GB</w:LidThemeOther>
<w:LidThemeAsian>X-NONE</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!-- [if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="false"
DefSemiHidden="false" DefQFormat="false" DefPriority="99"
LatentStyleCount="376">
<w:LsdException Locked="false" Priority="0" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 9"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 9"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footnote text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="header"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footer"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index heading"/>
<w:LsdException Locked="false" Priority="35" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="table of figures"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="envelope address"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="envelope return"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footnote reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="line number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="page number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="endnote reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="endnote text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="table of authorities"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="macro"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="toa heading"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 5"/>
<w:LsdException Locked="false" Priority="10" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Closing"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Signature"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="true"
UnhideWhenUsed="true" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Message Header"/>
<w:LsdException Locked="false" Priority="11" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Salutation"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Date"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text First Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text First Indent 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Note Heading"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Block Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Hyperlink"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="FollowedHyperlink"/>
<w:LsdException Locked="false" Priority="22" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Document Map"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Plain Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="E-mail Signature"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Top of Form"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Bottom of Form"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal (Web)"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Acronym"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Address"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Cite"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Code"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Definition"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Keyboard"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Preformatted"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Sample"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Typewriter"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Variable"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal Table"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation subject"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="No List"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Contemporary"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Elegant"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Professional"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Subtle 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Subtle 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Balloon Text"/>
<w:LsdException Locked="false" Priority="39" Name="Table Grid"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Theme"/>
<w:LsdException Locked="false" SemiHidden="true" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" SemiHidden="true" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" QFormat="true"
Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" QFormat="true"
Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" QFormat="true"
Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" QFormat="true"
Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" QFormat="true"
Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" QFormat="true"
Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" SemiHidden="true"
UnhideWhenUsed="true" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="TOC Heading"/>
<w:LsdException Locked="false" Priority="41" Name="Plain Table 1"/>
<w:LsdException Locked="false" Priority="42" Name="Plain Table 2"/>
<w:LsdException Locked="false" Priority="43" Name="Plain Table 3"/>
<w:LsdException Locked="false" Priority="44" Name="Plain Table 4"/>
<w:LsdException Locked="false" Priority="45" Name="Plain Table 5"/>
<w:LsdException Locked="false" Priority="40" Name="Grid Table Light"/>
<w:LsdException Locked="false" Priority="46" Name="Grid Table 1 Light"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark"/>
<w:LsdException Locked="false" Priority="51" Name="Grid Table 6 Colorful"/>
<w:LsdException Locked="false" Priority="52" Name="Grid Table 7 Colorful"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 1"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 1"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 1"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 1"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 1"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 2"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 2"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 2"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 2"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 2"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 3"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 3"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 3"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 3"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 3"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 4"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 4"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 4"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 4"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 4"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 5"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 5"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 5"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 5"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 5"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 6"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 6"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 6"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 6"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 6"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="46" Name="List Table 1 Light"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark"/>
<w:LsdException Locked="false" Priority="51" Name="List Table 6 Colorful"/>
<w:LsdException Locked="false" Priority="52" Name="List Table 7 Colorful"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 1"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 1"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 1"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 1"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 1"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 2"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 2"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 2"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 2"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 2"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 3"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 3"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 3"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 3"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 3"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 4"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 4"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 4"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 4"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 4"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 5"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 5"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 5"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 5"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 5"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 6"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 6"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 6"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 6"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 6"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Mention"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Smart Hyperlink"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Hashtag"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Unresolved Mention"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Smart Link"/>
</w:LatentStyles>
</xml><![endif]--><!-- [if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin-top:0cm;
mso-para-margin-right:0cm;
mso-para-margin-bottom:8.0pt;
mso-para-margin-left:0cm;
line-height:107%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Arial",sans-serif;
mso-fareast-language:EN-US;}
</style>
<![endif]--> - In a linear regression model, Cook’s distance is used to estimate the influence of a data point on the regression. The principle of Cook’s distance is to measure the effect of deleting a given observation. Data points with a large distance may represent outliers. For the i<sup>th</sup> point in the sample, Cook’s distance is defined as:</p>
<p><img src="/assets/courseware/v1/44e4a1a89c5d55c7a141fdbe8784b4e1/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_060.png" alt="" width="154" height="62" /></p>
<p><strong>Mahalanobis Distance</strong> - This test is based on Wilks method designed to detect a single outlier from a normal multivariable sample. It approaches the maximum squared Mahalanobis Distance (MD) to an F-distribution function formulation, which is often more appropriate than a 𝜒2 distribution. For a p-dimensional multivariate sample (i = 1,…,n), the Mahalanobis distance of the i<sup>th </sup>case is defined as:</p>
<p><img src="/assets/courseware/v1/f71dbef4e1116e007878ff78f7276320/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_062.png" alt="" width="207" height="41" /></p>
<p>where t is the estimated multivariate location, which is usually the arithmetic mean, and C is the estimated covariance matrix, usually the sample covariance matrix.</p>
<p>Multivariate outliers can be simply defined as observations having a large squared Mahalanobis distance. In this work, the squared Mahalanobis distance is compared with quantiles of the F-distribution with p and p − 1 degrees of freedom. Critical values are calculated using Bonferroni bounds.</p>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+HST.953x+3T2020+type@video+block@b8edcd3bf803435e9494cfd7649c109e">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@video+block@b8edcd3bf803435e9494cfd7649c109e" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="video" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Statistical Methods</h3>
<div
id="video_b8edcd3bf803435e9494cfd7649c109e"
class="video closed"
data-metadata='{"saveStateUrl": "/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@b8edcd3bf803435e9494cfd7649c109e/handler/xmodule_handler/save_user_state", "lmsRootURL": "https://openlearninglibrary.mit.edu", "publishCompletionUrl": "/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@b8edcd3bf803435e9494cfd7649c109e/handler/publish_completion", "streams": "1.00:dNm9pBKxEgM", "duration": 0.0, "recordedYoutubeIsAvailable": true, "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@b8edcd3bf803435e9494cfd7649c109e/handler/transcript/available_translations", "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "speed": null, "end": 0.0, "completionPercentage": 0.95, "autoAdvance": false, "transcriptLanguage": "en", "prioritizeHls": false, "autohideHtml5": false, "ytTestTimeout": 1500, "transcriptLanguages": {"en": "English"}, "savedVideoPosition": 0.0, "sources": [], "completionEnabled": false, "saveStateEnabled": false, "generalSpeed": 1.0, "autoplay": false, "poster": null, "showCaptions": "true", "transcriptTranslationUrl": "/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@b8edcd3bf803435e9494cfd7649c109e/handler/transcript/translation/__lang__", "ytMetadataEndpoint": "", "start": 0.0}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="b8edcd3bf803435e9494cfd7649c109e"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
<h3 class="hd hd-4 downloads-heading sr" id="video-download-transcripts_b8edcd3bf803435e9494cfd7649c109e">Downloads and transcripts</h3>
<div class="wrapper-downloads" role="region" aria-labelledby="video-download-transcripts_b8edcd3bf803435e9494cfd7649c109e">
<div class="wrapper-download-transcripts">
<h4 class="hd hd-5">Transcripts</h4>
<ul class="list-download-transcripts">
<li class="transcript-option">
<a class="btn btn-link" href="/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@b8edcd3bf803435e9494cfd7649c109e/handler/transcript/download" data-value="srt">Download SubRip (.srt) file</a>
</li>
<li class="transcript-option">
<a class="btn btn-link" href="/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@b8edcd3bf803435e9494cfd7649c109e/handler/transcript/download" data-value="txt">Download Text (.txt) file</a>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@6f7c451ce2cd4b63ac84132e0cc8fa6f" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Proximity-based Models</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@8a3c24000d324478bf046ed1ae81d74c">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@8a3c24000d324478bf046ed1ae81d74c" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>Proximity-based techniques are simple to implement and unlike statistical models they make no prior assumptions about the data distribution model. They are suitable for both supervised and unsupervised multivariable outlier detection.</p>
<p>Clustering is a type of proximity-based technique that starts by partitioning a N–dimensional dataset into c subgroups of samples (clusters) based on their similarity. Then, some measure of the fit of the data points to the different clusters is used in order to determine if the data points are outliers. One challenge associated with this type of technique is that it assumes specific shapes of clusters depending on the distance function used within the clustering algorithm. For example, in a 3-dimensional space, the Euclidean distance would consider spheres as equidistant, whereas the Mahalanobis distance would consider ellipsoids as equidistant (where the length of the ellipsoids in one axis is proportional to the variance of the data in that direction). Common proximity-based methods for outlier detection are listed below:</p>
<p>Common proximity-based methods for outlier detection include:</p>
<ul>
<li><strong>k-Means</strong> - The k-means algorithm is widely used in data mining due to its simplicity and scalability. The difficulty associated with this algorithm is the need to determine k, the number of clusters, in advance. The algorithm minimizes the within-cluster sum of squares, the sum of distances between each point in a cluster and the cluster centroid. In k-means, the center of a group is the mean of measurements in the group.</li>
<li><strong>k-Medoids</strong> - In contrast to the k-means algorithm, in k-medoids the cluster centers are members of the group. Consequently, if there is a region of outliers outside the area with higher density of points, the cluster center will not be pushed towards the outliers region, as in k-means. Thus, k-medoids is more robust towards outliers than k-means.</li>
</ul>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@c8c942976e7846b5bd6ae69fc61ab4d4">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@c8c942976e7846b5bd6ae69fc61ab4d4" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Criteria for Outlier Detection</h3>
<p>After determining the position of the cluster center with either k-means or k-medoids, the criteria to classify an item as an outlier must be specified, and different options exist:</p>
<p><strong>Criterion 1</strong>: The first criterion proposed to detect outliers is based on the Euclidean distance to the cluster centers C, such that points more distant to their center than the minimum inter-cluster distance are considered outliers.</p>
<p><img src="/assets/courseware/v1/334ec13ed4f040cf74f581a01b82f56d/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_063.png" alt="" width="341" height="34" /></p>
<p>where d(x, Ck) is the Euclidean distance between point x and Ck center, <img src="/assets/courseware/v1/3e2db85b6341d8b5f944458a13708d8f/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_064.png" alt="" width="65" height="23" /> is the distance between Ck and Cj centers and w = {0.5, 0.7, 1, 1.2, 1.5,...} is a weighting parameter that determines how aggressively the method will remove outliers.</p>
<p><strong>Figure 2.06.2</strong> provides a graphical example of the effect of varying values of w in the creation of boundaries for outlier detection. While small values of w aggressively remove outliers, as w increases the harder it is to identify them.<br /><img src="/assets/courseware/v1/3955177f0c5fa6eb32536c107c3604cc/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_065.png" alt="" style="display: block; margin-left: auto; margin-right: auto;" width="545" height="478" /></p>
<p><strong>Fig. 2.06.2</strong> Effect of different weights w in the detection of cluster-based outliers, using criterion 1</p>
<p><strong>Criterion 2</strong>: In this criterion, we calculate the distance of each data point to its centroid (case of k-means) or medoid (case of k-medoids). If the ratio of the distance of the nearest point to the cluster center and these calculated distances are smaller than a certain threshold, than the point is considered an outlier.</p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@a4891883425f4bf0bd27cfa819b4c4cb" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Supervised Outlier Detection & Outlier Analysis Using Expert Knowledge</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@129fc385af6b4bd8a8e38b75df7061fc">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@129fc385af6b4bd8a8e38b75df7061fc" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Supervised Outlier Detection</h3>
<p>In many scenarios, previous knowledge about outliers may be available and can be used to label the data accordingly and to identify outliers of interest. The methods relying on previous examples of data outliers are referred to as supervised outlier detection methods and involve training classification models which can later be used to identify outliers in the data.</p>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@01f8d11eaa2e41c29a7592371e81bb9c">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@01f8d11eaa2e41c29a7592371e81bb9c" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Outlier Analysis Using Expert Knowledge</h3>
<p>In univariate analyses, expert knowledge can be used to define thresholds of values that are normal, critical (life-threatening) or impossible because they fall outside permissible ranges or have no physical meaning.</p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@444e4d57bb0c4b5699cc1405903bd303" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Case Study: Identification of Outliers in the Indwelling Arterial Catheter (IAC) StudyUnit & Expert Knowledge Analysis</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@a0eea928d2614185bda73128220cd1b4">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@a0eea928d2614185bda73128220cd1b4" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>In this section, various methods will be applied to identify outliers in two “real world” clinical datasets used in a study that investigated the effect of inserting an indwelling arterial catheter (IAC) in patients with respiratory failure. Two datasets are used and include patients that received an IAC (IAC group) and patients that did not (non-IAC). The code used to generate the analyses and the figures is available in the GitHub repository for this book.</p>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@1043d72bdab943e891a4940308cce3f2">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@1043d72bdab943e891a4940308cce3f2" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Expert Knowledge Analysis</h3>
<p>Table 2.06.1 provides maximum and minimum values for defining normal, critical and permissible ranges in some of the variables analyzed in the study, as well as maximum and minimum values present in the dataset.<br /><strong>Table 2.06.1</strong> Normal, critical and impossible ranges for the selected variables, and maximum and minimum values present in the datasets</p>
<p style="text-align: center;"><img src="/assets/courseware/v1/53555d06e88d4ec222f1835dd5bc1336/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_066.png" alt="Table 2.06.1 Normal, critical and impossible ranges for the selected variables, and maximum and minimum values present in the datasets" width="574" height="362" /></p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@447e6edb46944afeb5999f1166884763" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Univariate Analysis (1)</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@5764b2619dfd4fbf80082c4957b5ba22">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@5764b2619dfd4fbf80082c4957b5ba22" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>In this section, univariate outliers are identified for each variable within pre-defined classes (survivors and non-survivors), using the statistical methods described above.</p>
<h4>Table 2.06.2 Number and percentage of outliers identified by each method</h4>
<p style="text-align: center;"><img src="/assets/courseware/v1/c6bb2a4ce47d7848b12c2ed3835306bf/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_067.png" alt="Table 2.06.2 Number and percentage of outliers identified by each method" width="761" height="730" /></p>
<p><em>Table 2.06.2</em> summarizes the number and percentage of outliers identified by each method in the Indwelling Arterial Catheter (IAC) and non-IAC groups. Overall, Tukey’s and log-IQ are the most conservative methods, i.e., they identify the smallest number of points as outliers, whereas IQ identifies more outliers than any other method. With a few exceptions, the modified z-score identifies more outliers than the z-score.</p>
<p>“Total patients” represents the number of patients identified when considering all variables together. The results in bold highlight the variable with the most outliers in each method, and also the method that removes more patients in total, in each class. Class 0: represents survivors, Class 1: non-survivors</p>
<p>A preliminary investigation of results showed that values falling within reference normal ranges (see <em>Table 2.06.1</em>) are never identified as outliers, whatever the method. On the other hand, critical values are often identified as such. Additional remarks can be made as in general (1) more outliers are identified in the variable BUN than in any other and (2) the ratio of number of outliers and total number of patients is smaller in the class 1 cohorts (non-survivors). As expected, for variables that approximate more to lognormal distribution than to a normal distribution, such as potassium, BUN and PCO2, the IQ method applied to the logarithmic transformation of data (log-IQ method) identifies less outliers than the IQ applied to the real data. Consider for instance the variable BUN, which follows approximately a lognormal distribution. </p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@3bb4ef0d26f74b878730e73546e0b0fa" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Univariate Analysis (2)</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@ee9aa3920f06441a90061f25779c6815">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@ee9aa3920f06441a90061f25779c6815" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Outliers identified by statistical analysis</h3>
<p style="text-align: center;"><img src="/assets/courseware/v1/fa64285e88b2b8cd0811dab32bd9de9c/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_069.png" alt="Fig. 2.06.3 Outliers identified by statistical analysis for the variable BUN, in the IAC cohort. Class 0: survivors; Class 1: non survivors" width="583" height="516" /></p>
<h4><strong>Fig. 2.06.3</strong> Outliers identified by statistical analysis for the variable BUN, in the IAC cohort. Class 0: survivors; Class 1: non-survivors</h4>
<p>On the other hand, when the values follow approximately a normal distribution, as in the case of chloride (see Fig. 2.06.4), the IQ method identifies fewer outliers than log-IQ. Of note, the range of values considered outliers differs between classes, i.e., what is considered an outlier in class 0 is not necessarily an outlier in class 1. An example of this is values smaller than 90 mmol/L in the modified z-score.</p>
<p style="text-align: center;"><img src="/assets/courseware/v1/15a95d9c61a741ea83271604921dd345/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_070.png" alt="Fig. 2.06.4 Outliers identified by statistical analysis for the variable chloride, in the IAC cohort. Class 0: survivors; Class 1: non survivors" width="583" height="520" /></p>
<h4><strong>Fig. 2.06.4 </strong>Outliers identified by statistical analysis for the variable chloride, in the IAC cohort. Class 0: survivors; Class 1: non-survivors</h4>
<p>Since this is a univariate analysis, the investigation of extreme values using expert knowledge is of interest. For chloride, normal values are in the range of 95–105 mmol/L, whereas values <70 or >120 mmol/L are considered critical, and concentrations above 160 mmol/L are physiologically impossible. Figure 2.06.4 confirms that normal values are always kept, whatever the method. Importantly, some critical values are not identified in both z-score and modified z-score (especially in class 1). Thus, it seems that the methods identify outliers that should not be eliminated, as they likely represent actual values in extremely sick patients.</p>
<p></p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@029c2166bf3a419497ffd8d48f4e0f30" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Multivariable Analysis</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@c930df4f8ca44862a397529ade5238b9">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@c930df4f8ca44862a397529ade5238b9" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3></h3>
<p>Using model-based approaches, an unusual combination of values for a number of variables can be identified. In this analysis, we will be concerned with multivariable outliers for the complete set of variables in the datasets, including those that are binary. In order to investigate multivariable outliers in IAC and non-IAC patients, the Mahalanobis distance and cluster-based approaches are tested within pre-defined classes. Table 14.3 shows the average results in terms of the number of clusters c determined by the silhouette index, and the percentage of patients identified as outliers. In order to account for variability, the tests were performed 100 times. The data was normalized for testing the cluster-based approaches only.</p>
<p><strong>Table 2.06.3</strong> Multivariable outliers identified by k-means, k-medoids, and Mahalanobis distance</p>
<p style="text-align: center;"><img src="/assets/courseware/v1/14900b512bee6bda1dc15d114fa18cc2/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_072.png" alt="Table 2.06.3 Multivariable outliers identified by k-means, k-medoids and Mahalanobis distance" width="478" height="645" /></p>
<p style="text-align: left;">Results are presented as mean ± standard deviation</p>
<p style="text-align: left;">Considering the scenario where two clusters are created for the complete IAC dataset separated by classes, we investigate outliers by looking at multivariable observations around cluster centers. Figure 14.5 shows an example of the outliers detected using k-means and k-medoids with criterion 1 and weight equal to 1.5. For illustrative purposes, we present only the graphical results of patients that died in the IAC group (class 1). The x-axis represents each of the selected features (see Table 14.1) and the y-axis represents the corresponding values normalized between 0 and 1. K-medoids does not identify any outlier, whereas k-means identifies 1 outlier in the first cluster and 2 outliers in the second cluster. This difference can be attributed to the fact that the intercluster distance is smaller in k-medoids than in k-means.</p>
<p style="text-align: center;"><img src="/assets/courseware/v1/d9ce0919153428274bd41882f883f1f1/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_073.png" alt="Fig. 2.06.5 Outliers identified by clustering based approaches for patients that died after IAC. Criterion 1, based on interclusters distance, with c = 2 and w = 1.5 was used. K-medoids does not identify outliers, whereas k-means identifies 1 outlier in cluster 1 and 2 outliers in cluster 2" width="560" height="418" /></p>
<p style="text-align: left;"><strong>Fig. 2.06.5</strong> Outliers identified by clustering-based approaches for patients that died after IAC. Criterion 1, based on inter-clusters distance, with c = 2 and w = 1.5 was used. K-medoids does not identify outliers, whereas k-means identifies 1 outlier in cluster 1 and 2 outliers in cluster 2</p>
<p style="text-align: left;">The detection of outliers seems to be more influenced by binary features than by continuous features: red lines are, with some exceptions, fairly close to black lines for the continuous variables (1 to 2 and 15 to 25) and distant in the binary variables. A possible explanation is that clustering was essentially designed for multivariable continuous data; binary variables produce a maximum separation, since only two values exist, 0 and 1, with nothing between them.</p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@f878f7a904b340e2bd57743ed829c072" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Classification of Mortality in IAC and Non-IAC Patients</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@3e4a6ff36d9645538491b8fd1a7629ff">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@3e4a6ff36d9645538491b8fd1a7629ff" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3></h3>
<p>Logistic regression models were created to assess the effect of removing outliers using the different methods in the classification of mortality in IAC and non-IAC patients, following the same rationale as in Chap. 13 - Missing Data. A 10-fold cross-validation approach was used to assess the validity and robustness of the models. In each round, every outlier identification method was applied separately for each class of the training set, and the results were averaged over the rounds. Before cross-validation, the values were normalized between 0 and 1 using the min-max procedure. For the log-IQ method, the data was log-transformed before normalization, except for variables containing null values (binary variables in Table 2.06.1, SOFA, and creatinine). We also investigate the scenario where only the 10 % worst examples detected by each statistical method within each class are considered, and the case where no outliers were removed (all data is used). In the clustering-based approaches, the number of clusters c was chosen between 2 and 10 using the silhouette index method. We also show the case where c is fixed as 2. The weight of the clustering-based approaches was adjusted according to the particularities of the method. Since a cluster center in k-medoids is a data point belonging to the dataset, the distance to its nearest neighbor is smaller than in the case of k-means, especially because a lot of binary variables are considered. For this reason, we chose higher values of w for k-means criterion 2.</p>
<p>The performance of the models is evaluated in terms of area under the receiver operating characteristic curve (AUC), accuracy (ACC, correct classification rate), sensitivity (true positive classification rate), and specificity (true negative classification rate). A specific test suggested by DeLong and DeLong can then test whether the results differ significantly.</p>
<p>The performance results for the IAC group are shown in Table 2.06.4, and the percentage of patients removed using each method in Table 2.06.5. For conciseness, the results for the non-IAC group are not shown. The best performance for IAC is AUC = 0.83 and ACC = 0.78 (highlighted in bold). The maximum sensitivity is 87 % and maximum specificity is 79 %, however, these two do not occur simultaneously. Overall, the best AUC is obtained when all the data is used, and when only a few outliers are removed. The worst performances are obtained using the z-score without trimming the results and k-means and k-medoids using c = 2, criterion 1, and weight 1.2. As for non-IAC, the best performance corresponds to AUC = 0.88, ACC = 0.84, sensitivity = 0.85 and specificity = 0.85. Again, the best performance is achieved when all the data is used and in the cases where fewer outliers are removed. The worst performance by far is obtained when all outliers identified by the z-score are removed. Similarly to IAC, for k-means and k-medoids criterion 1, increasing values of weight provide better results.</p>
<p><em><strong>Table 2.06.4</strong> </em>IAC logistic regression results using 10-fold cross-validation, after removal of outliers and using the original dataset</p>
<p style="text-align: center;"><img src="/assets/courseware/v1/5303793f0d0933bd33c552eeb24d3393/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_075.png" alt="" width="450" height="259" /><br /><img src="/assets/courseware/v1/b08b52d1b6c91489b6ca7a903699eb87/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_076.png" alt="" width="513" height="471" /></p>
<p style="text-align: left;">Results are presented as mean ± standard deviation</p>
<p style="text-align: left;"><strong>Table 2.06.5</strong> Percentage of IAC patients removed by each method in the train set, during cross-validation</p>
<p style="text-align: center;"><img src="/assets/courseware/v1/209a61fe431acc776f1bd82aeaecb7ee/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_078.png" alt="" width="339" height="257" /></p>
<p style="text-align: center;"><img src="/assets/courseware/v1/350339259ce2e2e64dcaeaae75bf4893/asset-v1:MITx+HST.953x+3T2020+type@asset+block/Selection_080.png" alt="" width="438" height="453" /></p>
<p style="text-align: left;">Results are presented as mean ± standard deviation</p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@f06d943fe68b43feb4c7feee13120427" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Conclusions & Key Takeaways</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@c8aac1365a794d86987a4356fd3dcdf5">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@c8aac1365a794d86987a4356fd3dcdf5" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3></h3>
<p>The univariable outlier analysis provided in the case study showed that a large number of outliers were identified for each variable within the predefined classes, meaning that the removal of all the identified outliers would cause a large portion of data to be excluded. For this reason, ranking the univariate outliers according to score values and discarding only those with the highest scores provided better classification results.</p>
<p>Overall, none of the outlier removal techniques was able to improve the performance of a classification model. As it had been cleaned these results suggest that the dataset did not contain impossible values, extreme values are probably due to biological variation rather than experimental mistakes. Hence, the “outliers” in this study appear to contain useful information in their extreme values, and automatically excluding resulted in a loss of this information.</p>
<p>Some modeling methods already accommodate for outliers so they have minimal impact in the model, and can be tuned to be more or less sensitive to them. Thus, rather than excluding outliers from the dataset before the modeling step, an alternative strategy would be to use models that are robust to outliers, such as robust regression.</p>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@97caceb36d9e4d4aa48ea5c89f13d65b">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@97caceb36d9e4d4aa48ea5c89f13d65b" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>Key Takeaways</p>
<ul>
<li>Distinguishing outliers as useful or uninformative is not clear cut.</li>
<li>In certain contexts, outliers may represent extremely valuable information that must not be discarded.</li>
<li>Various methods exist and will identify possible or likely outliers, but the expert eye must prevail before deleting or correcting outliers.</li>
</ul>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@f2f67a798f854a2cbefbcc55da626504" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Code Appendix & References</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@fb34cfde774c42e9bdd3a28369fe35c5">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@fb34cfde774c42e9bdd3a28369fe35c5" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<h3>Code Appendix</h3>
<p>The code used in this chapter is available in <a href="https://github.com/MIT-LCP/critical-data-book" target="[object Object]">this GitHub repository</a>. Further information on the code is available on this website.</p>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+HST.953x+3T2020+type@html+block@66e3be50d31f4e0594d6771fe8eb34c5">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@html+block@66e3be50d31f4e0594d6771fe8eb34c5" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="html" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">References</span></p>
<p><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">1. Barnett V, Lewis T (1994) Outliers in statistical data, 3rd edn. Wiley, Chichester</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">2. Aggarwal CC (2013) Outlier analysis. Springer, New York</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">3. Osborne JW, Overbay A (2004) The power of outliers (and why researchers should always check for them). Pract Assess Res Eval 9(6):1–12</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">4. Hodge VJ, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">5. Tukey J (1977) Exploratory data analysis. Pearson</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">6. Shiffler RE (1988) Maximum Z scores and outliers. Am Stat 42(1):79–80</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">7. Iglewicz B, Hoaglin DC (1993) How to detect and handle outliers. ASQC Quality Press</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">8. Seo S (2006) A review and comparison of methods for detecting outliers in univariate data sets. 09 Aug 2006 [Online]. Available: <a href="http://d-scholarship.pitt.edu/7948/" target="[object Object]">http://d-scholarship.pitt.edu/7948/</a>. Accessed 07-Feb-2016</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">9. Cook RD, Weisberg S (1982) Residuals and influence in regression. Chapman and Hall, New York</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">10. Penny KI (1996) Appropriate critical values when testing for a single multivariate outlier by using the Mahalanobis distance. Appl Stat 45(1):73–81</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">11. Macqueen J (1967) Some methods for classification and analysis of multivariate observations. Presented at the proceedings of 5th Berkeley symposium on mathematical statistics and probability, pp 281–297</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">12. Hu X, Xu L (2003) A comparative study of several cluster number selection criteria. In: Liu J, Cheung Y, Yin H (eds) Intelligent data engineering and automated learning. Springer, Berlin, pp 195–202</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">13. Jones RH (2011) Bayesian information criterion for longitudinal and clustered data. Stat Med 30(25):3050–3056</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">14. Cherednichenko S (2005) Outlier detection in clustering</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">15. Provan D (2010) Oxford handbook of clinical and laboratory investigation. OUP Oxford</span><br /><span style="font-family: 'Open Sans', Verdana, Arial, Helvetica, sans-serif;">16. DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44(3):837–845</span></p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@vertical+block@c96fb7c4c2eb47b4acf0a6564c1a8318" data-init="VerticalStudentView" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="vertical" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<h2 class="hd hd-2 unit-title">Workshop (Optional)</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+HST.953x+3T2020+type@video+block@db12c86caee341acacc6464c83151cd0">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-usage-id="block-v1:MITx+HST.953x+3T2020+type@video+block@db12c86caee341acacc6464c83151cd0" data-init="XBlockToXModuleShim" data-graded="False" data-request-token="a35c6d9643bd11ef8f110e08775edbcd" data-block-type="video" data-runtime-version="1" data-course-id="course-v1:MITx+HST.953x+3T2020" data-has-score="False" data-runtime-class="LmsRuntime">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Workshop (Optional)</h3>
<div
id="video_db12c86caee341acacc6464c83151cd0"
class="video closed"
data-metadata='{"saveStateUrl": "/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@db12c86caee341acacc6464c83151cd0/handler/xmodule_handler/save_user_state", "lmsRootURL": "https://openlearninglibrary.mit.edu", "publishCompletionUrl": "/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@db12c86caee341acacc6464c83151cd0/handler/publish_completion", "streams": "1.00:4Y2D_40dgv4", "duration": 0.0, "recordedYoutubeIsAvailable": true, "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@db12c86caee341acacc6464c83151cd0/handler/transcript/available_translations", "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "speed": null, "end": 552.0, "completionPercentage": 0.95, "autoAdvance": false, "transcriptLanguage": "en", "prioritizeHls": false, "autohideHtml5": false, "ytTestTimeout": 1500, "transcriptLanguages": {"en": "English"}, "savedVideoPosition": 0.0, "sources": [], "completionEnabled": false, "saveStateEnabled": false, "generalSpeed": 1.0, "autoplay": false, "poster": null, "showCaptions": "true", "transcriptTranslationUrl": "/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@db12c86caee341acacc6464c83151cd0/handler/transcript/translation/__lang__", "ytMetadataEndpoint": "", "start": 0.0}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="db12c86caee341acacc6464c83151cd0"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
<h3 class="hd hd-4 downloads-heading sr" id="video-download-transcripts_db12c86caee341acacc6464c83151cd0">Downloads and transcripts</h3>
<div class="wrapper-downloads" role="region" aria-labelledby="video-download-transcripts_db12c86caee341acacc6464c83151cd0">
<div class="wrapper-download-transcripts">
<h4 class="hd hd-5">Transcripts</h4>
<ul class="list-download-transcripts">
<li class="transcript-option">
<a class="btn btn-link" href="/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@db12c86caee341acacc6464c83151cd0/handler/transcript/download" data-value="srt">Download SubRip (.srt) file</a>
</li>
<li class="transcript-option">
<a class="btn btn-link" href="/courses/course-v1:MITx+HST.953x+3T2020/xblock/block-v1:MITx+HST.953x+3T2020+type@video+block@db12c86caee341acacc6464c83151cd0/handler/transcript/download" data-value="txt">Download Text (.txt) file</a>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
© All Rights Reserved