<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@perceptron_notes" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Notes – Chapter 3: Perceptron</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_notes_top">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_notes_top" data-init="XBlockToXModuleShim" data-block-type="html" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>
You can sequence through the Perceptron lecture video and note segments (go to Next page). </p><p>
You can also (or alternatively) download the <a href="/assets/courseware/v1/8f4f9aca5581dde50291b0d0e29d0148/asset-v1:MITx+6.036+1T2019+type@asset+block/notes_chapter_The_Perceptron.pdf" target="_blank">Chapter 3: Perceptron</a> notes as a PDF file. </p>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L01k_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: The perceptron algorithm</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01k">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01k" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: The perceptron algorithm</h3>
<div
id="video_MIT6036L01k"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:Rn_x12VHplM", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01k/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01k/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01k/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01k/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L01k"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@perceptron_top_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">An algorithm by any other name...</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_top">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_top" data-init="XBlockToXModuleShim" data-block-type="html" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>
First of all, the <span options="" class="marginote"><span class="marginote_desc" style="display:none">Well, maybe “neocognitron," also the name of a real ML algorithm, is cooler.</span><span>coolest algorithm name! </span></span> It is based on the 1943 model of neurons made by McCulloch and Pitts and by Hebb. It was developed by Rosenblatt in 1962. At the time, it was not interpreted as attempting to optimize any particular criteria; it was presented directly as an algorithm. There has, since, been a huge amount of study and analysis of its convergence properties and other aspects of its behavior. </p><p>
<br/></p><p>
<br/></p><p><a href="/assets/courseware/v1/8f4f9aca5581dde50291b0d0e29d0148/asset-v1:MITx+6.036+1T2019+type@asset+block/notes_chapter_The_Perceptron.pdf" target="_blank">Download this chapter as a PDF file</a></p><script src="/assets/courseware/v1/1ab2c06aefab58693cfc9c10394b7503/asset-v1:MITx+6.036+1T2019+type@asset+block/marginotes.js" type="text/javascript"/><span><br/><span style="color:gray;font-size:10pt"><center>This page was last updated on Saturday November 16, 2019; 07:30:57 PM (revision f808f068e)</center></span></span>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L01m_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: The perceptron algorithm in action - an example</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01m">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01m" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: The perceptron algorithm in action - an example</h3>
<div
id="video_MIT6036L01m"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:QLJa1g9n6Ms", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01m/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01m/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01m/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01m/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L01m"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@perceptron_algorithm_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Algorithm</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_algorithm">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_algorithm" data-init="XBlockToXModuleShim" data-block-type="html" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>
Recall that we have a training dataset [mathjaxinline]{\cal D}_ n[/mathjaxinline] with [mathjaxinline]x \in \mathbb {R}^ d[/mathjaxinline], and [mathjaxinline]y\in \{ -1, +1\}[/mathjaxinline]. The Perceptron algorithm trains a binary classifier [mathjaxinline]h(x; \theta , \theta _0)[/mathjaxinline] using the following algorithm to find [mathjaxinline]\theta[/mathjaxinline] and [mathjaxinline]\theta _0[/mathjaxinline] using [mathjaxinline]\tau[/mathjaxinline] <span options="" class="marginote"><span class="marginote_desc" style="display:none">We use Greek letter [mathjaxinline]\tau[/mathjaxinline] here instead of [mathjaxinline]T[/mathjaxinline] so we don't confuse it with transpose!</span><span>iterative steps:</span></span> </p><p><img src="/assets/courseware/v1/3c41ac8e1fe8267270a74edcce108989/asset-v1:MITx+6.036+1T2019+type@asset+block/images_perceptron_algorithm_codebox_1-crop.png" width="366"/></p><p><span options="" class="marginote"><span class="marginote_desc" style="display:none">Let's check dimensions. Remember that [mathjaxinline]\theta[/mathjaxinline] is [mathjaxinline]d \times 1[/mathjaxinline], [mathjaxinline]x^{(i)}[/mathjaxinline] is [mathjaxinline]d \times 1[/mathjaxinline], and [mathjaxinline]y^{(i)}[/mathjaxinline] is a scalar. Does everything match?</span><span>note</span></span></p><p>
Intuitively, on each step, if the current hypothesis [mathjaxinline]\theta , \theta _0[/mathjaxinline] classifies example [mathjaxinline]x^{(i)}[/mathjaxinline] correctly, then no change is made. If it classifies [mathjaxinline]x^{(i)}[/mathjaxinline] incorrectly, then it moves [mathjaxinline]\theta , \theta _0[/mathjaxinline] so that it is “closer" to classifying [mathjaxinline]x^{(i)}, y^{(i)}[/mathjaxinline] correctly. </p><p>
Note that if the algorithm ever goes through one iteration of the loop on line 4 without making an update, it will never make any further updates (verify that you believe this!) and so it should just terminate at that point. <br/> <br/><span style="color:#FF0000"><b class="bf">Study Question:</b></span> <span style="color:#0000FF"><em>What is true about [mathjaxinline]\mathcal{E}_ n[/mathjaxinline] if that happens?</em></span> <br/></p><p><div style="border-radius:10px;padding:5px;border-style:solid;background-color:rgba(0,255,0,0.03);" class="examplebox"><p><b class="bf">Example:</b> Let [mathjaxinline]h[/mathjaxinline] be the linear classifier defined by [mathjaxinline]\theta ^{(0)} = \begin{bmatrix} 1 \\ -1 \end{bmatrix}, \theta _0^{(0)} = 1[/mathjaxinline]. The diagram below shows several points classified by [mathjaxinline]h[/mathjaxinline]. However, in this case, [mathjaxinline]h[/mathjaxinline] (represented by the bold line) misclassifies the point [mathjaxinline]x^{(1)} = \begin{bmatrix} 1 \\ 3 \end{bmatrix}[/mathjaxinline] which has label [mathjaxinline]y^{(1)} = 1[/mathjaxinline]. Indeed, </p><table id="a0000000002" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]y^{(1)}\left(\theta ^ T x^{(1)} + \theta _0\right) = \begin{bmatrix} 1 & -1 \end{bmatrix} \begin{bmatrix} 1 \\ 3 \end{bmatrix} + 1 = -1 < 0[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
By running an iteration of the Perceptron algorithm, we update </p><table id="a0000000003" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\theta ^{(1)} = \theta ^{(0)} + y^{(1)}x^{(1)} = \begin{bmatrix} 2 \\ 2 \end{bmatrix}[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><table id="a0000000004" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\theta _0^{(1)} = \theta _0^{(0)} + y^{(1)} = 2[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
The new classifier (represented by the dashed line) now correctly classifies that point, but now makes a mistake on the negatively labeled point. </p><center><p><img src="/assets/courseware/v1/52f91fb0da334c4b82ee4e43b85cc2b6/asset-v1:MITx+6.036+1T2019+type@asset+block/images_perceptron_algorithm_tikzpicture_1-crop.png" width="598"/></p></center></div></p><p>
A really important fact about the perceptron algorithm is that, if there is a linear classifier with 0 training error, then this algorithm will (eventually) find it! We'll look at a proof of this in detail, next. </p><p>
<br/></p><p>
<br/></p><p><a href="/assets/courseware/v1/8f4f9aca5581dde50291b0d0e29d0148/asset-v1:MITx+6.036+1T2019+type@asset+block/notes_chapter_The_Perceptron.pdf" target="_blank">Download this chapter as a PDF file</a></p><script src="/assets/courseware/v1/1ab2c06aefab58693cfc9c10394b7503/asset-v1:MITx+6.036+1T2019+type@asset+block/marginotes.js" type="text/javascript"/><span><br/><span style="color:gray;font-size:10pt"><center>This page was last updated on Friday May 24, 2019; 02:28:15 PM (revision 4f166135)</center></span></span>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L01n_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: Evaluating learning algorithms - validation</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01n">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01n" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: Evaluating learning algorithms - validation</h3>
<div
id="video_MIT6036L01n"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:X_nOMVxfK08", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01n/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01n/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01n/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L01n/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L01n"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@perceptron_offset_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Offset</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_offset">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_offset" data-init="XBlockToXModuleShim" data-block-type="html" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>
Sometimes, it can be easier to implement or analyze classifiers of the form </p><table id="a0000000005" cellpadding="7" width="100%" cellspacing="0" class="eqnarray" style="table-layout:auto"><tr id="a0000000006"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
[mathjaxinline]\displaystyle h(x; \theta ) = \begin{cases} +1 & \text {if } \theta ^ Tx > 0 \\ -1 & \text {otherwise.} \end{cases}[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr></table><p>
Without an explicit offset term ([mathjaxinline]\theta _0[/mathjaxinline]), this separator must pass through the origin, which may appear to be limiting. However, we can convert any problem involving a linear separator <em>with</em> offset into one with <em>no</em> offset (but of higher dimension)! </p><p>
Consider the [mathjaxinline]d[/mathjaxinline]-dimensional linear separator defined by [mathjaxinline]\theta = \begin{bmatrix} \theta _1 & \theta _2 & \cdots & \theta _ d \end{bmatrix}[/mathjaxinline] and offset [mathjaxinline]\theta _0[/mathjaxinline]. </p><ul class="itemize"><li><p>
to each data point [mathjaxinline]x \in {\cal D}[/mathjaxinline], append a coordinate with value +1, yielding </p><table id="a0000000007" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]x_{\rm new} = \begin{bmatrix} x_1 & \cdots & x_ d & +1 \end{bmatrix}^ T[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table></li><li><p>
define </p><table id="a0000000008" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\theta _{\rm new} = \begin{bmatrix} \theta _1 & \cdots & \theta _ d & \theta _0 \end{bmatrix}^ T[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table></li></ul><p>
Then, </p><table id="a0000000009" cellpadding="7" width="100%" cellspacing="0" class="eqnarray" style="table-layout:auto"><tr id="a0000000010"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
[mathjaxinline]\displaystyle \theta _{\rm new}^ T \cdot x_{\rm new}[/mathjaxinline]
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle = \theta _1x_1 + \cdots + \theta _ dx_ d + \theta _0 \cdot 1[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000011"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle = \theta ^ Tx + \theta _0[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr></table><p>
Thus, [mathjaxinline]\theta _{\rm new}[/mathjaxinline] is an equivalent ([mathjaxinline](d+1)[/mathjaxinline]-dimensional) separator to our original, but with no offset. </p><p>
Consider the data set: </p><table id="a0000000012" cellpadding="7" width="100%" cellspacing="0" class="eqnarray" style="table-layout:auto"><tr id="a0000000013"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
[mathjaxinline]\displaystyle X[/mathjaxinline]
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle = [[1], [2], [3], [4]][/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000014"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
[mathjaxinline]\displaystyle Y[/mathjaxinline]
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle = [[+1], [+1], [-1], [-1]][/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr></table><p>
It is linearly separable in [mathjaxinline]d = 1[/mathjaxinline] with [mathjaxinline]\theta = [-1][/mathjaxinline] and [mathjaxinline]\theta _0 = 2.5[/mathjaxinline]. But it is not linearly separable through the origin! Now, let </p><table id="a0000000015" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]X_{\rm new} = \begin{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix}\begin{bmatrix} 2 \\ 1 \end{bmatrix}\begin{bmatrix} 3 \\ 1 \end{bmatrix}\begin{bmatrix} 4 \\ 1 \end{bmatrix}\end{bmatrix}[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
This new dataset is separable through the origin, with [mathjaxinline]\theta _{\rm new} = [-1, 2.5]^ T[/mathjaxinline]. </p><p>
We can make a simplified version of the perceptron algorithm if we restrict ourselves to separators <span options="" class="marginote"><span class="marginote_desc" style="display:none">We list it here because this is the version of the algorithm we'll study in more detail.</span><span>through the origin: </span></span> </p><p><img src="/assets/courseware/v1/17c0bcee94f089b14417d2ec1113ea99/asset-v1:MITx+6.036+1T2019+type@asset+block/images_perceptron_offset_codebox_1-crop.png" width="412"/></p><p>
<br/></p><p>
<br/></p><p><a href="/assets/courseware/v1/8f4f9aca5581dde50291b0d0e29d0148/asset-v1:MITx+6.036+1T2019+type@asset+block/notes_chapter_The_Perceptron.pdf" target="_blank">Download this chapter as a PDF file</a></p><script src="/assets/courseware/v1/1ab2c06aefab58693cfc9c10394b7503/asset-v1:MITx+6.036+1T2019+type@asset+block/marginotes.js" type="text/javascript"/><span><br/><span style="color:gray;font-size:10pt"><center>This page was last updated on Friday May 24, 2019; 02:28:15 PM (revision 4f166135)</center></span></span>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L02a_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: Perceptron - overview of plan</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02a">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02a" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: Perceptron - overview of plan</h3>
<div
id="video_MIT6036L02a"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:dmmlllblM1Y", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02a/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02a/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02a/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02a/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L02a"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L02b_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: Perceptron through origin algorithm</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02b">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02b" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: Perceptron through origin algorithm</h3>
<div
id="video_MIT6036L02b"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:tFdk5BZhMrU", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02b/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02b/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02b/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02b/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L02b"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L02c_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: Theory of perceptron - Linear separability</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02c">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02c" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: Theory of perceptron - Linear separability</h3>
<div
id="video_MIT6036L02c"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:uDClajMTmZw", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02c/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02c/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02c/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02c/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L02c"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L02d_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: Theory of perceptron - margin of a dataset</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02d">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02d" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: Theory of perceptron - margin of a dataset</h3>
<div
id="video_MIT6036L02d"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:Nsn82ZQt4Yc", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02d/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02d/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02d/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02d/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L02d"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L02e_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: Perceptron convergence theorem</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02e">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02e" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: Perceptron convergence theorem</h3>
<div
id="video_MIT6036L02e"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:fjByzXxgYgk", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02e/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02e/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02e/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02e/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L02e"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@MIT6036L02f_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Lecture: Proof sketch of the perceptron convergence theorem</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02f">
<div class="xblock xblock-public_view xblock-public_view-video xmodule_display xmodule_VideoBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02f" data-init="XBlockToXModuleShim" data-block-type="video" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "Video"}
</script>
<h3 class="hd hd-2">Lecture: Proof sketch of the perceptron convergence theorem</h3>
<div
id="video_MIT6036L02f"
class="video closed"
data-metadata='{"autoAdvance": false, "prioritizeHls": false, "recordedYoutubeIsAvailable": true, "ytTestTimeout": 1500, "poster": null, "streams": "1.00:DRr3zUmI9WU", "saveStateEnabled": false, "end": 0.0, "speed": null, "completionPercentage": 0.95, "start": 0.0, "publishCompletionUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02f/handler/publish_completion", "duration": 0.0, "autoplay": false, "savedVideoPosition": 0.0, "generalSpeed": 1.0, "autohideHtml5": false, "ytMetadataEndpoint": "", "transcriptTranslationUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02f/handler/transcript/translation/__lang__", "showCaptions": "true", "completionEnabled": false, "captionDataDir": null, "ytApiUrl": "https://www.youtube.com/iframe_api", "saveStateUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02f/handler/xmodule_handler/save_user_state", "transcriptAvailableTranslationsUrl": "/courses/course-v1:MITx+6.036+1T2019/xblock/block-v1:MITx+6.036+1T2019+type@video+block@MIT6036L02f/handler/transcript/available_translations", "sources": [], "transcriptLanguages": {"en": "English"}, "transcriptLanguage": "en", "lmsRootURL": "https://openlearninglibrary.mit.edu"}'
data-bumper-metadata='null'
data-autoadvance-enabled="False"
data-poster='null'
tabindex="-1"
>
<div class="focus_grabber first"></div>
<div class="tc-wrapper">
<div class="video-wrapper">
<span tabindex="0" class="spinner" aria-hidden="false" aria-label="Loading video player"></span>
<span tabindex="-1" class="btn-play fa fa-youtube-play fa-2x is-hidden" aria-hidden="true" aria-label="Play video"></span>
<div class="video-player-pre"></div>
<div class="video-player">
<div id="MIT6036L02f"></div>
<h4 class="hd hd-4 video-error is-hidden">No playable video sources found.</h4>
<h4 class="hd hd-4 video-hls-error is-hidden">
Your browser does not support this video format. Try using a different browser.
</h4>
</div>
<div class="video-player-post"></div>
<div class="closed-captions"></div>
<div class="video-controls is-hidden">
<div>
<div class="vcr"><div class="vidtime">0:00 / 0:00</div></div>
<div class="secondary-controls"></div>
</div>
</div>
</div>
</div>
<div class="focus_grabber last"></div>
</div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@vertical+block@perceptron_theory_of_the_perceptron_vert" data-init="VerticalStudentView" data-block-type="vertical" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<h2 class="hd hd-2 unit-title">Theory of the perceptron</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_theory_of_the_perceptron">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-request-token="58abd4ee05d711f0a17b0affe2bbc7c1" data-usage-id="block-v1:MITx+6.036+1T2019+type@html+block@perceptron_theory_of_the_perceptron" data-init="XBlockToXModuleShim" data-block-type="html" data-runtime-class="LmsRuntime" data-course-id="course-v1:MITx+6.036+1T2019" data-has-score="False" data-graded="False" data-runtime-version="1">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<p>
Now, we'll say something formal about how well the perceptron algorithm really works. We start by characterizing the set of problems that can be solved perfectly by the perceptron algorithm, and then prove that, in fact, it can solve these problems. In addition, we provide a notion of what makes a problem difficult for perceptron and link that notion of difficulty to the number of iterations the algorithm will take. </p><p><h3>Linear separability</h3> A training set [mathjaxinline]{\cal D}_ n[/mathjaxinline] is <em>linearly separable</em> if there exist [mathjaxinline]\theta , \theta _0[/mathjaxinline] such that, for all [mathjaxinline]i = 1, 2, \dots , n[/mathjaxinline]: </p><table id="a0000000016" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]y^{(i)}\left(\theta ^ Tx^{(i)} + \theta _0\right) > 0 \; \; .[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
Another way to say this is that all predictions on the training set are correct: </p><table id="a0000000017" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]h(x^{(i)}; \theta , \theta _0) = y^{(i)} \; \; .[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
And, another way to say this is that the training error is zero: </p><table id="a0000000018" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\mathcal{E}_ n(h) = 0 \; \; .[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p><h3>Convergence theorem</h3> The basic result about the perceptron is that, if the training data [mathjaxinline]{\cal D}_ n[/mathjaxinline] is linearly separable, then the perceptron algorithm is guaranteed to find <span options="" class="marginote"><span class="marginote_desc" style="display:none">If the training data is <em>not</em> linearly separable, the algorithm will not be able to tell you for sure, in finite time, that it is not linearly separable. There are other algorithms that can test for linear separability with run-times [mathjaxinline]O(n^{d/2})[/mathjaxinline] or [mathjaxinline]O(d^{2n})[/mathjaxinline] or [mathjaxinline]O((n^{d-1}\log {n})[/mathjaxinline].</span><span>a linear separator. </span></span> </p><p>
We will more specifically characterize the linear separability of the dataset by the <em>margin</em> of the separator. We'll start by defining the margin of a point with respect to a hyperplane. </p><p>
First, recall that the distance of a point [mathjaxinline]x[/mathjaxinline] to the hyperplane [mathjaxinline]\theta , \theta _0[/mathjaxinline] is </p><table id="a0000000019" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\frac{\theta ^ Tx + \theta _0}{\left\lVert \theta \right\rVert }\; \; .[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
Then, we'll define the <em>margin</em> of a <em>labeled point</em> [mathjaxinline](x, y)[/mathjaxinline] with respect to hyperplane [mathjaxinline]\theta , \theta _0[/mathjaxinline] to be </p><table id="a0000000020" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]y \cdot \frac{\theta ^ Tx + \theta _0}{\left\lVert \theta \right\rVert }\; \; .[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
This quantity will be positive if and only if the point [mathjaxinline]x[/mathjaxinline] is classified as [mathjaxinline]y[/mathjaxinline] by the linear classifier represented by this hyperplane. <br/> <br/><span style="color:#FF0000"><b class="bf">Study Question:</b></span> <span style="color:#0000FF">What sign does the margin have if the point is incorrectly classified? Be sure you can explain why.</span> <br/></p><p>
Now, the <em>margin</em> of a <em>dataset</em> [mathjaxinline]{\cal D}_ n[/mathjaxinline] with respect to the hyperplane [mathjaxinline]\theta , \theta _0[/mathjaxinline] is the <em>minimum</em> margin of any point with respect to [mathjaxinline]\theta , \theta _0[/mathjaxinline]: </p><table id="a0000000021" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\min _ i \left(y^{(i)} \cdot \frac{\theta ^ T x^{(i)} + \theta _0}{\left\lVert \theta \right\rVert }\right)\; \; .[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
The margin is positive if and only if all of the points in the data-set are classified correctly. In that case (only!) it represents the distance from the hyperplane to the closest point. </p><p><div style="border-radius:10px;padding:5px;border-style:solid;background-color:rgba(0,255,0,0.03);" class="examplebox"><p><b class="bf">Example:</b> Let [mathjaxinline]h[/mathjaxinline] be the linear classifier defined by [mathjaxinline]\theta = \begin{bmatrix} 1 \\ -1 \end{bmatrix}, \theta _0 = 1[/mathjaxinline]. </p><p>
The diagram below shows several points classified by [mathjaxinline]h[/mathjaxinline], one of which is misclassified. We compute the margin for each point: </p><center><p><img src="/assets/courseware/v1/506746fa1d7a249d787a83669a295862/asset-v1:MITx+6.036+1T2019+type@asset+block/images_perceptron_theory_of_the_perceptron_tikzpicture_1-crop.png" width="467"/></p><table id="a0000000022" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]y^{(1)} \cdot \frac{\theta ^ T x^{(1)} + \theta _0}{\left\lVert \theta \right\rVert } = 1 \cdot \frac{-2 + 1}{\sqrt {2}} = -\frac{\sqrt {2}}{2}[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><table id="a0000000023" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]y^{(2)} \cdot \frac{\theta ^ T x^{(2)} + \theta _0}{\left\lVert \theta \right\rVert } = 1 \cdot \frac{1 + 1}{\sqrt {2}} = \sqrt {2}[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><table id="a0000000024" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]y^{(3)} \cdot \frac{\theta ^ T x^{(3)} + \theta _0}{\left\lVert \theta \right\rVert } = -1 \cdot \frac{-3 + 1}{\sqrt {2}} = \sqrt {2}[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table></center><p>
Note that since point [mathjaxinline]x^{(1)}[/mathjaxinline] is misclassified, its margin is negative. Thus the margin for the whole data set is given by [mathjaxinline]-\frac{\sqrt {2}}{2}[/mathjaxinline]. </p></div></p><p><b class="bf">Theorem:</b><em>[Perceptron Convergence] For simplicity, we consider the case where the linear separator must pass through the origin. If the following conditions hold: <ol class="enumerate"><li value="1"><p>
there exists [mathjaxinline]\theta ^*[/mathjaxinline] such that [mathjaxinline]y^{(i)} \frac{\theta ^{*T}x^{(i)}}{\left\lVert \theta ^*\right\rVert } \geq \gamma[/mathjaxinline] for all [mathjaxinline]i = 1, \ldots , n[/mathjaxinline] and for some [mathjaxinline]\gamma > 0[/mathjaxinline] and </p></li><li value="2"><p>
all the examples have bounded magnitude: [mathjaxinline]\left\lVert x^{(i)}\right\rVert \leq R[/mathjaxinline] for all [mathjaxinline]i = 1, \ldots n[/mathjaxinline], </p></li></ol> then the perceptron algorithm will make at most [mathjaxinline]\left(\frac{R}{\gamma } \right)^2[/mathjaxinline] mistakes. <span class="rm"/></em></p><p>
We initialize [mathjaxinline]\theta ^{(0)} = 0[/mathjaxinline], and let [mathjaxinline]\theta ^{(k)}[/mathjaxinline] define our hyperplane after the perceptron algorithm has made [mathjaxinline]k[/mathjaxinline] mistakes. We are going to think about the angle between the hypothesis we have now, [mathjaxinline]\theta ^{(k)}[/mathjaxinline] and the assumed good separator [mathjaxinline]\theta ^*[/mathjaxinline]. Since they both go through the origin, if we can show that the angle between them is decreasing usefully on every iteration, then we will get close to that separator. </p><p>
So, let's think about the [mathjaxinline]\cos[/mathjaxinline] of the angle between them, and recall, by the definition of dot product: </p><table id="a0000000025" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\cos \left(\theta ^{(k)}, \theta ^*\right) = \frac{\theta ^{(k)} \cdot \theta ^*}{ \left\lVert \theta ^*\right\rVert \left\lVert \theta ^{(k)}\right\rVert }[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
We'll divide this up into two factors, </p><table id="a0000000026" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\cos \left(\theta ^{(k)}, \theta ^*\right) = \left(\frac{\theta ^{(k)} \cdot \theta ^*}{\left\lVert \theta ^*\right\rVert }\right)\cdot \left(\frac{1}{\left\lVert \theta ^{(k)}\right\rVert }\right)\; \; ,[/mathjax]</td><td class="eqnnum" style="width:20%; border:none;text-align:right">(1.1)</td></tr></table><p>
and start by focusing on the first factor. </p><p>
Without loss of generality, assume that the [mathjaxinline]k^{th}[/mathjaxinline] mistake occurs on the [mathjaxinline]i^{th}[/mathjaxinline] example [mathjaxinline]\left(x^{(i)}, y^{(i)}\right)[/mathjaxinline]. </p><table id="a0000000027" cellpadding="7" width="100%" cellspacing="0" class="eqnarray" style="table-layout:auto"><tr id="a0000000028"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
[mathjaxinline]\displaystyle \frac{\theta ^{(k)} \cdot \theta ^*}{\left\lVert \theta ^*\right\rVert }[/mathjaxinline]
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle = \frac{\left(\theta ^{(k-1)} + y^{(i)}x^{(i)}\right)\cdot \theta ^*}{\left\lVert \theta ^*\right\rVert }[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000029"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle = \frac{\theta ^{(k-1)}\cdot \theta ^*}{\left\lVert \theta ^*\right\rVert } + \frac{ y^{(i)}x^{(i)}\cdot \theta ^*}{\left\lVert \theta ^*\right\rVert }[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000030"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle \geq \frac{\theta ^{(k-1)}\cdot \theta ^*}{\left\lVert \theta ^*\right\rVert } + \gamma[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000031"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle \geq k\gamma[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr></table><p>
where we have first applied the margin condition from [mathjaxinline](a)[/mathjaxinline] and then applied simple induction. </p><p>
Now, we'll look at the second factor in equation <ref/>. We note that since [mathjaxinline]\left(x^{(i)},y^{(i)}\right)[/mathjaxinline] is classified incorrectly, [mathjaxinline]y^{(i)}\left({\theta ^{(k-1)}}^ Tx^{(i)}\right) \leq 0[/mathjaxinline]. Thus, </p><table id="a0000000032" cellpadding="7" width="100%" cellspacing="0" class="eqnarray" style="table-layout:auto"><tr id="a0000000033"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
[mathjaxinline]\displaystyle \left\lVert \theta ^{(k)}\right\rVert ^2[/mathjaxinline]
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle = \left\lVert \theta ^{(k-1)} + y^{(i)}x^{(i)}\right\rVert ^2[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000034"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle = \left\lVert \theta ^{(k-1)}\right\rVert ^2 + 2y^{(i)} {\theta ^{(k-1)}}^ Tx^{(i)} + \left\lVert x^{(i)}\right\rVert ^2[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000035"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle \leq \left\lVert \theta ^{(k-1)}\right\rVert ^2 + R^2[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000036"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle \leq kR^2[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr></table><p>
where we have additionally applied the assumption from [mathjaxinline](b)[/mathjaxinline] and then again used simple induction. </p><p>
Returning to the definition of the dot product, we have </p><table id="a0000000037" class="equation" width="100%" cellspacing="0" cellpadding="7" style="table-layout:auto"><tr><td class="equation" style="width:80%; border:none">[mathjax]\cos \left(\theta ^{(k)}, \theta ^*\right) = \frac{\theta ^{(k)} \cdot \theta ^*}{\left\lVert \theta ^{(k)}\right\rVert \left\lVert \theta ^*\right\rVert } = \left(\frac{\theta ^{(k)} \cdot \theta ^*}{\left\lVert \theta ^*\right\rVert }\right) \frac{1}{\left\lVert \theta ^{(k)}\right\rVert } \geq (k\gamma )\cdot \frac{1}{\sqrt {k}R} = \sqrt {k}\cdot \frac{\gamma }{R}[/mathjax]</td><td class="eqnnum" style="width:20%; border:none"> </td></tr></table><p>
Since the value of the cosine is at most 1, we have </p><table id="a0000000038" cellpadding="7" width="100%" cellspacing="0" class="eqnarray" style="table-layout:auto"><tr id="a0000000039"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
[mathjaxinline]\displaystyle 1[/mathjaxinline]
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle \geq \sqrt {k} \cdot \frac{\gamma }{R}[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr><tr id="a0000000040"><td style="width:40%; border:none"> </td><td style="vertical-align:middle; text-align:right; border:none">
[mathjaxinline]\displaystyle k[/mathjaxinline]
</td><td style="vertical-align:middle; text-align:left; border:none">
[mathjaxinline]\displaystyle \leq \left(\frac{R}{\gamma }\right)^2.[/mathjaxinline]
</td><td style="width:40%; border:none"> </td><td style="width:20%; border:none" class="eqnnum"> </td></tr></table><p>
This result endows the margin [mathjaxinline]\gamma[/mathjaxinline] of [mathjaxinline]{\cal D}_ n[/mathjaxinline] with an operational meaning: when using the Perceptron algorithm for classification, at most [mathjaxinline](R/\gamma )^2[/mathjaxinline] classification errors will be made, where [mathjaxinline]R[/mathjaxinline] is an upper bound on the magnitude of the training vectors. </p><p>
<br/></p><p>
<br/></p><p><a href="/assets/courseware/v1/8f4f9aca5581dde50291b0d0e29d0148/asset-v1:MITx+6.036+1T2019+type@asset+block/notes_chapter_The_Perceptron.pdf" target="_blank">Download this chapter as a PDF file</a></p><script src="/assets/courseware/v1/1ab2c06aefab58693cfc9c10394b7503/asset-v1:MITx+6.036+1T2019+type@asset+block/marginotes.js" type="text/javascript"/><span><br/><span style="color:gray;font-size:10pt"><center>This page was last updated on Friday May 24, 2019; 02:28:15 PM (revision 4f166135)</center></span></span>
</div>
</div>
</div>
</div>
© All Rights Reserved