<div class="xblock xblock-public_view xblock-public_view-vertical" data-runtime-class="LmsRuntime" data-usage-id="block-v1:MITx+6.005.1x+3T2016+type@vertical+block@vertical-ps2-beta" data-init="VerticalStudentView" data-runtime-version="1" data-course-id="course-v1:MITx+6.005.1x+3T2016" data-block-type="vertical" data-has-score="False" data-graded="True" data-request-token="94c6aca0feea11ee9bc416fff75c5923">
<h2 class="hd hd-2 unit-title">Problem Set 2</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.005.1x+3T2016+type@html+block@html_3c46015bc990">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-runtime-class="LmsRuntime" data-usage-id="block-v1:MITx+6.005.1x+3T2016+type@html+block@html_3c46015bc990" data-init="XBlockToXModuleShim" data-runtime-version="1" data-course-id="course-v1:MITx+6.005.1x+3T2016" data-block-type="html" data-has-score="False" data-graded="True" data-request-token="94c6aca0feea11ee9bc416fff75c5923">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<link href="/assets/courseware/v1/9fab367942e94f98d32835b3ef4df11d/asset-v1:MITx+6.005.1x+3T2016+type@asset+block/prism-edx-v1.css" rel="stylesheet" type="text/css" />
<h1 class="handout-title col-sm-8 col-sm-offset-2" id="problem_set_1_tweet_tweet">Problem Set 2 Starting Code</h1>
<div>
<p>Download the starting code for Problem Set 2 here:</p>
<blockquote>
<a href="/assets/courseware/v1/f63bdcd25593cff1bffc89acfb62d108/asset-v1:MITx+6.005.1x+3T2016+type@asset+block/ps2.zip">ps2.zip</a>
</blockquote>
<p>The process for doing this problem set is the same <a href="/courses/course-v1:MITx+6.005.1x+3T2016/jump_to_id/vertical-ps1-beta#import">as in Problem Set 1</a>, but here are quick reminders:</p>
<ul>
<li>
<p>To import the starting code into Eclipse, use File → Import... → General → Existing Projects Into Workspace → Select Archive File, then Browse to find where you downloaded ps2.zip. Make sure <code>ps2-tweets</code> is checked and click Finish.</p>
</li>
<li>
<p>To run JUnit tests, right-click on the <code>test</code> folder in Eclipse and choose Run As → JUnit Test.</p>
</li>
<li>
<p>This problem set has no <code>main()</code> method to run, just tests.</p>
</li>
<li>
<p>To run the autograder, right-click on <code>grader.xml</code> in Eclipse and choose Run As → Ant Build.</p>
</li>
<li>
<p>To view the autograder results, make sure your project is Refreshed, then double-click on <code>my-grader-report.xml</code>.</p>
</li>
<li>
<p>To submit your problem set, upload <code>my-submission.zip</code> to the submission page, which is the last section of this handout, at the end of the section bar.</p>
</li>
</ul>
</div>
</div>
</div>
<div class="vert vert-1" data-id="block-v1:MITx+6.005.1x+3T2016+type@html+block@html_89e4067421ce">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-runtime-class="LmsRuntime" data-usage-id="block-v1:MITx+6.005.1x+3T2016+type@html+block@html_89e4067421ce" data-init="XBlockToXModuleShim" data-runtime-version="1" data-course-id="course-v1:MITx+6.005.1x+3T2016" data-block-type="html" data-has-score="False" data-graded="True" data-request-token="94c6aca0feea11ee9bc416fff75c5923">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<link href="/assets/courseware/v1/9fab367942e94f98d32835b3ef4df11d/asset-v1:MITx+6.005.1x+3T2016+type@asset+block/prism-edx-v1.css" rel="stylesheet" type="text/css" />
<h1 class="handout-title col-sm-8 col-sm-offset-2" id="problem_set_1_tweet_tweet">Problem Set 2: Tweet Tweet</h1>
<h2 id="overview">Overview</h2>
<div data-outline="overview"><p>The theme of this problem set is to build a toolbox of methods that can extract information from a set of tweets downloaded from Twitter.</p><p>Since we are doing test-first programming, your workflow for each method should be (<em>in this order</em>).</p><ol>
<li>Study the specification of the method carefully.</li>
<li>Write JUnit tests for the method according to the spec.</li>
<li>Implement the method according to the spec. </li>
<li>Revise your implementation and improve your test cases until your implementation passes all your tests.</li>
</ol><p>Part of the point of this problem set is to learn how to write good tests. In particular:</p><ul>
<li><strong>Your test cases should be chosen using the input/output-space partitioning approach</strong>. This approach is explained in the <a href="/courses/course-v1:MITx+6.005.1x+3T2016/jump_to_id/vertical-test-first-programming-partitioning#choosing_test_cases_by_partitioning">reading about testing</a>.</li>
<li><strong>Include a comment at the top of each test suite class describing your <em>testing strategy</em></strong> — how you partitioned the input/output space of each method, and then how you decided which test cases to choose for each partition. The testing reading has an <a href="/courses/course-v1:MITx+6.005.1x+3T2016/jump_to_id/vertical-documenting-your-tsting-strategy#documenting_your_testing_strategy">example of documenting the testing strategy for a method.</a></li>
<li><strong>Your test cases should be small and well-chosen.</strong> Don't use a large set of tweets from Twitter for each test. Instead, create your own artificial tweets, carefully chosen to test the partition you're trying to test. </li>
<li><strong>Your tests should find bugs.</strong> Your test cases will be run against buggy implementations and seeing if your tests catch the bugs. So consider ways an implementation might inadvertently fail to meet the spec, and choose tests that will expose those bugs.</li>
<li><strong>Your tests must be legal clients of the spec.</strong> Your test cases will also be run against legal, variant implementations that still strictly satisfy the specs, and your test cases should not complain for these good implementations. That means that your test cases can't make extra assumptions that are only true for your own implementation.</li>
<li><strong>Put each test case in its own JUnit method.</strong> This will be far more useful than a single large test method, since it pinpoints where the problem areas lie in the implementation.</li>
<li>Again, keep your tests small. Don't use unreasonable amounts of resources (such as <code>MAX_INT</code> size lists). We won't expect your test suite to catch bugs related to running out of resources; <em>every</em> program fails when it runs out of resources.</li>
</ul><p>You should also keep in mind these facts from the readings about <a href="/courses/course-v1:MITx+6.005.1x+3T2016/jump_to_id/vertical-specifications-objectives">specifications</a> and <a href="/courses/course-v1:MITx+6.005.1x+3T2016/jump_to_id/vertical-designing-specifications-objectives">designing specifications</a>:</p><ul>
<li><strong>Preconditions.</strong> Some of the specs have preconditions, e.g. "this value must be positive" or "this list must be nonempty". When preconditions are violated, the behavior of the method is <em>completely unspecified</em>. It may return a reasonable value, return an unreasonable value, throw an unchecked exception, display a picture of a cat, crash your computer, etc., etc., etc. In the tests you write, do not use inputs that don't meet the method's preconditions. In the implementations you write, you may do whatever you like if a precondition is violated. Note that if the specification indicates a particular exception should be thrown for some class of invalid inputs, that is a <em>postcondition</em>, not a precondition, and you <em>do</em> need to implement and test that behavior.</li>
<li><strong>Underdetermined postconditions.</strong> Some of the specs have underdetermined postconditions, allowing a range of behavior. When you're implementing such a method, the exact behavior of your method within that range is up to you to decide. When you're writing a test case for the method, you must allow the implementation you're testing to have the full range of variation, because otherwise your test case is not a legal client of the spec as required above.</li>
</ul><p>Finally, in order for your overall program to meet the specification of this problem set, you are required to keep some things unchanged:</p><ul>
<li><strong>Don't change these classes at all:</strong> the classes <code>Tweet</code> and <code>Timespan</code> should not be modified <em>at all</em>.</li>
<li><strong>Don't change these class names:</strong> the classes <code>Extract</code>, <code>Filter</code>, <code>SocialNetwork</code>, <code>ExtractTest</code>, <code>FilterTest</code>, and <code>SocialNetworkTest</code> must use those names and remain in the <code>twitter</code> package.</li>
<li><strong>Don't change the method signatures and specifications:</strong> The public methods provided for you to implement in <code>Extract</code>, <code>Filter</code>, and <code>SocialNetwork</code> must use the method signatures and the specifications that we provided.</li>
<li><strong>Don't include illegal test cases:</strong> The tests you implement in <code>ExtractTest</code>, <code>FilterTest</code>, and <code>SocialNetworkTest</code> must respect the specifications that we provided for the methods you are testing.</li>
</ul><p>Aside from these requirements, however, you are free to add new public and private methods and new public or private classes if you wish. In particular, if you wish to write test cases that test a stronger spec than we provide, you should put those tests in a separate JUnit test class, so that we don't try to run them on staff implementations that only satisfy the weaker spec. We suggest naming those test classes <code>MyExtractTest</code>, <code>MyFilterTest</code>, <code>MySocialNetworkTest</code>, and we suggest putting them in the <code>twitter</code> package in the <code>test</code> folder alongside the other JUnit test classes. </p><hr></div>
<h2 id="problem_1_extracting_data_from_tweets">Problem 1: Extracting data from tweets</h2>
<div data-outline="problem_1_extracting_data_from_tweets"><p>In this problem, you will test and implement the methods in <code>Extract.java</code>.</p><p>You'll find <code>Extract.java</code> in the <code>src</code> folder, and a JUnit test class <code>ExtractTest.java</code> in the <code>test</code> folder. Separating implementation code from test code is a common practice in development projects. It makes the implementation code easier to understand, uncluttered by tests, and easier to package up for release.</p><div class="list-style-lower-alpha"><ol>
<li><p>Devise, document, and implement test cases for <code>getTimespan()</code> and <code>getMentionedUsers()</code>, and put them in <code>ExtractTest.java</code>.</p></li>
<li><p>Implement <code>getTimespan()</code> and <code>getMentionedUsers()</code>, and make sure your tests pass.</p></li>
</ol></div><p>Hints:</p><ul>
<li><p>Note that we use the class <a href="http://docs.oracle.com/javase/8/docs/api/?java/time/Instant.html"><code>Instant</code></a> to represent the date and time of tweets. You can check <a href="http://java.dzone.com/articles/deeper-look-java-8-date-and">this article on Java 8 dates and times</a> to learn how to use <code>Instant</code>.</p></li>
<li><p>You may wonder what to do about lowercase and uppercase in the return value of <code>getMentionedUsers()</code>. This spec has an underdetermined postcondition, so read the spec carefully and think about what that means for your implementation and your test cases.</p></li>
<li><p><code>getTimespan()</code> <em>also</em> has an underdetermined postcondition in some circumstances, which gives the implementor (you) more freedom and the client (also you, when you're writing tests) less certainty about what it will return.</p></li>
<li><p>Read the spec for the <code>Timespan</code> class carefully, because it may answer many of the questions you have about <code>getTimespan()</code>.</p></li>
</ul>
<hr></div>
<h2 id="problem_2_filtering_lists_of_tweets">Problem 2: Filtering lists of tweets</h2>
<div data-outline="problem_2_filtering_lists_of_tweets"><p>In this problem, you will test and implement the methods in <code>Filter.java</code>.</p><div class="list-style-lower-alpha"><ol>
<li><p>Devise, document, and implement test cases for <code>writtenBy()</code>, <code>inTimespan()</code>, and <code>containing()</code>, and put them in <code>FilterTest.java</code>.</p></li>
<li><p>Implement <code>writtenBy()</code>, <code>inTimespan()</code>, and <code>containing()</code>, and make sure your tests pass.</p></li>
</ol></div><p>Hints:</p><ul>
<li><p>For questions about lowercase/uppercase and how to interpret timespans, reread the hints in the previous question.</p></li>
<li><p>For all problems on this problem set, you are free to rewrite or replace the provided example tests and their assertions.</p></li>
</ul>
<hr></div>
<h2 id="problem_3_inferring_a_social_network">Problem 3: Inferring a social network</h2>
<div data-outline="problem_3_inferring_a_social_network"><p>In this problem, you will test and implement the methods in <code>SocialNetwork.java</code>. The <code>guessFollowsGraph()</code> method creates a social network over the people who are mentioned in a list of tweets. The social network is an approximation to who is following whom on Twitter, based only on the evidence found in the tweets. The <code>influencers()</code> method returns a list of people sorted by their influence (total number of followers).</p><div class="list-style-lower-alpha"><ol>
<li><p>Devise, document, and implement test cases for <code>guessFollowsGraph()</code> and <code>influencers()</code>, and put them in <code>SocialNetworkTest.java</code>. Be careful that your test cases for <code>guessFollowsGraph()</code> respect its underdetermined postcondition.</p></li>
<li><p>Implement <code>guessFollowsGraph()</code> and <code>influencers()</code>, and make sure your tests pass. For now, implement only the minimum required behavior for <code>guessFollowsGraph()</code>, which infers that Ernie follows Bert if Ernie @-mentions Bert.</p></li>
</ol></div><hr></div>
<h2 id="problem_4_get_smarter">Problem 4: Get smarter</h2>
<div data-outline="problem_4_get_smarter"><p>In this problem, you will implement one additional kind of evidence in <code>guessFollowsGraph()</code>. Note that we are taking a broad view of "influence" here, and even Twitter-following is not a ground truth for influence, only an approximation. It's possible to read Twitter without explicitly following anybody. It's also possible to be influenced by somebody through other media (email, chat, real life) while producing evidence of the influence on twitter.</p><p>Here are some ideas for evidence of following. Feel free to experiment with your own.</p><ul>
<li><p><strong>Common hashtags.</strong> People who use the same hashtags in their tweets (e.g. <code>#mit</code>) may mutually influence each other. People who share a hashtag that isn't otherwise popular in the dataset, or people who share multiple hashtags, may be even stronger evidence.</p></li>
<li><p><strong><a href="http://en.wikipedia.org/wiki/Triadic_closure">Triadic closure</a>.</strong> In this context, triadic closure means that if a strong tie (mutual following relationship) exists between a pair A,B and a pair B,C, then some kind of tie probably exists between A and C – either A follows C, or C follows A, or both.</p></li>
<li><p><strong>Awareness</strong>. If A follows B and B follows C, and B retweets a tweet made by C, then A sees the retweet and is influenced by C. </p></li>
</ul><p>Keep in mind that whatever additional evidence you implement, your <code>guessFollowsGraph()</code> must still obey the spec. To test your specific implementation, make sure you put test cases in your own <code>MySocialNetworkTest</code> class rather than the <code>SocialNetworkTest</code> class that the grader will run against staff implementations. </p><hr></div>
</div>
</div>
</div>
</div>
<div class="xblock xblock-public_view xblock-public_view-vertical" data-runtime-class="LmsRuntime" data-usage-id="block-v1:MITx+6.005.1x+3T2016+type@vertical+block@vertical-ps2-grader-output" data-init="VerticalStudentView" data-runtime-version="1" data-course-id="course-v1:MITx+6.005.1x+3T2016" data-block-type="vertical" data-has-score="False" data-graded="True" data-request-token="94c6aca0feea11ee9bc416fff75c5923">
<h2 class="hd hd-2 unit-title">Understanding the Grader Output</h2>
<div class="vert-mod">
<div class="vert vert-0" data-id="block-v1:MITx+6.005.1x+3T2016+type@html+block@html_224f00741807">
<div class="xblock xblock-public_view xblock-public_view-html xmodule_display xmodule_HtmlBlock" data-runtime-class="LmsRuntime" data-usage-id="block-v1:MITx+6.005.1x+3T2016+type@html+block@html_224f00741807" data-init="XBlockToXModuleShim" data-runtime-version="1" data-course-id="course-v1:MITx+6.005.1x+3T2016" data-block-type="html" data-has-score="False" data-graded="True" data-request-token="94c6aca0feea11ee9bc416fff75c5923">
<script type="json/xblock-args" class="xblock-json-init-args">
{"xmodule-type": "HTMLModule"}
</script>
<link href="/assets/courseware/v1/9fab367942e94f98d32835b3ef4df11d/asset-v1:MITx+6.005.1x+3T2016+type@asset+block/prism-edx-v1.css" rel="stylesheet" type="text/css" />
<h1>Understanding the Problem Set 2 Grader Output</h1>
<div>
<p>When you run the grader for Problem Set Beta 2 and view the resulting <code>my-grader-report.xml</code>, you will see several categories of tests.</p>
<p><code>twitter.staff.Original{Extract,Filter,SocialNetwork}Test</code> are tests we originally gave you with the problem set. These tests were in <code>test/twitter/{Extract,Filter,SocialNetwork}Test.java</code> before you started adding your own tests to those files. The tests are run against your own implementations of the methods.</p>
<p><code>twitter.{Extract,Filter,SocialNetwork}Test</code> are your own tests run against your own implementations of the methods. The grader is running your own test code here, so it should produce the same results as you would get from right-clicking on the <code>test</code> folder and running JUnit.</p>
<p><code>twitter.staff.{Extract,Filter,SocialNetwork}TestRunner</code> are running your own tests (from your <code>test</code> folder) against staff implementations of the methods. The staff implementations can be either <em>bad</em> (not following the spec) or <em>good</em> (correctly following the spec). For a bad implementation, the grader test passes if at least one of your tests rejects the bad implementation. For a good implementation, the grader test passes if all of your tests accept the good implementation. Test names in this category have the form <code>yourTests_staff<em>[Bad|Good]</em>Impl_<em>[hint about how the implementation behaves]</em></code>. For example, <code>yourTests_staffBadImpl_Filter_alwaysReturnsEmpty</code> means that your tests were run against a bad implementation in which the <code>Filter</code> methods always return empty lists.</p>
</div>
</div>
</div>
</div>
</div>