Course Description

This course is structured to teach a specific data science skill set, provide ability to perform hands-on research employing a particular method, and ultimately produce an independent research project. Students shall learn about and discuss relevant topics and research associated with time series analysis. Some of these sessions will be designated as hands-on laboratory sessions in which students write code to replicate and perform analyses using R.

The course will place a focus on the code and implementation choices necessary to perform applied time series analysis. Throughout the semester students shall replicate several time-series studies and provide replication code and analyses as part of their lab assignments.

Students must apply time series skills learned throughout the course to answer their own research questions. Students shall brief progress on their projects throughout the semester as well as the final project and results during the last couple weeks of the semester.

Catalog Number
  • DATA 330
  • Section 1
  • CRN 24696
Prerequisites
  • “Programming for Data Science” (DATA 141 or CSCI 140) or “Computational Problem Solving” (CSCI 141)
Co-requisites
  • You are recommended (but not obliged) to take this course alongside “Probability & Statistics for Scientists” (MATH 351), “Statistical Data Analysis” (MATH 352) or equivalent
Semester
  • Spring 2021 (2021-Jan-27 to 2021-May-18)
Location
  • This course is taught online (remotely); therefore, it can be completed from anywhere.
Class Times
  • Tu/Th 1530–1650
Instructor
Office Hours
  • Schedule is posted on virtual office website
  • Appointments are welcome outside normally scheduled office hours; please message or email to set up a time.
  • There may be certain days when office hours will be either canceled or rescheduled; notifications will be sent ahead of time.
Delivery
  • Course will delivered RSOF, which is fully remote and predominantly synchronous, off campus.
  • Some aspects of the class (e.g., lectures, lessons, or demonstrations) will be made available as a recording for to watch either before or after class
  • Synchronous class sessions will not necessarily be recorded; you are responsible for missed content
Final Exam Period
  • Monday, 10 May 2021 @ 14:00
Minimum Passing Grade
  • D
Communication
  • Instant messaging (e.g., GroupMe or Slack)
    • For instant delivery of content or for questions that need quick responses.
    • Please email me your preferred email address.
  • Discussion posts, Q&A, and schedule management using Blackboard
    • For weekly engagement including sharing ideas, methods and content and asking/responding to questions, comments or concerns.
    • This will serve as a primary content hub; all other content will be linked from here.
    • You need access to Blackboard for this class.
  • Code sharing GitHub
    • For posting coded assignments and exercises.
    • You must create a GitHub account, if you do not already have one.
  • Video conferencing (e.g, Zoom or Nooks)
    • Video conferencing is for office hours, video chats, and synchronous class meetings where “face-to-face” communication or screen sharing is required.
    • Please note that Nooks presently does not support mobile devices or Safari web browser :(
  • Email (twdavis-at-wm-dot-edu)
    • The new snail mail; use this for personal communication or whenever sharing is inappropriate.

See Code of Conduct regarding how these communication platforms are (and should be) used.

Textbook
Course Materials
  • Laptop or desktop computer (required)

    * Your computer should have at least 500 MB of free disk space, have at least 8 GB of memory, and run a modern desktop OS (e.g,. PC, Mac, or Linux). You should have access to headphones and a microphone for virtual class meetings; these will minimize the feedback and allow for an easier time with class discussions.

  • Dedicated USB 3.0 thumb drive 64 GB or larger (recommended)

    * Organizing your course work and maintaining a backup can save you from serious misfortune; I know from experience.

Technology
  • R version 3.6.3 “Holding the Windsock” (required)

    * Note that R version 4.0.3 “Bunny-Wunnies Freak Out” was released in October 2020 and is the default version in Google Colab; however, if/when possible, please use version 3.6.3.

  • tswge version 1.0.0 R package (required)

    • see here for a PDF of the reference manual
  • RStudio Desktop open-source edition (recommended)

  • Git (recommended)


Course Objectives

  1. Learn a specific data science skill
  2. Perform research employing a particular method
  3. Produce an independent research project

How these are interpreted

These course objectives were created by another professor and I will try my best to adapt to them.

In my opinion, LO#1 is the understanding and application of analytic methods associated with time series data. Data science is all about understanding patterns in data and time series is a special class of data. To accomplish this learning object, we will dive right into what makes a time series and what are the methods for analyzing it and applying it. There is basic lexicon associated with this topic, which I am compiling here.

Learning any coding language is invaluable; however, it is my belief that there is value in learning the free/open-source varieties as they are easily ported between operating systems and can follow you after college without the concern of expensive or proprietary licensing. Therefore, I have added the caveat to the course description and chosen to use R rather than STATA. You may disagree with my decision, but trust me, if you can figure it out in the open source community, you should have no trouble picking it up in the proprietary flavors on your own.

LO#2 emphasizes our ability to recognize time series data and the classical methods for analyzing them. To accomplish this objective, we will investigate published methods and (attempt) to reproduce them.

LO#3 was a curve ball when I read it. My original intention was that this would be more of a classical lecture style class, but there you go. Now we have an independent research task tacked on to this course. My goal is to narrow the scope and limit the data to give this more structure all while allowing you to make independent choices on your data and methods. We’ll see how this goes.

What you should already know

  • Basic computer skills (text and document editing; websites and web forms; reading, writing and manipulating files and folders)
  • Basic computer programming principles (e.g., functions, loops, if/then cases, file IO)
  • Basic probability and statistics (e.g., expected value, variance, standard deviation, correlation, magnitude, real vs. imaginary numbers)

What is expected of you

  • Meet twice a week virtually;
    • be on time and courteous to your classmates
    • join the class with your audio and video turned on (i.e., get yourself a pair of headphones with a mic)
  • Provide a course evaluation at the end of the term with a summary of things you liked and disliked; methods that worked for you and those that fell short; topics that interested you and those that were uninteresting.

What to expect from your instructor

  • Meet with you twice a week
  • Answer questions and provide feedback as requested through email, chat, discussion board forums or other electronic means in a timely manner
  • Manage all web resources
  • Post videos to introduce/explain concepts and guide you through demos
  • Make accommodations for video chat requests during and/or outside scheduled office hours

Course Structure

How this class is stylized

This class meets approximately twice a week for one hour and twenty minutes. The course is divided into three topical blocks:

  1. Intro to Time Series and R
  2. Data Analysis for Applied Time Series
  3. Intro to Modeling and Prediction

Each block is between 8 and 10 class periods and consists of a series of lessons, in-class exercises, a discussion paper, one lab, and one exam.

How class time is used

In general, I will follow this path through each of the three blocks:

  1. Lesson (2–3 classes)
  2. Exercises (1–2 classes)
  3. Discussion (2–3 classes)
  4. Lab + Exam (2–3 classes)

How this class is assessed

The grading for this semester uses a simplified check system. Each check you get is worth a certain number of points, based on the following table.

Table 1. Check categories and their associated point values.
Category Number of Checks Points
Engagement 0 0 pt
Exercises 2 2 pt
Labs 2 2 pt
Exams 2 2 pt
Discussions 1 2 pt
Notes 1 1 pt
Research Project 1 1 pt
Engagement
While engagement does not count toward your final score, it may count against your maximum possible grade (see Engagement Policy for more details).
Exercises

Exercises should be completed using either the R command line or RStudio. For each block, submit your in-class exercises to the GitHub repository as two plain-text files:

  1. Rscript: using either the command line or RStudio, produce the lines of code to complete the exercises, including your name, class, date, and text (as a comment) for each question.
  2. Rhistory: save all your commands from your session(s) and upload your history file to accompany

All exercises are due on or before the exam due date for a given block. Use your GitHub username and block number to name your files, for example: block 1 exercises should be titled: dt-woods_b1.R and dt-woods_b1.Rhistory. Please submit only your own work.

Your instructor may test your scripts and provide feedback to you. If you receive feedback that requests you to make changes to your script, please do so within one week of feedback.

Two of the three exercise scripts will be assessed (chosen randomly; you do not get to choose which ones). Exercises that are submitted on time and that you fixed any problems with (based on instructor feedback) such that they produce the expected outputs and run without errors, receive a check. Exercises that are late, incomplete, or throw unexpected errors will not receive a check.

Be sure to test your R scripts!

You are free to collaborate on class exercises, but you are expected to submit your own work (see Coursework Policy).

Labs

Labs should be completed using the R kernel in Google Colab; it is free to use with a Google account, such as the account associated with your W&M email. Labs should include relevant text blocks to introduce topics, explain methods, display images/graphics, and link to resources. Labs should include relevant code blocks to read, process, analyze and visualize data. Labs should be formatted like a report with the following sections:

  1. Introduction - a 150-300 word introduction to the purpose and scope of the lab
  2. Methods - a reproducible account of your work made up of code and text blocks of all steps leading up to, but not including the final graphics, tables or figures
  3. Results - final figures, tables and/or other graphics with explanations of what they represent
  4. Discussion and Conclusions - a 250 to 500 word reflection on your methods and results, what worked and what didn’t work, what were the challenges you faced, were the results as you expected, what inferences (if any) can you make
  5. References - a list of all citations used throughout the report; every reference must be cited and all citations must show up in your references. Please use the latest APA guidelines for your citation style.

To receive a check for a lab, submit your .ipynb or a link to your shared Jupyter Notebook to GitHub; if sharing a link, please make certain share permissions allow for W&M user access. Notebooks are expected at minimum to meet the following:

  1. be formatted correctly,
  2. be written in English with minimal typos (see Standards for Submitted Work),
  3. include working links to datasets,
  4. include code blocks that execute without errors (please note at the top of your notebook if any packages need to be installed),
  5. include figures that adhere to the guidelines given in the instructions,
  6. include references, each with appropriate in-text citations,
  7. include captions for all tables and figures, and
  8. include responses that answer the question(s) asked.

All labs are due on or before the exam due date for a given block. Two of the three labs will be assessed (chosen randomly; you do not get to choose which ones). If they meet the criteria, each will receive a check. Late labs, incomplete labs, or labs that fall short of the expectations outlined above will not receive a check.

You are free to collaborate on lab assignments, but ultimately it should reflect your thoughts, methods, and conclusions (see Coursework Policy).

Exams

Exams will be given near the end of each of the three main topic blocks and consist of one or more questions. Answer the question(s) to the best of your ability and submit your response(s) to Blackboard. Two of the three exams will be assessed (chosen randomly; you do not get to choose which ones). An exam submission that satisfies at least 75% of the expectations will receive a check; an exam that scores less than 75% of the expectations will not receive a check.

You should work independently on the exams.

Discussions

We will examine and review at least one published article for each of the three blocks. Discussions are assessed based on your participation, which may consist of any of the following:

  1. Actually talking and sharing your ideas and opinions during one or more live class sessions
  2. Typing your thoughts and opinions in the discussion board on Blackboard (limit 500 words)
  3. Recording your thoughts and opinions and posting the audio file in the discussion board on Blackboard (limit 5 minutes)

Participation in all three discussions receives a check. Failure to participate in one or more discussions does not receive a check.

Notes

You are encouraged to take notes during class. This practice is an invaluable learning aid. The class will be broken up such that each of you will be assigned to submit notes for one of the three blocks.

These notes should be neat, organized, formatted in markdown and titled using your GitHub name and block number (e.g., dt-woods_notes-b1.md) and uploaded to GitHub at least 48 hours before the exam due date. Notes serve as a study guide for all students and help by providing additional perspectives, methods, and note-taking styles.

An on-time submission of a complete set of notes in a single markdown file for your assigned block results in a check.

Independent Project

This semester, we will examine the application of time series analysis to audio files. You may choose the scope of your analysis and the song. The structure of the independent project follows the same structure and guidelines as a lab; therefore, it should be written up in a Jupyter Notebook, consist of specific sections, include references, and adhere to the Standards for Submitted Work.

Additionally, you will create either a podcast (audio) or vlog (video) not to exceed 10 minutes in length that summarizes your project.

A successful independent research project is measured by:

  1. An informative and well-documented introduction to your study.
  2. A logical and reproducible methodology (however complete).
  3. An insightful reflection on your methods, challenges, and what you were able to accomplish.

Grading Scale

The final letter grade is based on the point total (see check system above) as shown in the following table.

The instructor reserves the right to adjust a student’s final grade by one-half step (e.g., a student that received a score that would give them a C, it could be raised to a C+ or lowered to a C–), so long as the final grade is not lowered below a D or raised higher than an F.

Table 2. Letter grade allocations.
Points Letter Grade
10 A
9 A-
8 B+
7 B
6 C+
5 C
4 C-
3 D+
2 D
<2 F

Course Calendar

The tentative schedule for the main three course topics are as follows.

Table 3. Course schedule and due dates.
Topic Dates Exam Due Date Notes Due Date
Intro to TS & R 1/28 to 3/02 (10 classes) Wednesday, 3 March at 9 AM Monday, 1 March at 9 AM
Data Analysis 3/09 to 4/01 (8 classes) Friday, 2 April at 9 AM Wednesday, 31 March at 9 AM
Modeling & Prediction 4/08 to 5/06 (9 classes) Friday, 7 May at 9 AM Wednesday, 5 May at 9 AM
Research Project (semester long) Monday, 10 May 2–5 PM N/A

Important Dates

First day of Add/Drop
  • 26 January 2021
First day of term
  • 27 January 2021
Add/drop deadline
  • 5 February 2021
Withdraw deadline
  • 29 March 2021
Last day of term
  • 7 May 2021
Last day of exams
  • 18 May 2021

Standards & Policies

The Code of Conduct

By the basis in which this class is designed, these things hold true:

  1. This class utilizes several forms of communication (e.g., instant messaging, discussion boards, emails, audio/video conferencing), of which not all can be monitored by the instructor at all times.
  2. You are responsible for how you interact with each other.

Remember these things as you work together:

  1. “Don’t ascribe maliciousness to that which can be explained by inadvertence.”

    This comes from the fact that it is almost impossible to portray our feelings or intended meaning behind typed text. If something offends you, take a breath, be cordial and ask for clarification before unleashing your wrath (BTW: you shouldn’t unleash your wrath). That being said, also do not be a silent witness. If something offends you, let it be known. We will never learn from our mistakes if our mistakes are never pointed out. If malicious actions continue, I, your instructor, will manage it.

  2. “There is no innovation and creativity without failure. Period.”

    You are a college student registered in a college class. You are not expected to know everything. The entire purpose of this exercise is for you to gain knowledge, so make an effort. If you want to try something, try it and let everyone know what you are up to. Best case scenario, your innovations spark new insight. Worst case scenario, we all learn something from your efforts. Don’t be afraid to make a mistake. It’s better to aim high for something that will make a difference rather than to play it safe with something easy.

  3. Ask lots of questions.

    Questions are cheap, so ask a lot of them. When asking questions, remember to strive for clarity. If you don’t know something or your aren’t sure, just ask. Sometimes, knowing the right question to ask is just as difficult as finding the right answer. When you find yourself here, please send up a flare or simply say “I’m lost.” I will help you get back on track.

  4. Focus on opportunities.

    Remember: this is not a race and you are not a judge, so don’t get caught up with critiquing or competing with each other. Provide your opinions and perspectives and then actually take the time to read the opinions and perspectives of others. Challenge yourself to see things differently and try things differently. Ignore your desire to be correct or to correct someone else and try not to contradict one another; we don’t like it and, biologically, it shuts down our ability to see things logically.

  5. Document and share everything.

    While it may feel natural to keep your work private, projects really thrive when you document your process publicly. By writing things down and sharing them, more people can participate along the way and, occasionally, you yourself might receive help on something you didn’t even know you needed. This leads to more things being documented, which leads to better transparency and feedback, which leads to good decision making and faster/better results.

    We cannot manage what we do not measure: so digitize your process!!!

    This is why I have chosen to use GitHub for exercises and lab assignments. It allows us to share our methods.

  6. Everyone is bound to uphold a policy of respect for their instructor and their peers. Students should be open-minded to new ideas and participate in collegiate debate, the sharing of ideas, and the receiving of feedback without defamatory remarks. Students should help maintain a healthy learning environment by refraining from negative behavior, such as harmful remarks, quibbling over trivial matters, creating unnecessary debates, or bullying.

    There is zero tolerance for negative behavior. Failure to uphold this policy will result in punitive action and/or removing the offending student from access to all or part of the class.

Engagement Policy

  • Weekly virtual engagement is expected.

While it is not required that you attend regular class meetings, if you choose to not attend class (i.e., you have more than two unexcused absences), your final grade cannot be an A.

Excused absences include any and all college related events (e.g., athletic events, conferences, field trips, screening appointments and mandatory quarantine). Please make any and all planned absences of this sort known to your instructor at the earliest possible time.

Additionally, you are excused two personal days without question.

You cannot be counted absent on days when class does not meet or when campus is closed.

If you are unable to make to a regularly scheduled class meeting but you are still able to have positive interaction with the class (e.g., through posted discussions, chats, issues, or other electronic communication), then your absence will not be counted against you. Consideration for this type of engagement needs to be communicated and approved with your instructor.

Coursework Policy

  • You are bound by the honor code.

By accepting admission to the College, you have made a commitment to understand, support, and abide by the Honor Code. Violations, whether attempted or successful, will result in consequences ranging from a verbal reprimand up to a failing grade for the class.

Misconduct may include, but is not limited to, the following:

  • cheating or using unauthorized materials on assignments
  • fabrication, forgery, alteration, or destruction of documents; hacking unauthorized resources; intimidating or bribing peers; improper collaboration or colluding; plagiarizing; or lying in order to obtain academic advantage
  • assisting others in misconduct
  • attempting misconduct

Instant Messaging Policy

  • The instant messaging app should be used for communication that either: (1) needs to reach people quickly or (2) needs a quick response.

You are free (and encouraged) to use the instant messaging platform for better/faster communication. Do not use the instant messenger for spamming, soliciting or otherwise disrupting the peace. Be sure to change your notification settings on your mobile device to provide “Do Not Disturb” periods when sleeping or studying.

Email Policy

  • All personal correspondence should be made to the instructor’s W&M email address (see above) and include the class title in the subject line.

For private inquiries, please email the instructor; the instructor will confirm each email received. If you do not receive a confirmation message from the instructor within 12 hours of sending, feel free to send a follow-up email.

Make-Up Policy

  • There are no make-ups.

Due dates for all assessment materials are posted (see Course Calendar). Unexcused late submissions will be treated as such (see How this class is assessed).

All reasonable requests for extensions on submission deadlines are to be made to your instructor in person (or through virtual conferencing). Appointments must be held before original due dates. The acceptance of late submissions will be evaluated on an individual basis and is at the sole discretion of the instructor.

Digital Recording Policy

  • Sharing photos or videos of this class is not permitted.

Due to privacy laws/concerns, the recording of people and/or their voices during class meetings in any form is prohibited. The sharing of any class recordings is also prohibited.

Exceptions to this policy come only from the Academic Support office that require compliance with the ADA. Please make all requests known to your instructor as soon as possible.

Standards for Submitted Work

  • Written (and/or typed) work should be in English and follow, to the best of your abilities, the rules of the English language (see Strunk & White).
  • Written (and/or typed) work should be neat, thorough, legible, and logically organized.
  • All submitted work should include your name, date, and description of the assignment.
  • Filenames should be in all lowercase and contain only alphanumeric, underscore, hyphen, and/or period characters.
  • Plots and graphs should be done using a computer, where appropriate.
  • Tables, figures and images from online sources should include a citation (including the author/publisher, date created/accessed, and URL).
  • All in-text citations and references should be formatted to APA standards unless otherwise indicated.
  • Plagiarism will be taken seriously; if you write something that is not your own original idea or in your own words, then it must be cited! See here for information on plagiarism and how to avoid it.
  • Unless otherwise stated, electronic submissions may be in one of the following acceptable open file formats:

Statements and Resources

ADA Statement

William & Mary accommodates students with disabilities in accordance with federal laws and university policy. Any student who feels they may need an accommodation based on the impact of a learning, psychiatric, physical, or chronic health diagnosis should contact Student Accessibility Services staff at 757-221-2512 or at to determine if accommodations are warranted and to obtain an official letter of accommodation. For more information, please see https://www.wm.edu/sas.