
Final Article (due December 10, 2012)

Use Microsoft Word or LaTeX to create a concise Explanatory paper, 5-10 pages in length, excluding the appendixes, containing the following sections:

    • Title and author(s), affiliation
    • Introduction
      This section provides an overview of this entire document. It also states the problem to be solved, in the form of a research question and hypothesis.
    • Background
      This section provides background information on the domain where the data originates from - for example heart disease, and on the main set of problems to which a data analysis project could be of assistance.
    • Project Objectives
      Describe the problem(s) to be solved, as a set of questions the data analysis system will provide insight into, and the intended users. This list of questions can be numbered by priority. Each objective can be decomposed into sub-goals as necessary. For instance a goal may be to diagnose patients based on their blood test profiles, and you will decompose into more detailed questions such as determine a subset of variables more characteristic of the disease(s), determine which classification method provides better accuracy, which provides less false negative results, etc.
    • Methods
      Explain the material and methods used in this study. This section can be decomposed into several sub-sections:
      • Task-relevant Data - List the size of the database, number of attributes, number of rows, and format.
      • Tools - Briefly describe the tools or algorithms you have used, the functionality you actually used, and why you ended up choosing these tools.
      • Data preprocessing - Explain how you preprocessed your data - the data files, data attributes, data values, missing data, outliers, etc.
      • Data analysis methods- For each question/objective studied, explain which data mining task / algorithm used and how you found answers to this question.
    • Results - Present the knowledge discovered, and an evaluation of your study (accuracy, etc.). Precisely match the knowledge discovered with each objective in the study.
    • Discussion (for graduate students only)
      Summarize a few articles pertinent to this study, and how they relate to this work. Compare results if there are comparable studies.
    • Discussion (for undergraduate students only)
      Summarize the results obtained and their significance in the knowledge of heart disease.
    • Conclusion
      Provide an ending to this document with a mention of data mining methods and/or techniques, lessons learnt, and plans for future work.
    • References (for graduate students only)
      Provide a list of at least three articles or books pertinent to this study. Undergraduate students can refer to any kind of resource (Wikipedia, Web-site, ...) and any number.
    • Appendixes (optional)
      Appendixes can be added, for example contain:
      • Data dictionary.
      • Presentation slides.
      • Additional graphics and analysis results.
  • Grading criteria
    • Your grade will be based on both your demonstrated writing proficiency and on the contents of the document:
      • Writing proficiency
        • Overall document appearance, Spelling and grammar, Clarity and conciseness will be taken into account in the final grade.
      • Contents
        • Introduction       5%
        • Background     10%
        • Objectives        10%
        • Material (data) 10%
        • Methods           20%
        • Results              20%
        • Discussion       10%
        • Conclusion         5%
        • References       10%
    • Bonus may be awarded if additional substantial parts are provided,  for a total of 10% of the grade.

Honor code: The work needs to be your own. You may wish have someone from outside the team help by proofreading a draft version and identifying problems, but the words and content contained in the documents should be your own.

Submission Guidelines

Turn in an electronic copy of your final article and slides byWednesday, December 10, 2012.