University | Nanyang Technological University (NTU) |
Subject | CB0494 Introduction to Data Science and AI |
1. Objectives
- To relate potential real life problem with data science.
- To apply what have been taught and learnt in CB0494 on real dataset and perform data analysis. The workflow is very important. This includes why your team uses certain tools and how these tools can help in your analysis.
- To conclude potential findings with predictions as part of the solution for the problem.
This mini project is to demonstrate your understanding of the tutorials and the data science content from your lecture slides, and to be competent in basic data science analysis. The focus is still on the thought process and each team’s creativity in bringing the best out of the dataset with the tools taught in the course for the problem statement indicated by the team.
2. Assessment Criteria (Total 100 marks)
(Reminder: Overall weightage is 30%)
(a) Project Content and Analysis:
Relating problem with data science: | 10 Marks |
Relevant use of visualizations: | 10 Marks |
Relevant use of machine learning techniques (taught in course): | 15 Marks |
Data science and programming good practices and clarity in conclusions: | 10 Marks |
Team work: | 15 Marks |
Hire a Professional Essay & Assignment Writer for completing your Academic Assessments
Native Singapore Writers Team
- 100% Plagiarism-Free Essay
- Highest Satisfaction Rate
- Free Revision
- On-Time Delivery
(b) Project Report:
Organization: | 10 Marks |
Clarity: | 10 Marks |
(c) Presentation:
Organization: | 10 Marks |
Clarity: | 10 Marks |
Dataset 2: Regarding data on lung diseases (From Kaggle):
URL: https://www.kaggle.com/datasets/samikshadalvi/lungs-diseases-dataset
Dataset: lung_disease_data.csv
Data Description: Please refer to the URL for information on the data fields etc
Upon selecting your dataset, explain what real problem your team is going to solve and relate it to the data science questions. Study the dataset and select the appropriate data for your analysis via the Jupyter notebook. Each team should select the suitable tools taught in class to analyse the data and make sense of the analysis related to the problem statement they have stated, and eventually conclude their findings with some prediction.
To analyse the data, it should be following the thought process as taught in class. This means that each team need to strive to provide proof and justification why certain predictors are selected for example and obviously each team should select predictors to have the best goodness of fit score as possible.
Each team does not need to be so “obsess” about getting the goodness of fit score to an exceptional high value when it may be impossible for some cases due to the quality of the dataset which is beyond the control of the team. But the thought process on how each team try the best possible way to analyse the data with the tools taught and hence obtained a justified goodness of fit score is still important.
Each team can identify the limitation of the work, and list out some recommendations for future work. The recommendations may include suggestion of using some tools (not taught in class) with explanations.
Please do not blindly follow through the steps without justification and reasons. That will certainly mean a F grade as it does not demonstrate any understanding of the class and hence failed in the application of the tools taught for the data analysis. Wrong use of tools or using the tools in an unsuitable way will have marks deducted.
Each team must first convince themselves that the thought process and result of their mini project are realistic, reasonable and meaningful before they can convince me about that.
Please organise your programming codes and make suitable comments in your mini project Jupyter notebook file before submission so that it can be easily understood and readable. There is no limit in the length of your Jupyter notebook file for your mini project.
Buy Custom Answer of This Assessment & Raise Your Grades
Problem Statement:
Lung diseases remain a significant public health concern in Singapore, particularly among elderly and chronically ill populations. While access to healthcare is robust, patient recovery outcomes still vary due to multiple clinical and demographic factors. This project aims to develop a predictive model to classify whether a patient is likely to recover from lung disease based on their medical and demographic information. By identifying patients at risk of non-recovery early, healthcare providers can tailor interventions, reduce complications, and improve overall treatment efficiency.
Having trouble with the CB0494 Introduction to Data Science and AI assignment? Alright! Assignment Help Singapore provides AI-free and high-quality assignments. Hire a computer assignment helper now at a cheap price and hand over your CB0494 assignment problem to them, They will provide you with the best quality and well-researched assignments before the deadline. Here you are also given a list of free assignment examples, by looking at which you can determine the quality and structure of your assignment. So just write "Do my assignment for me" and be worry-free!
Looking for Plagiarism free Answers for your college/ university Assignments.
- Financial Statement Analysis Report – Assessment 2
- COM273e Innovative Marketing Strategy for Sustainable Activewear in Asia – ECA
- NCO201 Self-Directed Learning Plan & Reflection for Personal Growth
- The impact of AEDs on OHCA Patients and Their Survival Outcomes in Singapore – Report
- MKT362 ECA – FoodPanda Case Study & Legal Compliance in Singapore
- MH4522 Kernel Estimation for Poisson Point Processes – Spatial Data Science Assignment
- ANL203 Business Analytics for Student Monitoring
- Analyzing Cinematic Techniques and Themes in Film – Essay
- ELG101 Understanding Singlish and Its Cultural Significance – ECA Essay
- AVM Interactive Theatre Proposal: Youth Bullying Awareness Event – Essay