This course is aligned with the Infocomm Technology Skills Framework for Data Analyst and Data Scientist and will use R,
a widely used statistical package. The course aims to quickly bring up to speed a programmer or business analyst who
already knows how to programme in other language or have done advanced Excel macros to begin using R as a data
The course will define data science and explore the first two things a data scientist must do – cleaning and visualising data. You will learn and use R’s dplyr, ggplot and ggvis packages for these tasks. It will then cover the Data Science Workflow
- training models and testing them through the application of machine learning models to various industry-relevant data
science problems. The tool used will be the Caret package.
At the end of the course, you should have a working knowledge of how to solve data science problems with R and the following:
- Use R for basic data munging to aggregate, clean and process data from local files and access databases and REST APIs
- Create visualization with R/ggplot/ggvis
- Create basic to intermediate analytics models with the R Caret package
- Learn 12 common analytics models and how to programme and apply them in R
- Introduction to R: The R Ecosystem
- Refresher: Basic R Programming
- Data Types
- Conditional Execution & Functions
- The Analytics Process for Data Science
- Data Acquisition (Read/Write Files)
- Datasets & Packages
- Data Cleaning & Transformation Pt 1
- The Analytics Process for Data Science
- Data Cleaning & Transformation Pt 2
- Tidy Data Concepts/Application
- Statistical Analysis with R
- Summary Statistics
- Linear Regression/Multiple Linear
- Hypothesis Testing (One-sided/Two sided T-test)
- Exploratory Data Analysis in R using Visualization
- rgl (3D Plots)
- leaflet (Maps)
- Supervised vs Unsupervised Machine
- Supervised Machine Learning in Caret
- Cross Validation
- Unsupervised Machine Learning using k-means clustering
- How to Build a Shiny App
- User Interface (UI)
R, R Studio
Must be familiar with a programming language such as Java, C/C++ or Python and statistics 101 at a pre-university level.
Who Should Attend
Software Engineers, Programmers, Data Analysts
Mode of Training
On-campus or Online (Live)
Dr. Edmund Low
Edmund Low is a lecturer with the University Scholars Programme (USP) at the National University of Singapore. He teaches courses on engineering, statistical methods, data science and analytics. He currently heads the quantitative reasoning domain, and is also director of the Quantitative Reasoning Centre, at USP. He has organised / co-organised programming workshops and data hackathon for students. As an educator, Edmund has received both the USP Teaching Excellence Award, as well as the NUS Annual Teaching Excellence Award. He has more than 13 years of academic and professional experience in the use of computational modelling and data-driven tools, applying them to solve problems in public health, water resource management and air quality in buildings. Edmund holds a PhD in Environmental Engineering from Yale University.
Mr. Koo Ping Shung
Koo Ping Shung is an experienced Data Scientist with more than 13 years of relevant experience. He is also currently a Adjunct Senior Faculty with the Singapore University of Social Sciences (SUSS) and a SAS Trainer as well. To date, he has conducted over 1,600 man-hours of data science training. He is also the mentor to the trainees accepted into the IMDA-SAS BIA Programme for 5 intakes.
Ping Shung was a guest lecturer at NUS Institute of Systems Science and previously held an adjunct position in School of Information Systems, Singapore Management University.
Prior to this, he was the Analytics Practicum Manager of the Master of IT in Business (Analytics) at the Singapore Management University School of Information Systems. He managed the industrial relationships through projects and attachments. He often advised companies on the type of data analytics projects they can work on with their data and is a co-supervisor for many Masters students on their data analytics capstone projects. He was an instructor for the DBS Graduate Associate Programme for 3 years, teaching over 200 Graduate Associates on data analytics and received positive ratings. He has also trained professionals from various companies on data analytics and the use of SAS software.
Ping Shung was a facilitator for the IDA Data Science MOOC programme for 2 cohorts (over 300 professionals) and participated in Singapore's first Data Literacy Bootcamp which was co-organised by the IDA Singapore and The World Bank.
Ping Shung through his career has gathered much experience on statistical modeling, from working in the banks, supervising Masters students and doing education research. His data analytics experience range from a wide variety of business functions and industries, gathered from talking to companies or working on data analytics projects.
His strong passion in data analytics and data science can be seen through his involvement in data analytics interest groups, being a Co-founder of DataScience.Sg and former Working Committee Chairman of SAS User Group Singapore and Data Ambassador for one of the DataKind SG project. He also read widely on the different topics related to data science and artificial intelligence, keeping himself up-to-date with their development.
His research interest lies in how data science can help organisations and businesses to be more efficient and effective.
Ping Shung holds an MBA from University of Adelaide. He obtained his bachelor degree in Economics from National University of Singapore, with a minor in Computational Finance.