New libraries for data manipulation, visualisation and data modeling have made Python an increasingly exciting alternative to R as a data science language.
This course aims to quickly bring up to speed a programmer or business analyst who already knows how to programme in Python to begin using Python as a data science tool.
This course introduces basic Python programming and community best practices such as using Jupyter/Python. The course then moves on to show how Python can be applied to data mining, analytics, data science and artificial intelligence projects. At the end of this course, participants will gain an overview of the Python ecosystem as well as the skills necessary to self-learn and continue on their Python learning journey.
The course will define data science and explore the first two things a data scientist must do – cleaning and visualising data. It will then cover the Data Science Workflow - training models and testing them through the application of machine learning models to various industry-relevant data science problems. The tools used will be Anaconda, Jupyter Notebook and Scikit-learn.
At the end of the course, participants will be able to:
- Use Python for basic data munging to aggregate, clean and process data from local files, databases, and online
- Create visualisation with Matplotlib, Pandas.plot, and Seaborn
- Create basic to intermediate analytics models with Python/Sckit-learn
- Using the above tools within the context of solving essential data science problems
- Applying Python tools to import data from various sources, explore them, analyse them, learn from them, visualise them, and share them
- Python Basics (I): Python Environments
- Python statement and operation
- Variable Assignment
- Functions and Classes
- Python Basics (II)
- Lists and Dictionaries
- Conditional and looping statement
- File Input/Output
- Managing Python Environments and Packages
- Working with Data Sources
- Reading CSV
- Web Scraping
- Interacting with local and remote databases (ODBC)
- Reading from HTML
- Mini-Project: Making a Data Product with Python and Jupyter
- Data Exploration and Wrangling
- Series/Data frame
- Data cleaning
- Data analytics e.g., Descriptive statistics using Python
- Data Visualization with the matplotlib
- Basic visualization technique
- Creating visualization tools using matplotlib
- Introduction to key Data Science
- Data analytics process: Supervised and Unsupervised Learning
- Regression and Classification using Sci-kit Learn
- Mini-Project (and/or) Recap: Creating data visualization and data analytics product
Anaconda, Jupyter Notebook, Scikit-learn
Must be familiar with the Python programming language, or have attended the Introduction to Python training and statistics 101 at a pre-university level.
Who Should Attend
Business Analysts, Data Analysts, Software Engineers, Programmers
Mode of Training
On-campus or Online (Live)
Dr. Danny Poo
Danny Poo, a graduate from the University of Manchester Institute of Science and Technology (UMIST), England, has been in the field of Software Engineering, Artificial Intelligence and Information Systems for more than 40 years. He is currently an Associate Professor at the Department of Information Systems and Analytics, School of Computing, National University of Singapore.
Dr. Poo is deeply involved in Data Analytics and Software Engineering, teaching undergraduates and postgraduates in the areas of Software Engineering, Data Management and Machine Learning.
A well-known speaker in seminars, Dr. Poo has conducted numerous in-house training and consultancy for organizations. His notable teaching credentials include Data Strategy, Data StoryTelling, Data Visualisation, Big Data Analytics, Machine Learning, Data Management, Data Governance, and Data Architecture.
Dr. Poo is a Steering Committee member of the Asia-Pacific Software Engineering Conference and the founding Director of the Centre for Health Informatics at NUS. The Centre is the lead provider of human capital in Health Informatics.
Dr. Poo is the author of 5 books on Object-Oriented Software Engineering, Java Programming and Enterprise JavaBeans. He is currently publishing three books on Data Analytics including Python Programming, Learning Python for Data Analysis and Machine Learning using Python.
Mr. Koo Ping Shung
Koo Ping Shung is an experienced Data Scientist with more than 13 years of relevant experience. He is also currently a Adjunct Senior Faculty with the Singapore University of Social Sciences (SUSS) and a SAS Trainer as well. To date, he has conducted over 1,600 man-hours of data science training. He is also the mentor to the trainees accepted into the IMDA-SAS BIA Programme for 5 intakes.
Ping Shung was a guest lecturer at NUS Institute of Systems Science and previously held an adjunct position in School of Information Systems, Singapore Management University.
Prior to this, he was the Analytics Practicum Manager of the Master of IT in Business (Analytics) at the Singapore Management University School of Information Systems. He managed the industrial relationships through projects and attachments. He often advised companies on the type of data analytics projects they can work on with their data and is a co-supervisor for many Masters students on their data analytics capstone projects. He was an instructor for the DBS Graduate Associate Programme for 3 years, teaching over 200 Graduate Associates on data analytics and received positive ratings. He has also trained professionals from various companies on data analytics and the use of SAS software.
Ping Shung was a facilitator for the IDA Data Science MOOC programme for 2 cohorts (over 300 professionals) and participated in Singapore's first Data Literacy Bootcamp which was co-organised by the IDA Singapore and The World Bank.
Ping Shung through his career has gathered much experience on statistical modeling, from working in the banks, supervising Masters students and doing education research. His data analytics experience range from a wide variety of business functions and industries, gathered from talking to companies or working on data analytics projects.
His strong passion in data analytics and data science can be seen through his involvement in data analytics interest groups, being a Co-founder of DataScience.Sg and former Working Committee Chairman of SAS User Group Singapore and Data Ambassador for one of the DataKind SG project. He also read widely on the different topics related to data science and artificial intelligence, keeping himself up-to-date with their development.
His research interest lies in how data science can help organisations and businesses to be more efficient and effective.
Ping Shung holds an MBA from University of Adelaide. He obtained his bachelor degree in Economics from National University of Singapore, with a minor in Computational Finance.