Saturday 06 Apr, 2019

Starting at 9:00am to 9:00pm

HubHub - Twin City C

Mlynské nivy 14, Bratislava
Menu

Data Cleaning & Feature Engineering for real life ML

Roman’s bio:
Roman holds a PhD in applied statistics. He currently works as a Lead Data Scientist @ Operam a company automating online advertising for Hollywood studios as well as for other verticals. Operam has pomoted over 100 movies from Oscar winners (Moonlight) to biggest blockbusters (Deadpool 2). Roman has previously Lead the Data Science team in Piano, providing actionable data driven insights and building data products for over 1200 clients including sites such as CNBC or locally SME. Roman is also the organizer of Banalytics meetup focused on analytical community in Slovakia and has been actively engaged in organizing the local edition of MeasureCamp unconference on Digital Analytics. Roman also focuses on mentoring new data scientists, either via teaching practical Data Science at the FIIT STU, or via the Basecamp Data Science Bootcamp.

Workshop info:
Data Scientists spend most of their time on preparing data before we can weave our wants and use the import magic command that solves everything for us. There are too many blogs that use trivial datasets such as iris that are meant as introductory. But that’s not how real life works. In this interactive session we will be looking at a dataset from Dennik N’s Open Source project REMP2020 https://remp2020.com/pythia.html and together we will look at the raw user logs in order to see how a project idea turns into reality despite all the complexities connected to the data.

>> What will you learn:
– How to think about data cleaning
– What are some smart feature engineering approaches
– How to avoid leak from the future in your model
– What to take into account when considering the production pipeline

>> Who is this for:
– Beginners in data science
– People looking to transition into the field

>> What do you need to know:
– A little bit of Python would be nice
– Basic Git to pull the scripts provided

>> Event INFO:
Location: ESET (Aupark Tower)
Date: 5.4.2019 (Friday)
Time: 9:00 – 12:00
Capacity: 15
Price: 75 €

© 2019 MeasureCamp. All rights reserved.