Feature Engineering with Hamilton: Write Once Run Everywhere

Wednesday, October 11, 3:45pm - 4:10pm (EDT)

Time shown in-04:00 America, New York

Speaker: Elijah ben Izzy

Write Once, Run Everywhere Most data transformations are written twice. In the field of feature engineering for Machine Learning, data scientists regularly have to build, manage, and iterate on batch jobs, then translate those jobs to a service setting to load data and make fresh predictions. At best, this process is an engineering headache. At worst, this can result in difficult-to-detect deltas between training and inference, complex code, and highly bespoke infrastructure. In this talk we discuss Hamilton, a lightweight open-source framework in python that enables data practitioners to cleanly and portably define dataflows. Hamilton places no restrictions on the nature of transformations, allowing data scientists to use their favorite python libraries. With Hamilton, you can run the same code in your airflow DAG for training as you would in your fastAPI service for inference, and get the same result.

Add to Calendar 2023/10/11 12:45:00 2023/10/11 13:10:00 America/Los_Angeles Feature Engineering with Hamilton: Write Once Run Everywhere Speaker: Elijah ben Izzy

Write Once, Run Everywhere Most data transformations are written twice. In the field of feature engineering for Machine Learning, data scientists regularly have to build, manage, and iterate on batch jobs, then translate those jobs to a service setting to load data and make fresh predictions. At best, this process is an engineering headache. At worst, this can result in difficult-to-detect deltas between training and inference, complex code, and highly bespoke infrastructure. In this talk we discuss Hamilton, a lightweight open-source framework in python that enables data practitioners to cleanly and portably define dataflows. Hamilton places no restrictions on the nature of transformations, allowing data scientists to use their favorite python libraries. With Hamilton, you can run the same code in your airflow DAG for training as you would in your fastAPI service for inference, and get the same result. false MM/DD/YYYY 60 OPAQUE apQfLtmRnzOiFbgoNmal132273

Link: