Data Science Simplified

Keep up with latest tools and best practices in data science in 2 minutes.

Subscribe to get short daily code snippets delivered straight to your mailbox.

Latest short posts

The miceforest library is a Python tool for imputing missing data in a dataset using an iterative series of predictive models. In each iteration, every variable with missing values is imputed using the other variables. The iterations proceed until convergence appears to have been met.

In the example above, the correlation between A and B is brought much closer to the original data after imputing A using B and C, and then imputing B using A and C.

Link to miceforest.

Timely detection and notification of data anomalies are crucial for stakeholders to address potential issues promptly. 

Kestra, an open-source orchestrator, simplifies this process by enabling you to create a workflow using a YAML file.

In the given example, a DuckDB query is used to identify anomalies, and if any are detected, an email with the anomalous rows in a CSV file is sent to relevant parties.

Link to Kestra.

Latest Blog posts

Our Subscriber

4.1k+

Don’t miss these daily tips!

Select Frequency

My Youtube Videos

My Statistics

Followers in Linkedin
0 K+
Followers in Medium
0 K+
Followers in Twitter
0 K+
Subscribers in Youtube
0 K+
Followers in Newsletter
0 K+
Followers in GitHub
0 K+
Scroll to Top