Data Science Simplified

Keep up with latest tools and best practices in data science in 2 minutes.

Subscribe to get short daily code snippets delivered straight to your mailbox.

Latest short posts

Taipy: Build Responsive Interfaces in Python for Large Data

In Streamlit, using controls like sliders can lead to the entire script being re-run when the slider value changes, which may not provide a seamless, real-time update, especially for larger datasets.

In contrast, Taipy uses the State object to efficiently store variables, enabling dynamic data updates in response to user interactions.

Link to Taipy.

Quix Streams: Real-Time Data Processing in Python

Traditional batch processing techniques can be slow when handling large data sets that arrive continuously. In contrast, data streaming is a robust method that enables real-time processing of such data.

Quix Streams is a Python library that enables data streaming by leveraging Streaming DataFrames, which are similar to pandas DataFrames used for batch processing.

This familiar interface allows pandas users to easily build stream processing pipelines with minimal code.

Link to Quix Streams.

KitOps: A Unified Solution to Manage AI/ML Projects

In AI/ML projects, various components are usually stored in separate locations:

Code resides in Git repositories
Datasets and models are stored in DVC or storage services like S3
Parameters are managed using experiment management tools

As components are stored separately, the process of deploying and integrating them can become more complicated.

KitOps’s ModelKits offers a unified solution by packaging these components into ModelKits. This allows for easy versioning and sharing of components with other team members in just a few commands.

Learn more about KitOps.