Hi, I am working on a machine-learning project. Because of the available study material in the ML area, the team is inclined towards Apache Kafka, Apache Spark for data-pipelines and analytics. Our requirement is to store huge amounts of continuously increasing data that cannot fit into a single machine. The algorithms require data in batches so it is not necessary to keep full data ready for consumption. Using Kafka, the data can be distributed and fetched in varying batch sizes as and when required. I am more comfortable with PostgreSQL. And wanted to know more about case-studies where PostgreSQL is deployed for ML use. Any pointers referring to study material will be helpful. Please share in this thread. -- Thanks & Regards, Pankaj Jangid