Comet, an MLOps platform, today announced a strategic alliance with Snowflake that would enable data scientists to create better machine learning (ML) models more quickly.
Developers will be able to monitor and version their Snowflake queries and datasets inside of their Snowflake environment thanks to the partnership, according to Comet, which will allow integration of Comet’s products into Snowflake’s unified platform.
According to Comet, this connection will make it possible to trace the creation and performance of a model, providing greater insight and understanding than with conventional development approaches. Additionally, it will affect how well the model performs in reaction to data changes.
Using Snowflake data in the Comet platform will, in general, speed up and make the model-development process more visible, according to the business.
increased model deployment, training, and monitoring speed
According to the firms, clients will be able to construct, train, deploy, and monitor models much more quickly using a combination of Snowflake’s Data Cloud and Comet’s ML platform.
Additionally, Comet CEO Gideon Mendels said that this relationship “fosters a feedback loop between model development in Comet and data management in Snowflake.”
According to Mendels, including such a loop may constantly enhance models and close the gap between developing new models and using them in production settings, therefore delivering on the main promise of machine learning (ML)—the capacity to learn and adapt over time. According to him, organisations will be able to effectively manage data changes and their effects on models in production because to the unambiguous versioning between datasets and models.
The new service from Comet comes on the heels of the company’s recent release of a set of tools and connectors intended to speed up processes for data scientists using large language models (LLMs).
Improving ML models with ongoing criticism
Comet will be able to record, version, and directly connect queries executed by data scientists or developers to pull datasets from Snowflake for their ML models.
According to Mendels, this method improves repeatability, cooperation, auditability, and iterative improvement.
“The integration between Comet and Snowflake aims to provide a more robust, transparent, and efficient framework for ML development by enabling the tracking and versioning of Snowflake queries and datasets within Snowflake itself,” he said. Data scientists may always return to the precise version of the data that was used to train a particular model version by versioning the SQL queries and datasets. For the sake of model repeatability, this is essential.
Identifying the causes of model performance changes
Training data is essential to machine learning and should not be overlooked. Changes to the data, such as the addition of new features, the elimination of missing values, or the adjustment of data distributions, may have a significant impact on a model’s accuracy.
According to Comet, tracking a model’s history may help pinpoint what changes in the data led to noticeable shifts in performance. This not only helps with performance debugging and understanding, but it also directs feature engineering and data quality efforts.
A feedback loop that encourages continual improvements in the data management and model building phases was proposed by Mendels, who suggested recording queries and data over time.
With model lineage, “anyone can understand a model’s history and how it was developed without the need for extensive documentation,” as Mendels put it, “model lineage can facilitate collaboration among a team of data scientists.” “This is especially helpful when members of the team leave or when new members join the team, facilitating smooth knowledge transfer “.
Where will Comet go from here?
Companies like Uber, Etsy, and Shopify, which are already utilising Comet, are said to see improvements of 70% to 80% in their ML velocity.
“This is due to faster research cycles, the ability to understand model performance and detect issues faster, better collaboration, and more,” Mendels added. Since there are still difficulties in linking the two systems at now, this is expected to become much more with the combined solution. By not transmitting data across the wire or storing it in other places, customers may save money on ingress and consumption charges.
According to Mendels, Comet wants to become the go-to framework for developing AI.
The actual benefits of AI, in his opinion, won’t be seen until companies start using AI models trained on their own data. Comet’s goal is to smooth down the transition between research and production, no matter whether the user is starting from scratch with their training, fine-tuning an open-source model, or injecting context into ChatGPT.