Argilla, maker of the first platform that empowers enterprises to build production-ready natural language processing (NLP) solutions with a unique data-centric labeling platform, announced today $1.6 million in seed funding. The round was co-led by the first VC firm exclusively focused on investing in AI, Zetta Venture Partners, and Spanish deeptech VC focused on data infra and devtools, Caixa Capital Risc.
NLP is a well-known artificial intelligence component widely used in consumer digital assistants and chatbots, as well as business applications such as sentimental analysis, textual analysis and speech analysis. Enterprise adoption of NLP technologies has continued to grow due to more affordable, scalable options, and increased processing capacity and data digitalization, as well as the convergence of NLP with deep learning (DL) and machine learning (ML). As ML models become more powerful, standard, and accessible through open-source hubs like Hugging Face, high-quality training data has become the most important factor for enterprises to successfully implement NLP technologies.
“A recent study by Meta and UCL London found that adding a few high-quality training data examples is more beneficial for model quality than increasing the model size by billions of parameters given that increasing model size has a huge impact on the cost of computing,” said Francisco Aranda, co-founder of Argilla. “As a result, we’re seeing a shift from model-centric AI to data-centric AI or to put it another way—a shift from data quantity to data quality.”
To tackle this problem, Argilla co-founders, Daniel Vila Suero, Ph.D. and Francisco Aranda, developed the first data-centric NLP platform for data and ML teams to build and monitor high-quality training data—at a fraction of the time and cost of alternative tools. Enterprises can use Argilla to involve domain experts in their NLP lifecycle, which improves data quality and allows them to utilize the most sophisticated programmatic labeling methods from academia, which also reduces labeling costs. Argilla represents a powerful alternative to hand-labeling thousands of training examples.
“Teaching a ML model continues to be a tedious process, data labeling remains a huge challenge and roadblock because it’s slow, costly and labor intensive,” said Roma Jelinskaite, Investment Director at Caixa Capital Risc. “We believe Argilla will play a key role in the NLP ecosystem by finally removing the tradeoff between accuracy and efficiency.”
Argilla’s Data-Centric Labeling Platform
Argilla is the first open-source, data-centric labeling platform to offer the most sophisticated labeling practices while simultaneously being simple enough to integrate with other NLP and MLOps tools. The platform empowers teams to involve business profiles in the process, ensuring that the models meet all business requirements. By making it simple to tailor pre-trained models to specific use cases, Argilla helps businesses get the most out of their NLP solutions.
Other key benefits include:
- Reduces time and increases quality: Businesses can reduce time to production and improve model quality without the need for extensive manual labeling.
- Unlocks the value of the data: Argilla makes it easier to build and deploy high-quality NLP models and unlock the value of the data.
- For data teams: Argilla simplifies the process of building complex human-in-the-loop workflows, integrates them with an existing stack, and improves models over time. It also allows teams to fine-tune any pre-trained language model to meet the specific needs of their business.
- For business and domain experts: Argilla makes it easier to contribute to data and model quality without requiring programming skills and hundreds of hours of manual labeling and review.
“Dani, Francis, and the team at Argilla have helped pioneer the new, data-centric approach to enterprise NLP. Their maniacal focus on bringing the most sophisticated data curation, programmatic labeling, and human-in-the-loop features to market has been rewarded by thousands of open-source and enterprise users who leverage Argilla to generate value from NLP throughout their organization,” said James Alcorn, Partner at Zetta Venture Partners. “Zetta is thrilled to partner with Dani, Francis, and leading Spanish VC, Caixa Capital Risc, to help Argilla execute on its next phase of growth.”
“We’re thrilled to have the backing and guidance of two leading VC firms who invest in breakthrough technologies and are equally as excited about the possibilities Argilla is bringing to the NLP industry,” said Vila Suero. “We know global tech leaders are looking for better solutions based on the increase in their NLP budgets. Since 2020, 60% have increased by 10%, 33% indicated it grew by at least 30% and 15% said it has more than doubled. We’re continuing to invest in our platform with the single goal of empowering enterprises to build robust NLP products through faster data labeling and curation and with the easiest-to-use human-in-the-loop and programmatic labeling features.”
Already Argilla has thousands of users across the US, India, Europe, South America, Africa, and Asia. Current customers include Reale Seguros (Italy), Airbus (Germany), and Red Eléctrica de España, Idealista (Spain).
Argilla’s cloud offering (currently in alpha) will be available in the US and globally in Q1 2023.
To learn more about Argilla and to request a demo visit argilla.io.
Argilla is the maker of the first platform that empowers enterprises to build production-ready natural language processing (NLP) solutions with a unique data-centric labeling platform. Argilla empowers enterprises to build robust NLP products through faster data labeling and curation with the easiest to use human-in-the-loop and programmatic labeling features. The company was founded in 2017 in Madrid, Spain by co-founders Daniel Vila Suero, Ph.D. (AI) and Francisco Aranda. To learn more or to request a demo visit argilla.io.
About Zetta Venture Partners
Zetta Venture Partners is the first VC firm focused exclusively on investing in AI. Founded in 2013—the first year a zettabyte of data went across the internet—the firm manages $365M in assets across three funds. Each member of Zetta’s leadership team has extensive experience as an entrepreneur, operator and investor. The San Francisco-based firm leads investment rounds in pre-traction, AI-first companies with B2B business models, including Kaggle, Domo, Domino Data Lab,and Tractable. Visit www.zettavp.com for more information.
About Caixa Capital Risc
Caixa Capital Risc (CCR) is a multi-stage VC fund based in Spain that invests in early & growth SaaS and Deep Tech companies with a particular focus on DevOps, Infrastructure, Data and Cybersecurity. CCR currently manages + €200M and is part of investment organization CriteriaCaixa, the largest investment holding in Spain with €30bn in assets. CCR has been investing since 2007 and portfolio companies have gone public or been acquired by blue-chip companies like Apple, Meta, AirBnB, Qiagen, Vente-Privee, Pernod Ricard, Repsol and similar.