What is a data company?
Every company is a data company. That is, every company generates and stores data - about its product, performance, financials and workforce. And many, many companies are data-driven companies: they invest in external intelligence (or ‘big data’) to make smarter decisions.
Because of this, data commerce is a two-way street. There are companies buying data and companies selling data. When we think of “data companies”, it’s important we consider both parties integral to the commercial exchange of big data. For this reason, our Ultimate Guide to Data Companies looks at both data sellers and data buyers. Let’s begin with supply. Part One: Companies Which Sell Data.
Part One: Companies Which Sell Data
Companies selling data are most commonly known as data providers or vendors. Companies which have made data their primary revenue stream are often called data-as-a-service (or DaaS) companies. DaaS companies have created a business from selling data “as-a-service” - that is, on-demand and tailored to clients and organizations looking to buy data.
As the name suggests, big data is vast in terms of the categories comprising it. There are over 600 data categories listed on Datarade Marketplace, with more emerging constantly as technology advances and new kinds of data is generated and collected.
All the same, within data commerce, demand for data is mostly centered around seven core categories: geospatial data, commerce data, financial market data, company data, real estate data, web data, and AI & ML training data. It’s companies providing these categories of data which you’re likely to recognise as established industry players. Let’s have a look at some of these leading data companies according to their primary category. First up: geospatial data companies.
1. Geospatial data companies
Geospatial data companies are businesses that provide data and insights based on geographic locations. Their data can be used in a variety of industries, such as urban planning, agriculture, and transportation, to name a few. Geospatial data companies collect and analyze a range of information, from satellite imagery to location-based social media data, to create detailed maps and visualizations of the physical world.
Here are three top geospatial data companies whose maps and visualizations can be used to make informed decisions and gain deeper insights into the world around us:
- Placer.ai - Placer.ai provides retailers with location-based insights into their audience and competition. Placer.ai provides instant access to location insights derived from the foot traffic of millions of consumers, delivering visibility into offline behavior.
- Carto - Carto is a spatial analysis platform that enables users to leverage data to optimize their business for use cases, such as risk management, behavioral marketing, and data monetization.
- CleverMaps - CleverMaps is a location intelligence platform supporting the data stack ecosystem. Its map-based analytics platform enables it to solve location-related problems. It also allows data analysts to build and maintain large location intelligence projects to be put to work across retail, insurance, banking, real estate, delivery services, and more.
2. Commerce data companies
Commerce data is a rapidly growing category of data that includes various sub-categories, such as consumer data, transaction panel data, product purchase data, e-commerce data, and retail data. Companies providing commerce data offer valuable insights into consumer behavior, market trends, and purchasing patterns.
Here are three commerce data companies helping businesses make better decisions and optimize their marketing and sales strategies:
- Plaid - Plaid offers transaction data with over 5 years’ historical lookback, including merchant and category information. The Plaid API powers financial services applications and connects with user bank accounts.
- Klue - Klue collects deal data and uses AI to surface competitive intelligence for businesses shared into one central location.
- Quantcast - Quantcast provides consumer commerce data for advertisers. Ara® is Quantcast’s patented AI and machine learning engine operates on a dataset of about 20 petabytes of data every day to provide unparalleled insights into real-time consumer behavior through our intelligent audience platform.
3. Financial market data companies
In today's fast-paced financial landscape, companies require access to accurate and up-to-date market data to make informed decisions. To meet this demand, various companies offer financial market data APIs that provide real-time and historical data on stocks, commodities, currencies, and other financial instruments. These APIs offer a wide range of data points, including price, volume, market depth, and news sentiment analysis. Companies can leverage this data to develop investment strategies, create trading algorithms, and gain insights into market trends.
Here are three financial market data providers offering APIs and data tools for clients looking to navigate the complex and dynamic world of finance:
- xignite - Xignite provides cloud-native, real-time, and reference market data APIs to financial services and FinTech companies allowing them to easily integrate financial data into any application.
- Enigma - Enigma is a New York-based Data-as-a-Service company. Enigma transforms tabular data into representations of real-world relationships, providing a source of intelligence about people, places and companies. From evaluating insurance risk to combating money laundering, Enigma connects clients' internal data assets to transform their strategies and workflows.
- Signal AI - Signal AI offers an AI-powered business intelligence and media monitoring platform that aggregates, analyses, and provides business leaders with insights into digital, print, and broadcast media and regulatory data. Signal’s ML enables businesses to track the competitive landscape, changes to regulation and monitor reputation—empowering them to make smarter decisions.
4. Company data companies
Company and firmographic data providers offer a wealth of information on businesses, including their size, industry, location, contact information for relevant employees, and financial performance. This data is crucial for companies looking to identify potential customers, partners, and competitors. With access to company and firmographic data, businesses can create targeted marketing campaigns, conduct market research, and develop competitive strategies. Companies can also use this data to conduct due diligence on potential partners and suppliers, assess credit risk, and evaluate investment opportunities.
Here are three company and firmographic data providers playing a crucial role in helping businesses make informed decisions and stay ahead of competition:
- Cognism - Cognism provides B2B phone number and email data for company and contact targeting. Its data-as-a-service acts as an end-to-end sales acceleration solution that provides sales organizations with a more efficient way to prospect. With its company data assets and compliance engine, Cognism is helping to enrich CRM records, stream leads into the funnel, and is using artificial intelligence to surface opportunities and identify customer trends.
- Databroker - Databroker is the UK’s leading supplier of B2B marketing data. They source data specifically for direct mail, telemarketing and email marketing campaigns. Databroker’s data is regularly kept up to date to guarantee it’s safe and reliable ensuring the best ROI for businesses growing with Databroker’s insights.
- Marketscan - Marketscan offers firmographic data suitable for growing a business, marketing a campaign, or cleansing existing company databases. Marketscan’s business data lists have the volume and quality B2B marketers need to get their direct mail, telephone and email campaigns to the right people.
5. Real estate data companies
Real estate data companies provide valuable information to companies involved in the property industry. These companies collect, analyze, and sell data on a variety of topics, including property valuations, market trends, and building characteristics. The information they provide can be used to make informed decisions about real estate investments, insurance underwriting, and property management.
Here are three top real estate data companies providing businesses with a competitive advantage:
- Zesty.ai - Zesty.ai is an Artificial Intelligence-enabled building analytics platform for the property insurance industry. Insurers use the company's data insights to underwrite risk more accurately, provide consumers with a digital purchasing experience, and manage home inspections more cost-effectively. Using advancements in computer vision and deep learning, Zesty.ai extracts building characteristics from several different data sources for use in wildfire risk modeling, change detection, marketing, and more.
- PriceHubble - PriceHubble is leading the development of data and AI-driven real estate valuations and insights globally. The company gathers property market insights so that investors can simulate the real-time value of their property assets, manage portfolios, and create customer experiences around real estate.
- Local Logic - Local Logic is a location analytics platform for the real estate industry. The platform enables clients to augment their intuition about their assets with data-driven certainty for over 300M properties in the US and Canada.
6. Web data companies
Web data providers and web scraping companies sell valuable data on various aspects of the web, including social media, e-commerce, news, and search engines. This data can be used by businesses to gain insights into customer behavior, track competitors, and monitor brand reputation. Web data providers and web scraping companies can also provide valuable information on pricing trends, product availability, and market demand. With this data, businesses can optimize their online presence, develop effective marketing campaigns, and make data-driven decisions.
Here are some top web data providers and web scraping companies providing businesses with a competitive advantage:
- Bright Data - Bright Data is a web data specialist, offering award-winning proxy networks, powerful web scrapers, and ready-to-use datasets for download. The company’s data offering provides the best network uptime and fastest output - crucial for any project involving web data.
- Webz.io - Webz.io is big web data company aiming to transform the web into structured data feeds. The company translates the unstructured web into structured, digestible JSON or XML formats machines can actually make sense of, helping clients globally put web-derived insight to immediate work.
- Zyte - Zyte was one of the first web data platforms in operation. Zyte offers both web data on-demand or software tools to unlock websites. The company ingests and provides data on over 13 billion web pages per month with 99.9% accuracy.
- ipinfo.io - ipinfo.io is the trusted source for IP address data. This web data company provides accurate IP address data that keeps pace with Enterprise-grade demand and is used for a range of use cases, from e-commerce to cybersecurity. ipinfo.io’s data is aggregated from multiple sources and updated daily.
7. AI & ML training data companies
Artificial Intelligence (AI) and Machine Learning (ML) are rapidly evolving technologies that have transformed many industries. One of the most critical factors that enable the development of more advanced algorithms and AIs is high-quality training data. AI & ML training data is a category of data that is used to train algorithms to perform specific tasks such as image recognition, natural language processing, and speech recognition. This data is annotated, labeled, or classified to enable machines to learn from it and improve their performance.
Companies with annotated image, audio, text, transcript, and video data are meeting the demand for AI & ML training by providing high-quality training data that allows businesses to develop more advanced algorithms and AI systems. So AI & ML training data is one of the most exciting data categories today, and its quality and availability will shape the future of AI and ML technology. Here’s some of the companies providing the goods:
- Sama - Sama provides accurate data for ambitious AI. Its training data platform develops accurate machine learning models specializing in image, video, and sensor data annotation and validation for machine learning algorithms in industries including transportation, retail and e-commerce, consumer and media, MedTech, manufacturing and robotics, and agriculture.
- Superb AI - Superb AI uses AI to customize training data for large tech companies. The company's Superb AI Suite is an enterprise SaaS platform built to help ML engineers, product teams, researchers, and data annotators create efficient training data workflows through its filter and search, auto-labeling AI, and ML Ops integration solutions.
- Mostly AI - Mostly AI develops and offers a synthetic data engine that provides organizations the ability to generate realistic and representative synthetic users and data, retaining structure and variation while preserving privacy. The company serves the banking, insurance and telecommunications industries.
That wraps up Part One: Companies Which Sell Data. Let’s now take a look at the companies generating demand for data-as-a-service, and the various applications for which they’re using big data in Part Two: Companies Which Buy Data.
Part Two: Companies Which Buy Data
External data adoption is a cross-industry, cross-vertical phenomenon which has defined the past two decades of business innovation. Companies across the world are investing in big data as well as the tools to transform and interpret it. This shouldn’t come as a surprise, with an EY study finding that 93% companies have allocated budget for data and analytics. So virtually every company has something of a data sourcing strategy in place.
But which companies have truly cracked it when it comes to putting data to use? Which organizations have the necessary sourcing, processing, storage and application infrastructure to extract maximum value from external data? Several innovative companies have made data a core part of their business models and day-to-day processes.
Let’s look at some of the most common and most powerful applications of big data. And more importantly, the exemplary data-driven companies putting external intelligence to use.
1. Companies using Data for Artificial Intelligence & Machine Learning (AI & ML)
AI and machine learning are two of the most exciting and rapidly growing fields in technology. Companies across industries are buying AI and ML to automate tasks, improve decision-making, and gain insights from large datasets. Some of the most innovative companies in this space are leveraging external data to train AI and machine learning models.
These companies are not only creating cutting-edge technology, but also setting the stage for the next generation of data-driven businesses:
- Cogram - Cogram lets anybody query databases without writing code. Users describe a question in plain English, and Cogram generates the matching database query.
- Midjourney - Midjourney is an independent research lab. It produces a proprietary artificial intelligence program that creates images from textual descriptions.
- Let’s Enhance - Let's Enhance is a machine-learning solution for visual content. The company has trained its AI using millions of annotated images. Let’s Enhance then offers neural networks to automatically remove noise from JPEGs, upscale 4x, and add missing details to make images look natural.
2. Companies using Data for Code & Documentation
Companies and developers are increasingly using external data to improve their code, software engineering, and documentation. By leveraging data from various sources, they can identify patterns, trends, and best practices that can inform their development processes. This can lead to more efficient coding, better documentation, and ultimately, more successful software products. Additionally, companies can use external data to monitor their code and identify potential security vulnerabilities or areas for improvement.
Whether the external data is open source or commercial, data in code and documentation is becoming a critical component of successful software development, as it is for these companies:
- Moderne - Moderne is a software modernization company that focuses on automating code migration and remediation for composed software systems. It accelerates software development through continuous software modernization across an organization’s codebase.
- Warp - Warp’s company goal is to re-create the command line as a modern app, making a more usable, humane and, ultimately, more powerful CLI for developers. Using data, Warp is developing a Rust-based terminal which the company wants to work as well as possible out of the box, but that is also completely customizable and tweakable for the advanced user.
- TabNine - TabNine is an all-language autocompleter. It uses data gathered from records of code to build deep learning algorithms which help engineers write their own code faster.
3. Companies using Data for Finance & Compliance
Finance and compliance are two areas where the use of data has become increasingly important. Companies that deal with large amounts of financial data, such as banks and investment firms, are using data to monitor transactions, identify potential fraud, and ensure regulatory compliance. In addition, there has been a growing trend towards regtech, or the use of technology to help companies comply with regulations. Regtech companies are using data to help automate compliance processes, monitor regulatory changes, and provide real-time risk assessments.
In short, using data in finance and compliance is essential for many companies looking to stay competitive, efficient and ensure compliancy, like these three examples:
- ComplyAdvantage - ComplyAdvantage helps firms make intelligent choices when complying with regulations relating to sanctions, money laundering (AML), terrorist financing (CFT), bribery, and corruption. The company has collected a database of individuals, organizations, and associated entities which provides real-time insight into financial crime risks.
- UpStart - Upstart is a lending platform that leverages data, artificial intelligence and machine learning to price credit and automates the borrowing process. In addition to its direct-to-consumer lending platform, Upstart provides technology to banks, credit unions and other partners via SaaS.
- Cowbell - Cowbell harnesses data and technology to provide SMEs with advanced warning of cyber risk. The company’s USPs are its customized cyber insurance policies which are adaptable to threats facing clients imminently and in the long-term.
4. Companies using Data for Healthcare & Pharma
Healthcare and pharmaceutical companies have long been interested in the potential of leveraging data to improve health outcomes. With the rise of big data and advanced analytics technologies, these companies are now able to purchase, collect and analyze vast amounts of data from a variety of sources to gain insights into disease patterns, treatment effectiveness, and patient outcomes.
By investing in data to personalize treatments and develop new therapies, companies like these are working to improve patient outcomes and drive innovation in the healthcare industry:
- Clarify - Clarify's software solutions are fueled by a patient-level data set and incorporate clinical, claim, prescription, lab and socio-behavioral determinants of health data. Its analytics platform is powered by a technology stack inspired by those used in banking and logistics and provides doctors and insurers greater visibility into cost, quality, referrals, utilization, and member risk. The company is also helping life sciences organizations analyze and integrate rich data to determine the optimal sites and designs for clinical trials as well as accelerate clinical development.
- AiCure - AiCure is an AI and advanced data analytics company targeting the health care industry. AiCure uses AI to see, hear and understand how people respond to treatment across clinical trials and patient care. Clinically proven to accurately measure and modify patient behavior, AiCure’s technologies keep patients engaged and optimized to treatment, as well as assess treatment effectiveness.
- Cleerly - Cleerly is a healthcare company that uses AI-powered imaging to analyze heart scans and whose mission is to create digital care pathways to prevent heart attacks. The company integrates clinical science with AI in order to offer clinical insights to every stakeholder in the heart care pathway. Through these data-driven solutions, it aims to provide a comprehensive solution for cardiovascular disease evaluation that offers great value to the healthcare system and improves heart health for patients at risk of heart attacks.
5. Companies using Data for Human Capital & Recruitment
Companies in the human capital and recruitment space are increasingly turning to data to optimize talent acquisition and drive company growth. By leveraging data from various sources, these companies can identify patterns, trends, and best practices that can inform their recruitment processes. This includes data on job candidate matching, employee retention, and understanding of recruitment processes.
Successful recruitment and human capital companies use data to ensure more efficient matching of job candidates to open positions, better understanding of employee retention and ultimately, more successful company growth. Here’s some companies doing just that:
- Harver - Harver is a pre-employment assessment platform for hiring at scale. Its platform is designed to remove the challenges surrounding high-volume hiring, including managing large volumes of applicants, mitigating unconscious bias, and capturing the right information upfront to make data-driven hiring decisions.
- Beamery - Beamery uses large-scale, data-mining, and machine-learning algorithms to automate relationship tracking for recruiters. It builds recruitment CRM software that enables companies to approach recruiting like customer acquisition, from outbound prospecting and pipeline building to targeted nurture and engagement.
- Fetcher - Fetcher operates a staffing and recruiting platform that combines artificial intelligence and human expertise to fill positions quicker. Rather than using a standard database model, Fetcher's unique combination of machine learning and human intelligence creates curated batches of candidates for every open role. This sourcing model allows recruiters to spend less time in front of a computer searching or filtering for candidates, and more time connecting with candidates and hiring managers.
6. Companies using Data for Product & Automation
The masses of big data made available in recent years has provided companies with powerful tools for improving their product development processes and automating various aspects of their operations. By leveraging data, companies can gain valuable insights into customer behavior and preferences, identify trends and opportunities, and optimize their products and services to meet changing market demands. Additionally, automation technologies can help companies streamline workflows, reduce costs, and increase efficiency, allowing them to focus on delivering high-quality products and services to their customers.
As a result, companies such as these are investing heavily in data for product analytics and automation opportunities:
- Hyperscience - Hyperscience automates manual document processing for global financial services, insurance, healthcare, and government organizations. Its proprietary solution classifies documents and extracts data. Structured data files are then sent downstream for processing, decreasing wasted manual effort, and increasing output and productivity.
- Celonis - Celonis offers a process mining tool for analyzing and visualizing business processes. It helps organizations understand and improve operational process flows for business transformation.
- Workato - Workato is an enterprise automation platform that enables both business and IT teams to integrate their apps and automate business workflows without compromising security and governance. It enables companies to drive outcomes from business events. There is no coding required, and the platform utilizes machine learning and patented technology to make the creation and implementation of automation faster than traditional platforms.
7. Companies using Data for Sales & Marketing
Sales and marketing software and solutions are regularly leveraging data to optimize their products and services. They’re buying data from external data companies to better understand their target customer. By analyzing large sets of customer data, sales and marketing teams at almost any company are able to identify patterns and behaviors that can help improve the customer experience, analyze the customer journey, build better customer segments, enrich consumer databases, increase sales, and drive business growth.
These three example companies illustrate the power of data-driven decision-making in the sales and marketing industry and how it’s being used to create more effective and efficient software and solutions:
- Gong - One example of a company using data in sales is Gong, which analyzes sales calls to provide insights on best practices and areas for improvement. Gong's platform uses AI and machine learning to transcribe and analyze sales conversations, providing data-driven insights that can help sales teams improve their performance. Using voice sentiment data alongside a company’s sales database, Gong helps B2B sales teams convert more of their pipeline into closed revenue by scaling the effectiveness of their sales conversations. Authority and advice backed up by volumes of data is responsible for the success of Gong’s product.
- people.ai - People.ai is a predictive sales management platform. People.ai is helping companies improve the performance of sales teams by surfacing insights and providing automated recommendations about coaching, ramping and activity analytics across their sales organization. Underpinning people.ai’s product is sales and transaction data.
- Attentive - Attentive is an advertising services company. It provides marketing automation, growth marketing, retention marketing, audience management, messaging, and business intelligence. The company allows retail and e-commerce brands to connect with consumers through personalized communication experiences via SMS. For this, Attentive relies on consumer contact data which is up-to-date and at-scale.
But wait, this Guide to Data Companies wouldn’t be “ultimate” unless we took a look into another kind of data company which is neither buy-side nor sell-side. We said that data commerce is a two-way street. Well, that’s kind of misleading. There’s a third, crucial kind of data company is needed to connect data demand with data supply: the data commerce platform. In our two-way street analogy, the data commerce platform is the very street bringing data buyers and data providers into proximity with each other. Which brings us to the final (we promise) section of the Ultimate Guide to Data Companies. Part Three: Companies which Facilitate Data Commerce.
Part Three: Companies which Facilitate Data Commerce
A data commerce platform (DCP) is a software solution used by data providers to commercialise, distribute, and monetize their data products by getting their data supply in front of data demand - globally, at scale. It’s a new category of software born out of, yet also driving, the big data revolution. Much as e-commerce bred (and is catalysed by) companies like Amazon, Shopify, and Alibaba, data commerce has given rise to various software solutions for buying and selling data. Here are the ones you need to know about:
- Data Commerce Cloud™ - Data Commerce Cloud™ (DCC) enables data providers to set up an omni-channel data selling business with ease. With one DCC account, providers can publish their data products across multiple sales channels to reach data demand in various geographies and industries. Data providers using DCC have tapped into millions in USD data revenue from clients at global brands. One of the most ambitious data companies operating today, DCC’s vision is to become the data platform to rule them all, and a go-to partner for any data company.
- Datarade Marketplace - Datarade Marketplace is the world’s largest and most easy-to-use data marketplace. There are over 4,000 data products and 1,000 samples from a network of 600 data providers covering categories from automotive data to x-ray data. Companies looking for the right data partner use Datarade Marketplace to compare products, samples, and instant-purchase datasets. They’re also able to post their request to receive proposals from providers, effortlessly.
- Nomad - Nomad Data creates a marketplace of ideas and allow the experts in the data, the data providers themselves, to match the ideas to their data. It helps the providers to discover new use cases for data and help to find potential customers who have an identified need.
- DAWEX - Dawex is a secure platform for monetization and data exchange between organizations. It allows public players and businesses to share, acquire and monetize all kinds of data without an intermediary.
- Snowflake Marketplace - Snowflake Marketplace provides a data hub for securely collaborating around data with a selected group of members that providers invite. It lets providers publish data which can then be discovered by the consumers participating in the provider’s pre-selected exchange.
- Harbr - Harbr aims to power the world’s data products. The Harbr data commerce platform powers high-margin data businesses by enabling customized data product design and delivery at scale to drive significant revenue. Collaborating directly with customers lets providers deliver exactly what they need while also unlocking the high-margin use cases for the provider’s data.
- AWS Data Exchange - Amazon Web Services (AWS) is a business unit within Amazon.com that provides an infrastructure platform for businesses in the form of cloud computing.
- Eagle Alpha - Eagle Alpha focuses on facilitating the overall functioning of the global alternative data market via its platform and workflow tools. Since its inception, more than 1,000 data buyers have interacted with the platform. Eagle Alpha helps buyers understand a vast and growing marketplace of over 1,800 alternative data products. Leading funds, private equity firms, consultants, corporates, and vendors trust Eagle Alpha as their key alternative data solution provider.
- Narrative.io - Narrative helps organizations execute more efficiently on their data acquisition and data monetization objectives. Narrative's efficient and intuitive platform creates opportunities for both buyers and sellers to acquire or monetize data the way they see fit.
Conclusion: The Number of Data Companies is Only Going to Increase
This guide provided just an overview of 52 data companies to know about in 2023, grouped into three categories: companies which sell data, companies which buy data, and companies which facilitate data commerce. It’s by no means exhaustive, and will require updates as data commerce becomes the mainstream and more data companies emerge.
However, we hope it’s shown you some of the most innovative and exciting companies in each category, showcasing how data is being used across various industries and applications. From AI and machine learning to healthcare and finance, data is becoming increasingly essential for businesses looking to stay competitive and innovative. These data companies are paving the way for the future of data-driven business - thank you to every company featured for your contribution to big data innovation!