The new Voxel51 Auto-Realization Technology promises to reduce the annotation costs by 100,000 times

A breakthrough new test of computer vision start Voxel51 It suggests that the traditional data annotation model is to be exceeded. In the research published today, the company informs that its new auto-finding system reaches up to 95% accuracy at human level, and at the same time 5000 times faster and up to 100,000x cheaper than manual labeling.

Research compared models of foundations such as YOLO-WORLD and DINO Grounding on well-known data sets, including COCO, LVIS, BDD100K and LZO. Interestingly, in many real scenarios, models trained exclusively on AI labels made on an equal footing of Z-Lub, even better than better trained on human labels. In the case of companies building vision systems, computer implications are huge: millions of dollars in annotation costs can be saved, and models development cycles can decrease from weeks to hours.

New era of annotation: from manual delivery to pipelines directed by models

For decades, data annotation was a painful bottleneck in the development of AI. From ImageNet to autonomous sets of vehicle data, the teams consisted of vast armies of human employees to draw limiting boxes and segments of objects – efforts of both expensive and slow.

The dominant logic was simple: more man -marked data = better AI. But voxel51 tests will turn this assumption on the head.

Their approach uses pre-trained foundation models-non-diaphragm with the capabilities of zero shot-integrates them with a pipeline, which automates routine marking, while using active learning to flagate uncertain or complex cases of human review. This method dramatically reduces time and costs.

In one test, it took a little over an hour to mark 3.4 million objects using the NVIDIA L40S graphics processor and cost $ 1.18. Manually doing the same with AWS Sagemaker would take almost 7,000 hours and would cost over USD 124,000. In particularly difficult cases, such as identification of rare categories in COCO data sets or lvis-time of models marked by exceeded Their excellent counterparts. This surprising result may result from consistent labeling patterns of foundation models and their training in the field of large -scale internet data.

Inside Voxel51: The team transforms visual flows of AI work

Founded in 2016 Professor Jason Corso AND Brian Moore At the University of Michigan Voxel51, it originally began as advice focusing on video analysis. Corso, a veteran of a computer vision and robotics, has published over 150 academic articles and contributed to the wide open source code to the AI ​​community. Moore, former dr. Student Corso, acts as a general director.

The turning point occurred when the team recognized that most of the AI ​​bottlenecks were not in the design of the model – but in the data. This insight inspired them to create FiftyonePlatform designed to strengthen engineers for more effective exploration, visiting and optimizing visual data sets.

Over the years, the company has risen $ 45min this Series 12.5 million USD ia Series 30 million USD b Managed by Bessemer Venture Partners. There was an adoption of the company, with main clients, such as LG Electronics, Bosch, Berkshire Gray, Precision Saga and Rio, integrating Voxel51 tools with their flows of production work AI.

From tool to platform: The expanding role of Fiftyone

Fiftyone has grown from a simple data visualization tool for a comprehensive data -oriented platform. It supports a wide range of label formats and schemes – Coco, Pascal VOC, LVIS, BDD100K, Open images – and easily integrates with frames such as tensorflow and pythorch.

More than a visualization tool, Fiftyone enables advanced operations: finding duplication of images, identifying incorrectly marked samples, going out and measuring model models. The plug -in ecosystem supports non -standard modules for recognizing optical characters, video questions and answers, as well as dental -based analysis.

The Enterprise version, Fiftyone Teams, introduces cooperation functions such as version control, access to access and integration with cloud storage (e.g. S3), as well as tools for annotations such as Labelbox and CVAT. In particular voxel51 also cooperating with V7 Labs To improve the flow between the data set treatment and manual annotation.

Rethinking the annotation industry

AUTO-ONLY TEXTS VOXEL51 question the assumptions underlying the annotation industry worth almost $ 1 million. In traditional work flows, every image must be affected by man – roads and often unnecessary process. Voxel51 claims that most of this workforce can now be eliminated.

In the case of their system, most of the images are marked by artificial intelligence, while only edge cases are escalated to people. This hybrid strategy not only reduces costs, but also provides a higher general quality of data, because man's effort is reserved for the most difficult or valuable annotations.

This change resembles wider trends in the AI ​​field in the direction And data -oriented– a methodology that focuses on optimizing training data, not without tuning of model architecture.

Competitive landscape and industry party

Investors such as Bessemer perceive Voxel51 as a “layer of data orchestration” for AI – AI – to how to tools software development. Their Open Source tool has won millions of downloads, and their community includes thousands of ML programmers and teams around the world.

While other startups, such as AI Diver, Roboflow and ActiveLoop, also focus on data flows, Voxel51 is distinguished by its wide ethos and infrastructure of the corporate class. Instead of competing with annotation suppliers, the Voxel51 platform complements them – increasing existing services more efficient through selective treatment.

Future implications

Long -term implications are deep. If it is widely accepted, Voxel51The methodology can radically reduce the barrier in the entrance to a computer vision, democratizing the field of startups and researchers who do not have extensive marking budgets.

In addition to saving costs, this approach is also the basis Continuous learning systemsWhere models in production automatically mean failures, which are then checked, separated and folded back to training data – all within the same arranged pipeline.

The wider vision of the company is consistent with how AI is evolving: not only smarter models, but smarter work flows. In this vision, the annotation is no longer dead-but it is no longer the domain of brutal delivery. It is strategic, selective and powered by automation.

LEAVE A REPLY

Please enter your comment!
Please enter your name here