Industrial computer vision on AWS
Productive vision models for product counting, access control, quality inspection and industrial safety. Amazon Rekognition for fast cases, SageMaker custom for specific ones, integration with your existing IP cameras and edge computing with NVIDIA Jetson.
Vision AI goes beyond "I detected a puppy in the photo". It is about deploying a productive computer vision system that runs 24×7 over industrial IP cameras, with auditable accuracy, controlled latency and predictable costs. Caleidos designs, trains and operates vision models on AWS — Amazon Rekognition for cases covered by pre-trained models, SageMaker for custom models (YOLO, Detectron2, EfficientDet), Kinesis Video Streams for ingestion, IoT Greengrass + NVIDIA Jetson for edge inference where latency matters. Typical cases: automatic SKU counting on production lines, facial-recognition access control, visual quality inspection, industrial safety (PPE, intrusion, risk behaviors).
What you get with Caleidos
Rekognition for fast time-to-value
For cases covered by pre-trained models (people, vehicle, text, common-object detection, faces, content moderation) Amazon Rekognition delivers results in days, not months. Pay per actual use.
Custom models with SageMaker
When the case is specific (a particular SKU, a specific quality defect, a specific industrial part) we train custom models with SageMaker, AWS Ground Truth for labeling and state-of-the-art architectures (YOLO, Detectron2, EfficientDet).
Reuse your existing IP cameras
Integration with industrial IP cameras you already have (Hikvision, Axis, Dahua, Bosch). We do not replace hardware: we ingest the RTSP stream into AWS via Kinesis Video Streams or process at the edge.
Edge computing with NVIDIA Jetson
When latency matters (real-time control, sites with intermittent connectivity, limited bandwidth), we deploy models on NVIDIA Jetson at the edge with AWS IoT Greengrass for fleet management and OTA updates.
How we work
Use case discovery
We define with you the use case, KPIs (precision, recall, latency, throughput), constraints (environment, lighting, angles), camera volume, coverage and expected SLA.
Proof of Concept (PoC)
In 3-6 weeks we validate the end-to-end architecture with real samples: camera ingestion, inference (Rekognition or prototype model), business rules, alerts. Accuracy metrics against ground truth.
Custom training (when applicable)
For cases requiring a custom model: labeling with AWS Ground Truth or specialized tools, training with SageMaker (YOLO, Detectron2, EfficientDet), iteration until reaching the accuracy SLO.
Production deployment
Deployment in cloud (SageMaker endpoints, Lambda) or at the edge (NVIDIA Jetson + IoT Greengrass) per latency required. Integration with downstream systems (ERP, alerts, dashboards).
Continuous operation
Caleidos Lens© 24×7 operates the platform with model drift monitoring, retraining schedule, version management, inference observability and on-call for incidents.
Industrial Vision AI cases
Counting, access control, quality inspection
Vision AI implementations on AWS for automatic product counting on production lines, facial-recognition access control, visual quality inspection and industrial safety. Stack Rekognition + SageMaker custom + existing IP cameras.
Read full case →Tech stack
What we get asked the most
When to use Amazon Rekognition vs a custom model?
Rekognition when the case is covered by its pre-trained models: people, vehicles, text, common-object detection, faces, content moderation, facial comparison. It is the fastest and most economical option. Custom model (SageMaker) when you need to detect something specific to your business: a particular SKU, a specific quality defect, a specific industrial part, custom behaviors. We make the call in the discovery with real samples.
Does it work with our existing IP cameras?
Yes, in most cases. We work with standard industrial cameras (Hikvision, Axis, Dahua, Bosch and other RTSP/ONVIF compatible). The stream is ingested to AWS via Kinesis Video Streams or processed locally at the edge. No need to replace your existing hardware investment.
What level of accuracy is achieved?
Depends on the case. For cases with good lighting, controlled angles and sufficient dataset: 95-99% on common-object detection with pre-trained models, 85-95% on custom models for specific cases. We define with you the acceptable accuracy SLO and design the architecture to meet it (more data, better labeling, model ensemble, human-in-the-loop for borderline cases).
How much does implementing Vision AI with Caleidos cost?
Scope and investment are defined with you after understanding your context: number of cameras, use cases, required accuracy, latency and operating model. Let's have a conversation to put together a tailored proposal.
What typical use cases do you address?
Automatic product counting on production lines or warehouses; access control by facial recognition or license-plate reading; visual quality inspection (defects on parts, packaging); industrial safety (PPE compliance, perimeter intrusion, risk behaviors); retail analytics (heatmaps, time-in-store, conversion). Each case lands in a concrete architecture.
How does Vision AI relate to Agentic AI?
Vision AI is the perceptive component: converts images and video into structured data (what is there, where, how much, when). Agentic AI is the reasoning layer that acts on that data: analyzes, decides, executes. Combined they become full autonomous systems — an agent that sees a defect, logs it, opens a ticket, alerts the supervisor and adjusts process parameters. Learn more at /en/services/agentic-ai-aws.
Ready to get started?
Tell us about your challenge. No pitch, no commitment. Just understanding.
Free Vision AI Discovery