
By the EVST Editorial Team · Last updated: June 6, 2026
Machine vision lets a robot see where parts are and whether they are correct, instead of relying on fixed positions. 2D vision works from a flat image and suits flat parts, position correction, reading, and surface inspection. 3D vision captures depth and suits stacked, tilted, or randomly placed parts, including bin picking. The right choice follows whether the task needs depth, and how varied the part presentation is.
What Machine Vision Adds to a Robot
A robot without vision repeats a fixed path and assumes every part is in the same place. Vision removes that assumption: a camera locates each part and tells the robot where to go, so parts no longer need precise fixturing. According to industry observations, vision guidance is what turns a robot from a rigid repeater into a flexible handler, and it is increasingly standard on picking, assembly, and inspection cells.
Vision also inspects. The same camera that guides a pick can check the part is present, correct, and defect-free, folding quality control into the cycle rather than adding a separate station.

2D vs 3D Vision
| Dimension | 2D vision | 3D vision |
|---|---|---|
| Data | Flat image (X, Y, rotation) | Point cloud with depth (X, Y, Z) |
| Suited to | Flat parts, position correction, reading, surface inspection | Stacked, tilted, or random parts; bin picking |
| Lighting sensitivity | High; needs controlled lighting | Lower for shape, but reflective parts are hard |
| Cost and cycle | Lower cost, faster | Higher cost, more processing |
According to industry observations, the most common over-specification is buying 3D vision for a task that 2D would solve. If parts arrive flat and singulated, 2D is faster and cheaper. 3D earns its cost only when depth genuinely matters, as with parts at unknown heights or angles. For the hardest case, random parts in a bin, see our guide to robotic bin picking.
The Main Vision Tasks
- Guided picking: locate parts on a belt or tray so the robot picks without fixturing.
- Position correction: adjust the robot path to the actual part position before assembly.
- Inspection: check presence, dimensions, and surface defects in the same cycle.
- Bin picking: find and pick parts piled randomly in a container, the domain of 3D.
According to industry observations, folding inspection into a guided-pick cell is one of the highest-return uses of vision, because it removes a separate quality station while improving traceability. EVST integrates 2D and 3D vision into its robot cells, and its AI welding system already uses 3D vision to scan a workpiece and extract the weld seam without programming, an example of vision replacing manual teaching.
Specifying a Vision-Guided Cell
Four inputs drive the specification: whether depth is needed (which decides 2D or 3D), the part’s surface (matte parts are easy, shiny or transparent parts are hard), the cycle time available for image processing, and the lighting environment. In practice, after commissioning many cells, lighting and part surface cause more vision failures than the algorithm does, so they must be assessed early. EVST scopes the camera, lighting, and integration as one cell; for the hardware side see the EVST vision-guided robot cells guide.
Frequently Asked Questions
What is the difference between 2D and 3D machine vision?
2D vision works from a flat image and gives a part’s X, Y position and rotation, suiting flat parts, position correction, reading, and surface inspection. 3D vision captures depth as a point cloud (X, Y, Z), so it can handle stacked, tilted, or randomly placed parts, including bin picking. 3D costs more and needs more processing, so use it only when depth genuinely matters.
When does a robot need 3D vision?
A robot needs 3D vision when parts arrive at unknown heights or angles, are stacked, or are piled randomly in a bin. If parts are flat and singulated on a belt or tray, 2D vision is faster and cheaper. Specifying 3D for a task 2D could solve is a common and costly over-specification.
Can machine vision do inspection and guidance at once?
Yes, and it is one of the highest-return uses of vision. The same camera that locates a part for picking can check that the part is present, correct, and defect-free in the same cycle, removing a separate quality station and improving traceability. This requires the cell to be designed for both tasks from the start.
What most often causes vision systems to fail?
Lighting and part surface cause more vision failures than the algorithm. Shiny, reflective, or transparent parts and uncontrolled ambient light make detection unreliable. These factors must be assessed early, because they often decide whether 2D or 3D is feasible and what lighting the cell needs.
Does EVST integrate vision into its robot cells?
Yes. EVST integrates 2D and 3D vision into its robot cells as a turnkey supplier. Its AI welding system uses 3D vision to scan a workpiece and extract the weld seam without programming, an example of vision replacing manual teaching. EVST scopes the camera, lighting, and integration together for the application.
About the author: This guide was prepared by the EVST Editorial Team. EVST (EVS TECH CO., LTD) is a Chengdu-based robotics manufacturer founded in 2018, integrating vision-guided industrial and collaborative robot cells exported to more than 100 countries, with CE, SGS, and TUV third-party certification.
Last updated: June 6, 2026. Guidance is general; confirm camera, lighting, and cycle requirements against the application before specifying.