Annotation Services

Empowering machines with vision: Mastering image annotation

Visual intelligence is imperative for artificial intelligence (AI) in imaging. It recognises and differentiates objects and humans in a video or image. To achieve this, the model needs training and testing through computer vision algorithms. The higher the annotation quality, the better the visual intelligence of the AI model.

Image annotation is a complex and meticulous task, which, if executed correctly, can yield results tailored to your business objectives. However, low-quality image annotation can drive up operational costs without producing tangible results.

This article lists image annotation types and methods, explains use cases, and highlights the role of visual intelligence.

Image annotation types

During annotation, you add metadata to the dataset. The following image annotation types train high-performing machine-learning models:

Image classification

Image classification, also called ‘tagging,’ identifies similar objects across the dataset. It refers to an object in a labelled image and recognises a similar one in an unlabelled image. This classification type is broad and only identifies the basic nature of the object.

For example, an annotator could tag an image of an office as a ‘meeting room’ or ‘coffee station’.

Object recognition/detection

Object detection identifies an object’s presence, location, and count. The annotator uses different polygons to differentiate objects within an image. At a more advanced and complex level, a case of object recognition is medical imaging like CT (Computed Tomography) scan or MRI (Magnetic Resonance Imaging).

This is useful in medical science to identify tumours and other malignant growth and track changes over time.

Boundary recognition

Boundary recognition identifies the edges of an object, topography, and artificial boundaries in an image. For example, it identifies lines and splines in traffic lanes and sidewalks, land borders, and trains, autonomous vehicles, and drones.


This is an advanced technique that determines how similar or different the objects in an image are and how they change over time. Three image segmentation types are:

  • Semantic segmentation: Ideal for grouping similar objects that do not need counting. For example, the crowd and the team in a baseball match.
  • Instance segmentation: This is a pixel-wise segmentation that can count the number of people in the crowd in the example above.
  • Panoptic segmentation: It blends both the techniques above to deliver data that is labelled for semantic and instance.

Image annotation methods

Depending on your requirements and annotation tool, you can use one or more of these methods:

  • Bounding box: Annotators draw a 2D or 3D box around objects, such as vehicles, road signs, and pedestrians. The shape of the object or occlusion is less critical.
  • Landmarking: Annotators plot an object’s characteristics to detect emotions, expressions, and features. This method is helpful in facial recognition applications.
  • Polygon labelling: Annotators mark each of the points to trace the outline of an object. This method helps where the exact shape of the object is important, such as a house or vegetation.

Image annotation use cases

Let us understand the image annotation types and methods with the help of practical use cases:

Automotive industry

Annotate vehicles, sky, roads, trees, road signs, and other objects to help autonomous cars navigate safely. For example, the vehicle can easily differentiate between a human and a tree.


Facial recognition is critical for security and law enforcement. Annotation could help improve the effectiveness of identifying suspects in a police lineup.


A large retail shop could use 2D or 3D bounding boxes. Object detection algorithms can perform visual searches to determine the availability and quantity of items.

Real estate

Images of housing and commercial real estate have different 3D objects that the annotator can identify using segmentation and the cuboid tool. It creates depth and scales each object in an image.

Image annotation has applications in a wide range of industries, including agriculture, fashion, livestock, advertising and marketing, robotics and automation, sports, medicine, insurance, and waste management.

Visual intelligence for image annotation

High-quality annotated data is not readily available. It requires skilled annotators working on tools that give visual intelligence to machine learning models. This data is unique and is not available in the public domain.

Imparting visual intelligence in machine learning models involves collecting and annotating raw data captured through still and video cameras.

How can Infosys BPM help?

Build high-quality training data for AI at scale using a software and humanware model. The model successfully annotates images, text, audio, video, and sensor data in CPG, retail, media, railways, oil and gas, insurance, healthcare, and financial services.

Read more about the image segmentation annotation service at Infosys BPM.