Image data monetization with GCP and Elastic

In the data-driven world that we live in nowadays, extracting information and insights from every possible source is vital for all businesses. The data that is analyzed the most comes in the form of metrics and text however there is information to be found in other forms as well such as video and voice recordings and images and photos. These types of a bit more trickier to analyze, granted, but they can offer powerful insights into client preferences, behavior or sentiment. In this blog post we will take a look at how image data can be monetized using Google Cloud Platform together with the Elastic Stack to extract and analyze data with the purpose of gathering actionable insights.

Getting the data out of images

There are several machine learning models capable of analyzing images and extracting information out of them, however, for this example we will be taking a look at Google Cloud Platform’s image analysis tools and more specifically: Google Vision and AutoML.

The reason for using both these tools will become apparent soon.

Using Google Vision we can extract several types of information from images such as:

  • Identify elements (objects, people, landscapes etc.)
  • Identify characteristics of elements (shape, color, size etc.)
  • Draw certain conclusions based on the elements and the characteristics in the image (the style of a person’s clothes)

Google Vision AutoML is a service offered by google which enables users to create their own machine learning model for image analysis. In order to create this model you will need a few things:

  1. An image database: the larger the better and more importantly appropriate to your use case (TIP: you can also use a series of images that do NOT represent objects/characteristics you are interested in which will enable the model to successfully identify if an image is not useful to you)
  2. A training set: this should be taken from your image database and supplemented with images from other sources. In order to have a reliable high accuracy for the model you should use as many images as possible (in the tens of thousands). Your training set should be roughly 75%-80% of the total database of images used in the model.
  3. A test set: this should be the remaining 25%-20% of your image database and should be used to test the model training accuracy
  4. An ingestion pipeline: the scope of the pipeline is to take the image analysis results and pass them on to an analysis tool, in our case the Elastic stack

An important step which was not mentioned so far but that is critical to the model training is labeling. Labeling is the process of assigning meaningful tags to unlabeled data. The tags used depend on the use case you are trying to cover. If you are interested in identifying for example rooms which are well lit and rooms with poor lighting you should have two sets labels (good lighting and poor lighting for example) and you assign a label to each image in your database depending on the quality of light in the room.

Analyzing the data

Once the images are passed through either Google Vision or Google Vision AutoML we begin to have access to various information which can be used for further analysis.

For analyzing this information we will use the Elastic stack for several reasons:

  • Fast data retrieval even for very large datasets
  • Powerful data visualization tool in the form of Kibana
  • Native machine learning capabilities for anomaly detection and predictive analytics
  • API oriented and as such it is easy to integrate your analysis results with other apps

Retail use case example

Siscale implemented an image analysis pipeline using both Google Vision and Google Vision AutoML to analyze images for one of the largest retailers in the US market.

The scope of the use case was to create a mobile app which clients can use to take pictures of household objects like furniture or kitchen appliances and receive a suggestion of similar products that they could purchase. The analysis system used both Google image analysis tools to extract the object type and characteristics. Once the analysis was made the information was compared with the product database of the retailer using the Elastic stack and then the most relevant results were presented to the client in the application.