Data Fingerprinting to enable Incremental Improvement in Machine Learning Complexity


Many startups would like to incorporate a machine learning component into their product(s). Most of these products are unique in terms of the business, the data that is required to train the machine learning models, and the data that can be collected. One of the main challenges that these startups have is the availability of data specific to their business problem. Unfortunately, the quality of the machine learning algorithms is dependent on the quality of the domain specific data that is used to train these models. Generic data sets are not useful for the unique problems that these startups are solving. As a result, they cannot rollout a feature involving machine learning until they can collect enough data. On the other hand, customers ask for the product feature before their usage can generate the required data. In such a situation, one needs to rollout a machine learning solution incrementally. For this to happen, there must be a synergy between the data and the algorithms that have the ability to process this data. To enforce this synergy, we propose a computational model that we refer to as “Data Fingerprinting”. Continue reading

Performance Testing for Serverless Architecture (AWS)


This document provides a brief description of how JMETER can be used with a server-less architecture like AWS. This can be used to evaluate the number of read/writes capacity, to benchmark the load an application can survive with.

What are AWS Lambdas?

The code written is deployed to AWS lamba over the cloud with one or more lambda functions. A compute service runs the code on our behalf. Continue reading

Object detection with Turi Create and augmentation using ARKit


Over the past few years, the use of Machine Learning to solve complex problems has been increasing. Machine learning (ML) is a field of computer science that gives computer systems the ability to “learn” (i.e. progressively improve performance on a specific task) with data, without being explicitly programmed.

Last year was a good year for the freedom of information, as titans of the industry Google, Microsoft, Facebook, Amazon, Apple and even Baidu open-sourced their ML frameworks. In this blog, let’s explore a framework provided by Apple named Turi Create. Continue reading

Creating time-based index in Elasticsearch using NEST (.NET clients for Elasticsearch)


One of the most common use cases in Elasticsearch is to create time-based indexes for logs. In this blog, we will see how to create time-based index on run time using NEST (.NET clients for Elastic search).

When it comes to logging, we usually create a log file everyday to isolate the logs and get only the ones relevant for analysis, when required. If we store the logs in a relational database, we commonly have one table. With time, the entries on this table grow and to check the number of records on table, we usually delete the old records from the table at specific interval. Continue reading

OCR implementation in Android

What is OCR?

Optical character recognition, Optical character reader or OCR is the process of reading printed or handwritten text and converting them into machine-encoded text. OCR is mainly used in the field of artificial intelligence, pattern recognition, and computer vision.

So how does it work? In simple words, for a computer, an image is nothing but a collection of pixels. In OCR processing, the image is scanned for light and dark areas to identify each character.  Continue reading