Data Automation – Part 1
Data automation is the process of updating data programmatically. Automating the process of data handling is important for the long-term sustainability of any data program.
There are three common elements to data automation:
Extract, Transform, and Load.
- Extract: the process of extracting
your data from one or many sources’ systems
- Transform: the process of
transforming your data into the necessary structure.
- Load: the process of loading the
data into the final system.
In these three parts of
write-up, first we will take up how to extract the data or methods of
collecting the data before diving into the technical nitty-gritty of data
automation. Having a complete clarity beforehand about all the three elements will
help you to pick the right process and engage the right people at the right
time within your project management initiate.
In general Data capture is
the process of extracting information from a source and converting it into data
readable by a computer. There are multiple methods to capture, extract and
classify unstructured data. The list of methods of data capture listed in this writeup
is not exhaustive but it is a guide to the key methods used as part of business
process automation project. Due consideration of the origins of the data should
always be considered as it may be easier to integrate or capture the original
data at source rather than use a processed or unstructured form of the
data. This is one of the key principles
of Process Automation & Business Process Management best practice and
optimisation.
Modern solutions for
automated data capture and process automation often apply one or more of the
methods below to achieve maximum accuracy depending on nature of the data and
application use. The most appropriate data capture method depends on the nature
of data to be captured and the application area.
- Manual Keying: Manual keying or data entry is still relevant with certain types of unstructured data where automated capture methods achieve low accuracy levels or volumes are so low and variable that automation is not justified.
- Nearshore
keying: Nearshore keying
is the same as Manual keying but instead of the task being completed in-house
it is delivered by a managed service or delivery centre.
- OCR (Optical Character Recognition): OCR technology revolutionised data capture and underpinned the digitisation and automation of back-office operations. As a technology it provides the ability to recognise machine produced characters as part a data capture and extraction process. Modern OCR incorporate Word recognition, zonal and document recognition as well as AI such as pattern recognition and ML (machine learning) to deliver the most accurate recognition for computer generated text.
- ICR (Intelligent Character Recognition): ICR is the computer translation of hand printed and written characters. A scanned image of a handwritten document is analysed and recognised this software. ICR is like OCR but is a more difficult process since OCR is from printed text, as opposed to handwritten characters which are more variable.
- Barcode/ QR recognition: Dependent upon the type of barcode that is used, the number of metadata that can be included or marked up can be high, as is the level of recognition. Barcodes can be applied to almost any objects for a range of purposes.
- Template based intelligent capture: Templates are used to reduce variables and risks of failed data capture by optimising the capture process to certain document templates. This is combined with OCR & ICR and is useful where the number of different document types being received are relatively low (typically up to 30 different document types) but consistent.
- IDR (Intelligent Document Recognition): IDR also interprets, and indexes different documents based on the document type, its meta data and elements of the document identified.
- Artificial Intelligence and Data Capture: AI is ultimately an umbrella terms for different AI techniques. AI is best viewed in context of the use case and application such as Computer vision, Image, or pattern recognition to improve the recognition of any type of image. Neural Networks & Machine learning to assist with accurate recognition training based on large data sets and assisted learning. Natural Language Processing for interpreting sentences and their meaning.
- Hybrid Intelligent Automated Data Capture & verification services: Despite advances in data capture and AI, exceptions can happen when an automated approach is unable to confidentially automate a task. This platform combines AI with Human intelligence to offer the highest level of automated data capture of unstructured documents as service.
- Digital forms: When collecting information from users, which doesn’t exist already, it often makes sense to capture the data through a digital form either on the web, via an intranet page or smartphone app.
- Digital Signatures: A valid digital signature associated with an email or document allows a user’s identity or the authenticity of digital messages or documents to be captured. Digital signatures are often used for digital approval workflows involving parties from different companies or entities.
- Web scraping or monitoring: These tools, called web BOTs or crawlers (i.e. Google spiders) are used to crawl through web pages and code to collect, analyse and index specific data. Web scraping is used to capture and monitoring anything accessible via the web.
- Screen Scraping: Screen scraping is used by RPA and other tools to navigate, interact, and capture raw data that appears on a digital display, application, or website. Once the data is captured, it is then analysed to extract elements such as text and images etc. and then a workflow executed to process the data as defined by the configured workflow rules.
- Legacy System Integration or Data Import & Migration: If data can’t be accessed in a legacy system due to missing features or proprietary APIs, products such as Alchemy Data grabber Module allow organisations with legacy systems (mainframe systems) to ingest data for improved search and archival applications.
- OMR (Optical Mark Reading): This approach is used to capture human marked data on scanned forms, surveys, and exams.
- MICR (Magnetic Ink Character Recognition): This is a data capture technology capable of recognising characters machine printed in a magnetic ink
- Swipe or Proximity cards: Magnetic swipe or proximity cards are used to store data. Card readers capture this data to confirm identity and control to access to a building or shared device.
- Intelligent Voice Capture: The boom in smart devices has also seen the rise of voice controlled virtual assistants from the likes of Apple (Siri), Google (Google Assistant), Amazon (Alexa) and Microsoft (Cortana). These are the best examples of voice capture being used mainstream in our everyday lives for enhanced customer experiences and business processes.
- Intelligent image & video capture: Intelligent image and video data capture involves real-time analysis of images and moving image data for objects or “triggers” before executing a certain process i.e., crowd and footfall analysis, sentiment analysis, facial recognition, ANPR (Automatic Number Plate Recognition) etc.
- Augmented Reality: Augmented reality is closely linked to video analysis and involves the real time processing of camera footage looking for programmed “trigger” objects. If a trigger object is identified, a process is executed to for example display an overlay graphic, video or other web data.
***************
very well documented
ReplyDeleteThanks Surya
DeleteNicely scripted sir, please write about digital marketing and different target patterns.
ReplyDeleteThanks Rafi.
DeleteSure I will write on digital marketing as well in coming months.