Arun Kottolli: 2018

Thursday, August 30, 2018

Interesting Careers in Big Data

Big Data & Data analytics has opened a wide range of new & interesting career opportunities. There is an urgent need for Big Data professionals in organizations.

Not all these careers are new, and many of them are remapping or enhancements of older job functions. For example, a Statistician was formerly deployed mostly in government organization or in sales/manufacturing for sales forecast for financial analysis, statisticians today have become center of business operations. Similarly, Business analysts have become key for data analytics – as business analysts play a critical role of understanding business processes and identifying solutions.

Here are 12 interesting & fast growing careers in Big Data.

1. Big Data Engineer
Architect, Build & maintain IT systems for storing & analyzing big data. They are responsible for designing a Hadoop cluster used for data analytics. These engineers need to have a good understanding of computer architectures and develop complex IT systems which are needed to run analytics.

2. Data Engineer
Data engineers understand the source, volume and destination of data, and have to build solutions to handle this volume of data. This could include setting up databases for handling structured data, setting up data lakes for unstructured data, securing all the data, and managing data throughout its lifecycle.

3. Data Scientist
Data Scientist is relatively a new role. They are primarily mathematicians who can build complex models, from which one extract meaningful analysis.

4. Statistician
Statisticians are masters in crunching structured numerical data & developing models that can test business assumptions, enhance business decisions and make predictions.

5. Business Analyst
Business analysts are the conduits between big data team and businesses. They understand business processes, understand business requirements, and identify solutions to help businesses. Business analysts work with data scientists, analytics solution architects and businesses to create a common understanding of the problem and the proposed solution.

6. AI/ML Scientist
This is relatively a new role in data analytics. Historically, this was part of large government R&D programs and today, AI/ML scientists are becoming the rock stars of data analytics.

7. Analytics Solution Architects
Solution architects are the programmers who develop software solutions – which leads to automation and reports for faster/better decisions.

8. BI Specialist
BI Specialists understand data warehouses, structured data and create reporting solutions. They also work with business to evangelize BI solutions within organizations.

9. Data Visualization Specialist
This is a relatively new career. Big data presents a big challenge in terms of how to make sense of this vast data. Data visualization specialists have the skills to convert large amounts of data into simple charts & diagrams – to visualize various aspects of business. This helps business leaders to understand what’s happening in real time and take better/faster decisions.

10. AI/ML Engineer
These are elite programmers who can build AI/ML software – based on algorithms developed by AI/ML scientists. In addition, AI/ML engineers also need to monitor AL solutions for the output & decisions done by AI systems and take corrective actions when needed.

11. BI Engineer
BI Engineers build, deploy, & maintain data warehouse solutions, manage structured data through its lifecycle and develop BI reporting solutions as needed.

12. Analytics Manager
This is relatively a new role created to help business leaders understand and use data analytics, AI/ML solutions. Analytics Managers work with business leaders to smoothen solution deployment and act as liaison between business and analytics team throughout the solution lifecycle.

Wednesday, August 29, 2018

Customer Journey Towards Digital Banking

The bank branch as we know it with tellers behind windows and bankers huddled in cubicles with desktop computers, is in need of a massive transformation.

Today. most customers now carry a bank in their pockets in the form of a smart phone app, and visit an actual branch is not really needed. But banks all over the world are still holding on to the traditional brick-and-morter branches.

Though many banks are closing these branches. In 2017 alone, SBI, India's largest bank closed 716 branches!

Today, despite all the modern mobile technologies, physical branches remain an essential part of banks' operations and customer advisory functions. Brick-and-mortar locations are still one of the leading sales channels, and even in digitally advanced European nations, between 30 and 60 percent of customers prefer doing at least some of their banking at branches.

While banks like to move customers to the mobile banking platform, changing customer behavior has become a major challenge. The diagram shows the 5 distinct stages of customer behavior and banks must nudge customers to go along this journey.

Friday, August 24, 2018

Four Key Aspects of API Management

Today, APIs are transforming businesses. APIs are the core of creating new apps, customer-centric development and development of new business models.

APIs are the core of the drive towards digitization, IoT, Mobile first, Fintech and Hybrid cloud. This focus on APIs implies having a solid API management systems in place.

API Management is based on four rock solid aspects:

1. API Portal
Online portal to promote APIs.
This is essentially the first place users will come to get registered, get all API documentation, enroll in an online community & support groups.
In addition, it is good practice to provide an online API testing platform to help customers build/test their API ecosystems.

2. API Gateway
API Gateway – Securely open access for your API
Use policy driven security to secure & monitor API access to protect your API’s from unregistered usage, protect from malicious attacks. Enable DMZ-strength security between consumer apps using your APIs & internal servers

3. API Catalog
API lifecycle Management Manage the entire process of designing, developing, deploying, versioning & retiring APIs.
Build & maintain the right APIs for your business. Track complex interdependencies of APIs on various services and applications.
Design and configure policies to be applied to your APIs at runtime.

4. API Monitoring
API Consumption Management
Track consumption of APIs for governance, performance & Compliance.
Monitor for customer experience and develop comprehensive API monetization plan
Define, publish and track usage of API subscriptions and charge-back services

Thursday, August 23, 2018

Common Options for Disaster Recovery

Disaster recovery (DR) is based on three standard DR sites.

In this article, lets take a look at the differences in hot site vs. warm and cold sites in disaster recovery.

Hot site

In a hot site approach, the organization duplicates its entire environment as the basis of its DR strategy — an approach which, as you’d expect, costs a lot in terms of investment and upkeep. Even with data duplication, keeping hot site servers and other components in sync is time consuming. A typical hot site consists of servers, storage systems, and network infrastructure that together comprise a logical duplication of the main processing site. Servers and other components are maintained and kept at the same release and patch level as their primary counterparts. Data at the primary site is usually replicated over a WAN link to the hot site. Failover may be automatic or manual, depending on business requirements and available resources. Organizations can run their sites in “active‐active” or “active‐ passive” mode. In active‐active mode, applications at primary and recovery sites are live all the time, and data is replicated bi‐directionally so that all databases are in sync. In active‐ passive mode, one site acts as primary, and data is replicated to the passive standby sites.

Warm site

With a warm site approach, the organization essentially takes the middle road between the expensive hot site and the empty cold site. Perhaps there are servers in the warm site, but they might not be current. It takes a lot longer (typically a few days or more) to recover an application to a warm site than a hot site, but it’s also a lot less expensive.

Cold site

Effectively a non‐plan, the cold site approach proposes that, after a disaster occurs, the organization sends backup media to an empty facility, in hopes that the new computers they purchase arrive in time and can support their applications and data. This is a desperate effort guaranteed to take days if not weeks. I don’t want to give you the impression that cold sites are bad for this reason. Based on an organization’s recoverability needs, some applications may appropriately be recovered to cold sites. Another reason that organizations opt for cold sites is that they are effectively betting that a disaster is not going to occur, and thus investment is unnecessary.

Tuesday, August 21, 2018

Fundamentals of Data Management in the Age of Big Data

In the age of GDPR and when new data regulations are being put in place, companies now have to be prudent and cautious in their data management policies.

Data management, data privacy & security risks pose a great management challenge. In order to address these challenges, companies need to put proper data management policies in place. Here are eight fundamental policies of data management that needs to be adhered to by all companies.

Friday, August 17, 2018

4 Types of Data Analytics

Data analytics can be classified into 4 types based on complexity & Value. In general, most valuable analytics are also the most complex.

1. Descriptive analytics

Descriptive analytics answers the question: What is happening now?

For example, in IT management, it tells how many applications are running in that instant of time and how well those application are working. Tools such as Cisco AppDynamics, Solarwinds NPM etc., collect huge volumes of data and analyzes and presents it in easy to read & understand format.

Descriptive analytics compiles raw data from multiple data sources to give valuable insights into what is happening & what happened in the past. However, this analytics does not what is going wrong or even explain why, but his helps trained managers and engineers to understand current situation.

2. Diagnostic analytics

Diagnostic Analytics uses real time data and historical data to automatically deduce what has gone wrong and why? Typically, diagnostic analysis is used for root cause analysis to understand why things have gone wrong.

Large amounts of data is used to find dependencies, relationships and to identify patterns to give a deep insight into a particular problem. For example, Dell - EMC Service Assurance Suite can provide fully automated root cause analysis of IT infrastructure. This helps IT organizations to rapidly troubleshoot issues & minimize downtimes.

3. Predictive analytics

Predictive analytics tells what is likely to happen next.

It uses all the historical data to identify definite pattern of events to predict what will happen next. Descriptive and diagnostic analytics are used to detect tendencies, clusters and exceptions, and predictive analytics us built on top to predict future trends.

Advanced algorithms such as forecasting models are used to predict. It is essential to understand that forecasting is just an estimate, the accuracy of which highly depends on data quality and stability of the situation, so it requires a careful treatment and continuous optimization.

For example, HPE Infosight can predict what can happen to IT systems, based on current & historical data. This helps IT companies to manage their IT infrastructure to prevent any future disruptions.

4. Prescriptive analytics

Prescriptive analytics is used to literally prescribe what action to take when a problem occurs.

It uses a vast data sets and intelligence to analyze the outcome of the possible action and then select the best option. This state-of-the-art type of data analytics requires not only historical data, but also external information from human experts (also called as Expert systems) in its algorithms to choose the bast possible decision.

Prescriptive analytics uses sophisticated tools and technologies, like machine learning, business rules and algorithms, which makes it sophisticated to implement and manage.

For example, IBM Runbook Automation tools helps IT Operations teams to simplify and automate repetitive tasks. Runbooks are typically created by technical writers working for top tier managed service providers. They include procedures for every anticipated scenario, and generally use step-by-step decision trees to determine the effective response to a particular scenario.

Thursday, August 16, 2018

Successful IoT deployment Requires Continuous Monitoring

Growth of the IOT has created new challenges to business. The massive volume of IoT devices and the deluge of data it creates becomes a challenge — particularly when one uses IoT as key part of their business operations. These challenges can be mitigated with real-time monitoring tools that has to be tied to the ITIL workflows for rapid diagnostics and remediation.

Failure to monitor IoT devices leads to a failed IoT deployment.

Steps in Cloud Adaption at Large Enterprises

Large enterprises have bigger challenges when it comes to migrating applications to cloud. Migration to cloud is often an evolutionary process in most large enterprises and is often a 4 step process - but not necessarily a sequential process, and can happen in sequence or on parallel.

Moving to cloud requires a complete buy-in from all business & IT teams: developers, compliance experts, procurement, and security.

The first step is all about becoming aware of cloud technologies and its implications. IT team will need to understand:

1. What are benefits - Agility, cost savings, scalability, etc.
2. What is the roadmap for moving to the cloud?
3. What skills each team member will need?
4. How does the legacy applications work in the future?
5. Who are the partners in this journey?

The second step is all about experimentation and learning from those small experiments. These are typically PoC projects which demonstrates the capability & benefits. The PoC projects are needed to get key stake holder buy-in.

Third step is essentially a migration of existing apps to cloud. For example moving emails to cloud or moving office apps to Offce365 cloud etc. These projects are becoming a norm for large enterprises - which have a rich legacy.

Fourth step demonstrates the final maturity of cloud. In this stage, companies now deploy all new apps on cloud and these are cloud only apps.

Thursday, July 26, 2018

4 Stages of Developing a Data Lake

Companies generally go through the following four stages of development when building a data lake:

Wednesday, July 25, 2018

Why Edge Computing is critical for IoT success?

Edge computing is the practice of processing data near the edge of your network, where the data is being generated, instead of in a centralised data-processing warehouse.

Edge computing is a distributed, open IT architecture that features decentralised processing power, enabling mobile computing and Internet of Things (IoT) technologies. In edge computing, data is processed by the device itself or by a local computer or server, rather than being transmitted to a data centre.

Edge computing enables data-stream acceleration, including real-time data processing without latency. It allows smart applications and devices to respond to data almost instantaneously, as its being created, eliminating lag time. This is critical for technologies such as self-driving cars, and has equally important benefits for business.

Edge computing allows for efficient data processing in that large amounts of data can be processed near the source, reducing Internet bandwidth usage. This both eliminates costs and ensures that applications can be used effectively in remote locations. In addition, the ability to process data without ever putting it into a public cloud adds a useful layer of security for sensitive data.

Monday, July 23, 2018

8 Key Points in a Product Plan

Developing a great product is not an accident. It takes careful planning upfront in developing a Product Requirement Document (PRD).

A good PRD addresses 8 main points which are listed here. This document defines what the product will be, what problem it solves, when it will be ready and how much it will cost. There is no limitation on number of pages the document contain, but it could be comprehensive & concise.

The key to building a great product is to keep this PRD document true to its core intentions. This implies a lot of upfront work, but is absolutely essential for success. If the product is well planned, then only one can build a great product. Oftentimes, it makes sense to develop a user guide as part of the proposed solution – as this helps in developing the product. The amount of work that needs be done upfront is huge – but it aids in every step of product development. Some companies even go into great depths of defining each small step in the project plan with weekly timelines.

"One can achieve greatness with 10000 small steps!"

Thursday, July 19, 2018

5 Pillars of Data Management for Data Analytics

Basic Data Management Principles for Data Analytics

Data is the lifeblood for Big data analytics and all the AI/ML solutions built on top.

Here are 5 basic data management principles that must never be broken.

1. Secure Data at Rest

Most of the data is stored in storage systems which must be secured.
All data in storage must be encrypted

2. Fast & Secure Data Access

Fast access to data from databases, storage systems. This implies using fast storage servers and FC SAN networks.
Strong access control & authentication is essential

3. Manage Networks for Data in Transit

This involves building fast networks - a 40Gb Ethernet for compute clusters and 100Gb FC SAN networks
Fast SD-WAN technologies ensure that globally distributed data can be used for data analytics.

4. Secure IoT Data Stream

IoT endpoints are often in remote locations and have to be secured.
Corrupt data from IoT will break Analytics.
Having Intelligent Edge helps in preprocessing IoT data - for data quality & security

5. Rock Solid Data backup and recovery

Accidents & Disasters do happen. Protect from data loss & data unavailability with a rock solid data backup solutions.
Robust disaster recovery solutions can give zero RTO/RPO.

Wednesday, July 18, 2018

Business Success with Data Analytics

Data and advanced analytics have arrived. Data is becoming ubiquitous but several organizations are struggling to use data analytics in everyday business process. Companies who adapt data analytics in the truest and deepest levels will have a significant competitive advantage, ; those who fall behind risk becoming irrelevant. Analytics has the potential to upend the prevailing business models in many industries, and CEOs are struggling to understand how analytics can help.

Here are 10 key points that must be followed to succeed.

Understand how Analytics can disrupt your industry
Define ways in which Analytics can create value & new opportunities
Top managers should learn to love metrics and measurements
Change Organizational structures to enable analytics based decision making
Experiment with data driven, test-n-learn decision making processes
Data Ownership must be well defined & Data Access must be made easier
Invest in data management, data Security & analytics tools
Invest in training & hiring people to drive analytics
Establish Organizational Benchmarks for data analytics
Layout a long term road map for business success with Analytics

Friday, July 06, 2018

5 AI uses in Banks Today

1. Fraud Detection
Artificial intelligence tools improve defense against fraudsters and allowing banks to increase efficiency, reduce headcount in compliance and provide a better customer experience.
For example, if a huge transaction is initiated from an account with an history of minimal transactions – AI can shop the transactions until it is verified by a human.

2. Chatbots
Intelligent chatbots can engage users and improve customer service. AI chatbot brings a human touch, have human voice nuances and even understand the context of the conversation.
Recently Google demonstrated its AI chatbot that could make table reservation at a restaurant.

3. Marketing & Support
AI tools have the ability to analyze past behavior to optimize future campaign. By learning from prospect’s past behavior, AI tools automatically select & place ads or collateral for digital marketing. This helps craft directed marketing campaigns
Also see: https://www.techaspect.com/the-ai-revolution-marketing-automation-ebook-techaspect/

4. Risk Management
Real time transactions data analysis when used with AI tools can identify potential risks in offering credit. Today, banks have access to lots of transactional data – via open banking, and this data needs to be analyzed to understand micro activities and access the behavior of parties to correctly identify risks. Say for example, if the customer has borrowed money from a large number of other banks in recent times.

5. Algorithmic Trading
AI takes data analytics to the next level. Getting real time market data/news from live feeds such as Thomson Reuters Enterprise Platform, Bloomberg Terminal etc., and AI tools can use this information to understand investor sentiments and take real-time decisions on trading. This eliminates the time gap between insights & action.

Thursday, July 05, 2018

Importance of Fintech to India

On 8 November 2016, the Government of India announced the demonetization of all ₹500 and ₹1000 banknotes, it set off a wave of Fintech growth in India. Fintech is now mainstream and a critical segment for the future of India's economic growth.

Here are 10 reasons why Fintech is very important to India.

1. Economic Growth
The payment segment has been a major enabler of economic growth. Electronic payments systems added $300B to GDP in 70 countries between 2011-2015, which resulted in ~2.6Million Jobs/Yr
Each 1% increase in electronic payment produces ~$104 B in consumption of goods & services

2. Financial Inclusion
Fintech opens up opportunities for the previously unbanked population to access modern financial instruments. For people living in poverty or at the fringes of the economy, Fintech lowers costs of Financial transactions: Lower cost of credit and other banking services.

3. Speed & Quality of Innovation
Fintech drives improvements in traditional financial services – which will replace legacy systems. Eg: Peer-to-peer lending, Robo advisors, Hi-frequency trading

4. Business Sustainability & Scalability
Fintech has made businesses sustainable & Scalable. The entire e-commerce economy was built on e-payment systems and new business models such as Ridesharing: OLA, UBER, Metro Bikes, etc were developed on Fintech e-payment systems – which allows these businesses to scale and grow rapidly

5. Transparency & Audits
All digital transactions are inherently auditable hence bringing greater transparency into the system. Data sharing in real-time across banks & financial institution reduce fraud risks and reduces the cost of regulatory processes.

6. New Value Streams
New fintech technologies are creating new business opportunities. Bitcoin & other cryptocurrencies have spawned whole new businesses.

7. Market Curation & Structural Transformation
Fintech technologies are transforming other industries. For example, healthcare record management, Real estate, land registration with Blockchain, etc. This is bringing structural reforms to businesses that were on the fringes of the regulated economy into the mainstream economy.

8. Collaborative Culture
New Fintech businesses are built in collaboration with other businesses. For example, Blockchain is based on open collaboration between members who host the shared ledger.

9. The Scale of the Industry
Fintech has grown from being a niche to mainstream. Today Fintech companies are collectively worth more than $500Billion and directly employ millions of men.

10. Borderless Innovation
Technological innovations in Fintech can be quickly adapted across borders, creating new competition and new opportunities for existing players. This rapid innovation is bringing whole new financial hubs and opening new markets.

Wednesday, July 04, 2018

Skills Needed To Be A Successful Data Scientist

Data Scientist, the most demanded job of 21st century, requires multidisciplinary skills – mix of Math, Statistics, Computer Science, Communication & Business Acumen.

Top Challenges Facing AI Projects in Legacy Companies

Legacy companies which have been around for more than 20 years have been always slow to embrace new technologies & the case is also very true with embracing AI technologies.

Companies relutcantly start few AI projects - only to abandon them.

Here are are the top 7 challenges AI projects face in legacy companies:

1. Management Reluctance
Fear of Exacerbating asymmetrical power of AI
Need to Protect their domains
Pressure to maintain statusquo

2. Ensuring Corporate Accountability
Internal Fissures
Legacy Processes hinder accountability on AI systems

3. Copyrights & Legal Compliance

Inability to agree on data copyrights
Legacy Processes hinder compliance when new AI systems are implemented

4. Lack of Strategic Vision

Top management lacksstrategic vision on AI
Leaders are unaware of AI's potential
AI projects are not fully funded

5. Data Authenticity

Lack of tools to verify data Authenticity
Multiple data sources
Duplicate Data
Incomplete Data

6. Understanding Unstructured Data

Lack of tools to analyze Unstructured data
Middle management does not understand value of information in unstructured data
Incomplete data for AI tools

7. Data Availability

Lack of tools to consolidate data
Lack of knowledge on sources of data
Legacy systems that prevent data sharing

Monday, July 02, 2018

Benefits of Aadhaar Virtual ID

Use Aadhaar Virtual ID to Secure your Aadhaar Details

Considering the privacy of the personal data including the demographic and biometric information mentioned on the Aadhaar card, UIDAI has recently decided to come up with a unique feature, termed as Aadhaar Virtual ID.

The Aadhaar Virtual ID offers limited KYC access providing only that much information which is required for verification rather than offering complete details of an individual's Aadhaar card.

What is an Aadhaar Virtual ID?

The Aadhaar Virtual ID consists of 16-digit random numbers that is mapped to an individual's Aadhaar card at the back end. An Aadhaar card holder using the virtual ID need not submit his Aadhaar number every time for verification purpose, instead he can generate a Virtual ID and use it for various verification purposes like mobile number, bank and other financial documents.

The Aadhaar Virtual ID gives access to the biometric information of an Aadhaar card holder along with the basic details like name, address and photograph that are sufficient for the e-KYC. Unlike in the past, the agency will not know the 12-digit Aadhaar number and other personal details.

Benefits of Aadhaar Virtual ID

Complete Privacy of personal Data
eKYC can now be done without sharing Aadhaar number
All private information: biometric, DOB, address are private
User has complete control on sharing Aadhaar ID details
Only the Aadhaar card holder can generate virtual ID
Only the Aadhaar card holder can share virtual ID
Aadhaar Virtual ID expires after a pre-set time, preventing misuse
Automates all eKYC verification process in the backend
Simplifies agencies task of individually verifying KYC data
Web Based verification system is fast and reliable for real time business applications

Big Data Analytics for Digital Banking

Big Data has a huge impact on banking, especially in the era of digital banking.

Here are six main benefits for data analytics for banks.

1. Customer Insights

Banks can follow customer's social media & gain valuable insights on customer behavior patterns
Social media analysis gives a more accurate insights than traditional customer surveys
Social media analysis can be near real time, thus helping understand customer needs better

2. Customer Service

Big data analysis based on customer's historical data, current web data can be used to identify customer issues proactively and resolve them even before customer complains
Eg: Analyzing customers geographical data can help banks optimize ATM locations

3. Customer Experience

Banks can use big data analytics to customize website in real time - to enhance customer experience.
Banks can use analytics to send real time messages/communications regarding account status etc.,
With Big Data analytics, Banks can be proactive to enhance custoemr service.

4. Boosting Sales

Social media analysis gives a more accurate insights into customer's needs and help promote the right banking products to customers. For e.g., customers looking at housing advertisements and discussing housing finance in social media - are most likely in need of a housing loan.
Data analytics can accurately acess customer's needs & banks can promote right types of solutions.

5. Fraud Detection

Big Data analysis can detect fraud in real time and prevent it
Data from third parties and banking networks holds valuable information about customer interactions.

6. New Product Introduction

Big Data analysis can identify new needs and develop products that meet those needs
Eg: Mobile Payment services, Open Bank APIs, ERP Integration gateways, International currency exchange services etc are all based on data analytics

Friday, June 22, 2018

Blockchain for Secure Healthcare Records

Individual users health records - both user generate data: Activity monitors (fitness bands, mobile apps), home medical devices etc., and co-generated data from hospitals, testing laboratories, insurance companies etc., are becoming important to protect and regulate access of this sensitive data.

Patient health records is a sensitive data and there are regulations that must be followed. Globally, there has been a heightened need to increase privacy and limit access to patient's health records to avoid misuse and abuse.

Here is one use case for Blockchain based record vault - which enables users to secure their own health records & information.

At this point in time, there are no major solutions which offers this, and hence this a mere idea today - which can become a potential product/solution in near future.

There are several benefits of such a system:

Secure HIPPA Compliant Data Access.
User – i.e, Patient owns all his data & can choose to whom to share this data.
Doctors & Healthcare professionals get authenticated data on all patient records – when needed.
Simplifies Hospital data record management & has data only when patient has approved & for a agreed duration of time.
Drug Research organizations get authentic data & speeds up research.
Patient’s insurance claims are faster due to secure & authentic data. Results in faster disbursal
Retail & Pharma companies can buy user data directly from the patient/user. Thus helping patients from monetizing their own data.

Blockchain's smart contract system allows users to regulate who has access and for how long. Users can thus monetize their own health records and also prevent misuse of their health records.

Wednesday, June 20, 2018

Data Life Cycle Management in the Age of Big Data

Organizations are eager to harness the power of big data. Big data creates tremendous opportunities and challenges.

The data lifecycle stretches through multiple phases as data is created, used, shared, updated, stored and eventually archived or defensively disposed. Data lifecycle management plays an especially key role in three of these phases of data’s existence:

1. Disclose Data
2. Manipulate Data
3. Consume Data

Organizations can benefit from data only if they can manage the entire data lifecycle, focus on good governance, use, share and monetize data.

Tuesday, June 19, 2018

How Machine Learning Aids New Software Product Development

Developing new software products has always been a challenge. The traditional product management processes for developing new products takes lot more time/resources and cannot meet needs of all users. With new Machine Learning tools and technologies, one can augment traditional product management with data analysis and automated learning systems and tests.

Traditional New Product Development process can be broken into 5 main steps:

1. Understand
2. Define
3. Ideate
4. Prototype
5. Test

In each of the five steps, one can use data analysis & ML techniques to accelerate the process and improve the outcomes. With Machine Learning, the new 5 step program becomes:

Understand – Analyze:Understand User RequirementsAnalyze user needs from user data. In case of Web Apps, one can collect huge amounts of user data from Social networks, digital surveys, email campaigns, etc.
Define – Synthesize: Defining user needs & user personas can be enhanced by synthesizing user's behavioral models based on data analysis.
Ideate – Prioritize: Developing product ideas and prioritizing them becomes lot faster and more accurate with data analysis on customer preferences.
Prototype – Tuning: Prototypes demonstrate basic functionality and these prototypes can be rapidly, automatically tuned to meet each customer needs. This aids in meeting needs of multiple customer segments.Machine Learning based Auto-tuning of software allows for rapid experimentation and data collected in this phase can help the next stage.
Test – Validate: Prototypes are tested for user feedback. ML systems can receive feedback and analyze results for product validation and model validation. In addition, ML systems can auto-tune, auto configure products to better fit customer needs and re-test the prototypes.

Closing Thoughts

For a long time, product managers had to rely on their understanding of user needs. Real user data was difficult to collect and product managers had to rely on surveys and market analysis and other secondary sources for data. But in the digital world, one can collect vast volumes of data, and use data analysis tools and Machine learning to accelerate new software product development process and also improve success rates.

Thursday, June 14, 2018

Securing Containers and Microservices with HPE ProLiant Servers

Cloud-native software built on container technologies and microservices architectures is rapidly modernizing applications and infrastructure, and Containers are the preferred means of deploying Microservices. Cloud-native applications and infrastructure require a radically different approach to security. In cloud native applications, Service Oriented Architecture based on Microservices are commonly employed. These Microservices are running on containers and each containers has to be individually secured.

This calls for new ways to secure applications and one need to start with a comprehensive secure infrastructure, container management platform and tools to secure cloud-native software to addresses the new security paradigm.

This article proposes one such solution. Running VMware Instantiated Containers on HPE Proliant Gen10 DL 325 & DL 385 servers using AMD EPYC processors can address the security challenges.

HPE Proliant Gen10 DL 325 & DL 385 servers using AMD EPYC processors provide a solid security foundation. HPE's silicon root of trust, FIPS 140-2 Level 1 certified platform and AMD's Secure Memory Encryption provides the foundation layer for a secure IT Infrastructure.

About AMD EPYC Processor

AMD EPYC processor is the x86 architecture server processor from AMD. Designed to meet the needs of today's software defined data centers. The AMD EPYC SoC bridges the gaps with innovations designed from the ground up to efficiently support the needs of existing and future data center requirements.

AMD EPYC SoC brings a new balance to your data center. The highest core count in an x86-architecture server processor, largest memory capacity, most memory bandwidth, and greatest I/O density are all brought together with the right ratios to get the best performance.

AMD Secure Memory Encryption

AMD EPYC processor incorporates a hardware AES encryption engine for inline encryption & decryption of DRAM. AMD EPYC SoC uses 32-bit micro-controller (ARM Cortex-A5), which provides cryptographic functionality for secure key generation and key management.
Encrypting main memory keeps data private from malicious intruders having access to the hardware. Secure Memory Encryption protects against physical memory attacks. Single key is used for encryption of system memory – Can be used on systems with VMs or Containers. Hypervisor chooses pages to encrypt via page tables - thus giving users control over which applications use memory encryption.

Secure Memory encryption allows running secure OS/Kernel so that encryption is transparent to applications with minimal performance impact. Other hardware devices such as Storage, Network, graphics cards etc., can access encrypted pages seamlessly through Direct Memory Access (DMA)

VMware virtualization solutions

VMware virtualization solutions including NSX-T, NSX-V & vSAN along with VMWare Instantiated Containers provide network virtualization which includes security inform of micro segmentation and virtual firewalls for each container to provide runtime security.

Other VMware components include vRealize Suite for continuous monitoring and container visibility. This enhanced visibility helps in automated detection, prevention & response to security threats.

Securing container builds and deployment

Security starts at the build and deploy phase. Only tested & approved builds are held in the container registry – from which all container images are used for production deployment. Each container image has to be digitally verified prior to deployment. Signing images with private keys provides cryptographic assurances that each image used to launch containers was created by a trusted party.

Harden & Restrict access to host OS. Since containers running on a host share the same OS, it is important to ensure that they start with an appropriately restricted set of capabilities. This can be achieved using kernel security feature such as secure boot and secure memory encryption.

Secure data generated by containers. Data encryption starts at the memory level – even before data is written to the disk. Secure memory encryption on HPE DL 325 & 385 servers allow a seamless integration with vSAN – so that all data is encrypted according to global standards such as FIPS 140-2. In addition kernel security features and modules such as Seccomp, AppArmor, and SELinux can also be used.

Specify application-level segmentation policies. Network traffic between microservices can be segmented to limit how they connect to each other. However, this needs to be configured based on application-level attributes such as labels and selectors, abstracting away the complexity of dealing with traditional network details such as IP addresses. The challenge with segmentation is having to define policies upfront that restrict communications without impacting the ability of containers to communicate within and across environments as part of their normal activity.

Securing containers at runtime

Runtime phase security encompasses all the functions—visibility, detection, response, and prevention—required to discover and stop attacks and policy violations that occur once containers are running. Security teams need to triage, investigate, and identify the root causes of security incidents in order to fully remediate them. Here are the key aspects of successful runtime phase security:

Instrument the entire environment for continuous visibility. Being able to detect attacks and policy violations starts with being able to capture all activity from running containers in real time to provide an actionable "source of truth." Various instrumentation frameworks exist to capture different types of container-relevant data. Selecting one that can handle the volume and speed of containers is critical.

Correlate distributed threat indicators. Containers are designed to be distributed across compute infrastructure based on resource availability. Given that an application may be comprised of hundreds or thousands of containers, indicators of compromise may be spread out across large numbers of hosts, making it harder to pinpoint those that are related as part of an active threat. Large-scale, fast correlation is needed to determine which indicators form the basis for particular attacks.

Analyze container and microservices behavior. Microservices and containers enable applications to be broken down into minimal components that perform specific functions and are designed to be immutable. This makes it easier to understand normal patterns of expected behavior than in traditional application environments. Deviations from these behavioral baselines may reflect malicious activity and can be used to detect threats with greater accuracy.

Augment threat detection with machine learning. The volume and speed of data generated in container environments overwhelms conventional detection techniques. Automation and machine learning can enable far more effective behavioral modeling, pattern recognition, and classification to detect threats with increased fidelity and fewer false positives. Beware solutions that use machine learning simply to generate static whitelists used to alert on anomalies, which can result in substantial alert noise and fatigue.

Intercept and block unauthorized container engine commands. Commands issued to the container engine, e.g., Docker, are used to create, launch, and kill containers as well as run commands inside of running containers. These commands can reflect attempts to compromise containers, meaning it is essential to disallow any unauthorized ones.

Automate actions for response and forensics. The ephemeral life spans of containers mean that they often leave very little information available for incident response and forensics. Further, cloud-native architectures typically treat infrastructure as immutable, automatically replacing impacted systems with new ones, meaning containers may be gone by the time of investigation. Automation can ensure information is captured, analyzed, and escalated quickly enough to mitigate the impact of attacks and violations.

Closing Thoughts

Faced with these new challenges, security professionals will need to build on new secure IT infrastructure that supports the required levels of security for their cloud-native technologies. Secure IT Infrastructure must address the entire lifecycle of cloud-native applications: Build/Deploy & Runtime. Each of these phases has a different set of security considerations which is addressed to form a comprehensive security program.

Tuesday, June 12, 2018

Aadhaar - A Secure Digital Identity Platform

Secure identity platform helps businesses such as Fintech, Banks, Healthcare, Rental services, etc can use to verify customers' real identities. With a Aadhaar number & a finger print scan, Aadhaar lets businesses accurately identify a customers for trusted transactions.

Digitization has created new business opportunities like Peer-to-peer lending, robo investing, online insurance, online gaming, digital wallets etc. As digitization speeds up the pace of business and needs an equally fast, secure identification system.

Currently, Aadhaar platform has over One Billion Identities - and be used to create new business opportunities and also optimize existing processes. For example, companies can use Aadhaar ID to:

1. Optimize Conversions
A fast & accurate customer verification helps mobile companies or Fintech companies speed up conversion inquiries into paying customers.

2. Deter & Reduce Fraud
Secure identification allows Fintech companies to prevent account takeover and online frauds and also detect & prevent new frauds.

3. Meet Compliance Mandates
Data in Aadhaar provides the necessary data to comply with regulations and directives.

4. Enable new business opportunities
Aadhar ID system enables the new 'sharing' economy, allowing owners to share/rent their assets & earn money

Thursday, May 24, 2018

Most Common Security Threats for Cloud Services

Cloud computing continues to transform the way organizations use, store, and share data, applications, and workloads. It has also introduced a host of new security threats and challenges. As more data and applications are moving to the cloud, the security threat also increases.

With so much data residing in the cloud — public cloud, these services have become natural targets for cyber security attacks.

The main responsibility for protecting corporate data in public cloud lies not with the service provider but with the cloud customer. Enterprise customers are now learning about the risks and spending money to secure their data and applications.

Wednesday, May 23, 2018

Build Highly Resilient Web Services

Digitization has led to new business models that rely on web services. Digital banks, payment gateways & other Fintech services are now available only on web. These web services need to be highly resilient with uptime of greater than 99.9999%

Building such high resilient Web services essentially boils down to seven key components:

High Resilient IT Infrastructure:
All underlying IT infrastructure (Compute, Network & Storage) is running in HA mode. High availability implies node level resilience and site level resilience. This ensures that a node failure or even a site failure does not bring down the web services.

Data Resilience:
All app related data is backed up in timely snapshots and also replicated in real time in multiple sites - so that data is never lost and RPO, RTO is maintained at "Zero"
This ensures that Data Recovery site is always maintained as an active state.

Application Resilience:
Web Applications have to be designed for high resilience. SOA based web apps, container apps are preferred than large monolith applications.

Multiple instances of the application should be run behind a load balancer - so that workload gets evenly distributed. Load balancing can also be done across multiple sites or even multiple cloud deployments to ensure web apps are always up and running.

Application performance monitoring plays an important role to ensure apps are available and performing as per required SLA. Active Application Performance Management is needed to ensure customers have good web experience.

Security Plan:
Security planning implies building in security features into the underlying infrastructure, applications & data. Security plan is a mandatory and must be detailed enough to pass security audits and all regulatory compliance requirements.
Software-Defined-Security is developed based on this security plan and this helps avoid several security issues found in operations.
Security plan includes security policies like: encryption standards, access control, DMZ etc.

Security operations:
Once the application is in production, the entire IT infrastructure stack must be monitored for security. There are several security tools for: Autonomous Watchdogs, Web Policing, web intelligence, continuous authentication, traffic monitoring, endpoint security & user training against phishing.
IT security is always an ongoing operation and one must be fully vigilant of any security attacks, threats or weaknesses.

IT Operations Management:
All web services need constant monitoring for Availability & Performance. All IT systems that are used to provide a service must be monitored and corrective actions, proactive actions need to be taken in order to keep the web applications running.

DevOps & Automation:
DevOps & automation is a lifeline of web apps. DevOps is used for all system updates to provide a seamless, non disruptive upgrades to web apps. DevOps also allows new features of web apps be tested in a controlled ways - like exposing new versions/capabilities to select group of customers and then using that data to harden the apps.

Closing Thoughts

High resilient apps are not created by accident. It takes a whole lot of work and effort to keep the web applications up and running at all times. In this article, I have just mentioned 7 main steps needed to build high resilience web applications - but there are more depending on the nature of the application and business use cases, but these seven are common to all types of applications.

Tuesday, May 22, 2018

5 Aspects of Cloud Management

If you have to migrate an application to a public cloud, then there are five aspects that you need to consider first before migrating.

1. Cost Management
Cost of public cloud service must be clearly understood and charge back to each application must be accurate. Lookout for hidden costs and demand based costs - as these can burn a serious hole in your budgets.

2. Governance & Compliance
Compliance to regulatory standards is mandatory. In addition you may need additional compliance requirements. Service providers must proactively adhere to these standards.

3. Performance & Availability
Application performance is the key. Availability/Up time of underlying infrastructure and performance of IT infrastructure must be monitored continuously. In addition, application performance monitoring both direct methods and via synthetic transactions is critical to know what customers are experiencing

4. Data & Application Security
Data security is a must. Data must be protected against data theft, Data loss, data unavailability. Applications must also be secured from unauthorized access and DDoS attacks. Having an active security system is a must for apps running on cloud.

5. Automation & Orchestration
Automation for rapid application deployment via DevOps, rapid configuration changes and new application deployment is a must. Offering IT Infrastructure as code enables flexibility for automation and DevOps. Orchestration of various third party cloud services and ability to use multiple cloud services together is mandatory.

Monday, May 21, 2018

AI for IT Infrastructure Management

AI is being used today for IT Infrastructure management. IT infrastructure generates lots of telemetry data from sensors & software that can be used to observe and automate. As IT infrastructure grows in size and complexity, standard monitoring tools does not work well. That's when we need AI tools to manage IT infrastructure.

Like in any classical AI system, IT infrastructure management systems also has 5 standard steps:

1. Observe:
Typical IT systems collect billions of data sets from thousands of sensors, collecting data every 4-5 minutes. I/O pattern data is also collected in parallel and parsed for analysis.

2. Learn:
Telemetry data from each device is modeled along with its global connections, and system learns each device & application stable, active states, and learns unstable states. Abnormal behavior is identified by learning from I/O patterns & configurations of each device and application.

3. Predict:
AI engines learn to predict an issue based on pattern-matching algorithms. Even application performance can be modeled and predicted based on historical workload patterns and configurations

4. Recommend:
Based on predictive analytics, recommendations are be developed based on expert systems. Recommendations are based on what constitutes an ideal environment, or what is needed to improve the current condition

5. Automate:
IT automation is done via Run Book Automation tools – which runs on behalf of IT Administrators, and all details of event & automation results are entered into an IT Ticketing system

Sunday, May 20, 2018

5 Reasons for All Flash vSAN Storage

1. High Performance
All flash vSAN with a mix of NVMe, SAS SSD allows for superior input/output operations per second (IOPS) performance needed for all enterprise workloads. vSAN 6.7 can give more than 500K IOPS with sub-millisecond read/write latency. When compared to spinning disks storage systems, all-flash vSAN system wins in all performance benchmarks.

2. Enterprise-class capability & Capacity
VSAN now provides all enterprise class storage performance and security such as Encrption, DeDupe, Compliance with all major standards: PCI-DSS, HIPPA, DISA, STIG, FedRAMP etc. Along with vRealize, vSAN can used for all data center automation functions to quickly provision storage, storage insights, storage resource management etc.

3. Guaranteed availability and resiliency
vSAN storage can deliver 99.9999 percent availability. All flash vSAN delivers high availability and high resilience with vSphere HA, Stretched clusters, Smart Rebuild/Rebalancing to ensure highest data integrity.

4. Run multiple workloads
High cluster level storage performance allows users to run multiple enterprise apps within the same cluster. Moreover, vSAN is now certified to run SAP HANA, Oracle, MS-SQL etc. This gives IT admins the confidence to run all mission critical IT apps in vSAN. In addition to standard block storage services, vSAN with NexentaConnect can provide high-performance NFS file services - to provide unified storage solution.

5. DR & Data protection optimizes all-flash storage
Backup and recovery is always one of the highest IT priorities. VMware tools such as vSphere and other 3rd part tools provide enterprises the highest data protection. vSphere Replication, Rapid Array replication across stretched clusters, Storage vMotion etc., leverage all-flash storage best in class data protection.

Friday, May 18, 2018

Popular AI Programming Tools

AI & Robotics based automation market is expected to cross $153 Billion by 2020.

Majority of this value is coming from robotics, and Robotics Process Automation (RPA) which is essentially based on AI technologies.

Here I compiled a list of popular AI programming tools. Most of AI tools are a set of libraries built in Python. In fact python is the number-1 programming language for AI, and in addition to Python, you can use other tools listed below:

Software Defined Security for Secure DevOps

Core idea of DevOps is to build & deploy more applications and do that a whole lot faster. However, there are several security related challenges that needs to be addressed before a new application is deployed.

Software Defined Security addresses this challenge of making applications more secure - while keeping pace with business requirements for a DevOps deployment.

The fundamental concept of software defined security is the codify all security parameters/requirements into modules - which can be snapped on to any application. For example, micro segmentation, data security, encryption policies, activity monitoring, DMZ security posture etc are all coded into distinct modules and offered over a service catalog.

A small team of security experts can develop this code, review & validate it and make these security modules generally available for all application developers.

Application developers can select the required security modules at the time of deployment. This gives tremendous time to deployment advantage as it automates several security checks and audits that are done before deployment.

Security code review & security testing is done once at the security module level and thus individual security code review of each application can then be automated. This saves tremendous amount of time during application testing time - leading to faster deployment.

Software security is ever changing, so when a new standard or a security posture has to be modified, only the security modules are changed and applications can pick up the new security modules - thus automating security updates on a whole lot of individual applications. This leads to tremendous effort saving in operations management of deployed apps.

Thursday, May 17, 2018

How to select uses cases for AI automation

AI is rapidly growing and companies are actively looking at how to use AI in their organization and automate things to improve profitability.

Approaching the problem from business management perspective, the ideal areas to automate will be around the periphery of business operations where jobs are usually routine, repetitive but needs little human intelligence - like warehouse operators, metro train drivers etc., These jobs follow a set pattern and even if there is a mistake either by human operator or by a robot - the costs are very low.

Business operations tends to employ large number of people with minimum skills and use lots of safety systems to minimize costs of errors. It is these areas that are usually the low hanging fruits for automation with AI & robotics..

Developing an AI application is a lot more complex, but all apps have 4 basic steps:
1. Identify area for automation: Areas where automation solves a business problem & saves money

2. Identify data sources. Automation needs tones of data. So one needs to identify all possible sources of data and start collecting & organizing all the data

Once data is collected, AI applications can be developed. Today, there are several AI libraries and AI tools to develop new applications. My next blog talks about all the popular AI application development tools.

Once an AI tool to automate a business process is developed, it has to be deployed, monitored and checked for additional improvements - which should be part of regular business improvement program.

Wednesday, May 16, 2018

Fintech & Rise of Digital Banks

All around the world, we are seeing a new class of banks: The digital banks. These fintech pioneers are redefining the banking industry by connecting with a new generation of mobile-first consumers.

Digital banks are an online only version of a normal bank offering Savings, Checking Accounts with payment, deposit and withdrawal services – but only through web: PC & Mobile devices.

Proving low cost banking services to a new class of customers: People who are highly mobile, tech savvy and unbanked!

Digital banks offer three main services:

1. Payment Gateways

A seller service, often provided by e-commerce store or e-commerce enabler
Authorizes a credit card or online transfer to merchants & businesses
A virtual Point-of-Sale terminal for online businesses

2. E-Wallets

Mobile App used to make payments to other mobile wallets
Digital wallet can be set up to transfer funds to/from a bank account
Popular banking tool for unbanked.

3. Remittances

International Money transfers between individuals
Nearly instant money transfers and low fees to lure customers away from traditional banks
Uses Bitcoin or crypto currencies to avoid regulatory authorities

Tuesday, May 15, 2018

Digitalization of Banks and How Blockchain Helps

The core of challenges faced by banking industry today are: Time taken to complete a transaction, Securing customer and bank’s internal data, Compliance with regulations & Fraud detection & prevention.

All these challenges are essentially data & compute related. Once we understand the core data issues, solving them is relatively easy. Blockchain technology is a great solution to solve many of the current banking challenges.

Monday, May 14, 2018

Popular Programming Languages for Data Analytics

Data analysis is becoming very important and an exciting field to work in. To become a data scientist, one need to have advanced mathematical skills, advanced statistical and real world programming ability. In addition of C/C++ & Java, there are several programming languages that are designed for data analysis.

I have listed down the most popular programming languages for data analysis.

Thursday, May 10, 2018

How AI Tools helps Banks

In the modern era of the digital economy, technological advancements in Machine Learning (ML) and Artificial Intelligence (AI) can help banking and financial services industry immensely.

AI & ML tools will become an integral part of how customers interact with banks and financial institutions. I have listed 8 areas where AI tools will have the greatest impact.

Tuesday, May 08, 2018

Build Modern Data Center for Digital Banking

Building a digital bank needs a modern data center. The dynamic nature of fintech and digital banking calls for a new data center which is highly dynamic, scalable, agile, highly available, and offers all compute, network, storage, and security services as a programmable object with unified management.

A modern data center enables banks to respond quickly to the dynamic needs of the business.
Rapid IT responsiveness is architected into the design of a modern infrastructure that abstracts traditional infrastructure silos into a cohesive virtualized, software defined environment that supports both legacy and cloud native applications and seamlessly extends across private and public clouds .

A modern data center can deliver infrastructure as code to application developers for even
faster provisioning both test & production deployment via rapid DevOps.

Modern IT infrastructure is built to deliver automation - to rapidly configure, provision, deploy, test, update, and decommission infrastructure and applications (Both legacy, Cloud native and micro services.

Modern IT infrastructure is built with security as a solid foundation to help protect data, applications, and infrastructure in ways that meet all compliance requirements, and also offer flexibility to rapidly respond to new security threats.