Introduction
This solution provides a real-time processing and analytics capability in order to sense, analyze and connect with browsing shoppers and offering them contextually relevant shopping experiences.
Customer & Background
Our customer (retailer) required a solution to analyze high-volume shopper events in-store, correlating with and learning from data revealing past preferences in order to offer promotions and useful product information to the shopper. The retailer required this solution in real-time.
- Collect in-store shopper activity data from sensors as JSON
- Detect the shopper’s exact location
- Map merchandise in relation to shopper location
- Retrieve promotions (coupons, discounts and offers) applicable to the merchandise at shopper’s location and determine relevance to shopper
- Send promotions to Gateway then to shopper depending on relevance and other product information for consumer engagement
The system required the following in Batch Mode in order to deliver optimal shopper experience:
- Collect store-specific promotions (coupons, discounts and offers)
- Collect up-to-date store catalog data
- Index for quick access
- Collect competitive information
- Evaluate product and price information in relation to the market/competition
- Examine user profiles and preferences for promotional relevancy
- Learning of user purchase patterns and responses over time
VOLUMES
- Process thousands to millions of events per second. Compute relevancy using predictive models over terabytes of data.
SOLUTION OVERVIEW
Here is a list of capabilities delivered throughout the project:
- Ingestion: Achieve high-volume write speeds
- Persistence: Persist data to structured (real-time and non-real-time) and semi-structured data stores
- Collect competitive external data: Crawling internet for data on price, product availability and other special offers
- Enrichment: Adding UPC codes to products
- Cleansing: Cleaning data by de-duping, outlier detection, inconsistency removal, missing value treatment etc.
- Transformations: Performing complex joins, data type conversions, merging or cutting, mathematical operations
- Design: User Experiences for coupons and discount offers to shoppers via alerts, notifications and promotions in real-time
Technical
Scalability: Ability to process thousands to millions of events per second
Real Time: Processing and response in less than a second on the cloud
Private cloud and public cloud with PaaS and SaaS apabilities
Reliability: Predictability and consistency in operations
Security through Authentication, Encryption and Trust Zone
Technologies
JSON/ Web Socket/ TCP, Flume, Spark Streaming, Spark, Scrapy Rhadoop, Hbase, PostgreSQL