As we look to close out 2016, it is clear that the need to improve customer experience is still on the rise. Creating a custom, personal experience across many different screens is just scratching the surface of the obstacles a company will face. 2017 will be the year of customer experience analytics. In this post, I will take you through how we, here at Alooma, have started to embark on this journey to provide our customers with the best customer experience by using data.
Like most companies, our customers have widely differing use cases for various problems and business values. For example, one customer may be interested in replicating sensitive medical data, and thus be most concerned with accuracy, reliability, and security. Another customer may be looking for a real-time advanced analytics platform for a mobile app, and would be more concerned with latency, volume, and flexibility in the face of schema changes. This means our account managers and solutions engineers can find themselves having a hard time keeping up with the goals of each customer and finding the optimal time to reach out to help them.
We, as Alooma’s data team, decided to try to help them overcome this difficulty by using data.
We analyzed user journeys from the past few months, detailing all the interactions customers had with our product. We identified the critical moments in which we believed reaching out to the customer would be beneficial, either to help overcome a difficulty or to celebrate a win.
We hypothesized that if an account manager or solutions engineer had full visibility into each account's critical moments, with the right context, they would (1) have a better understanding of user intent, and (2) be able to contact the user when needed, to improve his or her experience with our product.
In order to achieve goal (1) a traditional, batch ETL flow would suffice. However, goal (2) requires identifying and responding to the critical moments in near-real-time. We estimated that up to 5 minutes latency would be acceptable, and that we should eventually strive to respond to these events in under a minute.
We decided to create a data flow that would collect all the required data, analyze it to detect critical moments, and report them to the relevant account's account manager and solutions engineer in real-time. We chose Slack as the platform to deliver these updates, because our teams already use this platform, both internally and to communicate with our customers.
Collecting the data
The first challenge we identified was collecting the data. Relevant pieces of information required to detect critical moments reside in many different data sources:
- Salesforce.com: Account information, stage, and ownership, e.g., who is the assigned account manager?
- Website tracking: Web event data, e.g., the pages in Alooma’s website the user visited
- Web server: Customer activity in the system, e.g., adding inputs, deploying new code in the code engine
- Backend logs: Internal platform events, e.g., data being loaded to the output, new table created
- Monitoring system (Zabbix): System issues, e.g., an input connection was interrupted, latency was detected
Processing the data
Now comes the fun part - processing the data. In an ideal world, where all events contain all the required information, reporting logic would be an easy task:
if event.type == 'add input': slack.message( event.account_owner, event.user + ' added input ' + event.input_name + ' in account ' + event.account) if event.type == 'events loaded to output': slack.message( event.account_owner, 'events loaded to output in account ' + event.account) ...
Unfortunately, as tends to be the case in data projects, not all data is available from a single source. For example, when we started working on this project, we realized that events coming from the backend don't contain the
account, but only the technical name of the system (think EC2 instance name like "i-8902DS22"). Additionally, none of our events contained information about the relevant account manager or solutions engineer - that data only resides in Salesforce.com. We thought about enriching the data in the originating sources. For example, adding the
account information to the backend logs. However, such an approach has two big disadvantages:
- It would have required a lot of engineering resources that we would rather allocate to developing the product itself.
- It violates the separation of concerns design principle, by requiring components to hold data that is unrelated to their operation.
We decided to handle this in a script that processes the data after it’s been loaded to the data warehouse. This script uses SQL JOINs to bridge the data gaps. For instance, we joined the backend logs table, which is missing the
account information, with a systems_to_accounts translation table. Additionally, we JOIN with user activity tables to provide the event with relevant context. For example, when a user fails to add an input, we also provide the list of documentation pages that this user has viewed in the past day.
Now the script that processes the events, gets the them as if they were generated, fully featured, in an "ideal world". The logic of processing the events is kept very simple.
After a few iterations, we ended up with code that looks something like:
# SQL query that gets web server events from the last 5 minutes, # enriched with account information and user activity SQL_GET_WEBSERVER_EVENTS = """ SELECT method, url, timestamp, user, input_name, input_type, system_name, accounts.account_name, accounts.account_owner, user_aggreated_activity.pages_visited FROM webserver_events JOIN systems_to_accounts using(system_name) JOIN accounts using(account_id) JOIN user_aggreated_activity using(user) WHERE timestamp > sysdate - INTERVAL '5m' """ # Connect to the database and execute the above query connection = psycopg2.connect(DB_CONNECTION_STRING) events = connection.cursor.execute(SQL_GET_WEBSERVER_LOGS).fetch_all() # Process the events for event in events: if event.url == '/plumbing/inputs' and not event.is_response_ok: slack.message( event.account_owner, event.user + ' failed to add input ' + event.input_name + ' in account ' + event.account_name + ‘. Pages visited: ‘ + event.pages_visited) if event.method == 'POST' and event.url == '/transform/functions/run': slack.message( event.account_owner, event.user + ' tested code in the Code Engine in account ' + event.account_name) ...
We wrapped the actual sending of events to Slack with a mechanism that "snoozes" notifications by type, so the same account manager or solutions engineer won't get similar notifications on the same account in short intervals of time. The snooze interval is configured for each notification type: important and rare events like users failing to add inputs are not snoozed, while more common events like users using a new feature for the first time on their system are snoozed for several hours.
This results in a Slack channel full of golden updates like this:
By using these updates our account managers and solutions engineers are better equipped to provide the best experience to our customers. For example, they can now reach out to the user to inform about an outstanding operational issue, or to offer assistance when a user is trying out a new feature. Across the board from account managers to customers the feedback has been super positive. But we aren’t stopping here: we are working to improve the customer updates slack channel, along with trying to find other interesting ways to use data to improve our customer’s experience.