What are the best web based ETL tools?

byItamar Weiss
Updated Apr 19, 2017

Before we dive into the specific details of Alooma, we can briefly talk about trends in the ETL world that moved from batch-based to stream-oriented solution.

These stream-based tools are often called “data pipelines”. However, they are still being used for the same purpose of classic ETL. Therefore, I’d like to suggest a new modern categorization of ETL tools:

Alooma is ETL as service. While it is based on cutting-edge open source tools such as Kafka, Storm, Elasticsearch and more, it wraps them into an elegant, round product solving all of your ETL needs by employing all of the leading industry best practices.

Alooma provides out-of-the-box connectors to the most popular SaaS services such as: Salesforce, Zendesk, Intercom, Localytics, Quickbooks, Shopify, Segment, Stripe, Zoura, Twilio, Jira and more. As well as the most popular marketing tools such as: Google Adwords, Facebook Adwords, Marketo, Mailchimp, Hubspot. Analytics tools such as: Mixpanel, Google Analytics, Localytics, Segment. Storage products such as: S3, Box, Dropbox, FTP, Google Cloud Storage. APIs such as: REST, websockets, Javascript, Java, Python, Ruby on Rails, iOS, Android.

One of the highlights of Alooma is database replication. It allows you to replicate all the popular databases (MySQL, Postgres, MongoDB, SQL Server, Oracle, Cassandra) and replicate them to your data warehouse in high scale, even if your database is highly sharded.

Well, I hope the picture is clear, Alooma can integrate to almost any data source, and is adding new data sources weekly. And it’s not only that it integrates with these services, each connector is optimized to the specific data source, to import the data at the highest possible throughput and avoid data loss and duplicates even upon third-party failures.

Alooma also provides real-time visualizations and querying of your data streams, so that you can get a picture of the data flowing through your pipeline, in real-time.

The code engine allows you to customize your data exactly how you need it by writing Python code to enable sophisticated data uses such as data enrichment, real-time alerts, anomaly detection, and more.

The mapper allows you to map any data source to any data output. Schema changes can be handled both automatically or manually, according to your preference, without breaking your pipeline

The restream queue is Alooma’s safety net. It catches any error, for any reason, allows you to fix your pipeline, and “restreams” it through the pipeline for exactly once processing

And you don’t need to take Alooma’s word on it, it’s used by dozens of customers such as The New-York times, okcupid, Freshbooks, Gofundme and many more.

Like what you read? Share on

Published at Quora. See Original Question here