What is the best way to load data into Amazon Redshift from MySQL?
I will add that another drawback of the DIY approach is that it usually creates a delay between when the data is written to MySQL and when it becomes available in Redshift.
This is actually one of the problems we are trying to address at Alooma - building a robust data pipeline that can take your inputs and reliably move the data into Redshift (in real time, without any data loss, and performing advanced transformation along the way).
Alooma offers a different approach than current commercial tools in a few ways: Alooma connects to the MySQL server as a replication slave and then copies the binlog directly to Redshift--in near real-time. This allows you to create a near real-time view of the MySQL tables in Redshift. This also means you avoid table locks or performance issues in MySQL which usually happen when doing frequent dumps. Alooma allows custom Python transformations on the data stream before it is even copied to Redshift. Full disclosure, I work there, so I may be biased :)
(but seriously, it's awesome)
Published at Quora. See Original Question here