First things first

At Valiu we are obsessed with providing users the best possible experience. With that in mind, it is critical for us to deeply understand our users’ behavior. In this test you’re going to create an ETL pipeline and help us answer questions on our users. You can download the data here


The Challenge

Each of the following csv files contains data pertaining to our users. 'central_financial.csv' contains data pertaining to our cashin operations (COP to USDv), cashout (USDv to VES) and peer-to-peer payments (USDv to USDv) operations; 'remit.csv' contains information pertaining to our remittances operations (COP to VES) and the 'users_short.csv' file pertains to information about our users.

Your task at hand is to pull the data from each one of these csv files into three separate database instances of your choice in which at least one is SQL and the other one noSQL(i.e: PostgreSQL, MySQL, MongoDB). You should be able to write a ETL process to aggregate the data from these three different db instances on a data warehouse of your choice and finally you should establish a db connection to a BI tool of you choice (i.e: Tableau, Metabase, PowerBI) in order to answer the following question via a SQL or NoSQL script: calculate retention rates for our cashins, cashout and remittances operations for the last 4 weeks.


Things you need to know

This test is purposefully open-ended, you'll have to figure out which variab are important and how to best organize our data so querying them is efficient.


When you finish you must

Fill in the following Google Forms to send your challenge 🚀 https://forms.gle/GRPbP5bvGUWYSUZp8

<aside> 💡 Make sure you include the SQL or NoSQL Aggregation code of your queries.

</aside>