Member Sign In

Roland Pogonyi

Programming, SaaS

Data Pipelines

Data Pipelines header image

The Challenge

There are many data analytics tools out there but each has its limitation when it comes to building data pipelines.

Some limitations of other products:

- Not suitable for large amounts of data: A laptop or even the most powerful desktop computer will not be able to process terabytes of data. Maximum file size in spreadsheet applications is mostly limited to a few hundred megabytes, largely determined by the available memory of the device.

- Not able to mix various data sources: Imagine a scenario where you have some data in CSV format and some other related data in a database. Joining and running queries on these two very different sources could be a challenge without the right tool.

- No scheduling: Many analytics tools will let you inspect your data and export a report but will not provide an option to run it automatically at regular intervals.

- Presumed programming knowledge: Most analytics tools will require at least some programming experience.

Data Pipelines image 1

The Results

Data Pipelines, the platform that addresses all of the above issues.

It uses an open source big data analytics engine which guarantees it to reliably work on data of any size.

It lets you mix and match various types of datasources in the same pipeline, including files from Amazon S3.

You can schedule your pipeline runs using the sophisticated built-in scheduler. It does not assume any programming experience. If you have worked with spreadsheets you should be able to build your own pipeline using the built-in visual query builder.

Some use cases:

- Interactive analysis: Connect your datasource and start inspecting your data via the pipeline builder. See the output in real time while building the pipeline step by step.

- Reporting: Once a pipeline is built, use the scheduler to run it at regular intervals to deliver reports to a destination of your choice.

- Data migration: Thanks to its ability to mix different types of datasources (eg. CSV files with relational database) in a single pipeline, it can be used for migrations and regular backups.

Find Out More