PUBLISHED:
CATEGORY: Stripe ▸ Analytics GUI
TAGS:

Getting your Stripe data into AWS Glue

AWS Glue is a new managed cloud service from Amazon that moves data around, optionally transforming it in the process.

It primarily works as a managed ETL system with a friendly GUI that manages your "data pipelines".

Glue reads your data from any AWS based database (including SQL RDS which tdog can write to).

Once schema meta data about the sources is stored in the AWS Glue Data Catalog you can begin to use it with AWS Glue.

Interesting AWS Glue powered features

  • You can create a single central datastore by using Glue to continuously poll and copy data from multiple external sources.

    • A central datastore can be used to query and join tables to get a "single view of the world".
  • AWS QuickSight can be used for charts and analytics.

  • AWS Athena

    • SQL query engine.
    • This is especially useful if you are considering Stripe Sigma.
    • Both Stripe Sigma and AWS Athena are built on the Presto SQL engine.
    • Sigma charges an infrastructure fee along with a per-charge fee, but Athena just charges you per query.
    • Athena allows importing data from other sources too, which would allow cross service joins (not possible with Sigma).
  • AWS Glue Elastic Views

    • Create composite tables ("materialized views") that combine data from multiple sources using SQL.
    • These views are automatically kept up to date, and can be pushed to other AWS services:
      • RedShift, OpenSearch, S3, DynamoDB, Aurora, RDS.

The tdog CLI downloads your Stripe account to a MySQL or Postgres database instance which can be used as an AWS Glue source.

Try tdog for free against a Stripe test account