Configuring Catalog Integration with Amazon S3

0 out of 5 steps completed0%
5 Lessons

About this Course

Approximate Time to Completion: 1 hour
Instructor: Rebecca Golden, Senior Technical Trainer

Course Objectives:

  • Explain the AWS services Collibra uses for its DGC integration with S3
  • Configure the appropriate IAM roles and policies needed for integration
  • Register and synchronize an S3 File System within Catalog

In this course, we will discuss requirements for ingesting metadata for data held in Amazon’s Simple Storage Service, more commonly known as S3, into your DGC environment. By leveraging AWS services, such as Glue, Identity Access Management (IAM), and Athena, provisioning data access can be automated when approved for requested data sets by data analysts.

Collibra’s DGC leverages AWS Glue, which is an ETL service, to create and expose metadata about the data stored in your S3 buckets and provides visibility of that metadata to your DGC users, including the file system structure.¬†Using IAM, DGC users will have access to see that metadata and shop for data sets within the report catalog. If they request access to the data set, you can actually provision temporary credentials through IAM and the user can be granted those permissions and use a service like AWS Athena to write reports inside of Tableau.

You might be interested in these courses:

Course Materials