Let us begin with understanding AWS Glue, what it is all about before moving to our AWS Glue Plug-in, and how it is benefits to our workload automation users. The AWS Glue Plug-in can be downloaded from Automation Hub to enhance your Workload Automation setup. AWS Glue is a serverless data integration service that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources. You can use it for analytics, machine learning, and application development. It also includes additional productivity and data ops tooling for authoring, running jobs, and implementing business workflows. With AWS Glue, you can discover and connect to more than 70 diverse data sources and manage your data in a centralized data catalog. You can visually create, run, and monitor extract, transform, and load (ETL) pipelines to load data into your data lakes. Also, you can immediately search and query cataloged data using Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum. AWS Glue consolidates major data integration capabilities into a single service. These include data discovery, modern ETL, cleansing, transforming, and centralized cataloging. It's also serverless, which means there's no infrastructure to manage. With flexible support for all workloads like ETL, ELT, and streaming in one service, AWS Glue supports users across various workloads and types of users. Also, AWS Glue makes it easy to integrate data across your architecture. It integrates with AWS analytics services and Amazon S3 data lakes. AWS Glue has integration interfaces and job-authoring tools that are easy to use for all users, from developers to business users, with tailored solutions for varied technical skill sets. Let's start with the job definition parameters section of our plugin. Connect to AWS Glue with Workload Automation Log in to the Dynamic Workload Console and open the Workload Designer. Choose to create a new job and select “AWS Glue” job type in the Cloud section. General Tab: Name: User can provide any name in name field Workstation: You need to choose the workstation Connection: Establishing connection to the AWS Cloud server: Access Key ID: User must have access key id for AWS to use its services. This access key id must be unique for all users. Secret Access Key: This is like a password. It is used for programmatic (API) access to AWS services. AWS Region: Region is physical location around the world where we cluster data centres. User must be using region like (ap-south-1). AWS Role ARN: IAM roles are entities within AWS that define a set of permissions and policies to control access to AWS resources. The IAM role is referenced and identified across AWS services and resources using the Role ARN. Test Connection: Click to verify if the connection to the AWS Glue works correctly. Action: In Action tab specify the job queue and definition which you want to perform. Workflow: A workflow consisting of triggers, crawlers, jobs etc. Run Properties: The runtime properties to be passed at the time of starting the workflow. Submitting your job: It is time to Submit your job to the current plan. You can add your job to the job stream that automates your business process flow. Select the action menu in the top-left corner of the job definition panel and click on Submit Job into Current Plan. A confirmation message is displayed, and you can switch to the Monitoring view to see what is going on. Monitor Page: User can track the jobs in monitor page. If the job completes successfully in the backend AWS Glue, the status should be changed to successful. Job Log Details: Workflow Details Page Authors
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
Archives
August 2023
Categories
All
|