aws_glue_crawler
Manages a Glue Crawler. More information can be found in the AWS Glue Develeper Guide
Example Usage
JDBC Target
resource "aws_glue_crawler" "example" { database_name = "${aws_glue_catalog_database.example.name}" name = "example" role = "${aws_iam_role.example.name}" jdbc_target { connection_name = "${aws_glue_connection.example.name}" path = "database-name/%" } }
S3 Target
resource "aws_glue_crawler" "example" { database_name = "${aws_glue_catalog_database.example.name}" name = "example" role = "${aws_iam_role.example.name}" s3_target { path = "s3://${aws_s3_bucket.example.bucket} } }
Argument Reference
NOTE: At least one
jdbc_target
ors3_target
must be specified.
The following arguments are supported:
-
database_name
(Required) Glue database where results are written. -
name
(Required) Name of the crawler. -
role
(Required) The IAM role (or ARN of an IAM role) used by the crawler to access other resources. -
classifiers
(Optional) List of custom classifiers. By default, all AWS classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification. -
configuration
(Optional) JSON string of configuration information. -
description
(Optional) Description of the crawler. -
jdbc_target
(Optional) List of nested JBDC target arguments. See below. -
s3_target
(Optional) List nested Amazon S3 target arguments. See below. -
schedule
(Optional) A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:cron(15 12 * * ? *)
. -
schema_change_policy
(Optional) Policy for the crawler's update and deletion behavior. -
table_prefix
(Optional) The table prefix used for catalog tables that are created.
jdbc_target Argument Reference
-
connection_name
- (Required) The name of the connection to use to connect to the JDBC target. -
path
- (Required) The path of the JDBC target. -
exclusions
- (Optional) A list of glob patterns used to exclude from the crawl.
s3_target Argument Reference
-
path
- (Required) The path to the Amazon S3 target. -
exclusions
- (Optional) A list of glob patterns used to exclude from the crawl.
schema_change_policy Argument Reference
-
delete_behavior
- (Optional) The deletion behavior when the crawler finds a deleted object. Valid values:LOG
,DELETE_FROM_DATABASE
, orDEPRECATE_IN_DATABASE
. Defaults toDEPRECATE_IN_DATABASE
. -
update_behavior
- (Optional) The update behavior when the crawler finds a changed schema. Valid values:LOG
orUPDATE_IN_DATABASE
. Defaults toUPDATE_IN_DATABASE
.
Attributes Reference
In addition to all arguments above, the following attributes are exported:
-
id
- Crawler name
Import
Glue Crawlers can be imported using name
, e.g.
$ terraform import aws_glue_crawler.MyJob MyJob
© 2018 HashiCorpLicensed under the MPL 2.0 License.
https://www.terraform.io/docs/providers/aws/r/glue_crawler.html