aws_glue_crawler
Manages a Glue Crawler. More information can be found in the AWS Glue Develeper Guide
Example Usage
JDBC Target
resource "aws_glue_crawler" "example" {
database_name = "${aws_glue_catalog_database.example.name}"
name = "example"
role = "${aws_iam_role.example.name}"
jdbc_target {
connection_name = "${aws_glue_connection.example.name}"
path = "database-name/%"
}
}
S3 Target
resource "aws_glue_crawler" "example" {
database_name = "${aws_glue_catalog_database.example.name}"
name = "example"
role = "${aws_iam_role.example.name}"
s3_target {
path = "s3://${aws_s3_bucket.example.bucket}
}
}
Argument Reference
NOTE: At least one
jdbc_targetors3_targetmust be specified.
The following arguments are supported:
-
database_name(Required) Glue database where results are written. -
name(Required) Name of the crawler. -
role(Required) The IAM role (or ARN of an IAM role) used by the crawler to access other resources. -
classifiers(Optional) List of custom classifiers. By default, all AWS classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification. -
configuration(Optional) JSON string of configuration information. -
description(Optional) Description of the crawler. -
jdbc_target(Optional) List of nested JBDC target arguments. See below. -
s3_target(Optional) List nested Amazon S3 target arguments. See below. -
schedule(Optional) A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:cron(15 12 * * ? *). -
schema_change_policy(Optional) Policy for the crawler's update and deletion behavior. -
table_prefix(Optional) The table prefix used for catalog tables that are created.
jdbc_target Argument Reference
-
connection_name- (Required) The name of the connection to use to connect to the JDBC target. -
path- (Required) The path of the JDBC target. -
exclusions- (Optional) A list of glob patterns used to exclude from the crawl.
s3_target Argument Reference
-
path- (Required) The path to the Amazon S3 target. -
exclusions- (Optional) A list of glob patterns used to exclude from the crawl.
schema_change_policy Argument Reference
-
delete_behavior- (Optional) The deletion behavior when the crawler finds a deleted object. Valid values:LOG,DELETE_FROM_DATABASE, orDEPRECATE_IN_DATABASE. Defaults toDEPRECATE_IN_DATABASE. -
update_behavior- (Optional) The update behavior when the crawler finds a changed schema. Valid values:LOGorUPDATE_IN_DATABASE. Defaults toUPDATE_IN_DATABASE.
Attributes Reference
In addition to all arguments above, the following attributes are exported:
-
id- Crawler name
Import
Glue Crawlers can be imported using name, e.g.
$ terraform import aws_glue_crawler.MyJob MyJob
© 2018 HashiCorpLicensed under the MPL 2.0 License.
https://www.terraform.io/docs/providers/aws/r/glue_crawler.html