amazon-web-services - ECS服务自动伸缩

标签 amazon-web-services terraform amazon-ecs google-tag-manager autoscaling

我已经按照本指南设置了 GTM 服务器端:https://aws-solutions-library-samples.github.io/advertising-marketing/using-google-tag-manager-for-server-side-website-analytics-on-aws.html

我正在使用 AWS ECS 任务定义和服务。后来我用Snowbridge to send data使用 HTTP post 请求从 AWS kinesis 到 GTM(扫雪机客户端)。

当数据量很大时,我偶尔会收到 GTM 的 502 错误。如果我过滤掉数据并减少转发到 GTM 的数据量,我就不会再收到错误。我可以在 GTM 端进行哪些更改以确保可以处理大量数据? ECS 可以使用自动伸缩吗?

我已经使用过类似的参数

deployment_maximum_percent = 200

deployment_minimum_healthy_percent = 50

但问题仍然存在。

这就是我的 GTM 配置的大致样子:

resource "aws_ecs_cluster" "gtm" {
  name = "gtm"
  setting {
    name  = "containerInsights"
    value = "enabled"
  }
}

resource "aws_ecs_task_definition" "PrimaryServerSideContainer" {
  family                   = "PrimaryServerSideContainer"
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = 2048
  memory                   = 4096
  execution_role_arn       = aws_iam_role.gtm_container_exec_role.arn
  task_role_arn            = aws_iam_role.gtm_container_role.arn
  runtime_platform {
    operating_system_family = "LINUX"
    cpu_architecture        = "X86_64"
  }
  container_definitions = <<TASK_DEFINITION
  [
  {
    "name": "primary",
    "image": "gcr.io/cloud-tagging-10302018/gtm-cloud-image",
    "environment": [
      {
        "name": "PORT",
        "value": "80"
      },
      {
        "name": "PREVIEW_SERVER_URL",
        "value": "${var.PREVIEW_SERVER_URL}"
      },
      {
        "name": "CONTAINER_CONFIG",
        "value": "${var.CONTAINER_CONFIG}"
      }
    ],
    "cpu": 1024,
    "memory": 2048,
    "essential": true,
    "logConfiguration": {
          "logDriver": "awslogs",
          "options": {
            "awslogs-group": "gtm-primary",
            "awslogs-create-group": "true",
            "awslogs-region": "eu-central-1",
            "awslogs-stream-prefix": "ecs"
          }
        },
    "portMappings" : [
        {
          "containerPort" : 80,
          "hostPort"      : 80
        }
      ]
  }
]
TASK_DEFINITION
}


resource "aws_ecs_service" "PrimaryServerSideService" {
  name             = var.primary_service_name
  cluster          = aws_ecs_cluster.gtm.id
  task_definition  = aws_ecs_task_definition.PrimaryServerSideContainer.id
  desired_count    = var.primary_service_desired_count
  launch_type      = "FARGATE"
  platform_version = "LATEST"

  scheduling_strategy = "REPLICA"

  deployment_maximum_percent         = 200
  deployment_minimum_healthy_percent = 50

  network_configuration {
    assign_public_ip = true
    security_groups  = [aws_security_group.gtm-security-group.id]
    subnets          = data.aws_subnets.private.ids
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.PrimaryServerSideTarget.arn
    container_name   = "primary"
    container_port   = 80
  }

  lifecycle {
    ignore_changes = [task_definition]
  }
}

resource "aws_lb" "PrimaryServerSideLoadBalancer" {
  name               = "PrimaryServerSideLoadBalancer"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.gtm-security-group.id]
  subnets            = data.aws_subnets.public.ids

  enable_deletion_protection = false
}
....


我还尝试添加这些:

resource "aws_appautoscaling_target" "ecs_target" {
  max_capacity       = 4
  min_capacity       = 1
  resource_id        = "service/${aws_ecs_cluster.gtm.name}/${aws_ecs_service.PrimaryServerSideService.name}"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"
}

resource "aws_appautoscaling_policy" "ecs_policy" {
  name               = "scale-down"
  policy_type        = "StepScaling"
  resource_id        = aws_appautoscaling_target.ecs_target.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs_target.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs_target.service_namespace

  step_scaling_policy_configuration {
    adjustment_type         = "ChangeInCapacity"
    cooldown                = 60
    metric_aggregation_type = "Maximum"

    step_adjustment {
      metric_interval_upper_bound = 0
      scaling_adjustment          = -1
    }
  }
}

但是 502 错误仍然存​​在。

最佳答案

您正在寻找正确的方向,只剩下两件事要做:

  1. 您需要确定指标以了解是否需要扩展(更有可能是 CPU 使用率)
  2. 更新您的资源“aws_appautoscaling_policy”“ecs_policy”,以根据第 1 页中的指标进行扩展

目前,您的 ecs_policy 没有任何可扩展的指标。

这里是示例:

resource "aws_appautoscaling_policy" "ecs_target_cpu" {
  name               = "application-scaling-policy-cpu"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.ecs_service_target.resource_id
  scalable_dimension = aws_appautoscaling_target.ecs_service_target.scalable_dimension
  service_namespace  = aws_appautoscaling_target.ecs_service_target.service_namespace
  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageCPUUtilization"
    }
    target_value = 80
  }
  depends_on = [aws_appautoscaling_target.ecs_service_target]
}

关于amazon-web-services - ECS服务自动伸缩,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/77501012/

相关文章:

aws-api-gateway - Terraform - AWS - API 网关依赖难题

azure - 如何使用 Terraform 为 Azure API 管理服务创建用户名密码身份?

docker - 具有绑定(bind)端口和绑定(bind)主机名的 Akka 集群

amazon-web-services - 如何为 Fargate 任务定义指定操作系统系列?

Terraform 外部提供者资源在计划阶段没有变量属性

amazon-web-services - AWS Fargate 集群无法通过 NAT 和 Internet 网关访问 Internet

amazon-web-services - 使用 Lambda - Node.js 获取调用 AWS HTTP API 的 URL

javascript - 如何将ID Token和Access Token作为cookie从AWS传递并检查它们是否过期?

amazon-web-services - 访问 ElasticBeanstalk EC2 日志文件的更简单方法

ruby-on-rails - 如何使用 AWS OpsWork 部署 2 个或更多 Rails webapps?