我们一直在将我们的应用程序从 CircleCI 转移到我们公司的 GitHub Actions,但我们遇到了一个奇怪的情况。
项目代码没有变化,但我们的 kafka 集成测试在 GH Actions 机器上开始失败。在 CircleCI 和本地(MacOS 和 Fedora linux 机器)中一切正常。
CircleCI 和 GH Actions 机器都运行 Ubuntu(测试版本为 18.04 和 20.04)。 MacOS 没有在 GH Actions 中进行测试,因为它没有 Docker。
这是docker-compose
和 workflow
构建和集成测试使用的文件:
version: '2.1'
services:
postgres:
container_name: listings-postgres
image: postgres:10-alpine
mem_limit: 500m
networks:
- listings-stack
ports:
- "5432:5432"
environment:
POSTGRES_DB: listings
POSTGRES_PASSWORD: listings
POSTGRES_USER: listings
PGUSER: listings
healthcheck:
test: ["CMD", "pg_isready"]
interval: 1s
timeout: 3s
retries: 30
listings-zookeeper:
container_name: listings-zookeeper
image: confluentinc/cp-zookeeper:6.2.0
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
networks:
- listings-stack
ports:
- "2181:2181"
healthcheck:
test: nc -z localhost 2181 || exit -1
interval: 10s
timeout: 5s
retries: 10
listings-kafka:
container_name: listings-kafka
image: confluentinc/cp-kafka:6.2.0
depends_on:
listings-zookeeper:
condition: service_healthy
environment:
KAFKA_BROKER_ID: 1
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://listings-kafka:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_ZOOKEEPER_CONNECT: listings-zookeeper:2181
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- listings-stack
ports:
- "29092:29092"
healthcheck:
test: kafka-topics --bootstrap-server 127.0.0.1:9092 --list
interval: 10s
timeout: 10s
retries: 50
networks: {listings-stack: {}}
name: Build
on: [ pull_request ]
env:
AWS_ACCESS_KEY_ID: ${{ secrets.TUNNEL_AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.TUNNEL_AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: 'us-east-1'
CIRCLECI_KEY_TUNNEL: ${{ secrets.ID_RSA_CIRCLECI_TUNNEL }}
jobs:
build:
name: Listings-API Build
runs-on: [ self-hosted, zap ]
steps:
- uses: actions/checkout@v2
with:
token: ${{ secrets.GH_OLXBR_PAT }}
submodules: recursive
path: ./repo
fetch-depth: 0
- name: Set up JDK 11
uses: actions/setup-java@v2
with:
distribution: 'adopt'
java-version: '11'
architecture: x64
cache: 'gradle'
- name: Docker up
working-directory: ./repo
run: docker-compose up -d
- name: Build with Gradle
working-directory: ./repo
run: ./gradlew build -Dhttps.protocols=TLSv1,TLSv1.1,TLSv1.2 -x integrationTest
- name: Integration tests with Gradle
working-directory: ./repo
run: ./gradlew integrationTest -Dhttps.protocols=TLSv1,TLSv1.1,TLSv1.2
- name: Sonarqube
working-directory: ./repo
env:
GITHUB_TOKEN: ${{ secrets.GH_OLXBR_PAT }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
run: ./gradlew sonarqube --info -Dhttps.protocols=TLSv1,TLSv1.1,TLSv1.2
- name: Docker down
if: always()
working-directory: ./repo
run: docker-compose down --remove-orphans
- name: Cleanup Gradle Cache
# Remove some files from the Gradle cache, so they aren't cached by GitHub Actions.
# Restoring these files from a GitHub Actions cache might cause problems for future builds.
run: |
rm -f ${{ env.HOME }}/.gradle/caches/modules-2/modules-2.lock
rm -f ${{ env.HOME }}/.gradle/caches/modules-2/gc.properties
集成测试是使用 Spock 框架编写的,出现错误的部分如下: boolean compareRecordSend(String topicName, int expected) {
def condition = new PollingConditions()
condition.within(kafkaProperties.listener.pollTimeout.getSeconds() * 5) {
assert expected == getRecordSendTotal(topicName)
}
return true
}
int getRecordSendTotal(String topicName) {
kafkaTemplate.flush()
return kafkaTemplate.metrics().find {
it.key.name() == "record-send-total" && it.key.tags().get("topic") == topicName
}?.value?.metricValue() ?: 0
}
我们得到的错误是:Condition not satisfied after 50.00 seconds and 496 attempts
at spock.util.concurrent.PollingConditions.within(PollingConditions.java:185)
at com.company.listings.KafkaAwareBaseSpec.compareRecordSend(KafkaAwareBaseSpec.groovy:31)
at com.company.listings.application.worker.listener.notifier.ListingNotifierITSpec.should notify listings(ListingNotifierITSpec.groovy:44)
Caused by:
Condition not satisfied:
expected == getRecordSendTotal(topicName)
| | | |
10 | 0 v4
false
我们已经调试了 GH Actions 机器(通过 SSH 连接)并手动运行。错误仍然发生,但如果集成测试第二次运行(以及后续运行),一切正常。我们还尝试初始化所有必要的主题并抢先向它们发送一些消息,但行为是相同的。
我们的问题是:
编辑
spring:
kafka:
bootstrap-servers: localhost:29092
producer:
batch-size: 262144
buffer-memory: 536870912
retries: 1
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.apache.kafka.common.serialization.ByteArraySerializer
acks: all
properties:
linger.ms: 0
最佳答案
我们确定了 Kafka 测试之间的一些测试序列依赖性。
我们将 Gradle 版本更新为 7.3-rc-3
它具有更确定性的测试扫描方法。当我们准备修复测试的依赖项时,此更新“解决”了我们的问题。
关于docker - Gradle 中的 Kafka 集成测试运行到 GitHub Actions,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69284830/