我正在尝试使用托管在 Azure 上的 Linux(预览版)在 Service Fabric 集群上部署 Linux 容器。
我遇到问题,我无法使用 1 个 Nginx
服务运行我的 Service Fabric 应用程序,并出现以下错误:
Error event: SourceId='System.Hosting', Property='Download:1.0:1.0'. There was an error during download.Failed to download container image
我调查了日志文件并发现了一些似乎与错误相关的消息:
2017-07-05 08:20:23.833,Info,29803,30481,Hosting.ProcessActivationManager,Processing Ipc message with action DownloadContainerImages
2017-07-05 08:20:23.834,Info,29803,30481,Hosting.DockerProcessManager,Starting dockerprocessmanager processName /usr/bin/docker, args daemon -H localhost:2375 -H unix:///var/run/docker.sock
2017-07-05 08:20:23.834,Info,30547,30481,Common.ProcessWait,completed 0 waiters
2017-07-05 08:20:23.837,Info,29803,30481,Hosting.DockerProcessManager,Docker process has started. 29806
2017-07-05 08:20:23.850,Warning,30492,30481,Hosting.ContainerImageDownloader,Failed to get history for Image, error Failed to connect to any resolved endpoint
2017-07-05 08:20:23.850,Info,30492,30481,Hosting.ContainerImageDownloader,CheckDecrement count 0
2017-07-05 08:20:23.850,Warning,30492,30481,Hosting.ContainerActivator,Failed to import docker image error FABRIC_E_INVALID_OPERATION.
2017-07-05 08:20:23.850,Info,30492,30481,Transport.Enqueue@7f4cbda4ef20,9aff2afa-8f9e-a34e-9d67-4bf57c605eb8:120476 true 125B @ qsize 0/0B
2017-07-05 08:20:23.850,Warning,30492,30481,Hosting.ProcessActivationManager,DownloadContainerImages returned FABRIC_E_INVALID_OPERATION
2017-07-05 08:20:23.855,Info,29705,30566,Transport.Msg_Dispatch@7f42be16bc20,9aff2afa-8f9e-a34e-9d67-4bf57c605eb8:120476 true 1 125B
2017-07-05 08:20:23.855,Warning,29583,30566,Hosting.DownloadManager@9a8431474352dcc2e88fa9ad6af912b1:131437078006900280,Failed to import container images error FABRIC_E_INVALID_OPERATION.
2017-07-05 08:20:23.855,Info,29583,30566,Hosting.DownloadManager@9a8431474352dcc2e88fa9ad6af912b1:131437078006900280,Download container images count 1 for activationcontext error FABRIC_E_INVALID_OPERATION.
2017-07-05 08:20:23.855,Warning,29583,30566,Hosting.DownloadManager@9a8431474352dcc2e88fa9ad6af912b1:131437078006900280,Download: Download:LinuxContainerServiceFabricApplicationType_App1:NginxGuestContainerPkg:1.0:1.0, ErrorCode=FABRIC_E_INVALID_OPERATION, RetryCount=7
但我不明白是什么原因无法获取图像的历史记录,错误无法连接到任何已解析的端点
。这是我的 ServiceManifest
:
<?xml version="1.0" encoding="utf-8"?>
<ServiceManifest Name="NginxGuestContainerPkg"
Version="1.0.0"
xmlns="http://schemas.microsoft.com/2011/01/fabric"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ServiceTypes>
<!-- This is the name of your ServiceType.
The UseImplicitHost attribute indicates this is a guest service. -->
<StatelessServiceType ServiceTypeName="NginxGuestContainerType" UseImplicitHost="true" />
</ServiceTypes>
<!-- Code package is your service executable. -->
<CodePackage Name="Code" Version="1.0.0">
<EntryPoint>
<!-- Follow this link for more information about deploying Windows containers to Service Fabric: https://aka.ms/sfguestcontainers -->
<ContainerHost>
<ImageName>library/nginx:1.13.0-alpine-perl</ImageName>
</ContainerHost>
</EntryPoint>
<!-- Pass environment variables to your container: -->
<!--
<EnvironmentVariables>
<EnvironmentVariable Name="VariableName" Value="VariableValue"/>
</EnvironmentVariables>
-->
</CodePackage>
<!-- Config package is the contents of the Config directoy under PackageRoot that contains an
independently-updateable and versioned set of custom configuration settings for your service. -->
<ConfigPackage Name="Config" Version="1.0.0" />
<Resources>
<Endpoints>
<!-- This endpoint is used by the communication listener to obtain the port on which to
listen. Please note that if your service is partitioned, this port is shared with
replicas of different partitions that are placed in your code. -->
<Endpoint Name="NginxGuestContainerTypeEndpoint" Protocol="http" UriScheme="http" Port="80" />
</Endpoints>
</Resources>
</ServiceManifest>
还有我的ApplicationManifest
:
<?xml version="1.0" encoding="utf-8"?>
<ApplicationManifest ApplicationTypeName="LinuxContainerServiceFabricApplicationType"
ApplicationTypeVersion="1.0.0"
xmlns="http://schemas.microsoft.com/2011/01/fabric"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Parameters>
<Parameter Name="NginxGuestContainer_InstanceCount" DefaultValue="-1" />
</Parameters>
<!-- Import the ServiceManifest from the ServicePackage. The ServiceManifestName and ServiceManifestVersion
should match the Name and Version attributes of the ServiceManifest element defined in the
ServiceManifest.xml file. -->
<ServiceManifestImport>
<ServiceManifestRef ServiceManifestName="NginxGuestContainerPkg" ServiceManifestVersion="1.0.0" />
<ConfigOverrides />
<Policies>
<ResourceGovernancePolicy CodePackageRef="Code" CpuShares="500" MemoryInMB="1024" MemorySwapInMB="4084" MemoryReservationInMB="1024" />
<ContainerHostPolicies CodePackageRef="Code">
<RepositoryCredentials AccountName="someusername" Password="" PasswordEncrypted="false"/>
<PortBinding ContainerPort="80" EndpointRef="NginxGuestContainerTypeEndpoint"/>
</ContainerHostPolicies>
</Policies>
</ServiceManifestImport>
<DefaultServices>
<!-- The section below creates instances of service types, when an instance of this
application type is created. You can also create one or more instances of service type using the
ServiceFabric PowerShell module.
The attribute ServiceTypeName below must match the name defined in the imported ServiceManifest.xml file. -->
<Service Name="NginxGuestContainer">
<StatelessService ServiceTypeName="NginxGuestContainerType" InstanceCount="[NginxGuestContainer_InstanceCount]">
<SingletonPartition />
</StatelessService>
</Service>
</DefaultServices>
</ApplicationManifest>
你能帮我指出我做错了什么吗?谢谢。
更新
我不确定这是否会导致问题,但是当我对其中一个 SF 节点执行 SSH 时,我发现 Docker 服务已停止
。当我尝试启动它并手动执行 pull
时,一分钟后它会自动停止。这是来自 systemctl
的日志:
Jul 05 09:25:51 default000000 dockerd[41096]: time="2017-07-05T09:25:51.835329455Z" level=info msg="Loading containers: done."
Jul 05 09:25:51 default000000 dockerd[41096]: time="2017-07-05T09:25:51.946744849Z" level=info msg="Daemon has completed initialization"
Jul 05 09:25:51 default000000 dockerd[41096]: time="2017-07-05T09:25:51.946809649Z" level=info msg="Docker daemon" commit=02c1d87 graphdriver=aufs version=17.06.0-ce
Jul 05 09:25:51 default000000 dockerd[41096]: time="2017-07-05T09:25:51.961652188Z" level=info msg="API listen on /var/run/docker.sock"
Jul 05 09:25:51 default000000 systemd[1]: Started Docker Application Container Engine.
Jul 05 09:26:53 default000000 systemd[1]: Stopping Docker Application Container Engine...
Jul 05 09:26:53 default000000 dockerd[41096]: time="2017-07-05T09:26:53.919115662Z" level=info msg="Processing signal 'terminated'"
Jul 05 09:26:53 default000000 dockerd[41096]: time="2017-07-05T09:26:53.954315756Z" level=info msg="stopping containerd after receiving terminated"
Jul 05 09:26:54 default000000 systemd[1]: Stopped Docker Application Container Engine.
Jul 05 09:26:55 default000000 systemd[1]: Stopped Docker Application Container Engine.
最佳答案
<ContainerHost>
<ImageName>library/nginx:1.13.0-alpine-perl</ImageName>
</ContainerHost>
似乎很可疑……这可以解决吗?您的日志说明:
2017-07-05 08:20:23.850,警告,30492,30481,Hosting.ContainerImageDownloader,无法获取图像的历史记录,错误无法连接到任何已解析的端点
我正在用 AzureCR 做类似的事情
<ContainerHost>
<ImageName>xxxxxx.azurecr.io/hedge-app:298</ImageName>
</ContainerHost>
你能从 CLI 中拉取镜像吗?
关于linux - 无法导入容器镜像错误 fabric_e_invalid_operation,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44921543/