linux - 无法导入容器镜像错误 fabric_e_invalid_operation

标签 linux azure nginx docker azure-service-fabric

我正在尝试使用托管在 Azure 上的 Linux(预览版)在 Service Fabric 集群上部署 Linux 容器。

我遇到问题,我无法使用 1 个 Nginx 服务运行我的 Service Fabric 应用程序,并出现以下错误:

Error event: SourceId='System.Hosting', Property='Download:1.0:1.0'. There was an error during download.Failed to download container image

我调查了日志文件并发现了一些似乎与错误相关的消息:

2017-07-05 08:20:23.833,Info,29803,30481,Hosting.ProcessActivationManager,Processing Ipc message with action DownloadContainerImages
2017-07-05 08:20:23.834,Info,29803,30481,Hosting.DockerProcessManager,Starting dockerprocessmanager processName /usr/bin/docker, args daemon -H localhost:2375 -H unix:///var/run/docker.sock
2017-07-05 08:20:23.834,Info,30547,30481,Common.ProcessWait,completed 0 waiters
2017-07-05 08:20:23.837,Info,29803,30481,Hosting.DockerProcessManager,Docker process has started. 29806
2017-07-05 08:20:23.850,Warning,30492,30481,Hosting.ContainerImageDownloader,Failed to get history for Image, error Failed to connect to any resolved endpoint
2017-07-05 08:20:23.850,Info,30492,30481,Hosting.ContainerImageDownloader,CheckDecrement count 0
2017-07-05 08:20:23.850,Warning,30492,30481,Hosting.ContainerActivator,Failed to import docker image error FABRIC_E_INVALID_OPERATION.
2017-07-05 08:20:23.850,Info,30492,30481,Transport.Enqueue@7f4cbda4ef20,9aff2afa-8f9e-a34e-9d67-4bf57c605eb8:120476 true  125B @ qsize 0/0B
2017-07-05 08:20:23.850,Warning,30492,30481,Hosting.ProcessActivationManager,DownloadContainerImages returned FABRIC_E_INVALID_OPERATION
2017-07-05 08:20:23.855,Info,29705,30566,Transport.Msg_Dispatch@7f42be16bc20,9aff2afa-8f9e-a34e-9d67-4bf57c605eb8:120476 true 1 125B
2017-07-05 08:20:23.855,Warning,29583,30566,Hosting.DownloadManager@9a8431474352dcc2e88fa9ad6af912b1:131437078006900280,Failed to import container images error FABRIC_E_INVALID_OPERATION.
2017-07-05 08:20:23.855,Info,29583,30566,Hosting.DownloadManager@9a8431474352dcc2e88fa9ad6af912b1:131437078006900280,Download container images count 1 for activationcontext  error FABRIC_E_INVALID_OPERATION.
2017-07-05 08:20:23.855,Warning,29583,30566,Hosting.DownloadManager@9a8431474352dcc2e88fa9ad6af912b1:131437078006900280,Download: Download:LinuxContainerServiceFabricApplicationType_App1:NginxGuestContainerPkg:1.0:1.0, ErrorCode=FABRIC_E_INVALID_OPERATION, RetryCount=7

但我不明白是什么原因无法获取图像的历史记录,错误无法连接到任何已解析的端点。这是我的 ServiceManifest:

<?xml version="1.0" encoding="utf-8"?>
<ServiceManifest Name="NginxGuestContainerPkg"
                 Version="1.0.0"
                 xmlns="http://schemas.microsoft.com/2011/01/fabric"
                 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ServiceTypes>
    <!-- This is the name of your ServiceType.
         The UseImplicitHost attribute indicates this is a guest service. -->
    <StatelessServiceType ServiceTypeName="NginxGuestContainerType" UseImplicitHost="true" />
  </ServiceTypes>

  <!-- Code package is your service executable. -->
  <CodePackage Name="Code" Version="1.0.0">
    <EntryPoint>
      <!-- Follow this link for more information about deploying Windows containers to Service Fabric: https://aka.ms/sfguestcontainers -->
      <ContainerHost>
        <ImageName>library/nginx:1.13.0-alpine-perl</ImageName>
      </ContainerHost>
    </EntryPoint>
    <!-- Pass environment variables to your container: -->
    <!--
    <EnvironmentVariables>
      <EnvironmentVariable Name="VariableName" Value="VariableValue"/>
    </EnvironmentVariables>
    -->
  </CodePackage>

  <!-- Config package is the contents of the Config directoy under PackageRoot that contains an 
       independently-updateable and versioned set of custom configuration settings for your service. -->
  <ConfigPackage Name="Config" Version="1.0.0" />

  <Resources>
    <Endpoints>
      <!-- This endpoint is used by the communication listener to obtain the port on which to 
           listen. Please note that if your service is partitioned, this port is shared with 
           replicas of different partitions that are placed in your code. -->
      <Endpoint Name="NginxGuestContainerTypeEndpoint" Protocol="http" UriScheme="http" Port="80" />
    </Endpoints>
  </Resources>
</ServiceManifest>

还有我的ApplicationManifest:

<?xml version="1.0" encoding="utf-8"?>
<ApplicationManifest ApplicationTypeName="LinuxContainerServiceFabricApplicationType"
                     ApplicationTypeVersion="1.0.0"
                     xmlns="http://schemas.microsoft.com/2011/01/fabric"
                     xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Parameters>
    <Parameter Name="NginxGuestContainer_InstanceCount" DefaultValue="-1" />
  </Parameters>
  <!-- Import the ServiceManifest from the ServicePackage. The ServiceManifestName and ServiceManifestVersion 
       should match the Name and Version attributes of the ServiceManifest element defined in the 
       ServiceManifest.xml file. -->
  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="NginxGuestContainerPkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
    <Policies>
      <ResourceGovernancePolicy CodePackageRef="Code" CpuShares="500" MemoryInMB="1024" MemorySwapInMB="4084" MemoryReservationInMB="1024" />
      <ContainerHostPolicies CodePackageRef="Code">
        <RepositoryCredentials AccountName="someusername" Password="" PasswordEncrypted="false"/>
        <PortBinding ContainerPort="80" EndpointRef="NginxGuestContainerTypeEndpoint"/>
      </ContainerHostPolicies>
    </Policies>
  </ServiceManifestImport>
  <DefaultServices>
    <!-- The section below creates instances of service types, when an instance of this 
         application type is created. You can also create one or more instances of service type using the 
         ServiceFabric PowerShell module.

         The attribute ServiceTypeName below must match the name defined in the imported ServiceManifest.xml file. -->
    <Service Name="NginxGuestContainer">
      <StatelessService ServiceTypeName="NginxGuestContainerType" InstanceCount="[NginxGuestContainer_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>
  </DefaultServices>
</ApplicationManifest>

你能帮我指出我做错了什么吗?谢谢。

更新

我不确定这是否会导致问题,但是当我对其中一个 SF 节点执行 SSH 时,我发现 Docker 服务已停止。当我尝试启动它并手动执行 pull 时,一分钟后它会自动停止。这是来自 systemctl 的日志:

Jul 05 09:25:51 default000000 dockerd[41096]: time="2017-07-05T09:25:51.835329455Z" level=info msg="Loading containers: done."
Jul 05 09:25:51 default000000 dockerd[41096]: time="2017-07-05T09:25:51.946744849Z" level=info msg="Daemon has completed initialization"
Jul 05 09:25:51 default000000 dockerd[41096]: time="2017-07-05T09:25:51.946809649Z" level=info msg="Docker daemon" commit=02c1d87 graphdriver=aufs version=17.06.0-ce
Jul 05 09:25:51 default000000 dockerd[41096]: time="2017-07-05T09:25:51.961652188Z" level=info msg="API listen on /var/run/docker.sock"
Jul 05 09:25:51 default000000 systemd[1]: Started Docker Application Container Engine.
Jul 05 09:26:53 default000000 systemd[1]: Stopping Docker Application Container Engine...
Jul 05 09:26:53 default000000 dockerd[41096]: time="2017-07-05T09:26:53.919115662Z" level=info msg="Processing signal 'terminated'"
Jul 05 09:26:53 default000000 dockerd[41096]: time="2017-07-05T09:26:53.954315756Z" level=info msg="stopping containerd after receiving terminated"
Jul 05 09:26:54 default000000 systemd[1]: Stopped Docker Application Container Engine.
Jul 05 09:26:55 default000000 systemd[1]: Stopped Docker Application Container Engine.

最佳答案

 <ContainerHost>
    <ImageName>library/nginx:1.13.0-alpine-perl</ImageName>
  </ContainerHost>

似乎很可疑……这可以解决吗?您的日志说明:

2017-07-05 08:20:23.850,警告,30492,30481,Hosting.ContainerImageDownloader,无法获取图像的历史记录,错误无法连接到任何已解析的端点

我正在用 AzureCR 做类似的事情

  <ContainerHost>
    <ImageName>xxxxxx.azurecr.io/hedge-app:298</ImageName>
  </ContainerHost>

你能从 CLI 中拉取镜像吗?

关于linux - 无法导入容器镜像错误 fabric_e_invalid_operation,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44921543/

相关文章:

linux - Logger 命令在 CentOS 中不起作用?

linux - 可能使用 net_device 取消对私有(private)数据的引用

python - 重命名有间隙的连续图像文件

azure - 更改 Azure 云服务环境变量而不发布

security - Nginx 从代理服务器中删除 Cookie 的安全标志

macos - macOS 上的 Nginx : open files resource limit

linux - 如何在不运行 mknod 的情况下让 Linux 字符设备自动显示在/dev 中?

尽管处理了 RoleEnvironment.Changing 事件并将 Cancel 设置为 false,但 Azure Webroles 在配置更改后仍会回收

azure - 在 Azure 数据工厂的复制事件中自动记录行号的附加列

django - Nginx 不支持 Django 管理静态文件