python - Docker权限错误: [Errno 13] Permission denied: 'tmp'

标签 python docker scrapy dockerfile

我是 Docker 新手,并且已经构建了一个自定义容器来在我的云服务器上运行我的蜘蛛。我的 scraper 是使用 python 3.6、scrapy 1.6、selenium 构建的,并使用 docker 在一个容器中运行所有内容。启动蜘蛛时,我有 scrapy open_spider 方法,该方法在生成 scrapy 爬行的 url 的目录中运行另一个 python 脚本。该脚本将链接保存在文本文件中,但是,我收到 PermissionError: [Errno 13] Permission returned: 'tmp.

我尝试在 tmp 文件夹上运行 chmod 777 和 a+rw,这样我就可以创建文本文件,但我仍然收到权限被拒绝的错误。我已经研究这个问题好几天了,但不知道如何解决这个问题。

我的笔记本电脑上的操作系统是 ubuntu 18.04。

下面是我的 docker 文件的链接

Dockerfile

FROM scrapinghub/scrapinghub-stack-scrapy:1.6-py3
RUN apt-get -y --no-install-recommends install zip unzip jq libxml2 libxml2-dev
RUN printf "deb http://archive.debian.org/debian/ jessie main\ndeb-src http://archive.debian.org/debian/ jessie main\ndeb http://security.debian.org jessie/updates main\ndeb-src http://security.debian.org jessie/updates main" > /etc/apt/sources.list


#============================================
# Google Chrome
#============================================
# can specify versions by CHROME_VERSION;
#  e.g. google-chrome-stable=53.0.2785.101-1
#       google-chrome-beta=53.0.2785.92-1
#       google-chrome-unstable=54.0.2840.14-1
#       latest (equivalent to google-chrome-stable)
#       google-chrome-beta  (pull latest beta)
#============================================
ARG CHROME_VERSION="google-chrome-stable"
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
  && echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list \
  && apt-get update -qqy \
  && apt-get -qqy install \
    ${CHROME_VERSION:-google-chrome-stable} \
  && rm /etc/apt/sources.list.d/google-chrome.list \
  && rm -rf /var/lib/apt/lists/* /var/cache/apt/*

#============================================
# Chrome Webdriver
#============================================
# can specify versions by CHROME_DRIVER_VERSION
# Latest released version will be used by default
#============================================
ARG CHROME_DRIVER_VERSION
RUN CHROME_STRING=$(google-chrome --version) \
  && CHROME_VERSION_STRING=$(echo "${CHROME_STRING}" | grep -oP "\d+\.\d+\.\d+\.\d+") \
  && CHROME_MAYOR_VERSION=$(echo "${CHROME_VERSION_STRING%%.*}") \
  && wget --no-verbose -O /tmp/LATEST_RELEASE "https://chromedriver.storage.googleapis.com/LATEST_RELEASE_${CHROME_MAYOR_VERSION}" \
  && CD_VERSION=$(cat "/tmp/LATEST_RELEASE") \
  && rm /tmp/LATEST_RELEASE \
  && if [ -z "$CHROME_DRIVER_VERSION" ]; \
     then CHROME_DRIVER_VERSION="${CD_VERSION}"; \
     fi \
  && CD_VERSION=$(echo $CHROME_DRIVER_VERSION) \
  && echo "Using chromedriver version: "$CD_VERSION \
  && wget --no-verbose -O /tmp/chromedriver_linux64.zip https://chromedriver.storage.googleapis.com/$CD_VERSION/chromedriver_linux64.zip \
  && rm -rf /opt/selenium/chromedriver \
  && unzip /tmp/chromedriver_linux64.zip -d /opt/selenium \
  && rm /tmp/chromedriver_linux64.zip \
  && mv /opt/selenium/chromedriver /opt/selenium/chromedriver-$CD_VERSION \
  && chmod 755 /opt/selenium/chromedriver-$CD_VERSION \
  && sudo ln -fs /opt/selenium/chromedriver-$CD_VERSION /usr/bin/chromedriver


#============================================
# crawlera-headless-proxy
#============================================

RUN curl -L https://github.com/scrapinghub/crawlera-headless-proxy/releases/download/1.1.1/crawlera-headless-proxy-linux-amd64 -o /usr/local/bin/crawlera-headless-proxy \
 && chmod +x /usr/local/bin/crawlera-headless-proxy
RUN chmod a+rw app/cars/spiders
RUN chmod a+rw app/cars/tmp
COPY ./start-crawl /usr/local/bin/start-crawl
ENV TERM xterm
ENV SCRAPY_SETTINGS_MODULE cars.settings
RUN pip install --upgrade pip
RUN mkdir -p /app
WORKDIR /app
COPY ./requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY . /app
RUN python setup.py install
RUN chmod a+rw app/cars/tmp

这是我的 setup.py 文件的链接

# Automatically created by: shub deploy

from setuptools import setup, find_packages

setup(
    name='cars',
    version='1.0',
    packages=find_packages(),
    entry_points={'scrapy': ['settings = cars.settings']},
)

最佳答案

将以下内容添加到您的 Dockerfile 中:

RUN adduser --disabled-login dockeruser
USER dockuser
RUN chown dockuser:dockuser -R /tmp/

注意:如果 --disabled-login 也不起作用,请使用 --disabled-password

关于python - Docker权限错误: [Errno 13] Permission denied: 'tmp' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59271844/

相关文章:

encryption - 如何加密 docker 镜像或 docker 镜像中的源代码?

python - 使用 eBPF 在 Python 堆栈级别跟踪代码是否可行?

wordpress - 无法在 Docker 上使用 Wordpress 保存帖子

docker - 在docker-compose中创建卷

python - python3上的scrapy如何获取在javascript上工作​​的文本数据

python - Scrapy 请求回调未触发

python - 如何使用 scrapy/python 直接从 URL 读取 xml

python - Paramiko : NameError: global name 'descriptor' is not defined

python - 具有多个参数、变量和当前工作目录的命令行

python - 来自 concurrent.futures 的 ProcessPoolExecutor 比 multiprocessing.Pool 慢