我试图在Kubernetes上作为Scheduler运行Spark。
当使用kubectl proxy
从kubernetes集群外部运行时,它可以正常工作。
spark-shell --master k8s://http://localhost:8001 --conf spark.kubernetes.container.image=abdoumediaoptimise/spark
但是,每当我们尝试直接从Pod内运行spark-shell或spark-submit时,它就永远不会起作用(即使通过以下方式跟随rbac from spark docs:
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark
。我们有授权执行异常(exception):
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes/api/v1/namespaces/default/pods?labelSelector=spark-app-selector%3Dspark-application-1574714537374,spark-role%3Dexecutor. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods is forbidden: User "system:serviceaccount:default:default" cannot list resource "pods" in API group "" in the namespace "default"
知道如何从Pod中启动Spark吗?这实际上使得不可能在笔记本电脑上使用spark k8s://
Spark RBAC YAML文件
apiVersion: v1
kind: ServiceAccount
metadata:
name: spark
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: spark
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: edit
subjects:
- kind: ServiceAccount
name: spark
namespace: default
最佳答案
spark.kubernetes.authenticate.driver.serviceAccountName
-是ServiceAccount名称,Spark Driver的Kubernetes客户端使用该名称来向Kubernetes API进行身份验证以请求执行者。
您正在寻找的spark.kubernetes.authenticate.submission.*
,用于配置SparkSubmit
应用程序的Kubernetes客户端以向Kubernetes API进行身份验证,以请求服务,ConfigMap和Driver Pod。
要使其正常工作,请使用您感兴趣的ServiceAccount配置您的Pod:spec.serviceAccountName: <your-SA>
。之后,使用安装到Pod中的/var/run/secrets/kubernetes.io/serviceaccount
目录文件来配置spark.kubernetes.authenticate.submission.*
选项。
关于apache-spark - Spark Kubernetes,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59039871/