hadoop - YARN 中公平调度程序的 ACL 不起作用-6ren

hadoop - YARN 中公平调度程序的 ACL 不起作用

转载作者：可可西里更新时间：2023-11-01 16:26:12

25

4

我在 fair-scheduler.xml 中用 ACL 配置了我的队列。但是其他用户也可以在同一个队列中运行作业。我是否需要根据我的队列在其他地方定义 ACL。任何链接或帮助将不胜感激。谢谢

 <queue name="queue1">
            <minResources>10000mb,10vcores</minResources>
            <maxResources>30000mb,30vcores</maxResources>
            <maxRunningApps>10</maxRunningApps>
            <weight>2.0</weight>
            <schedulingMode>fair</schedulingMode>
            <aclAdministerApps>User1</aclAdministerApps>
            <aclSubmitApps>User1</aclSubmitApps>
    </queue>

最佳答案

注意:这是关于容量调度器的。不确定公平调度程序 ACL 继承行为是否不同。

ACL 通过 yarn.scheduler.capacity.<queue-path>.acl_submit_applications 配置, 请参阅 Capacity Scheduler :

yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications The ACL which controls who can submit applications to the given queue. If the given user/group has necessary ACLs on the given queue or one of the parent queues in the hierarchy they can submit applications. ACLs for this property are inherited from the parent queue if not specified.

请注意有关队列继承父队列 ACL 的位。由于通常所有队列都继承自根队列，并且根队列ACL保留为默认capacity-scheduler.xml作为* :

<property>
 <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
 <value>*</value>
 <description>
  The ACL of who can submit jobs to the default queue.
 </description>
</property>

因此，通常所有队列都会为所有用户 (*) 获取 ACL，以便能够提交。当您配置队列时，您应该确保限制父队列以及您想要的队列。

更新

在查看 FS 队列代码后，我必须得出结论，行为是相同的。访问检查在 AllocationConfiguration.hasAccess() 中完成:

 public boolean hasAccess(String queueName, QueueACL acl,
      UserGroupInformation user) {
    int lastPeriodIndex = queueName.length();
    while (lastPeriodIndex != -1) {
      String queue = queueName.substring(0, lastPeriodIndex);
      if (getQueueAcl(queue, acl).isUserAllowed(user)) {
        return true;
      }

      lastPeriodIndex = queueName.lastIndexOf('.', lastPeriodIndex - 1);
    }

    return false;
  }

并不是说代码会遍历队列层次结构(通过在名称中的每个句点拆分广告)直到其中一个父队列授予访问权限。与容量调度程序行为完全一样。直到它到达根队列，此时这段代码才生效:

/**
   * Get the ACLs associated with this queue. If a given ACL is not explicitly
   * configured, include the default value for that ACL.  The default for the
   * root queue is everybody ("*") and the default for all other queues is
   * nobody ("")
   */
  public AccessControlList getQueueAcl(String queue, QueueACL operation) {
    Map<QueueACL, AccessControlList> queueAcls = this.queueAcls.get(queue);
    if (queueAcls != null) {
      AccessControlList operationAcl = queueAcls.get(operation);
      if (operationAcl != null) {
        return operationAcl;
      }
    }
    return (queue.equals("root")) ? EVERYBODY_ACL : NOBODY_ACL;
  }

还要注意队列是如何加载的，来自 AllocationFileLoaderService.reloadAllocations() :

// Load queue elements.  A root queue can either be included or omitted.  If
// it's included, all other queues must be inside it.
for (Element element : queueElements) {
  String parent = "root";
  ...
  loadQueue(parent, element, minQueueResources, maxQueueResources,
      queueMaxApps, userMaxApps, queueMaxAMShares, queueWeights,
      queuePolicies, minSharePreemptionTimeouts, queueAcls,
      configuredQueues);
}

/**
* Loads a queue from a queue element in the configuration file
*/
private void loadQueue(String parentName, Element element, ...) 
  throws AllocationConfigurationException {
String queueName = element.getAttribute("name");
if (parentName != null) {
  queueName = parentName + "." + queueName;
}

注意队列名称实际上是如何与父队列和 "root" 连接在一起的是所有队列的隐式父级。因此，您的队列名称确实是 root.queue1 .

所以这意味着在 FS 调度器中，所有队列默认都允许每个人访问，因为它们都继承了 root队列默认访问。您需要显式覆盖 root在您的配置文件中排队 ACL。这与 CapacityScheduler 没有什么不同，但我认为获取默认表单配置的 CS 行为优于从代码获取默认值的 FS 行为。

我没有实际测试 FS 行为，但代码可能会在读取时执行。

关于hadoop - YARN 中公平调度程序的 ACL 不起作用，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25846479/

25

4

0

文章推荐： c++ - 为什么有 C++14/17 的网络库提案？

文章推荐： internet-explorer - 是什么让 IE 6 & 7 报告 'Operation Aborted' ？

文章推荐： c++ - 为什么按值传递而不是按常量引用传递？

multithreading - Perl中的线程同步/调度
我有一个带有一些功能的perl对象。每个功能从主程序中调用一次。我想并行运行某些功能以节省时间。由于某些功能取决于先前功能的结果，因此我无法将它们全部一起运行。我想到了这样的事情: 对于每个函数，保
python 调度 - 如何避免无限循环？
首先，我的代码在这里: import schedule # see https://github.com/dbader/schedule import crawler def job(): p
java - quartz 调度
从 11 月 1 日开始，我必须使用quartz调度程序每4个月安排一次任务。我使用 cronExpression 来实现同样的目的。但 cronExpression 每年都会重置。所以我的任务将在
java - Akka 调度
我有以下代码块，它调用两个请求，但略有延迟。 final ActorRef actor1 = getContext().actorOf( ActorClass.prop
linux - 调度 - 每个进程使用一个或多个内核堆栈进行上下文切换
考虑到 Linux 的情况，我们为每个用户堆栈都有一个内核堆栈，据我所知，每当发生上下文切换时，我们都会切换到当前进程的内核模式。这里我们保存当前进程的当前状态，寄存器，程序数据等，然后调度器(不确
c - OpenBSD下的pthread优先级/调度
我有将东西移植到 OpenBSD 的奇怪爱好。我知道它有 pthreads 问题，但在 2013 年 5 月发布版本之前我不会升级。我使用的是 5.0，我对 pthreads 还很陌生。我已经学习了
algorithm - 调度:隐式期限率单调算法的提前期限
给定一组任务: T1(20,100) T2(30,250) T3(100,400) (execution time, deadline=peroid) 现在我想将截止日期限制为 Di = f * Pi
python - 调度:最小化非重叠时间范围之间的差距
使用 Django 开发一个小型日程安排 Web 应用程序，在该应用程序中，人们被分配特定的时间与他们的上级会面。员工存储为模型，与表示时间范围和他们有空的星期几的模型具有 OneToMany 关系。
algorithm - 贪心算法，调度
我想了解贪婪算法调度问题的工作原理。所以我一直在阅读和谷歌搜索一段时间，因为我无法理解贪心算法调度问题。我们有 n 个作业要安排在单个资源上。作业 (i) 有一个请求的开始时间 s(i) 和结束时
algorithm - 调度，贪心算法
这是流行的 El Goog 问题的变体。考虑以下调度问题:有 n 个作业，i = 1..n。有 1 台 super 计算机和无限的 PC。每个作业都需要先经过 super 计算机的预处理，然后再在P
python - 调度 Scrapy 蜘蛛以脚本的间隔运行
假设我有一个需要运行多次的蜘蛛 class My_spider(Scrapy.spider): #spider def 我想做这样的事 while True: runner = Cra
kubernetes - 如何调试 kubernetes 调度？
我已将 podAntiAffinity 添加到我的 DeploymentConfig 模板中。但是，pod 被安排在我预计会被规则排除的节点上。我如何查看 kubernetes 调度程序的日志以了
reactjs - 调度 Redux 操作是否被认为是昂贵的？
我已经使用 React - Redux - Typescript 堆栈有一段时间了，到目前为止我很喜欢它。但是，由于我对 Redux 很陌生，所以我一直在想这个特定的话题。调度 Redux 操作(和
azure - 调度 Azure 实例
我想按照预定的计划(例如，周一至周五，美国东部时间晚上 9 点至 5 点)运行单个 Azure 实例以减少账单，并且想知道最好的方法是什么。问题的两个部分: 能否使用服务管理 API [1] 按预定
Drupal 的引导/调度/路由流程
假设最小模块安装(为了简单起见)，Drupal 的 index.php 中两个顶级功能的核心“职责”是什么？ ? drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL); me
scheme - Racket URL 调度
我正在尝试使用 Racket(以前称为 PLT Scheme)连接 URL 调度。我查看了教程和服务器文档。我不知道如何将请求路由到相同的 servlet。具体例子: #lang 方案 (需要网络服
月末的 Airflow DAG 调度
我想在 Airflow (v1.9.0) 上运行计划。我的DAG需要在每个月底运行，但我不知道如何编写设置。 my_dag = DAG(dag_id=DAG_ID, cat
azure - 调度 httpTrigger 函数
我正在尝试在“httpTrigger”类型函数的 function.json 中设置计划字段，但计时器功能似乎未运行。我的目标是拥有一个甚至可以在需要时进行调度和手动启动的功能，而不必仅为了调度而添加
Airflow 无法识别 DAG 调度
我正在尝试制定每周、每月的 Airflow 计划，但不起作用。有人可以报告可能发生的情况吗？如果我每周、每月进行安排，它就会保持静止，就好像它被关闭一样。没有错误信息，只是不执行。我发送了一个代码示例
javascript - firebase 有办法每两周调用一次我的代码吗？调度
我希望每两周自动更新一次我的表格。我希望我的函数能够被 firebase 调用。这可能吗？我正在使用 Angular 2 Typescript 和 Firebase。最佳答案仅通过fireba

首页

博学

6Ren·AI

商城

hadoop - YARN 中公平调度程序的 ACL 不起作用

更新