Container killed by YARN for exceeding memory limits
Bug信息
WARN TaskSetManager: Lost task 49.2 in stage 6.0 (TID xxx,
xxx.xxx.xxx.compute.internal): ExecutorLostFailure (executor 16 exited caused by one
of the running tasks) Reason: Container killed by YARN for exceeding memory limits.
18 GB of 18 GB physical memory used. Consider boosting
spark.yarn.executor.memoryOverhead or disabling yarn.nodemanager.vmem-check-enabled...
Bug本质原因
Yarn的nodemanager中某个container内存不够了,换句话说就是这个container中的数据太大了,超出它的内存上限了。
那么这个container内存上限又是在哪控制的呢?得了解一下Yarn资源相关的参数。
Yarn资源相关参数
- yarn.nodemanager.resource.memory-mb
每个NodeManager可以供yarn调度(分配给container)的物理内存,单位MB。 - yarn.nodemanager.resource.cpu-vcores
每个NodeManager可以供yarn调度(分配给container)的vcore个数。 - yarn.scheduler.maximum-allocation-mb
每个container能够申请到的最大内存。 - yarn.scheduler.minimum-all