分配给job过多MPI process报错的问题

sh/bash/dash/ksh/zsh等Shell脚本
回复
lizijiang
帖子: 5
注册时间: 2021-03-30 21:03
系统: linux

分配给job过多MPI process报错的问题

#1

帖子 lizijiang » 2021-04-02 18:04

本人使用Linux系统,通过xshell连接超算。通过yhi查询可用节点时,空闲节点数有6个,只是分区名不同,且1个节点有12核。我通过mpirun调用一个节点数为1核数为6的pbs文件,来运行fds。
运行时报错

代码: 全选

ERROR:too many MPI processes have been assigned to this job
我的shell文件内容为

代码: 全选

#!/bin/bash
#PBS -N detailed_12
#PBS -e /THL6/home/openfoam/FDS/FDSmodel/single_building/detailed_12/detailed_12.err
#PBS -o /THL6/home/openfoam/FDS/FDSmodel/single_building/detailed_12/detailed_12.log
#PBS -l nodes=1:ppn=6
#PBS -l walltime=24:00:00
export OPM_NUM_THREADS=4
export I_MPI_PIN_DOMAIN=omp
cd /THL6/home/openfoam/FDS/FDSmodel/single_building/detailed_12
mpiexec -n 6 fds detailed_12.fds
请问这个是由于分配给我的节点数不够所造成的,还是我的fds程序设置有问题,或者其他问题呢?
如果登错了论坛版块十分抱歉,看了很久不知道应该放在哪个板块。
onlylove
论坛版主
帖子: 5234
注册时间: 2007-01-14 16:23

Re: 分配给job过多MPI process报错的问题

#2

帖子 onlylove » 2021-04-02 20:50

看了下版块,暂时放这边吧
头像
oneleaf
论坛管理员
帖子: 10441
注册时间: 2005-03-27 0:06
系统: Ubuntu 12.04

Re: 分配给job过多MPI process报错的问题

#3

帖子 oneleaf » 2021-04-03 0:06

代码: 全选

IF (PROCESS(NMESHES) < N_MPI_PROCESSES-1) THEN
   WRITE(MESSAGE,'(A)') 'ERROR: Too many MPI processes have been assigned to this job'
   CALL SHUTDOWN(MESSAGE) ; RETURN
ENDIF
lizijiang
帖子: 5
注册时间: 2021-03-30 21:03
系统: linux

Re: 分配给job过多MPI process报错的问题

#4

帖子 lizijiang » 2021-04-06 15:36

oneleaf 写了: 2021-04-03 0:06

代码: 全选

IF (PROCESS(NMESHES) < N_MPI_PROCESSES-1) THEN
   WRITE(MESSAGE,'(A)') 'ERROR: Too many MPI processes have been assigned to this job'
   CALL SHUTDOWN(MESSAGE) ; RETURN
ENDIF
您好!首先非常感谢您的帮助,由于我不知道您的这段代码是从哪个手册得到的,我十分希望知道其来源或手册,所以我根据我的理解就MPI PROCESS要大于我的网格数,故我对我的shell进行了修改,但是还是无法运行。希望详细的进行描述并得到您的帮助!
我MESH文件中的MPI_PROCESS分区如下,

代码: 全选

&MESH ID='domain1', IJK=110,225,225, XB=60.0,82.0,15.0,60.0,0.0,45.0, MPI_PROCESS=0/
&MESH ID='domain2', IJK=110,225,225, XB=60.0,82.0,15.0,60.0,45.0,90.0, MPI_PROCESS=0/
&MESH ID='environment01', IJK=60,75,120, XB=0.0,60.0,0.0,75.0,0.0,120.0, MPI_PROCESS=0/
&MESH ID='domain3', IJK=110,225,225, XB=82.0,104.0,15.0,60.0,0.0,45.0, MPI_PROCESS=1/
&MESH ID='domain4', IJK=110,225,225, XB=82.0,104.0,15.0,60.0,45.0,90.0, MPI_PROCESS=1/
&MESH ID='environment02', IJK=135,15,120, XB=60.0,195.0,0.0,15.0,0.0,120.0, MPI_PROCESS=1/
&MESH ID='domain5', IJK=110,225,225, XB=104.0,126.0,15.0,60.0,0.0,45.0, MPI_PROCESS=2/
&MESH ID='domain6', IJK=110,225,225, XB=104.0,126.0,15.0,60.0,45.0,90.0, MPI_PROCESS=2/
&MESH ID='environment03', IJK=135,15,120, XB=60.0,195.0,60.0,75.0,0.0,120.0, MPI_PROCESS=2/
&MESH ID='domain7', IJK=110,225,225, XB=126.0,148.0,15.0,60.0,0.0,45.0, MPI_PROCESS=3/
&MESH ID='domain8', IJK=110,225,225, XB=126.0,148.0,15.0,60.0,45.0,90.0, MPI_PROCESS=3/
&MESH ID='environment04', IJK=135,45,30, XB=60.0,195.0,15.0,60.0,90.0,120.0, MPI_PROCESS=3/
&MESH ID='domain9', IJK=110,225,225, XB=148.0,170.0,15.0,60.0,0.0,45.0, MPI_PROCESS=4/
&MESH ID='domain10', IJK=110,225,225, XB=148.0,170.0,15.0,60.0,45.0,90.0, MPI_PROCESS=4/
&MESH ID='domain11', IJK=125,225,225, XB=170.0,195.0,15.0,60.0,0.0,45.0, MPI_PROCESS=5/
&MESH ID='domain12', IJK=125,225,225, XB=170.0,195.0,15.0,60.0,45.0,90.0, MPI_PROCESS=5/
共有16个mesh分块,分成了6个MPI_PROCESS
然后,我根据您给我的回复,首先将我的shell文件改为了

代码: 全选

#PBS -l nodes=1:ppn=16 
,还是得到了与之前相同的错误

代码: 全选

ERROR: Too many MPI processes have been assigned to this job 
然后我将shell文件又改为了

代码: 全选

#PBS -l nodes=1:ppn=18 
,得到了下面的错误

代码: 全选

ERROR: The number of MPI processes, 18, exceeds the number of meshes, 16 
希望还可以得到您和其他人的帮助,十分感谢!
头像
oneleaf
论坛管理员
帖子: 10441
注册时间: 2005-03-27 0:06
系统: Ubuntu 12.04

Re: 分配给job过多MPI process报错的问题

#5

帖子 oneleaf » 2021-04-07 11:04

代码来源于 fds 项目,建议直接去找超算中心的老师解决。
回复