1 year ago

#183813

test-img

zuko7

Batch and Bash codes while submitting jobs

I was used to the following way of submitting my jobs that to be done in R in an sequential way in 'PBS/Torque'. following is my R code named simsim.R

#########
set<-1
#########
# Read i
#########
#the following two refers to the bash code
arg <- commandArgs()
arg
itration<- as.numeric(arg)[3]
itration

setwd("/home/habijabi")
save(arg,itration,
     file = paste0('simsim_RESULT_',set,itration,'.RData'))

Now I write the following set of codes

#!/bin/bash
Chains=10
for cha in `seq 1 $Chains`
do
    echo "Chains:  " $cha
    sleep 1
    qsub -q long -l nodes=1:ppn=12,walltime=24:00:00 -v c=$cha ./diffv1.sh

done

in this 'diffv1.sh' I used to load the module and pass the variable 'c'.

#!/bin/bash 

## input values
c=$c

#configure software
module load R/4.1.2
#changed
cd /home/habijabi
R --no-save  < simsim.R $c

In this way I was used to sending the '$c' value to my R code. And it would have produced me 10 many .R files with the corresponding names. But then I had to change to 'SLURM'. Following is the batch code that I was using.

#!/bin/bash 
#SBATCH --job-name=R-test

#IO files
#SBATCH --error=R-test.%J.err
#SBATCH --output=R-test.%J.out

#!/bin/bash                                                                                                                                                             
module load R/4.1.2
set -e -x

mkdir -p jobs

cd /home/habijabi


for cha in {1..10}
do
    sbatch --time=24:00:00 \
           --ntasks-per-node=12  \
           --nodes=1 \
           -p compute \
           -o jobs/${cha}_srun.txt \
           --wrap="R --no-save < /home/habijabi/simsim.R ${cha}"
done

But with this code, it runs only once or twice. And I do not understand why after submitting 150 jobs it does not run all of them.... The run file shows the following:

+ mkdir -p jobs
+ cd /home/habijabi
+ for cha in '{1..10}'
+ sbatch --time=24:00:00 --ntasks-per-node=12 --nodes=1 -p compute -o jobs/1_srun.txt '--wrap=R --no-save < /home/habijabi/simsim.R 1'
+ for cha in '{1..10}'
+ sbatch --time=24:00:00 --ntasks-per-node=12 --nodes=1 -p compute -o jobs/2_srun.txt '--wrap=R --no-save < /home/habijabi/simsim.R 2'
+ for cha in '{1..10}'
+ sbatch --time=24:00:00 --ntasks-per-node=12 --nodes=1 -p compute -o jobs/3_srun.txt '--wrap=R --no-save < /home/habijabi/simsim.R 3'
...so on...

and the .out file shows the following

Submitted batch job 146299
Submitted batch job 146300
Submitted batch job 146301
Submitted batch job 146302
Submitted batch job 146303
......
......

Both are doing fine...But here, a few of the jobs run, and majority of them gives error as follows.

/opt/ohpc/pub/libs/gnu8/R/4.1.2/lib64/R/bin/exec/R: error while loading shared libraries: libpcre2-8.so.0: cannot open shared object file: No such file or directory

I do not understand what I have done wrong....This does not produce anything... I am new at this type of coding, any help is appreciated.

r

bash

batch-processing

slurm

torque

0 Answers

Your Answer

Accepted video resources