1 year ago

#303350

test-img

danlooo

R targets: Create two targets for each row in a tibble to track meta data and the resulting file

I want to utilize the R package targets to call shell commands and read results, e.g. the exit code into a new target. The commands are organized as a tibble with metadata attached to them. Currently, I have a workflow which runs the command of each row.

However, if I delete one of the output file, the corresponding sub-target won't be recreated but just skipped. Note that this is required, because the output file content is intentionally not stored in the target R object. In order to solve this issue, one needs to insert a new target with tar_target(format = "file") for each row of the commands table. I can not simply let the function call_shell return the filename as a character vector, because I need to join metadata downstream, e.g. doing calls %>% left_join(commands).

I've read the targetopia contributing page. Unfortunately, I am unable to expand the factory example with dynamic branching.

This is my _targets.R file:

library(tidyverse)
library(targets)

#' Execute a shell command and save the output to a file
call_shell <- function(command = "echo hi", file = "out.txt") {
  exit_code <-
    command %>%
    paste0(" > ", file) %>%
    system()

  list(
    exitcode = exit_code,
    size = file.info(file)$size
  )
}

list(
  tar_target(
    commands,
    {
      tribble(
        ~id, ~command, ~long_command,
        1, "echo foo", FALSE,
        2, "echo foo bar baz", TRUE
      )
    }
  ),
  tar_target(
    calls,
    command = {
      commands %>%
        mutate(call = command %>% map(~ call_shell(
          command = .x, file = paste0(tar_name(), ".txt")
        )))
    },
    pattern = map(commands)
  )
)

How can I create a function tar_target_shell so that this will create two targets: one for the exit codes and one to track potentially missing output files?

r

targets-r-package

0 Answers

Your Answer

Accepted video resources