1 year ago

#146158

test-img

David Alexander

Azure ML Studio saved plots have boxes instead of characters

Context & Update: I've rewritten this post to reflect latest findings (problem as yet unresolved)

Using Azure Machine Learning Studio to run an "Execute R script" task. Using a compute cluster, which shuts down after running the pipeline (so no console access but can save files as logs to view after shutdown). Environment is Ubuntu and R version it loads is 3.5.1 (Session Info). I am creating plots png and then saving them into an Excel workbook openxlsx, and finally saving and retrieving this in Azure file storage. This file management is done with the azuremlsdk library

The issue I am having is that the characters on the plots are "squares" or "boxes", also seen referred as "glyphs"

png imported into excel clip

I've provided a code sample (although not completely reproducible as it requires Azure) to show the simple test I am running

library(ggplot2)
library(openxlsx)
imageName <- "testplot.png"
png(imageName)
plot(price ~ carat, data = diamonds, main = "Price vs Carat")
dev.off()
wb <- createWorkbook()
addWorksheet(wb, "testplotsheet", gridLines = TRUE)
insertImage(wb, "testplotsheet", imageName)
saveWorkbook(wb, file = xlName, overwrite = TRUE)

sessioninfo & capabilities outputs

R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS

Matrix products: default
BLAS: /azureml-envs/azureml_6ff64eff0a652bbe0bb1d84fc0884554/lib/R/lib/libRblas.so
LAPACK: /azureml-envs/azureml_6ff64eff0a652bbe0bb1d84fc0884554/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] tibble_3.0.1      ggplot2_3.3.0     azuremlsdk_1.10.0 openxlsx_4.2.5   
[5] dplyr_0.8.5       jsonlite_1.6.1    reticulate_1.12  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8       magrittr_1.5     tidyselect_1.0.0 munsell_0.5.0   
 [5] colorspace_1.4-1 lattice_0.20-41  R6_2.4.1         rlang_1.0.1     
 [9] tools_3.5.1      grid_3.5.1       gtable_0.3.0     withr_2.2.0     
[13] ellipsis_0.3.0   assertthat_0.2.1 lifecycle_0.2.0  crayon_1.3.4    
[17] Matrix_1.2-18    zip_2.2.0        purrr_0.3.4      vctrs_0.2.4     
[21] glue_1.4.0       stringi_1.4.3    compiler_3.5.1   pillar_1.4.3    
[25] scales_1.1.0     pkgconfig_2.0.3 

          name value
1         jpeg  TRUE
2          png  TRUE
3         tiff  TRUE
4        tcltk  TRUE
5          X11 FALSE
6         aqua FALSE
7     http/ftp  TRUE
8      sockets  TRUE
9       libxml  TRUE
10        fifo  TRUE
11      cledit FALSE
12       iconv  TRUE
13         NLS  TRUE
14     profmem  TRUE
15       cairo  TRUE
16         ICU  TRUE
17 long.double  TRUE
18     libcurl  TRUE

From another post on stackoverflow: Boxes show up when there is a mismatch between Unicode characters in the document and those supported by the font. Specifically, the boxes represent characters not supported by the selected font.

Results from various things I have tried:

Cairo : can't install because missing "cairo.h"

par(family ="Ubuntu Mono") (used after runing png as suggested) didn't change anything

Sys.setlocale("LC_ALL", 'en_US.UTF-8') (suggested on ms forum) didn't change the LC settings, but also does not seem supported based on checking available locales on the system system("locale -a", intern = TRUE)

[1] "C"       "C.UTF-8" "POSIX"  

I've seen a suggestion to use iconv, but don't know how to incorporate this into the code above. How could I use this or is it irrelevant?

This seemed significant:

sysfonts is not installed, and can't install it due to missing zlib. This also prevents me from installing showtext

  • do png and other graphics device rely on sysfonts to create characters in images?
  • is this why the family and type global settings or parameters in plot, png etc do nothing?

Are there any fonts available I wondered? font.families, font.files, font_families & font_files all blank or error

I did manage to locate some *.ttf files on the system, (not in any recognised fonts folder location), so thought perhaps I can load them or force R to use them for the png device

extrafont tried to use this to add fonts. however because font_import function requires responding to a y/n prompt, not sure if it loads. I saved the output for fonts() but blank, so probably not

  • Any other way to force png to use one of these ttf font files?

pdf this actually works! I can get characters to display saving the same output in pdf, does this use its own fonts? This seems to imply that it should be possible to get characters to display on png

At this point I am seeing if our Engineers can access a different compute because all these problems seem to stem from an outdated/inadequate R and Ubuntu environment. However, if there are any other ideas to get these characters to display, I welcome suggestions!

r

plot

utf-8

azure-machine-learning-service

azuremlsdk

0 Answers

Your Answer

Accepted video resources