1 year ago
#296068
Derek1st
Carriage returns in file when passed as argument to program
I've got a python program that acts as a thin wrapper to an existing command line tool. If you've ever heard of globus, its a file service. To massively oversimplify things, there are various endpoints files can be located on. In order to "download" files to a local computer, you would create your own endpoint and transfer files to it. They have a command line tool that's pretty intuitive. The globus command line interface already allows for batch downloading. You would either pass as an argument the name of a file (where each line in the file contains 2 arguments, the path to the resource on the source endpoint, and the location you want it to be installed on the local endpoint) or using an additional optional argument, you can opt to pass in each of those lines into the cli one at a time until some EOF is reached.
For my work, we accept similar files to the ones used by globus. I perform a number of modifications. I then create my own temporary file, and then give THAT temporary file to globus using a subprocess.popen.
The problem is, while everythings works fine on linux, (and actually, 90% of the time, everything works perfect on windows too), on some specific usages in windows, I get an error.
My transfer will start, but I'll check on its status (which is done through the globus website) and I'll see that there was an error. The error globus is reporting looks to be a carriage return. This was strange given that I thought I had filtered out any carriage returns in my modification of our users data. But after a thorough double check, I see that when I was writing each line to a file, windows style line endings were being added.
No problem, I just added like 5 lines of code where I reopen the temp file in binary mode, read its contents, swap any \r's for \n's. Checked again, and awesome, there's no carriage returns.
however, when I call globus again, low and behold, I'm getting the carriage return error. I am at my wits end.
I've tried changing from the pyton tempfile module to doing manual creation and deletion of files in case there was anything hinky in that module. I've printed out the contents of the tempfile for the case that is giving me issues, and taken that exact file, and run the globus command manually and it works! So the file that's being generated is correct. Somehow, when getting passed as an argument to globus, there's something going on with the line endings.
I cannot figure out how if my script generates a file, and then I manually call globus with it, it works, but when I call globus in my script with the exact same file, its getting carriage return errors. I looked at the globus documentation to see if there's any arguments for this. nada.
I'm going to share with you the globus command in full, followed by a deidentified version of the relevant portion of my code.
I really hope you can help. I've been struggling with this for like 72 hours now.
Globus command:
globus transfer $sourceEndpointID $destinationEndpointID --batch filename.txt
and then the code is:
def my_function(mylist, my_id, local_id, args):
temp = tempfile.NamedTemporaryFile(mode='w+b', delete=False)
for each in mylist:
is_directory = False
destination = args.destination.replace("\\", "/")
if each["smallpath"] == "/":
full_path = each["smallpath"] + "/"
else:
full_path = each["rel_path"].replace("\\", "/") + "/" + each["specific_path"].lstrip("/")
if os.path.basename(full_path) == "":
is_directory = True
if is_directory is False:
line = f'"{full_path}" "~/{destination}/{each["orgID"]}-{each["userID"]}/{os.path.basename(full_path)}" \n'
else:
if each["smallpath"] != "/":
slash_index = full_path.rstrip('/').rfind("/")
local_dir = full_path[slash_index:].rstrip().rstrip('/')
else:
local_dir = "/"
line = f'"{full_path}" "~/{destination}/{each["orgID"]}-{each["userID"]}/{local_dir.lstrip("/")}" --recursive \n'
line = line.replace("\\", "/")
temp.write(line)
temp.seek(0)
windows_ending = b'\r'
unix_ending = b'\n'
double_newline = b'\n\n'
with open(temp.name, 'rb') as file:
file_contents = file.read()
temp.seek(0)
file_contents = file_contents.replace(windows_ending, unix_ending)
file_contents = file_contents.replace(double_newline, unix_ending)
with open(temp.name, 'wb') as file:
file.write(file_contents)
myprocess = subprocess.Popen(["programname", "commandname", myid, local_id, "--batch",
temp.name], stdout=subprocess.PIPE)
mycommand = myprocess.communicate()[0].decode('utf-8')
temp.close()
os.unlink(temp.name)
python
arguments
temporary-files
carriage-return
0 Answers
Your Answer