1 year ago
#367633
HulaLula
error: 'ascii' codec can't decode byte 0xd8 (...): ordinal not in range(...) What am I doing wrong
What is wrong with this python asciicode:
with open('poetry.txt', 'w',encoding="utf-8") as outfile:
for fname in poem_txt_list:
with open(fname) as infile:
outfile.write(infile.read())
it gives me the error:
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-51-dc1d05741086> in <module>
2 for fname in poem_txt_list:
3 with open(fname) as infile:
----> 4 outfile.write(infile.read())
/opt/anaconda3/envs/MLinPractice/lib/python3.6/encodings/ascii.py in decode(self, input, final)
24 class IncrementalDecoder(codecs.IncrementalDecoder):
25 def decode(self, input, final=False):
---> 26 return codecs.ascii_decode(input, self.errors)[0]
27
28 class StreamWriter(Codec,codecs.StreamWriter):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd8 in position 0: ordinal not in range(128)
I am doing something very fundamentally wrong I think. The original code from the github which I'm trying to adjust is the following:
#get list of text files in data
poem_txt_list = glob.glob('data/*.txt')
with open('raw_corpus.txt', 'w') as outfile:
for fname in poem_txt_list:
with open(fname) as infile:
outfile.write(infile.read())
data_dir = 'raw_corpus.txt'
text = helper.load_data(data_dir)
I have one file ("poetry.txt", UTF-8 encoded) which I want to place in the code instead of "raw_corpus.txt". I think its the right spot for my file to go but I am not entirely sure. I also don't have a list of data, I only have one txt file with all the poetry as one long text. Do I even need the first line?
python
unicode
encoding
ascii
decoding
0 Answers
Your Answer