2 years ago
#375063
Alison810
Boolean Expression Error - If Statement - The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
I have a DataFrame with headers for 'City', 'State' and 'Country'. My goal is to clean up the state and country columns and have all states displayed with their proper abbreviation (ex: 'New York' == 'NY') since both the full name and abbreviations are displayed causing inconsistency.
I ultimately want to do create a class that accounts for not only US state abbreviations but also wraps this in an a series of if/else statements to clean city, state and country columns for states/countries beyond the United States.
Error Returned: "ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."
The current DataFrame looks like:
| State | Country | 
|---|---|
| New York | USA | 
| NY | United States | 
and I want it to look like:
| State | Country | 
|---|---|
| NY | USA | 
| NY | USA | 
I created a dictionary of all state/abbreviation combos and stored it in a variable us_state_abbrev.
us_state_abbrev = {
'Alabama': 'AL', 'Alaska': 'AK', 'Arizona': 'AZ', 'Arkansas': 'AR', 'California': 'CA', 'Colorado': 'CO',
'Connecticut': 'CT', 'Delaware': 'DE', 'Florida': 'FL', 'Georgia': 'GA', 'Hawaii': 'HI', 'Idaho': 'ID',
'Illinois': 'IL', 'Indiana': 'IN', 'Iowa': 'IA', 'Kansas': 'KS', 'Kentucky': 'KY', 'Louisiana': 'LA',
'Maine': 'ME', 'Maryland': 'MD', 'Massachusetts': 'MA', 'Michigan': 'MI', 'Minnesota': 'MN', 'Mississippi': 'MS',
'Missouri': 'MO', 'Montana': 'MT', 'Nebraska': 'NE', 'Nevada': 'NV', 'New Hampshire': 'NH', 'New Jersey': 'NJ',
'New Mexico': 'NM', 'New York': 'NY', 'North Carolina': 'NC', 'North Dakota': 'ND', 'Ohio': 'OH', 'Oklahoma': 'OK',
'Oregon': 'OR', 'Pennsylvania': 'PA', 'Rhode Island': 'RI', 'South Carolina': 'SC', 'South Dakota': 'SD',
'Tennessee': 'TN', 'Texas': 'TX', 'Utah': 'UT', 'Vermont': 'VT', 'Virginia': 'VA', 'Washington': 'WA',
'West Virginia': 'WV', 'Wisconsin': 'WI', 'Wyoming': 'WY'}
From there, I used a .map( ) function replace the state name with the state abbreviation
df1['State'] = df1['State'].map(us_state_abbrev).fillna(df1['State'])
All of that works fine. I started with this block of code which works and returns the subsetted DataFrame:
df1[df1.State.str.len() > 2]
but if I try and write the below code to start building out the statement I receive a ValueError.
if df1[df1.State.str.len() > 2]:
   print("True")
Error Returned: "ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."
Any help would be appreciated.
dataframe
etl
valueerror
boolean-expression
0 Answers
Your Answer