ValueError: Must pass DataFrame with boolean values only

1.0K    Asked by DianeCarr in Data Science , Asked on Jul 12, 2021
In this data file, the United States is broken up into four regions using the "REGION" column.

Create a query that finds the counties that belong to regions 1 or 2, whose name starts with 'Washington', and whose POPESTIMATE2015 was greater than their POPESTIMATE 2014.

This function should return a 5x2 DataFrame with the columns = ['STNAME', 'CITY NAME'] and the same index ID as the census_df (sorted ascending by index).

You'll find a description of my data in the following picture:-

enter image description here

def answer_eight():

    counties=census_df[census_df['SUMLEV']==50]

    regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]

    washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]

    grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]

    return grew[grew['STNAME'],grew['COUNTY']]

outcome = answer_eight()

assert outcome.shape == (5,2)

assert list (outcome.columns)== ['STNAME','CTYNAME']

print(tabulate(outcome, headers=["index"]+list(outcome.columns),tablefmt="orgtbl"))

ERROR

ValueError                                Traceback (most recent call last)
in ()
      6     return grew[grew['STNAME'],grew['COUNTY']]
      7 
----> 8 outcome = answer_eight()
      9 assert outcome.shape == (5,2)
     10 assert list (outcome.columns)== ['STNAME','CTYNAME']
in answer_eight()
      1 def answer_eight():
      2     counties=census_df[census_df['SUMLEV']==50]
----> 3     regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
      4     washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
      5     grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]
/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in __getitem__(self, key)
   1991             return self._getitem_array(key)
   1992         elif isinstance(key, DataFrame):
-> 1993             return self._getitem_frame(key)
   1994         elif is_mi_columns:
   1995             return self._getitem_multilevel(key)
/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in _getitem_frame(self, key)
   2066     def _getitem_frame(self, key):
   2067         if key.values.size and not com.is_bool_dtype(key.values):
-> 2068             raise ValueError('Must pass DataFrame with boolean values only')
   2069         return self.where(key)
   2070 
ValueError: Must pass DataFrame with boolean values only

I am clueless. Where am I going wrong?

Thanks

Answered by Ella Clarkson

You should pass a valueerror: must pass a dataframe with boolean values only when you compare a series or column in df with a scalar to produce a boolean mask. Do not pass the df multiple times.

def answer_eight(): counties=census_df[census_df['SUMLEV']==50]
regions = counties[(counties['REGION']==1]) | (counties['REGION']==2])]
washingtons = regions[regions['COUNTY'].str.startswith("Washington")]
grew = washingtons[washingtons['POPESTIMATE2015']>washingtons['POPESTIMATES2014']] return grew[['STNAME','COUNTY']]


Your Answer

Interviews

Parent Categories