Introduction
Python has a wealthy ecosystem of libraries that make it a perfect language for knowledge evaluation. A kind of libraries is pandas, which simplifies the method of studying and writing knowledge between in-memory knowledge constructions and totally different file codecs.
Nonetheless, whereas working with Excel recordsdata utilizing pandas.read_excel, you would possibly run into an error that appears like this:
xlrd.biffh.XLRDError: Excel xlsx file; not supported
On this Byte, we’ll dissect this error message, perceive why it happens, and discover ways to repair it.
What’s the Error “xlrd.biffh.XLRDError”
The xlrd.biffh.XLRDError is a selected error message that you just would possibly encounter whereas working with the pandas library in Python. This error is thrown whenever you attempt to learn an Excel file with the .xlsx extension utilizing pandas.read_excel methodology.
Here is an instance of the error:
import pandas as pd
df = pd.read_excel('file.xlsx')
Output:
xlrd.biffh.XLRDError: Excel xlsx file; not supported
Explanation for the Error
The xlrd.biffh.XLRDError error is brought on by a latest change within the xlrd library that pandas makes use of to learn Excel recordsdata. The xlrd library now solely helps the older .xls file format and now not helps the newer .xlsx file format.
This alteration is usually a little bit of a shock in the event you’ve been utilizing pandas.read_excel with xlrd. By default, pandas.read_excel makes use of the xlrd library to learn Excel recordsdata, however as of xlrd model 2.0.0, this library now not helps .xlsx recordsdata.


As builders, we have all been there…
Learn how to Repair the Error
The answer to this error is straightforward. You simply want to put in openpyxl and specify the engine argument within the pandas.read_excel methodology to make use of the openpyxl library as an alternative of xlrd. The openpyxl library helps each .xls and .xlsx file codecs.
Here is do it:
First, you’ll want to set up the openpyxl library. You are able to do this utilizing pip:
$ pip set up openpyxl
Then, you possibly can specify the engine argument within the pandas.read_excel methodology like this:
import pandas as pd
df = pd.read_excel('file.xlsx', engine='openpyxl')
This code will learn the Excel file utilizing the openpyxl library, and you’ll now not encounter the xlrd.biffh.XLRDError error.
Conclusion
On this Byte, we have discovered in regards to the xlrd.biffh.XLRDError error that occurs when utilizing pandas.read_excel to learn .xlsx recordsdata. We have discovered why this error happens and repair it through the use of the openpyxl library.


