PDF is one of the most commonly used formats for data. PDF files are not editable and sometimes and there is no direct way to import them into. I will show you multiple ways of importing data from PDF into Excel files and explain you Pros and Cons of each one.
Convert using Online Tools
For example you can use PDFtoExcel feature from https://www.ilovepdf.com/.
To complete the conversion you need to upload PDF file and download Excel Document. Below is screen print from the tool – you can also click on it to launch the converter.
How to Secure Convert Data (on your Desktop)
When using Online Tools there is always concern about security of data – if you use any tools outside of your desktop for conversion, somebody else will get a copy of your data, which may not be desired or even might be prohibited based on some government rules or regulations
Most secure way of converting your data is to use tools on your desktop – this way your data files never leave your desktop and you are always in control.
1. Convert by using Microsoft Word
- Microsoft word allows opening PDF files for editing.
- PDF file can be selected from the open dialog box, and after few seconds, word brings the entire PDF file into Microsoft word.
- Depending on how the PDF id structured, the entire data could be in form of table in Microsoft Word.
- Table can be copied and pasted in Microsoft Excel and format to meet your requirement.
2. Ask your source to provide data in different format
One of the most overlooked methods to import the data is to ask your source to provide data in different format.
Getting data in format different from PDF would allow you to establish ongoing connection between Excel and Source of the data. This might be especially helpful you are planning to refresh data from this source more than once, since you will be able to skip all the manual steps and can do it by just clicking the button.
There are different formats available, on how data could be shared with you.One of the most popular formats is Comma Separated Value (CSV). Whenever files of CSV format are opened in Notepad, all the values are separated with commas. Microsoft Excel also supports a lot of other formats such as XML, regular text file and a lot of other formats.
To better understand available options look in the Data tab above Microsoft Excel. For example, one can ask can you give me the data from the database and you can connect directly to the database or from file (typical file formats).
Also, importing from online stores like Azure or different online services is an option.
So, don’t hesitate to ask your source to provide data in a different format as it could be the easiest option for you when dealing with voluminous data from the source data file.
3. Use Tabula Open Source Tool to Convert PDF to Excel
You can also use Tabula Open Source tool to convert from PDF to Excel: https://tabula.technology/
Open Source Project means that anybody can readily see the code that went into a project. Open source projects also are usually community-based. Typically, Open Source Projects are free, but teams of programmers accept support from contributors around the world accept donations
Tabula is an open source project with multiple developers around the world contributing to the tools. In the case of tabular, team of programmers focus on security to solve the problem of importing data from PDF to Excel using desktop tools.
A copy of Java installed is need in order for Tabula to work on Windows and Linux OS.
The best thing about using Tabula is that the data converted never leaves your desktop and this ensures the security of the data outside of the cloud platform. One of the cons of Tabula is that it works only with text-based PDF files but not the scanned document.
4. Copy and Paste from PDF to Excel.
Simple copy and paste happen to be one of the most often overlooked way of converting PDF files to Excel. All that need to be done is highlight all data, and copy (ctrl + c) and paste it into Microsoft Excel.
With this method, separation of data into columns of Microsoft Excel is not appropriately done as it typically brings columns with the space separator and some of the columns may have 3 words or more.
In order to effectively use this technique, understanding the structure very well is essential. You could use text to column feature of Microsoft Excel and split it into columns using the logical algorithm. This method is not the easier but sometimes might ne the only option for you to get the data into the tool.
5. Convert from Adobe Acrobat
IF none of the methods of getting data into excel worked for you, you can try paid service on Adobe. To do that, you open the PDF file in Adobe Acrobat, and then select convert to Word, Excel or PowerPoint option. You click export to excel and you will be presented with a sign in screen.
This is a paid service and to take advantage of this, you need to select one of the options here. The cheapest one to do this at the time of this recording is $2 per month. You can also purchase Adobe Acrobat Pro with features of converting this to word, excel or other format will be include in the subscription.
Cleansing of Data after Import
Sometimes after you got the data into Excel it might require additional cleansing. Let’s look some additional steps you might need to do, once you imported the data
Use Excel Text to columns feature
Microsoft Excel provides numerous tools for you to cleanse the data. One of these tools is text to column feature of excel. For example, you might realize that purchaser name contains both the first name and the last name and it all contained in one cell. With Text To column Excel Feature, separation of first name from last name could be done and put them into separate columns.
Let’s look how to use Text To column Feature Step by Step:
- Consider renaming the column. One of the most appropriate name might be first and last.
- To use text to column, you can select the column, in our case column D.
- The text here is space delimited. So, we select delimited in the first screen and we select space on the second screen and we click next.
- We can assign the type of the column for the future converted values and we can select text from both columns and we can click Finish.
- This will prompt a question if we want to replace the column because we created and extra column E which is empty.
- We can select Ok and you can see now, column D contains first name and Column E contains last Name.
Go to data tab and select the text to column feature.
8. Use Power Query
Power Query in Excel provides a quick way for you to cleanse the imported data
For example, you might notice that purchaser’s address contains the street address as well as the Zip code and this data is separated with a slash (/) - see image below. To convert this into 2 separate columns, you need to first separate your data into a table.
- To do that, you can select the first cell and then Ctrl+Shift +end to get to the last cell of the data range, after which you navigate to insert table, click ok and now data is converted.
- Now you can use data tab to transform data from existing table or range.
- You click from table range button to launch power query.
- Then select the purchaser’s address column, then select split column option in the power query.
- Since we know that the slash is the delimiter for the purchaser’s address information, you click by delimiter, add 2 spaces in the beginning and at the end of the slash.
- You will see that excel created 2 columns; one contains purchaser’s address and another contains zip code
Watch the Video
About the author:
My goal is to help people with their education, certification tests and assessment test needs and this is why I have created this web site. Make sure to stay in touch and Subscribe to my YouTube Channel and join my email list - this way you are not missing out on any new episodes.
Thanks again for your support, good luck learning and wish you all the best!!!
Below is the list of resources you could use to get prepared for the Excel Interview:
All the best on your interview!!!
Top 50 Excel Assessment Test Questions and Answers
Top 50 Excel Assessment Test Questions and Answers PDF EBook teaches you everything you need to get ready for Basic and Intermediate Microsoft Excel Interview and prepare for Excel Assessment Test. You will practice most frequently asked questions on the exam and will learn what to do when you arrive at the exam testing center, how to manage your time when taking the exam, tips to manage your time during assessment test, tricks for answer questions in different formats and much much more.