Easy! Easy! How to copy a table from a PDF into Word… for beginners

So the source document for your translation is a PDF – and it contains some complex tables! You just want to copy and paste them directly into your Word document so you can overtype the text with your translation… Right?

Well, if you have the professional version of Adobe Acrobat, then you’re in luck [1]. But if you only have the free reader, like most people, then you are going to have to use your wits if want to avoid retyping all the data…

The skills needed to pull the data out of a complex table in a PDF and make it spring back to life in a Word document are actually very basic. What can look like a complex task can be done with a few simple tricks.

Here we break down the problem into a series of really simple steps. While each step seems to throw up yet another problem to be solved, each fix only ever requires really simple skills like Copy & Paste, Find & Replace… If you have mastered these simple skills, you don’t have to remember any correct “sequence” to do the job – just solve each little problem step by step until you have achieved your goal!

1 Copy the table in the PDF, and paste the data into Word

Select all the text of the table, copy it and paste it directly into Word. The result may not be a pretty sight!

Most of the formatting in the table will be lost – you’ll just have plain data.

It will look a terrible mess as the columns will have disappeared! In the example above, the words in each of the column headings appear to be muddled up. Rather than wrapping within each cell, the words on each line run into the words of the next column.

But don’t worry about it! It’s easy to fix…

It’s not such a big problem to untangle this apparent mess.

2 Click the Show/Hide button

Make sure you have the formatting marks visible so you can see what is going on, and how the data in the table is structured.

The columns of data are clearly separated by spaces. We can use these spaces to reconstruct the columns. But the words in the column headings are also separated by spaces. Sometimes these spaces show where the columns are supposed to be and sometimes they are just ordinary “spaces between words”. Figuring out which is which is the only part of this job which requires a little human intelligence. This is your job!

Leave the spaces which are supposed to be spaces as spaces, and change the spaces which are meant to show where the columns are into something else.

Easy!

Let’s use tabs to mark where the columns are supposed to be.

3 Spaces to tabs

  • Although there are 7 columns of data in my example table, there are only two column headings in the top row. So in this row we only need to change one space to a tab to separate the two pieces of text. Select the space which separates the two headings, and hit the Tab button:

  • We now do the same for the 7 column headings. Remember you just have to decide whether it’s a “real space” or not. You don’t need to line everything up and it doesn’t matter too much if you make a few mistakes – you can fix these up later once the table has been made (when you’ll be able to see what you’re doing!). In the figure below, the blue circle show a “real” space, the red circle shows a space replaced with a tab.

Now we have to do the same with the data in the body of the table. In my example table, there are only letters and numbers in the data – there are no “real spaces” separating words. All the spaces mark where the data is to be separated into columns. Instead of changing them to tabs one by one, we can simplify the task by changing all of them in one hit using Find & Replace:

  • Select all the data;
  • Open the Find & Replace dialog box;
  • Type a space into the Find what field;
  • Type Word’s code for a tab (^t) into the Replace what field;
  • Hit Replace All.

All the text and the data should now have tabs to mark the columns (and spaces to mark the real spaces).

4 Now make the table

As we now have tabs marking where the columns are supposed to be, we can use Word’s Convert text to Table function to reconstruct a simple, regular table. (We can sort out the irregularities later.)

  • Select all the text & data that are to go into the table;
  • Go to Insert|Table|Convert text to Table;

  • Word has correctly guessed that this is a 7-column table from the highest number of tabs you’ve put into any line, and that you are going to use these tabs to set up the columns.
  • Click OK.

Magic!

We are almost there. There’s only a bit of tidying up to do!

5 Fix the top row.

We’ve now got a nice regular 7-column table, but there are some minor irregularities to deal with. The column headings in the top row are not only in the wrong place, but they are also supposed to span several columns.

Easily fixed!

  • Just select the text and drag it to the right place;
  • Then select the cells the text is supposed to span;
  • Right click the selected cells and click Merge Cells.

6 Fix the column headings

Look at the words in the 7 column headings. The PDF inconveniently split them up over three rows rather than wrapping them within a single cell.

We want all the words to all be in a single cell at the top of each column. Easily fixed! We just need to merge these cells vertically.

  • Select the cells containing the text for each column heading;
  • Right click and select Merge Cells;
  • Do the same for the other columns (or to do it more quickly select the cells and type Ctrl + Y - this repeats the last thing you did).

Each of the column headings is now in its own separate cell. But as you can see in the red circle above, we have another small problem to deal with – the words are separated by unnecessary paragraph markers. We need to get rid of these and replace them with ordinary spaces. You could just delete them one by one, but here’s a quicker way using Find & Replace:

  • Select all the column headings right across the table;
  • Do a Find & Replace. Type the code for a paragraph marker (^p) in the Find what field;
  • Type a space in the Replace with field:
  • Click Replace All.

With the extra paragraph marks gone, each column heading will wrap normally within its own cell.

(Now is a good time to do a quick proofread of the column headings. Now that the data is in a table and we can see what we’re doing, it’s easy to move any words which have ended up in the wrong column. Just select any misplaced words and drag them to the right spot.)

7 Final tidy up

You should now have a table with everything in the right place. It just needs a cosmetic make-over to make it look like the original:

  • Select all the text in the table and click the “Centre button” to centre the text in the columns [2];
  • Adjust the font, point size, paragraph and line spacing;
  • Select the whole table [2] and get rid of all the borders; then
  • Reinstate just those borders you need to match the original.

8 One last problem

My table is pretty much the same as the original in the PDF.

But wait! My column headings don’t line up horizontally …

We need to adjust the table property which controls how the text sits in each cell. Select the offending row, right-click and select Cell Alignment|Align top Centre.

PDF to Word… The job is done!


[1] How to copy and paste data and tables without the loss of formatting with the professional version of Adobe Acrobat: http://www.wikihow.com/Copy-and-Paste-PDF-Content-Into-a-New-File

[2] Note the difference between “select the table” and “select all the text in the table”. If you “select the table” and hit the “Centre” button, the whole table will move into the centre of the page. If you just “select  all the text” the text in the table will be centred within each cell. To tell the difference look at the very right-hand side of the table:

[3] Some PDF to Word converters are worth trying. I tried this one (using the example tables in this post) with some success: http://www.pdfonline.com/pdf2word/index.asp.

[4] The examples in this post were illustrated using Microsoft Word 2007 and Adobe Reader X.

Vote the Top 100 Language Professionals Blogs 2011

About these ads
This entry was posted in Tips 'n tricks and tagged , , , , , , . Bookmark the permalink.

6 Responses to Easy! Easy! How to copy a table from a PDF into Word… for beginners

  1. Martin says:

    More great stuff – even with a neat tool like ABBYY FineReader, which enables you to mark text so the program will recognise it as a table and format it accordingly, there are occasions where the recognition is less than perfect, and some or all of the steps you outline above will still need to be followed to patch it up, so these routines should be in everyone’s store of basic WP skills as a matter of course.

  2. John Moody says:

    Well, I use the PDF Copy Paste software, which allows me to crop my desired portion of any PDF document, and send it to Microsoft Word with a click.

  3. John Moody says:

    PDF Copy Paste software can be found at http://www.PDFCopyPaste.com

  4. Yevgeny says:

    Very helful

  5. Pingback: Thủ thuật word | Học Lại ! Hix ! Hix

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s