Pages, some stolen, some original

Friday, August 13, 2010

Importing CSV Dates Into A Spreadsheet

I use spreadsheets to organize data. Sometimes I will be looking into a subject, and I will find bits and pieces of information in several places, which makes it hard to compare, so I will consolidate all this data into one spreadsheet. It can be tedious copying data from each source into a cell in the spreadsheet. One can type it in by hand or use the computer's copy and paste functions. Either way is time consuming and tedious. I usually use copy and paste, it (mostly) eliminates typos, which are the bane of my home made typing technique.

Sometimes I will find a whole page of data, in text, where each line contains the essentially the same information about a different item (time, date, location, item number, etc.). If each bit of information is separated from the others by commas, the entire file can be imported into a spreadsheet in one fell swoop. Files like this are called CSV (Comma Separated Values) files and are given the .csv filename extension.

I tried importing such a file yesterday. Everything went smoothly, well mostly, except for the date. The dates in the file were in this form:

December 23, 2007

You will notice it contains a comma, which Google interpreted as a field separator, so December 23 went in one column and 2007 went in the next. That wouldn't be so bad, but then Google decided December 23 meant this year, tacked on 2010, and changed the format to 12/23/2010, which was bad.

So how do you fix this? One way is to edit the original file so each date is surrounded by double quotes, like this:

"December 23, 2007"

Another way is manually edit the dates once they are in the spreadsheet. Both methods are tedious, time consuming and error prone.

I decided I was going to make the spreadsheet do the conversion work for me. At first I tried just doing some simple math, but that doesn't allow for leap years, so if there is a leap day in the middle of the change, your date will come out wrong. Instead, we take the incorrect date apart using the month and day functions, and then make a new date by joining these values together along with the slash delimiters.

Say the mangled date ended up in column E, and the years ended up in column F. I inserted a column (G) after the years (column F). In this column I put a formula for correcting the date:

=concatenate(month(En), "/", day(En), "/", Fn)

where n is the row number. Write the formula in the first row as it should be, then copy it to all the other cells in the column. The copy function will automatically adjust the row numbers. Apply your preferred date format to column G, copy the entire column and paste values over the incorrect dates in column E. Delete columns F & G, and you're good to go.

Update: more than you or I should want to know from the Google Help Forum.

1 comment:

  1. "=concatenate(month(En), "/", day(En), "/", Fn)"
    Instead use:
    G1: =ArrayFormula(month(E:E)& "/"&day(E:E)&"/"& F:F)

    ReplyDelete