Python – moving fields around in CSV

We get a lot of CSV formatted data that needs to be shuffled around and sent out as slightly different CSV data. Often the end-user wants columns 1, 6, and 10 of something, but ordered like so: 10, 6, 1. I’ve attached a very primitive version of a Python script that satisfies that need. I’ll be refining it to add some other useful items later. It demonstrates the use of the csv library and use of “enumerate” to return a both an item’s position in a list, as well as its value.

I look forward to your comments!

####################################################
#
# Ben Chapman - Thu Dec  3 10:09:53 EST 2009
# Released into the public domain
#
# Take input CSV file, write certain columns in arbitrary order to output file.
# Output field specification requires a list of every field in the input; put
# desired output field position or -1 if field should not be included.  See the
# get_info helper function for some help with defining the output_fields mask.
#
# VERSION .2
#
# Version .2 of this would autoscan the data and create an INI file
# based on the input file using ConfigParser and OptParse, making it easier for
# the end user to modify.  Default behavior would be to create column_mover.ini
# if only one file provided on command line. If two files provided on command
# line, then first is input, second is output and script would error out if no
# column_mover.ini found in current directory.
#
# Filenames, etc. would also not be hard-coded. :)
#####################################################

import sys
import csv

############################################
# CHANGE TO MATCH INPUT FILE
# The output_fields list is used as follows:
#     Input Col 0 - output col 2
#     Input Col 1 - output col 3
#     Input Col 2 - discarded
#     ...
#     Input Col 12 - output col 0
#     Input Col 13 - discarded
############################################

output_fields = [2, 3, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 0, -1]

def get_info(fn):
    '''
    Utility function to give you information for filling in the output_fields
    values. In the interpreter, do `import column_mover` and then run
    column_mover.get_info('FILENAME').
    '''
    cr = csv.reader(open(fn, 'rU'))
    row = cr.next()
    print "Length of input: %s" % len(row)
    print "Sample output_field mask with -1 for each output position:"
    print [-1 for x in range(len(row))]
    print "nnn"
    print "Field #tttValue"
    print "=======ttt====="
    for x,y in enumerate(row):
        print "%sttt%s" % (x,y)

def check_template(row):
    '''Check for most basic error and abort if found '''
    if len(output_fields) != len(row):
        print "Error:"
        print "You must map each input field to an output."
        print "Aborting ..."
        sys.exit(1)
    else:
        return

def load_data(fn):
    '''
    Read a CSV file: CSV dialect must be manually adjusted inside of function for now.
    '''
    input_file = open(fn, 'rU')
    cr = csv.reader(input_file)
    rows = []
    for row in cr:
        rows.append(row)
    input_file.close()
    return rows

def process_row(row):
    '''process an input row and return an output row'''
    # Create a empty row of items based on number of requested non-empty
    # outputs
    tot_output_fields = len(output_fields) - output_fields.count(-1)
    out_row = ['' for x in range(tot_output_fields)]
    for counter, field in enumerate(output_fields):
        if field != -1:
            out_row[field] = row[counter]
    return out_row

def save_data(fn, data):
    ''' Open a file and write CSV; adjust dialect of CSV inside the function '''
    fout = open(fn,'wb')
    cw = csv.writer(fout)
    for row in data:
        cw.writerow(row)
    fout.close()

if __name__ == '__main__':
    data = load_data('PCOUNTER.LOG')
    check_template(data[0])
    out_data = []
    for row in data:
        out_data.append(process_row(row))
    save_data('OUTPUT.CSV',out_data)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s