Pages

Monday, March 21, 2016

Alphabetize CSV Column Headers With Python

Today I need to dump a 200-column CSV out of a "central university" Salesforce org, give it to a business user, and ask her which columns she needs me to set up a "landing spot" for in our "subschool" org.

The least I can do is have it in alphabetical order for her.

Here's a Python script that can do just that (plus move ID to the front), after dataloader.io exported the columns in a seemingly random order.

If you're doing inter-data-store integration and you're not using Python (especially the "CSV" & "Pandas" modules) to help you with the CSV files, try it. You'll like it.

import pandas
import csv

inpath = 'C:\\WHEREVER_YOU_PUT_YOUR_FILE\\'
infilename = 'WHATEVER_YOU_NAMED_YOUR_FILE.csv'
infilepath = inpath + infilename
outpath = 'C:\\WHEREVER_YOU_WANT_YOUR_NEW_FILE\\'
outfilename = 'NAME_FOR_YOUR_NEW_FILE.csv'
outfilepath = outpath + outfilename

# CSV to Pandas:
df = pandas.read_csv(infilepath, encoding='latin-1',  header=0)

# Make a list of alphabetized column headers (with ID at the front):
sortedColumnList = sorted(df.columns)
sortedColumnList.insert(0, sortedColumnList.pop(sortedColumnList.index('Id')))

# Make a new Pandas object arranged alphabetically:
df2 = df.reindex_axis(sortedColumnList, axis=1)

# Write the new Pandas object to disk:
df2.to_csv(outfilepath, index=False, quoteing=csv.QUOTE_ALL)
You can replace the last line of code with this code if you want to remove columns where all of the rows are empty, anyway:
# Drops blank columns.  (Nothing to do with alphabetization.)
df3 = df2.dropna(axis=1, how='all')
# Write the new Pandas object to disk
df3.to_csv(outfilepath, index=False, quoteing=csv.QUOTE_ALL)

Actually, I think you can just add ".dropna(axis=1, how='all')" to the end of the "reindex_axis()" call and still print df2 - and actually, you can also just set df to equal itself with something done to it and print df (e.g. "df = df.reindex_axis().dropna()"), but I decided to favor legibility over style.

To execute this, you can just paste it into a plain-text file you call alphabetizecolumnheaders.py, install WinPython on your computer, bring up your ".py" file in the "Spyder" python IDE, and click "Run."

Credit to StackExchange here & here & here.

No comments:

Post a Comment