Pages

Wednesday, May 24, 2017

Python CSV Example - Back-Filling Nearby Zip Code Distances

Today I used the following small Python script to add a bunch of distances to a new field called "Miles_From_Us__c" on 40,000 pre-existing Contact records.

The process was as follows:

  1. Export "SELECT ID, MAILINGPOSTALCODE, MILES_FROM_US__C WHERE MAILINGPOSTALCODE != NULL AND MAILINGPOSTALCODE != '' AND MILES_FROM_US__C = NULL" from our Salesforce org to a CSV file on my hard drive at "c:\tempexample\contactzipextract.csv" using the "Apex Data Loader" software.
  2. Run the Python script in this blog post instead of bashing my head against the wall with Excel.
  3. Import the CSV file on my hard drive at "c:\tempexample\contactzipupdate.csv" back into our Salesforce org as an "UPDATE" operation using the "Apex Data Loader" software.

 

import pandas

zips = ['11111','22222','33333','44444']

zipstodist = {'11111':0, '22222':1, '33333':9, '44444':21}

df = pandas.read_csv('c:\\tempexample\\contactzipextract.csv', dtype='object')

df = df[df['MAILINGPOSTALCODE'].str[:5].isin(zips)]

df['MILES_FROM_US__C'] = df['MAILINGPOSTALCODE'].str[:5].apply(lambda x : zipstodist[x])

df[['ID','MILES_FROM_US__C']].to_csv('c:\\tempexample\\contactzipupdate.csv', index=0)

Here's an explanation of what my script is doing, line by line, for people new to Python:

  1. Says that I'll be using the "Pandas" plug-in to Python.
  2. Says which U.S. postal codes ("zip codes") I care about as a Python "list" that I give a nickname of "zips."
    • I got this from the first column of a CSV file of "zip codes of interest" that a consultant sent one of my users.
  3. Defines the "distance from us," in miles, of each zip code of interest, as a Python "dict" that I give a nickname of "zipstodist."
    • I got this from the first and second columns of a CSV file of "zip codes of interest" that a consultant sent one of my users.
    • I could have cross-checked the two CSV files in Python, but I had already transformed the CSV data into this list style when writing some Salesforce Apex code for future maintenance, so copying/pasting it into Python was easier in this case.
  4. Reads my CSV file into Python as a Python "Pandas dataframe" that I give a nickname of "df."
  5. Filters out any rows whose first 5 characters of the value in the "MAILINGPOSTALCODE" column aren't among my zip codes of interest.
  6. Sets each row's"MILES_FROM_US__C" to the distance corresponding to the first 5 characters of that row's "MAILINGPOSTALCODE" value.
  7. Exports the "ID" & "MILES_FROM_US__C" columns of my filtered, transformed dataset out to a new CSV file on my hard drive.

No comments:

Post a Comment