Pages

Wednesday, July 20, 2016

Big Job: Apex vs. Manual CSV File Manipulation vs. Python-Is-Cool

In our Salesforce org, we have an object called "Extended Contact Information" that's in a "detail" relationship hanging off of "Contact."

It lets us use record-types to cluster information about Contacts that is only of interest to certain internal departments (and lets us segment, department-by-department, their answers to questions about a Contact that they might feel differently about ... such as whether they're the best way to get ahold of a company).

We also have a checkbox field for each internal department that lives on Contact, and that needs to be checked "true" if a department is working with someone. (Whenever a department attaches an "Extended Contact Information" object to someone, we have a trigger in place that checks that department's corresponding box on "Contact," but it can also be checked by hand.)

Our computers mostly care about this checkbox, but humans care about the "Extended Contact Information" record, so today I did an audit and noticed 4,000 people from our "Continuing Ed" department were flagged in the the checkbox as working with that department but didn't have "Extended Contact Information" records of that type attached.

I wrote the following Apex to execute anonymously, but due to a bazillion triggers downstream of "Extended Contact Information" record inserts, it hits CPU time execution limits somewhere between 100 at a time & 300 at a time.

List<Contact> cons = [SELECT Id, Name FROM Contact WHERE Continuing_Ed__c = TRUE AND Id NOT IN (SELECT Contact_Name__c FROM Extended_Contact_Information__c WHERE RecordTypeId='082I8294817IWfiIWX') LIMIT 100];

List<Extended_Contact_Information__c> ecisToInsert = new List<Extended_Contact_Information__c>();

for (Contact c : cons) {
    ecisToInsert.add(new Extended_Contact_Information__c(
        Contact_Name__c = c.Id,
        RecordTypeId='082I8294817IWfiIWX'
    ));
}

insert ecisToInsert;

Probably the fastest things to do are one of the following:

  1. run this code manually 40 times at "LIMIT 100"
  2. export the 4,000 Contact IDs as a CSV, change the name of the "Id" column to "Contact_Name__c," add a "RecordTypeId" field with "082I8294817IWfiIWX" as the value in every row, and re-import it to the "Extended Contact Information" table through a data-loading tool

But, of course, the XKCD Automation Theory part of my brain wants to write a Python script to imitate option #2 and "save me the trouble" of exporting, copying/pasting, & re-importing data. Especially since, in theory, I may need to do this again.

TBD what I'll actually go with. Python code will be added to this post if I let XKCD-programmer-brain take over.

As a reminder to myself, here's a basic Python "hello-world":

from simple_salesforce import Salesforce
sf = Salesforce(username='un', password='pw', security_token='tk')
print(sf.query_all("SELECT Id, Name FROM Contact WHERE IsDeleted=false LIMIT 2"))

Output:

OrderedDict([('totalSize', 2), ('done', True), ('records', [OrderedDict([('attributes', OrderedDict([('type', 'Contact'), ('url', '/services/data/v29.0/sobjects/Contact/xyzzyID1xyzzy')])), ('Id', 'xyzzyID1xyzzy'), ('Name', 'Person One')]), OrderedDict([('attributes', OrderedDict([('type', 'Contact'), ('url', '/services/data/v29.0/sobjects/Contact/abccbID2abccb')])), ('Id', 'abccbID2abccb'), ('Name', 'Person Two')])])])

And this:

from simple_salesforce import Salesforce
sf = Salesforce(username='un', password='pw', security_token='tk')
cons = sf.query_all("SELECT Id, Name FROM Contact WHERE IsDeleted=false LIMIT 2")
for con in cons['records']:
    print(con['Id'])

Does this:

xyzzyID1xyzzy
abccbID2abccb

Whoops. Looks like there aren't any batch/bulk insert options in "simple-salesforce," and I don't feel like learning a new package. Splitting the difference and grabbing my CSV file with Python, but inserting it into Salesforce with a traditional data-loading tool.

from simple_salesforce import Salesforce
sf = Salesforce(username='un', password='pw', security_token='tk')
cons = sf.query_all("SELECT Id, Name FROM Contact WHERE Continuing_Ed__c = TRUE AND Id NOT IN (SELECT Contact_Name__c FROM Extended_Contact_Information__c WHERE RecordTypeId='082I8294817IWfiIWX') LIMIT 2")

import csv
with open('c:\tempsfoutput.csv', 'w', newline='') as csvfile:
    fieldnames = ['contact_name__c', 'recordtypeid']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames, lineterminator="\n")
    writer.writeheader()
    for con in cons['records']:    
        writer.writerow({'contact_name__c': con['Id'], 'recordtypeid': '082I8294817IWfiIWX'})

print('done')

And then the CSV file looks like this:

contact_name__c,recordtypeid
xyzzyID1xyzzy,082I8294817IWfiIWX
abccbID2abccb,082I8294817IWfiIWX

No comments:

Post a Comment