Pages

Friday, July 14, 2017

Python & Oracle SQL Hash + Regexp Replace Equivalents -- random note to self

Oracle SQL, for pulling from ERP:

standard_hash(regexp_replace(COLUMNNAMEGOESHERE, '\s', ''), 'SHA1')

Python, for cleaning up old values already loaded from ERP into Salesforce:

[hashlib.sha1(str(val).encode('utf-8')).hexdigest().upper() for val in (PANDASDATAFRAMEVARIABLENAMEGOESHERE['COLUMNNAMEGOESHERE'].str.replace(r'\s', ''))]

Note to self, lest I forget. This strips the whitespace (including line breaks) out of text, then condenses it into a 40-character hash of itself. Used for turning a big chunk of text that I expect to be unique in a table into a shorter Salesforce "unique external ID" text field when the table containing the text doesn't have a standard "primary key."

Thursday, July 6, 2017

REST APIs and Salesforce - A Primer

A non-programmer colleague was getting advice to integrate an external database with Salesforce by building a "REST API" inside of Salesforce, and I chimed in because for a long time, 3 different ways of referring to "REST APIs" and "Salesforce" confused me.
 

Here's a breakdown of them that I hope will facilitate discussions with developers and help you learn the right kind of programming for your needs, if you're teaching yourself:
 

  1. When people mention "building REST APIs inside of Salesforce," they mean using the "Apex" language to program a specific Salesforce database ("org") to ACCEPT requests FROM THE OUTSIDE WORLD to engage in a computer-to-computer data-transferring conversation over the web.

    The data can potentially flow both ways (more in a second), but only via the outside world knocking on your Salesforce org's door and saying, "Hey, I've got another communications request!" in the exact manner that you've said you want your Salesforce org to be communicated with.

    That's what "writing an API" means -- it means using code to define how a computer is willing to have its door knocked on and be talked to.

    When you "build a REST API" inside of Salesforce, you're building a door in your fortress and posting a guard by it who's cautious about how she interacts with strangers on the other side.
    • You might program your Salesforce org to know what to do with data from the outside world, but only when delivered correctly (e.g. "If you have something for the fortress, bring it in a series of pink 12x12x12 boxes, only paint the first one green and the last one purple, and put your name and return address on all of them -- if we like them, we'll figure out what to do with the stuff inside the boxes from there, and after we're done, we'll post a receipt on the door telling you whether we handled them neatly or burned them").
    • Or you might program your Salesforce org to hand out data from inside the fortress to the outside world, if asked just right ("Tap out 'shave-and-a-haircut-two-bits,' then the morse code for your name and return address, and a description of the data you want, and we'll FedEx a list to you").
    Or you might set up communications protocols for each, to really facilitate 2-way conversation--yet note that every piece of that conversation is initiated by some "outsider" piece of software knocking on your fortress's door in a specific way, no matter which way the "data you care about" is flowing.
     
     
  2. There's ALSO a general-purpose API, built into every Salesforce database ("org") in existence, that uses the "REST" communications protocol. Salesforce defines and maintains how it works. It's a door that Salesforce the corporation cut into your fortress and posted a guard at.

    This "API" is what the authors of "ETL" tools like Apex Data Loader & dataloader.io & Jitterbit & Zapier study carefully before writing such tools.

    Technically, you can write your own tool taking advantage of this protocol, but most people don't bother, since other companies have done such great work.

    Also, if you do option #1, you can sometimes INSERT more data in fewer "API calls" than by using tools that leverage the general-purpose Salesforce "REST" data API.

    Both option #1 & option #2 give you the chance to write code inside your Salesforce database that post-processes incoming data according to your business's needs -- although in the case of approach #1, you might bundle that logic into the back end of the API for efficiency, whereas in the case of approach #2, you have to use "triggers" and such.
     
     
  3. Finally, you can write Apex code within your Salesforce org that goes knocking on the doors of OTHER fortresses.

    Doing so might involve writing Apex that uses the "REST" protocol and someone else's "API," but it's not at all the same thing as the 1st concept we discussed.

    When learning how to do this, you'll often hear terms like "writing asynchronous Apex" and "making HTTP requests with Apex."

Thoughts:

Most admins' approaches to connecting an outside database land in the realm of approach #2, using someone else's "ETL" tool -- maybe with some "developer" help to write "triggers" to post-process incoming data according to your business needs. It's the easiest approach by far.

Some "apps" and "packages" and "plugins" might have done #1 as part of the software package they deliver to you (that is, they might cut a hole in your fortress and post a guard).

Personally, I've barely dabbled in "approach #2, roll-your-own-ETL-tool," and I've only been to classes and read books on approaches #1 & #3.

Watching from the outside, it seems to me that "approach #1" & "approach #2, roll-your-own-ETL-tool" achieve pretty similar business-need ends and require similar levels of developer skill.

  • It's just a question of whether you prefer to be constrained to programming in Salesforce's language, yet have more control over Salesforce database performance (approach #1),
  • Or whether you prefer to be a little less efficient, but have more control over the language in which you program ("approach #2, roll-your-own-ETL-tool").

(Although even in the case of approach #1, you'll probably be writing some sort of "ETL" software in your favorite language that does the actual knocking on your Salesforce database's ("org's") door -- it'll just probably be a lot simpler to write than whatever you would've written that relies solely upon Salesforce's standard, built-in API.)


Finally, a brief analogy for any Oracle developers following along:

REST-based data transmission : data flow in & out of Salesforce databases :: ODBC/JDBC-based data transmission : data flow in & out of traditional relational databases

Specifically, the built-in "approach #2" to using REST might be most comparable -- not sure how many database management systems let you write your own APIs into them (#1) or make callouts to internet services from them (#3).