API Optimization Using Bulk API and Bulk API 2.0

    Overview

    This article describes Bulk API, and how it can be used when backing up a Salesforce service with Recover. Additionally, this article outlines how to configure Bulk API and Bulk API 2.0 to best optimize your backups.

    The enhancement increases customer backup flexibility, optimizes backups, and can decrease run time.

    What are Bulk APIs?

    In the application, Bulk APIs are used to optimize loading or deleting large sets of data. When compared to using only REST APIs, using Bulk APIs reduces the backup run time, especially when dealing with large objects (such as large tables with a large number of records).

    The application includes a feature that enables the use of Bulk APIs (as opposed to REST APIs). By default, this function is enabled. The recently released Bulk API 2.0 enhancement provides the ability to simply define the number of Bulk API batches per day, and/or the total size of query results for Bulk API 2.0 for each backup service to simultaneously optimize and reduce backup run times. This also helps optimize Bulk API consumption in Salesforce.

    NOTE: By default, Bulk API is consumed first, up to the limit set, and then Bulk API 2.0. To change this, contact support.
    NOTE: The minimum threshold for automatically applying (triggering) Bulk API consumption to a Salesforce object, is 500,000 records.

    What is Bulk API 2.0

    Bulk API 2.0 is optimized for loading or deleting large sets of data. It can be used to query, insert, update, or delete many records by submitting batches.

    NOTE: We use Bulk API 2.0 only for querying.

    Any data operation that includes more than 2,000 records is a good candidate for Bulk API 2.0 to execute and manage a Salesforce workflow.

    The Differences between Bulk API and Bulk API 2.0

    The main difference Between Bulk API and Bulk API 2.0, is that Bulk 2.0 uses a different limit mechanism than the older Bulk API. The older version of Bulk API limits the number of records per batch, and the number of batches per day (for example: 15,000 batches per day, where each batch can have up to 10,000 records, hence 15K x 10K = 150 million records per day). Whereas the Bulk API 2.0 limits are based on the maximum size of query results (in GB) per backup. This limit mechanism makes Bulk API 2.0 particularly useful for optimizing API consumption for large objects, such as large tables.

    Backup for Salesforce using Bulk API 2.0

    Bulk API 2.0 available from API version 41.0 offers a new way of querying data similar to Bulk API. The main difference is the daily limits. Bulk API V2, limits the maximum number of records (100 million) per 24-hour period, instead of limiting the number of Bulk Jobs and Batches. Both Bulk API and Bulk API 2.0 use the same REST API framework as other Salesforce REST APIs. 
    We recommend using the new Bulk API 2.0 feature, for customers who might exceed the Bulk API consumption limit, as well as for speeding up backups of large data sets with large tables. 
    For more information on Backup for Salesforce using Bulk API 2.0, see here.

    Limits and Other Considerations

    When defining API Consumption limits in the Options page, several considerations should be taken into account, based on the data characteristics of the specific org that’s being backed up.

    Some Salesforce orgs have data sets consisting of many records that are not necessarily large, while other orgs may have data sets having a moderate number of records, that are each data-heavy. Therefore, data sets with a large number of records might consume a large number of batch API calls, even if they are not data-heavy. The usage of consumed APIs can be optimized using the Bulk API 2.0 method, when used along with the older Bulk API method (at the same time). 

    For example: If each batch is defined to contain up to 10,000 records, then a data set with 15 million records will consume 1500 batches (i.e. API queries per day, using Bulk API). If the batch limit is set to 15,000 batches per day, then the table in this example would use up 10% (1500 batches) of the entire day’s limit.  
    Therefore, we have added Bulk API 2.0, as an option for customers who might have a conflict with the 15K batch limit. The Bulk API 2.0 can now be used along with the older Bulk API method, at the same time.     

    By using the Bulk API 2.0 method, query jobs do not need to consume 1500 batches. Since each job corresponds to 1 table, therefore the 15M account records will only consume one j query job (from the 10,000 jobs daily limit). Thereby, we could minimize the waste of Bulk API calls (and save them for other jobs), by using Bulk API 2.0 to deal with the jobs that process a large number of records in a single table.

    NOTE: By default, the maximum volume of data per daily backup is set to 300 GB. This limit can be configured by the user.

    For more information on managing API limits, see Recover for Salesforce Managing API Limits.

     

    « Previous ArticleNext Article »