Before archiving, it is important to understand the difference between Content Document and Content version (two different objects in Salesforce):
Whenever a file is uploaded to the Salesforce CRM library, or while uploading a file for any record in its record page, a ContentDocument is created, reflecting the record details and relations to the parent object. For more information on how the Content Version object is structured, please refer to the following Salesforce knowledge article.
The following query results in a list of all ContentDocuments that have an Opportunity parent record
SELECT Id, ContentDocumentId, LinkedEntityId FROM ContentDocumentLink WHERE LinkedEntityId IN (SELECT Id FROM Opportunity)
A ContentDocument’s fields and relations are described under the ContentVersion object in Salesforce Setup.
Run the following query to extract record IDs:
SELECT ContentDocumentId, Id FROM ContentVersion
The query extracts the list of IDs of ContentVersion for the related contentDocument.
This article outlines how to archive ContentDocument records with filtering criteria from the parent object. For example, archiving files that are related to Opportunities when the opportunity stage is ‘Closed Lost’. This is a business use case where the organization would like to archive only the files where the Opportunity record for example needs to be kept in the environment in order to identify duplicate records in the future.
For example, archiving files related to Opportunities when the opportunity stage is ‘Closed Lost’. This is a situation where an organization only wants to archive files e.g. an Opportunity record, needs to be kept in the environment so that duplicate records can be identified in the future.
ISPICKVAL( ‘ContentDocLookupOppty’__r.StageName ,'Closed Lost')
SELECT Id FROM ContentDocument WHERE Id IN (SELECT ContentDocumentId FROM ContentVersion WHERE Formula_checkbox_API_Name = TRUE)