Scaling AEM using AWS-S3 with TAR-MK
Prerequisites
- AEM 6.5
- Latest Service pack (I am using 6.5.8 here).
- S3 connector (Feature pack version 1.10.10)
- Amazon S3 bucket.
Required feature pack
com.adobe.granite.oak.s3connector-1.10.10.zip
Note :S3 connector feature pack has dependency over the service packs, as it uses oak as dependencies and AEM should be updated with a equal/higher version of oak than the one s3 connector is using , look for the service packs POM to know which version of OAK is being used.
<oak.version>1.22.2</oak.version>
Run modes
AEM needs to be started using crx3tar-nofds run mode if S3 Data store to be configured with TarMK.
java -jar <aem-jar-file>.jar -r crx3tar-nofds
AEM has two stores
-
Node Store
Content nodes are stored in a node store.
Segment Node Store for TarMK and Document Node Store for MongoMK
The segment node store is the basis of Adobe’s TarMK implementation in AEM6.
PID : org.apache.jackrabbit.oak.segment.SegmentNodeStoreService
Segment Node Configuration Options
Options Description Default value repository.home Path to repository home under which repository-related data is stored. crx-quickstart/segmentstore tarmk.size Maximum size of a segment in MB 256 MB customBlobStore Boolean value indicating that a custom data store is used. True for AEM 6.3 and later versions False for Prior to AEM 6.3
-
Data Store
The binary data is stored in a data store.
Data Store Configuration : Amazon S3 Data Store
PID : org.apache.jackrabbit.oak.plugins.blob.datastore.S3DataStore.config
Steps to configure Amazon S3 as Data store –
- Extract the contents of the feature pack zip file to a temporary folder.
- Go to the temporary folder and navigate to the following location: jcr_root/libs/system/install
- Copy all the contents from the above location to <aem-install>/crx-quickstart/install
- copy org.apache.jackrabbit.oak.plugins.blob.datastore.S3DataStore.config file from the following folder:
<feature pack>/jcr_root/libs/system/config
to
<aem-install>/crx-quickstart/install
- Edit the file and add the configuration options required by your setup.
- Start AEM.
AWS S3 connector configuration options
Options | Description | Default value |
accessKey | The AWS access key | |
secretKey | The AWS secret access key. Note: Alternatively, IAM roles can be used for authentication. If you are using IAM roles you no longer need to specify the accessKey and secretKey. |
|
s3Bucket | The bucket name. | |
s3Region | The bucket region. | |
path | The path of the data store. | <AEM install folder>/repository/datastore |
minRecordLength | The minimum size of an object that should be stored in the data store. | 16KB |
maxCachedBinarySize | Binaries with size less than or equal to this size will be stored in the memory cache. The size is in bytes. | 17408 |
cacheSize | The size of the cache. The value is specified in bytes. | 64GB |
secret | Only to be used if using binaryless replication for shared datastore setup. | |
stagingSplitPercentage | The percentage of cache size configured to be used for staging asynchronous uploads. | 10 |
uploadThreads | The number of upload threads that are used for asynchronous uploads. | 10 |
stagingPurgeInterval | The interval in seconds for purging finished uploads from the staging cache. | 300 seconds |
stagingRetryInterval | The retry interval in seconds for failed uploads. | 600 seconds |
Observation
All the binaries will be stored under S3, except the one has smaller size than minRecordLength
Reference
- https://experienceleague.adobe.com/docs/experience-manager-65/deploying/deploying/data-store-config.html?lang=en