improved rbac-performance-analysis.md documentation and some cleanup

This commit is contained in:
Michael Hoennig 2024-08-01 09:48:13 +02:00
parent 9a05cad38e
commit c1a49d198f
3 changed files with 53 additions and 8 deletions

View File

@ -1,13 +1,16 @@
# RBAC Performance Analysis # RBAC Performance Analysis
This describes the analysis of the legacy-data-import which took way too long, which turned out to be a problem in the RBAC-access-rights-check. This describes the analysis of the legacy-data-import which took way too long, which turned out to be a problem in the RBAC-access-rights-check as well as `EntityManager.persist` creating too many SQL queries.
## Our Performance-Problem ## Our Performance-Problem
During the legacy data import for hosting assets we noticed massive performance problems. The import of about 2200 hosting-assets (IP-numbers, managed-webspaces, managed- and cloud-servers) as well as the creation of booking-items and booking-projects as well as necessary office-data entities (persons, contacts, partners, debitors, relations) **took 10-25 minutes**. During the legacy data import for hosting assets we noticed massive performance problems. The import of about 2200 hosting-assets (IP-numbers, managed-webspaces, managed- and cloud-servers) as well as the creation of booking-items and booking-projects as well as necessary office-data entities (persons, contacts, partners, debitors, relations) **took 25 minutes**.
We could not find a pattern, why the import mostly took about 25 minutes, but sometimes took *just* 10 minutes. The impression that it had to do with too many other parallel processes, e.g. browser with BBB or IntelliJ IDEA was proved wrong, but stopping all unnecessary processes and performing the import again. Importing hosting assets up to UnixUsers and EmailAddresses even **took about 100 minutes**.
(The office data import sometimes, but rarely, took only 10min.
We could not find a pattern, why that was the case. The impression that it had to do with too many other parallel processes, e.g. browser with BBB or IntelliJ IDEA was proved wrong, but stopping all unnecessary processes and performing the import again.)
## Preparation ## Preparation
@ -308,10 +311,48 @@ We changed these mappings from `EAGER` (default) to `LAZY` to `@ManyToOne(fetch
Now, finally, the total runtime of the import was down to 12 minutes. This is repeatable, where originally, the import took about 25mins in most cases and just rarely - and for unknown reasons - 10min. Now, finally, the total runtime of the import was down to 12 minutes. This is repeatable, where originally, the import took about 25mins in most cases and just rarely - and for unknown reasons - 10min.
### Importing UnixUser and EmailAlias Assets
But once UnixUser and EmailAlias assets got added to the import, the total time went up to about 110min.
This was not acceptable, especially not, considering that domains, email-addresses and database-assets are almost 10 times that number and thus the import would go up to over 1100min which is 20 hours.
In a first step, a `HsHostingAssetRawEntity` was created, mapped to the raw table (hs_hosting_asset) not to the RBAC-view (hs_hosting_asset_rv). Unfortunately we did not keep measurements, but that was only part of the problem anyway.
The main problem was, that there is something strange with persisting (`EntityManager.persist`) for EmailAlias assets. Where importing UnixUsers was mostly slow due to RBAC SELECT-permission checks, persisting EmailAliases suddenly created about a million (in numbers 1.000.000) SQL UPDATE statements after the INSERT, all with the same data, just increased version number (used for optimistic locking). We were not able to figure out why this happened.
Keep in mind, it's the same table with the same RBAC-triggers, just a different value in the type column.
Once `EntityManager.persist` was replaced by an explicit SQL INSERT - just for `HsHostingAssetRawEntity`, the total time was down to 17min. Thus importing the UnixUsers and EmailAliases took just 5min, which is an acceptable result. The total import of all HostingAssets is now estimated to about 1 hour (on my developer laptop).
## Further Options To Explore
1. Instead of separate SQL INSERT statements, we could try bulk INSERT.
2. We could use the SQL INSERT method for all entity-classes, or at least for all which have high row counts.
3. For the production code, we could use raw-entities for referenced entities, here usually RBAC SELECT permission is given anyway.
## Summary ## Summary
That the import runtime is down to about 12min is repeatable, where originally, the import took about 25mins in most cases and just rarely - and for unknown reasons - just 10min. ### What we did Achieve?
In a first step, the total import runtime for office entities was reduced from about 25min to about 10min.
In a second step, we reduced the import of booking- and hosting-assets from about 100min (not counting the required office entities) to 5min.
### What Helped?
Merging the recursive CTE query to determine the RBAC SELECT-permission, made it more clear which business-queries take the time. Merging the recursive CTE query to determine the RBAC SELECT-permission, made it more clear which business-queries take the time.
Avoiding EAGER-loading where not neccessary, reduced the total runtime of the import to about the half. Avoiding EAGER-loading where not necessary, reduced the total runtime of the import to about the half.
The major improvement came from using direct INSERT statements, which then also bypassed the RBAC SELECT permission checks.
### What Still Has To Be Done?
Where this performance analysis was mostly helping the performance of the legacy data import, we still need measures and improvements for the productive code.
For sure, using more LAZY-loading also helps in the production code. For some more ideas see section _Further Options To Explore_.

View File

@ -11,6 +11,7 @@ import net.hostsharing.hsadminng.stringify.Stringifyable;
import java.util.List; import java.util.List;
import java.util.Map; import java.util.Map;
import java.util.Optional; import java.util.Optional;
import java.util.UUID;
import static java.util.Collections.emptyMap; import static java.util.Collections.emptyMap;
import static net.hostsharing.hsadminng.stringify.Stringify.stringify; import static net.hostsharing.hsadminng.stringify.Stringify.stringify;
@ -27,6 +28,8 @@ public interface HsHostingAsset extends Stringifyable, RbacObject<HsHostingAsset
.withProp(HsHostingAsset::getConfig) .withProp(HsHostingAsset::getConfig)
.quotedValues(false); .quotedValues(false);
void setUuid(UUID uuid);
HsHostingAssetType getType(); HsHostingAssetType getType();
HsHostingAsset getParentAsset(); HsHostingAsset getParentAsset();
void setIdentifier(String s); void setIdentifier(String s);

View File

@ -142,7 +142,7 @@ public class CsvDataImport extends ContextBasedTest {
public <T extends RbacObject> T persist(final Integer id, final T entity) { public <T extends RbacObject> T persist(final Integer id, final T entity) {
try { try {
if (entity instanceof HsHostingAssetRawEntity ha ) { // && ha.getType() == HsHostingAssetType.EMAIL_ALIAS) { if (entity instanceof HsHostingAsset ha) {
//noinspection unchecked //noinspection unchecked
return (T) persistViaSql(id, ha); return (T) persistViaSql(id, ha);
} }
@ -164,7 +164,7 @@ public class CsvDataImport extends ContextBasedTest {
} }
@SneakyThrows @SneakyThrows
public RbacObject<HsHostingAsset> persistViaSql(final Integer id, final HsHostingAssetRawEntity entity) { public RbacObject<HsHostingAsset> persistViaSql(final Integer id, final HsHostingAsset entity) {
if (entity.getUuid() == null) { if (entity.getUuid() == null) {
entity.setUuid(UUID.randomUUID()); entity.setUuid(UUID.randomUUID());
} }
@ -203,8 +203,9 @@ public class CsvDataImport extends ContextBasedTest {
.setParameter("caption", entity.getCaption()) .setParameter("caption", entity.getCaption())
.setParameter("config", entity.getConfig().toString()) .setParameter("config", entity.getConfig().toString())
.setParameter("version", entity.getVersion()); .setParameter("version", entity.getVersion());
logError(() -> {
final var count = query.executeUpdate(); final var count = query.executeUpdate();
logError(() -> {
assertThat(count).isEqualTo(1); assertThat(count).isEqualTo(1);
}); });
return entity; return entity;