From aede5bc8d6b907f27eef1200db62773ebc7be8fe Mon Sep 17 00:00:00 2001 From: Michael Hoennig Date: Thu, 22 Aug 2024 12:40:41 +0200 Subject: [PATCH 1/6] baseline document potential rbac optimizations --- doc/rbac-performance-analysis.md | 52 ++++- ...e-cte-experiments-for-accessible-uuids.sql | 182 ++++++++++++------ 2 files changed, 170 insertions(+), 64 deletions(-) diff --git a/doc/rbac-performance-analysis.md b/doc/rbac-performance-analysis.md index 504d1639..dcc96621 100644 --- a/doc/rbac-performance-analysis.md +++ b/doc/rbac-performance-analysis.md @@ -15,7 +15,7 @@ We could not find a pattern, why that was the case. The impression that it had t ## Preparation -### Configuring PostgreSQL +### Configuring PostgreSQL The pg_stat_statements PostgreSQL-Extension can be used to measure how long queries take and how often they are called. @@ -355,6 +355,56 @@ In production, the `SELECT ... FROM hs_office_relation_rv` (No. 2) with about 0. 3. For the production code, we could use raw-entities for referenced entities, here usually RBAC SELECT permission is given anyway. +## The Problematically Huge Join + +The origin problem was the expensive RBAC check for many SELECT queries. +This consists of two parts: + +1. The recursive CTE query to determine which object's UUIDs are visible for the current subject. + This query itself takes currently about 250ms thus is no problem by itself as long as we only need it once per request. +2. Joining the result from 1. with the result if a business query. + The performance of the business query itself is no problem, for the join see the following explanations. + +Superusers can see all objects (currently already over 90.000) +and even high level roles of customers with many hosting assets can see several thousand objects. +This is the one side of that problematic join. + +The other side of that problematic is the result of the business query. +For example if a user wants to select all of their e-mail-addresses, that might easily half of the visible objects. + +Thus, we would have a join of for example 5.000 x 2.500 rows, which is going to be slow. +As there are currently about 84.000 objects are hosting assets and 33.000 e-mail-addresses in our system, +for a superuser we would even run into an 84.0000 x 33.0000 join. + +We found some solution approaches: + +1. Getting rid of the `rbacrole` and `rbacpermission` table and only having implicit roles with implicit grants (OWNER->ADMIN->AGENT->TENENT->REFERRER) by comparison of ordered enum values and fixed permission assignments (e.g. OWENER->DELETE, ADMIN->UPDATE etc.). We could also get rid of the table `rbacreferece` if we enter users as business objects. + + This should reduce + + +### Adding The Object Type To The Table `rbacObject` + +This optimization idea came from Michael Hierweck. +Its idea is to reduce the size of the result of the recursive CTE query and maybe even speed up that query itself. + +To evaluate this, I added a type column to the `rbacObject` table, initially as an enum hsHostingAssetType. Then I entered the type there for all rows from hs_hosting_asset. This means that 83,886 of 92,545 rows in `rbacobject` have a type set, leaving 8,659 without. + +If we do this for other types (we currently have 1,271 relations and 927 booking items), it gets more complicated because they are different enum types. As varchar(16), we could lose performance again due to the higher storage space requirements. + +But the performance gained is not particularly high anyway. +See the average seconds per recursive CTE select as role 'hs_hosting_asset:defaultproject:ADMIN', +joined with business query for all `'EMAIL_ADDRESSES'`: + +| | D-1000000-hsh | D-1000300-mih | +|-----------------------------------------------------|------------------|---------------| +| currently (without type comparision in rbacobject): | ~3.30 - ~3.49 | ~0.23 | +| optimized (with type comparision in rbacobject): | ~2.99 - ~3.08 | ~0.21 | + +As you can see, the query is no problem at all for normal customers (in the example, yours truly). With Hostsharing (D-1000000-hsh) it is quite slow. + +My optimization, which is not easy to try out, would only improve the part of the recursive CTE query. This is not necessary at the moment, but perhaps if we grow significantly. + ## Summary ### What we did Achieve? diff --git a/sql/recursive-cte-experiments-for-accessible-uuids.sql b/sql/recursive-cte-experiments-for-accessible-uuids.sql index f8795961..62f88bc2 100644 --- a/sql/recursive-cte-experiments-for-accessible-uuids.sql +++ b/sql/recursive-cte-experiments-for-accessible-uuids.sql @@ -1,5 +1,7 @@ -- just a permanent playground to explore optimization of the central recursive CTE query for RBAC +select * from hs_statistics_view; + rollback transaction; begin transaction; SET TRANSACTION READ ONLY; @@ -11,71 +13,99 @@ select count(type) as counter, type from hs_hosting_asset_rv order by counter desc; commit transaction; +-- ======================================================== + +DO language plpgsql $$ +DECLARE + start_time timestamp; + end_time timestamp; + total_time interval; + letter char(1); +BEGIN + start_time := clock_timestamp(); + + call defineContext('performance testing', null, 'superuser-alex@hostsharing.net', + 'hs_booking_project#D-1000000-hshdefaultproject:ADMIN'); +-- 'hs_booking_project#D-1000300-mihdefaultproject:ADMIN'); + SET TRANSACTION READ ONLY; + + FOR i IN 0..25 LOOP + letter := chr(i+ascii('a')); + + perform count(*) from ( + with accessible_hs_hosting_asset_uuids as ( + + with recursive + recursive_grants as + (select distinct rbacgrants.descendantuuid, + rbacgrants.ascendantuuid, + 1 as level, + true + from rbacgrants + where rbacgrants.assumed + and (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) + union all + select distinct g.descendantuuid, + g.ascendantuuid, + grants.level + 1 as level, + assertTrue(grants.level < 22, 'too many grant-levels: ' || grants.level) + from rbacgrants g + join recursive_grants grants on grants.descendantuuid = g.ascendantuuid + where g.assumed), + grant_count as (select count(*) as grant_count + from recursive_grants), + count_check as (select assertTrue((select grant_count from grant_count) < 600000, + 'too many grants for current subjects: ' || + (select grant_count from grant_count)) + as valid) + select distinct perm.objectuuid + from recursive_grants + join rbacpermission perm on recursive_grants.descendantuuid = perm.uuid + join rbacobject obj on obj.uuid = perm.objectuuid + join count_check cc on cc.valid + where obj.objecttable::text = 'hs_hosting_asset'::text + and obj.type = 'EMAIL_ADDRESS'::hshostingassettype + ) + select type, + -- count(*) as counter + target.uuid, + -- target.version, + -- target.bookingitemuuid, + -- target.type, + -- target.parentassetuuid, + -- target.assignedtoassetuuid, + target.identifier, + target.caption + -- target.config, + -- target.alarmcontactuuid + from hs_hosting_asset target + where (target.uuid in (select accessible_hs_hosting_asset_uuids.objectuuid + from accessible_hs_hosting_asset_uuids)) + and target.type = 'EMAIL_ADDRESS' + and identifier like letter || '%' + -- and target.type in ('EMAIL_ADDRESS', 'CLOUD_SERVER', 'MANAGED_SERVER', 'MANAGED_WEBSPACE') + -- order by target.identifier; + -- group by type + -- order by counter desc +) timed; +END LOOP; + +end_time := clock_timestamp(); +total_time := end_time - start_time; + +RAISE NOTICE 'average execution time: %', total_time/26; + +END; +$$; + +-- average seconds per recursive CTE select as role 'hs_hosting_asset:defaultproject:ADMIN' +-- joined with business query for all 'EMAIL_ADDRESSES': +-- D-1000000-hsh D-1000300-mih +-- - without type comparision in rbacobject: ~3.30 - ~3.49 ~0.23 +-- - with type comparision in rbacobject: ~2.99 - ~3.08 ~0.21 - -rollback transaction; -begin transaction; -SET TRANSACTION READ ONLY; -call defineContext('performance testing', null, 'superuser-alex@hostsharing.net', - 'hs_booking_project#D-1000000-hshdefaultproject:ADMIN'); --- 'hs_booking_project#D-1000300-mihdefaultproject:ADMIN'); - -with accessible_hs_hosting_asset_uuids as - (with recursive - recursive_grants as - (select distinct rbacgrants.descendantuuid, - rbacgrants.ascendantuuid, - 1 as level, - true - from rbacgrants - where rbacgrants.assumed - and (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) - union all - select distinct g.descendantuuid, - g.ascendantuuid, - grants.level + 1 as level, - assertTrue(grants.level < 22, 'too many grant-levels: ' || grants.level) - from rbacgrants g - join recursive_grants grants on grants.descendantuuid = g.ascendantuuid - where g.assumed), - grant_count AS ( - SELECT COUNT(*) AS grant_count FROM recursive_grants - ), - count_check as (select assertTrue((select count(*) as grant_count from recursive_grants) < 300000, - 'too many grants for current subjects: ' || (select count(*) as grant_count from recursive_grants)) - as valid) - select distinct perm.objectuuid - from recursive_grants - join rbacpermission perm on recursive_grants.descendantuuid = perm.uuid - join rbacobject obj on obj.uuid = perm.objectuuid - join count_check cc on cc.valid - where obj.objecttable::text = 'hs_hosting_asset'::text) -select type, --- count(*) as counter - target.uuid, --- target.version, --- target.bookingitemuuid, --- target.type, --- target.parentassetuuid, --- target.assignedtoassetuuid, - target.identifier, - target.caption --- target.config, --- target.alarmcontactuuid - from hs_hosting_asset target - where (target.uuid in (select accessible_hs_hosting_asset_uuids.objectuuid - from accessible_hs_hosting_asset_uuids)) - and target.type in ('EMAIL_ADDRESS', 'CLOUD_SERVER', 'MANAGED_SERVER', 'MANAGED_WEBSPACE') --- and target.type = 'EMAIL_ADDRESS' --- order by target.identifier; --- group by type --- order by counter desc -; -commit transaction; - - - +-- ============================================================================= rollback transaction; begin transaction; @@ -140,3 +170,29 @@ with grants as ( ) select * from rbacgrants_ev gev where exists ( select uuid from grants where gev.uuid = grants.uuid ); + + +alter table rbacobject + -- just for performance testing, we would need a joined enum or a varchar(16) which would make it slow + add column type hshostingassettype; + +rollback transaction; +begin transaction; +call defineContext('setting rbacobject.type from hs_hosting_asset.type', null, 'superuser-alex@hostsharing.net'); + + UPDATE rbacobject + SET type = hs.type + FROM hs_hosting_asset hs + WHERE rbacobject.uuid = hs.uuid; + +end transaction; + +select + (select count(*) from hs_office_relation) as "relation", + (select count(*) from hs_booking_item) as "booking item"; + +select + (select count(*) as "total" from rbacobject), + (select count(*) as "not null" from rbacobject where type is not null), + (select count(*) as "null" from rbacobject where type is null); + -- 2.39.5 From 6dd20db238b471083491fe3c080e69811d6a8026 Mon Sep 17 00:00:00 2001 From: Michael Hoennig Date: Thu, 22 Aug 2024 14:31:15 +0200 Subject: [PATCH 2/6] improved formatting --- ...e-cte-experiments-for-accessible-uuids.sql | 50 +++++++++---------- 1 file changed, 23 insertions(+), 27 deletions(-) diff --git a/sql/recursive-cte-experiments-for-accessible-uuids.sql b/sql/recursive-cte-experiments-for-accessible-uuids.sql index 62f88bc2..296fa71a 100644 --- a/sql/recursive-cte-experiments-for-accessible-uuids.sql +++ b/sql/recursive-cte-experiments-for-accessible-uuids.sql @@ -29,21 +29,23 @@ BEGIN -- 'hs_booking_project#D-1000300-mihdefaultproject:ADMIN'); SET TRANSACTION READ ONLY; - FOR i IN 0..25 LOOP + FOR i IN 0..0 LOOP letter := chr(i+ascii('a')); perform count(*) from ( + + -- start of VIEW hs_hosting_asset_rv: with accessible_hs_hosting_asset_uuids as ( with recursive - recursive_grants as - (select distinct rbacgrants.descendantuuid, + recursive_grants as ( + select distinct rbacgrants.descendantuuid, rbacgrants.ascendantuuid, 1 as level, true from rbacgrants - where rbacgrants.assumed - and (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) + where (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) + --and rbacgrants.assumed union all select distinct g.descendantuuid, g.ascendantuuid, @@ -51,44 +53,38 @@ BEGIN assertTrue(grants.level < 22, 'too many grant-levels: ' || grants.level) from rbacgrants g join recursive_grants grants on grants.descendantuuid = g.ascendantuuid - where g.assumed), - grant_count as (select count(*) as grant_count - from recursive_grants), - count_check as (select assertTrue((select grant_count from grant_count) < 600000, - 'too many grants for current subjects: ' || - (select grant_count from grant_count)) - as valid) + where g.assumed + ), + grant_count as ( + select count(*) as grant_count from recursive_grants + ), + count_check as ( + select assertTrue((select grant_count from grant_count) < 600000, + 'too many grants for current subjects: ' || (select grant_count from grant_count)) as valid + ) select distinct perm.objectuuid from recursive_grants join rbacpermission perm on recursive_grants.descendantuuid = perm.uuid join rbacobject obj on obj.uuid = perm.objectuuid join count_check cc on cc.valid where obj.objecttable::text = 'hs_hosting_asset'::text - and obj.type = 'EMAIL_ADDRESS'::hshostingassettype + and obj.type = 'EMAIL_ADDRESS'::hshostingassettype -- with/without this type condition ) + -- end of VIEW hs_hosting_asset_rv. + + -- start of business query, usually based on a view according to the above CTE query: select type, - -- count(*) as counter target.uuid, - -- target.version, - -- target.bookingitemuuid, - -- target.type, - -- target.parentassetuuid, - -- target.assignedtoassetuuid, target.identifier, target.caption - -- target.config, - -- target.alarmcontactuuid from hs_hosting_asset target where (target.uuid in (select accessible_hs_hosting_asset_uuids.objectuuid from accessible_hs_hosting_asset_uuids)) and target.type = 'EMAIL_ADDRESS' and identifier like letter || '%' - -- and target.type in ('EMAIL_ADDRESS', 'CLOUD_SERVER', 'MANAGED_SERVER', 'MANAGED_WEBSPACE') - -- order by target.identifier; - -- group by type - -- order by counter desc -) timed; -END LOOP; + -- end of business query. + ) timed; + END LOOP; end_time := clock_timestamp(); total_time := end_time - start_time; -- 2.39.5 From 897ab191c70596c2d6a27315f48472d3f5a22b74 Mon Sep 17 00:00:00 2001 From: Michael Hoennig Date: Thu, 22 Aug 2024 17:39:54 +0200 Subject: [PATCH 3/6] better formatted SQL and documentation of performance analysis for SELECT --- doc/rbac-performance-analysis.md | 20 +- ...e-cte-experiments-for-accessible-uuids.sql | 198 ++++++------------ 2 files changed, 80 insertions(+), 138 deletions(-) diff --git a/doc/rbac-performance-analysis.md b/doc/rbac-performance-analysis.md index dcc96621..fcd9781a 100644 --- a/doc/rbac-performance-analysis.md +++ b/doc/rbac-performance-analysis.md @@ -380,13 +380,21 @@ We found some solution approaches: 1. Getting rid of the `rbacrole` and `rbacpermission` table and only having implicit roles with implicit grants (OWNER->ADMIN->AGENT->TENENT->REFERRER) by comparison of ordered enum values and fixed permission assignments (e.g. OWENER->DELETE, ADMIN->UPDATE etc.). We could also get rid of the table `rbacreferece` if we enter users as business objects. - This should reduce + This should dramatically reduce the size of the table `rbackgrant` as well as the recusion levels. + + But since we only apply this query once for each business query, that would only improve performance once we have way more objects in our system, but does not help our current problem. + + It's quite some effort to implement even just a prototype, so we did not further explore this idea. + +2. Adding the object type to the table `rbacObject` to reduce the size of the result of the recursive CTE query. + + See chapter below. ### Adding The Object Type To The Table `rbacObject` -This optimization idea came from Michael Hierweck. -Its idea is to reduce the size of the result of the recursive CTE query and maybe even speed up that query itself. +This optimization idea came from Michael Hierweck and was promising. +The idea is to reduce the size of the result of the recursive CTE query and maybe even speed up that query itself. To evaluate this, I added a type column to the `rbacObject` table, initially as an enum hsHostingAssetType. Then I entered the type there for all rows from hs_hosting_asset. This means that 83,886 of 92,545 rows in `rbacobject` have a type set, leaving 8,659 without. @@ -403,7 +411,11 @@ joined with business query for all `'EMAIL_ADDRESSES'`: As you can see, the query is no problem at all for normal customers (in the example, yours truly). With Hostsharing (D-1000000-hsh) it is quite slow. -My optimization, which is not easy to try out, would only improve the part of the recursive CTE query. This is not necessary at the moment, but perhaps if we grow significantly. +Luckily this experiment also shows that it's not a big problem, having all hosting assets in the same database table. + +Implementing this approach would be a bit difficult anyway, because we would need to transfer the type query parameter into the definition of the restricted view. We have not even the slightest idea how this could be done. + +See the related queries in [recursive-cte-experiments-for-accessible-uuids.sql](../sql/recursive-cte-experiments-for-accessible-uuids.sql). They might have changed independently since this document was written, but you can still check out the old version from git. ## Summary diff --git a/sql/recursive-cte-experiments-for-accessible-uuids.sql b/sql/recursive-cte-experiments-for-accessible-uuids.sql index 296fa71a..7fa5ab32 100644 --- a/sql/recursive-cte-experiments-for-accessible-uuids.sql +++ b/sql/recursive-cte-experiments-for-accessible-uuids.sql @@ -2,19 +2,55 @@ select * from hs_statistics_view; -rollback transaction; -begin transaction; -SET TRANSACTION READ ONLY; -call defineContext('performance testing', null, 'superuser-alex@hostsharing.net', - 'hs_booking_project#D-1000000-hshdefaultproject:ADMIN'); --- 'hs_booking_project#D-1000300-mihdefaultproject:ADMIN'); -select count(type) as counter, type from hs_hosting_asset_rv - group by type - order by counter desc; -commit transaction; - -- ======================================================== +-- An example for a restricted view (_rv) as generated by our RBAC system: +drop view if exists hs_hosting_asset_example_rv; +create view hs_hosting_asset_example_rv as + with accessible_hs_hosting_asset_uuids as ( + + with recursive + recursive_grants as ( + select distinct rbacgrants.descendantuuid, + rbacgrants.ascendantuuid, + 1 as level, + true + from rbacgrants + where (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) + --and rbacgrants.assumed + union all + select distinct g.descendantuuid, + g.ascendantuuid, + grants.level + 1 as level, + assertTrue(grants.level < 22, 'too many grant-levels: ' || grants.level) + from rbacgrants g + join recursive_grants grants on grants.descendantuuid = g.ascendantuuid + where g.assumed + ), + grant_count as ( + select count(*) as grant_count from recursive_grants + ), + count_check as ( + select assertTrue((select grant_count from grant_count) < 600000, + 'too many grants for current subjects: ' || (select grant_count from grant_count)) as valid + ) + select distinct perm.objectuuid + from recursive_grants + join rbacpermission perm on recursive_grants.descendantuuid = perm.uuid + join rbacobject obj on obj.uuid = perm.objectuuid + join count_check cc on cc.valid + where obj.objecttable::text = 'hs_hosting_asset'::text + and obj.type = 'EMAIL_ADDRESS'::hshostingassettype -- with/without this type condition + ) + select target.* + from hs_hosting_asset target + where (target.uuid in (select accessible_hs_hosting_asset_uuids.objectuuid + from accessible_hs_hosting_asset_uuids)); +-- end of the example view. + +-- ------------------------------------------------------------------------------- + +rollback transaction; DO language plpgsql $$ DECLARE start_time timestamp; @@ -24,73 +60,30 @@ DECLARE BEGIN start_time := clock_timestamp(); - call defineContext('performance testing', null, 'superuser-alex@hostsharing.net', + CALL defineContext('performance testing', null, 'superuser-alex@hostsharing.net', 'hs_booking_project#D-1000000-hshdefaultproject:ADMIN'); -- 'hs_booking_project#D-1000300-mihdefaultproject:ADMIN'); SET TRANSACTION READ ONLY; - FOR i IN 0..0 LOOP + FOR i IN 0..25 LOOP letter := chr(i+ascii('a')); + PERFORM count(*) from ( - perform count(*) from ( + -- An example for a business query based on the view: + select type, uuid, identifier, caption + from hs_hosting_asset_example_rv + where type = 'EMAIL_ADDRESS' + and identifier like letter || '%' + -- end of the business query example. - -- start of VIEW hs_hosting_asset_rv: - with accessible_hs_hosting_asset_uuids as ( + ) AS timed; - with recursive - recursive_grants as ( - select distinct rbacgrants.descendantuuid, - rbacgrants.ascendantuuid, - 1 as level, - true - from rbacgrants - where (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) - --and rbacgrants.assumed - union all - select distinct g.descendantuuid, - g.ascendantuuid, - grants.level + 1 as level, - assertTrue(grants.level < 22, 'too many grant-levels: ' || grants.level) - from rbacgrants g - join recursive_grants grants on grants.descendantuuid = g.ascendantuuid - where g.assumed - ), - grant_count as ( - select count(*) as grant_count from recursive_grants - ), - count_check as ( - select assertTrue((select grant_count from grant_count) < 600000, - 'too many grants for current subjects: ' || (select grant_count from grant_count)) as valid - ) - select distinct perm.objectuuid - from recursive_grants - join rbacpermission perm on recursive_grants.descendantuuid = perm.uuid - join rbacobject obj on obj.uuid = perm.objectuuid - join count_check cc on cc.valid - where obj.objecttable::text = 'hs_hosting_asset'::text - and obj.type = 'EMAIL_ADDRESS'::hshostingassettype -- with/without this type condition - ) - -- end of VIEW hs_hosting_asset_rv. - - -- start of business query, usually based on a view according to the above CTE query: - select type, - target.uuid, - target.identifier, - target.caption - from hs_hosting_asset target - where (target.uuid in (select accessible_hs_hosting_asset_uuids.objectuuid - from accessible_hs_hosting_asset_uuids)) - and target.type = 'EMAIL_ADDRESS' - and identifier like letter || '%' - -- end of business query. - ) timed; END LOOP; -end_time := clock_timestamp(); -total_time := end_time - start_time; - -RAISE NOTICE 'average execution time: %', total_time/26; + end_time := clock_timestamp(); + total_time := end_time - start_time; + RAISE NOTICE 'average execution time: %', total_time/26; END; $$; @@ -103,75 +96,14 @@ $$; -- ============================================================================= -rollback transaction; -begin transaction; -SET TRANSACTION READ ONLY; -call defineContext('performance testing', null, 'superuser-alex@hostsharing.net', - 'hs_booking_project#D-1000000-hshdefaultproject:ADMIN'); --- 'hs_booking_project#D-1000300-mihdefaultproject:ADMIN'); - -with one_path as (with recursive path as ( - -- Base case: Start with the row where ascending equals the starting UUID - select ascendantuuid, - descendantuuid, - array [ascendantuuid] as path_so_far - from rbacgrants - where ascendantuuid = any (currentsubjectsuuids()) - - union all - - -- Recursive case: Find the next step in the path - select c.ascendantuuid, - c.descendantuuid, - p.path_so_far || c.ascendantuuid - from rbacgrants c - inner join - path p on c.ascendantuuid = p.descendantuuid - where c.ascendantuuid != all (p.path_so_far) -- Prevent cycles - ) - -- Final selection: Output all paths that reach the target UUID - select distinct array_length(path_so_far, 1), - path_so_far || descendantuuid as full_path - from path - join rbacpermission perm on perm.uuid = path.descendantuuid - join hs_hosting_asset ha on ha.uuid = perm.objectuuid - -- JOIN rbacrole_ev re on re.uuid = any(path_so_far) - where ha.identifier = 'vm1068' - order by array_length(path_so_far, 1) - limit 1 - ) -select - ( - SELECT ARRAY_AGG(re.roleidname ORDER BY ord.idx) - FROM UNNEST(one_path.full_path) WITH ORDINALITY AS ord(uuid, idx) - JOIN rbacrole_ev re ON ord.uuid = re.uuid - ) AS name_array - from one_path; -commit transaction; - -with grants as ( - select uuid - from rbacgrants - where descendantuuid in ( - select uuid - from rbacrole - where objectuuid in ( - select uuid - from hs_hosting_asset - -- where type = 'DOMAIN_MBOX_SETUP' - -- and identifier = 'example.org|MBOX' - where type = 'EMAIL_ADDRESS' - and identifier='test@example.org' - )) -) -select * from rbacgrants_ev gev where exists ( select uuid from grants where gev.uuid = grants.uuid ); - - +-- extending the rbacobject table: alter table rbacobject -- just for performance testing, we would need a joined enum or a varchar(16) which would make it slow add column type hshostingassettype; +-- and fill the type column with hs_hosting_asset types: + rollback transaction; begin transaction; call defineContext('setting rbacobject.type from hs_hosting_asset.type', null, 'superuser-alex@hostsharing.net'); @@ -183,9 +115,7 @@ call defineContext('setting rbacobject.type from hs_hosting_asset.type', null, ' end transaction; -select - (select count(*) from hs_office_relation) as "relation", - (select count(*) from hs_booking_item) as "booking item"; +-- check the result: select (select count(*) as "total" from rbacobject), -- 2.39.5 From 77676c921755a6815bbe734317463d7db31298e3 Mon Sep 17 00:00:00 2001 From: Michael Hoennig Date: Fri, 23 Aug 2024 05:43:12 +0200 Subject: [PATCH 4/6] document what did not help and cleanup --- doc/rbac-performance-analysis.md | 18 ++++++++++++++++-- ...ve-cte-experiments-for-accessible-uuids.sql | 2 +- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/doc/rbac-performance-analysis.md b/doc/rbac-performance-analysis.md index fcd9781a..621d1248 100644 --- a/doc/rbac-performance-analysis.md +++ b/doc/rbac-performance-analysis.md @@ -417,6 +417,14 @@ Implementing this approach would be a bit difficult anyway, because we would nee See the related queries in [recursive-cte-experiments-for-accessible-uuids.sql](../sql/recursive-cte-experiments-for-accessible-uuids.sql). They might have changed independently since this document was written, but you can still check out the old version from git. +### Rearranging the Parts of the CTE-Query + +I also moved the function call which determines into its own WITH-section, with no improvement. + +Experimentally I moved the business condition into the CTE SELECT, also with no improvement. + +Such rearrangements seem to be sucessfully done by the PostgreSQL query optimizer. + ## Summary ### What we did Achieve? @@ -425,13 +433,19 @@ In a first step, the total import runtime for office entities was reduced from a In a second step, we reduced the import of booking- and hosting-assets from about 100min (not counting the required office entities) to 5min. -### What Helped? +### What did not Help? + +Rearranging the CTE query by extracting parts into WITH-clauses did not improve the performance. + +Surprisingly little performance gain (<10% improvement) came from reducing the result of the CTE query by moving the hosting asset type into RBAC-system and using it in the inner SELECT query instead of in the outer SELECT query of the application side. + +### What did Help? Merging the recursive CTE query to determine the RBAC SELECT-permission, made it more clear which business-queries take the time. Avoiding EAGER-loading where not necessary, reduced the total runtime of the import to about the half. -The major improvement came from using direct INSERT statements, which then also bypassed the RBAC SELECT permission checks. +The major improvement came from using direct INSERT statements, which avoided some SELECT statements unnecessarily generated by the EntityManager and also completely bypassed the RBAC SELECT permission checks. ### What Still Has To Be Done? diff --git a/sql/recursive-cte-experiments-for-accessible-uuids.sql b/sql/recursive-cte-experiments-for-accessible-uuids.sql index 7fa5ab32..081aab5d 100644 --- a/sql/recursive-cte-experiments-for-accessible-uuids.sql +++ b/sql/recursive-cte-experiments-for-accessible-uuids.sql @@ -17,7 +17,7 @@ create view hs_hosting_asset_example_rv as true from rbacgrants where (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) - --and rbacgrants.assumed + and rbacgrants.assumed union all select distinct g.descendantuuid, g.ascendantuuid, -- 2.39.5 From cad8b475f857709571124c0158cd52a9a548c9cd Mon Sep 17 00:00:00 2001 From: Michael Hoennig Date: Fri, 23 Aug 2024 06:52:15 +0200 Subject: [PATCH 5/6] add the idea Inverting the recursion of the CTE-query, combined with the type condition. --- doc/rbac-performance-analysis.md | 10 +- ...e-cte-experiments-for-accessible-uuids.sql | 125 ++++++++++++------ 2 files changed, 97 insertions(+), 38 deletions(-) diff --git a/doc/rbac-performance-analysis.md b/doc/rbac-performance-analysis.md index 621d1248..fd2d9a1c 100644 --- a/doc/rbac-performance-analysis.md +++ b/doc/rbac-performance-analysis.md @@ -390,6 +390,14 @@ We found some solution approaches: See chapter below. +3. Inverting the recursion of the CTE-query, combined with the type condition. + + Instead of starting the recursion with `currentsubjectsuuids()`, + we could start it with the target table name and row-type, + then recurse down to the `currentsubjectsuuids()`. + + This idea was not yet explored. + ### Adding The Object Type To The Table `rbacObject` @@ -423,7 +431,7 @@ I also moved the function call which determines into its own WITH-section, with Experimentally I moved the business condition into the CTE SELECT, also with no improvement. -Such rearrangements seem to be sucessfully done by the PostgreSQL query optimizer. +Such rearrangements seem to be successfully done by the PostgreSQL query optimizer. ## Summary diff --git a/sql/recursive-cte-experiments-for-accessible-uuids.sql b/sql/recursive-cte-experiments-for-accessible-uuids.sql index 081aab5d..5e9a7be5 100644 --- a/sql/recursive-cte-experiments-for-accessible-uuids.sql +++ b/sql/recursive-cte-experiments-for-accessible-uuids.sql @@ -4,52 +4,80 @@ select * from hs_statistics_view; -- ======================================================== --- An example for a restricted view (_rv) as generated by our RBAC system: +-- This is the extracted recursive CTE query to determine the visible object UUIDs of a single table +-- (and optionally the hosting-asset-type) as a separate VIEW. +-- In the generated code this is part of the hs_hosting_asset_rv VIEW. + +drop view if exists hs_hosting_asset_example_gv; +create view hs_hosting_asset_example_gv as +with recursive + recursive_grants as ( + select distinct rbacgrants.descendantuuid, + rbacgrants.ascendantuuid, + 1 as level, + true + from rbacgrants + where (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) + and rbacgrants.assumed + union all + select distinct g.descendantuuid, + g.ascendantuuid, + grants.level + 1 as level, + assertTrue(grants.level < 22, 'too many grant-levels: ' || grants.level) + from rbacgrants g + join recursive_grants grants on grants.descendantuuid = g.ascendantuuid + where g.assumed + ), + grant_count as ( + select count(*) as grant_count from recursive_grants + ), + count_check as ( + select assertTrue((select grant_count from grant_count) < 600000, + 'too many grants for current subjects: ' || (select grant_count from grant_count)) as valid + ) +select distinct perm.objectuuid + from recursive_grants + join rbacpermission perm on recursive_grants.descendantuuid = perm.uuid + join rbacobject obj on obj.uuid = perm.objectuuid + join count_check cc on cc.valid + where obj.objecttable::text = 'hs_hosting_asset'::text + -- with/without this type condition +-- and obj.type = 'EMAIL_ADDRESS'::hshostingassettype + and obj.type = 'EMAIL_ADDRESS'::hshostingassettype +; + +-- ----------------------------------------------------------------------------------------------- + +-- A query just on the above view, only determining visible objects, no JOIN with business data: + +rollback transaction; +begin transaction; +CALL defineContext('performance testing', null, 'superuser-alex@hostsharing.net', + 'hs_booking_project#D-1000000-hshdefaultproject:ADMIN'); +-- 'hs_booking_project#D-1000300-mihdefaultproject:ADMIN'); +SET TRANSACTION READ ONLY; +EXPLAIN ANALYZE select * from hs_hosting_asset_example_gv; +end transaction ; + +-- ======================================================== + +-- An example for a restricted view (_rv) similar to the one generated by our RBAC system, +-- but using the above separate VIEW to determine the visible objects. + drop view if exists hs_hosting_asset_example_rv; create view hs_hosting_asset_example_rv as with accessible_hs_hosting_asset_uuids as ( - - with recursive - recursive_grants as ( - select distinct rbacgrants.descendantuuid, - rbacgrants.ascendantuuid, - 1 as level, - true - from rbacgrants - where (rbacgrants.ascendantuuid = any (currentsubjectsuuids())) - and rbacgrants.assumed - union all - select distinct g.descendantuuid, - g.ascendantuuid, - grants.level + 1 as level, - assertTrue(grants.level < 22, 'too many grant-levels: ' || grants.level) - from rbacgrants g - join recursive_grants grants on grants.descendantuuid = g.ascendantuuid - where g.assumed - ), - grant_count as ( - select count(*) as grant_count from recursive_grants - ), - count_check as ( - select assertTrue((select grant_count from grant_count) < 600000, - 'too many grants for current subjects: ' || (select grant_count from grant_count)) as valid - ) - select distinct perm.objectuuid - from recursive_grants - join rbacpermission perm on recursive_grants.descendantuuid = perm.uuid - join rbacobject obj on obj.uuid = perm.objectuuid - join count_check cc on cc.valid - where obj.objecttable::text = 'hs_hosting_asset'::text - and obj.type = 'EMAIL_ADDRESS'::hshostingassettype -- with/without this type condition + select * from hs_hosting_asset_example_gv ) select target.* from hs_hosting_asset target where (target.uuid in (select accessible_hs_hosting_asset_uuids.objectuuid from accessible_hs_hosting_asset_uuids)); --- end of the example view. -- ------------------------------------------------------------------------------- +-- performing several queries on the above view to determine average performance: + rollback transaction; DO language plpgsql $$ DECLARE @@ -90,9 +118,32 @@ $$; -- average seconds per recursive CTE select as role 'hs_hosting_asset:defaultproject:ADMIN' -- joined with business query for all 'EMAIL_ADDRESSES': -- D-1000000-hsh D-1000300-mih --- - without type comparision in rbacobject: ~3.30 - ~3.49 ~0.23 --- - with type comparision in rbacobject: ~2.99 - ~3.08 ~0.21 +-- - without type comparison in rbacobject: ~3.30 - ~3.49 ~0.23 +-- - with type comparison in rbacobject: ~2.99 - ~3.08 ~0.21 +-- ------------------------------------------------------------------------------- + +-- and a single query, so EXPLAIN can be used + +rollback transaction; +begin transaction; +CALL defineContext('performance testing', null, 'superuser-alex@hostsharing.net', + 'hs_booking_project#D-1000000-hshdefaultproject:ADMIN'); +-- 'hs_booking_project#D-1000300-mihdefaultproject:ADMIN'); +SET TRANSACTION READ ONLY; + +EXPLAIN SELECT * from ( + + -- An example for a business query based on the view: + select type, uuid, identifier, caption + from hs_hosting_asset_example_rv + where type = 'EMAIL_ADDRESS' +-- and identifier like 'b%' + -- end of the business query example. + + ) ha; + +end transaction; -- ============================================================================= -- 2.39.5 From 1b41f567defc9e8928b70011de30de79b9faf41c Mon Sep 17 00:00:00 2001 From: Michael Hoennig Date: Fri, 23 Aug 2024 08:46:12 +0200 Subject: [PATCH 6/6] more on the idea Inverting the recursion of the CTE-query, combined with the type condition. --- doc/rbac-performance-analysis.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/doc/rbac-performance-analysis.md b/doc/rbac-performance-analysis.md index fd2d9a1c..fa80dde4 100644 --- a/doc/rbac-performance-analysis.md +++ b/doc/rbac-performance-analysis.md @@ -395,8 +395,12 @@ We found some solution approaches: Instead of starting the recursion with `currentsubjectsuuids()`, we could start it with the target table name and row-type, then recurse down to the `currentsubjectsuuids()`. - - This idea was not yet explored. + + In the end, we need the object UUIDs, though. + But if we start with the join of `rbacObject` with `rbacPermission`, + we need to forward the object UUIDs through the whole recursion. + + This idea was not yet further explored. ### Adding The Object Type To The Table `rbacObject` -- 2.39.5