The requirements of *hsadmin-ng* include table-m row- and column-level-security for read and write access to business-objects.
More precisely, any access has to be controlled according to given rules depending on the accessing users, their roles and the accessed business-object.
Further, roles and business-objects are hierarchical.
To avoid misunderstandings, we are using the term "business-object" what's usually called a "domain-object".
But as we are in the context of a webhosting infrastructure provider, "domain" would have a double meaning.
Our implementation is based on Role-Based-Access-Management (RBAC) in conjunction with views and triggers on the business-objects.
As far as possible, we are using the same terms as defined in the RBAC standard, for our function names though, we chose more expressive names.
In RBAC, subjects can be assigned to roles, roles can be hierarchical and eventually have assigned permissions.
A permission allows a specific operation (e.g. view or edit) on a specific (business-) object.
You can find the entity structure as a UML class diagram as follows:
```plantuml
@startuml
' left to right direction
top to bottom direction
' hide the ugly E in a circle left to the entity name
For such a singleton business-object-type, e.g. *Organization" or "Hostsharing" has to be defined, and its single entity is referred in the permission.
The tenant-role is granted to everybody who needs to be able to view the business-object.
Usually all owners, admins and tenants of sub-objects get this role granted.
Some business-objects only have very limited data directly in the main business-object and store more sensitive data in special sub-objects (e.g. 'customer-details') to which tenants of sub-objects of the main-object (e.g. package admins) do not get view permission.
## Example Users, Roles, Permissions and Business-Objects
The following diagram shows how users, roles and permissions could be granted access to operations on business objects.
To support the RBAC system, for each business-object-table, some more artifacts are created in the database:
- a `BEFORE INSERT TRIGGER` which creates the related *RbacObject* instance,
- an `AFTER INSERT TRIGGER` which creates the related *RbacRole*s, *RbacPermission*s together with their related *RbacReference*s as well as *RbacGrant*s,
- a restricted view (e.g. *customer_rv*) through which restricted users can access the underlying data.
Not yet implemented, but planned are these actions:
- an `ON DELETE ... DO INSTEAD` rule to allow `SQL DELETE` if applicable for the business-object-table and the user has 'delete' permission,
- an `ON UPDATE ... DO INSTEAD` rule to allow `SQL UPDATE` if the user has 'edit' right,
- an `ON INSERT ... DO INSTEAD` rule to allow `SQL INSERT` if the user has 'add-..' right to the parent-business-object.
The restricted view takes the current user from a session property and applies the hierarchy of its roles all the way down to the permissions related to the respective business-object-table.
This way, each user can only view the data they have 'view'-permission for, only create those they have 'add-...'-permission, only update those they have 'edit'- and only delete those they have 'delete'-permission to.
### Current User
The current use is taken from the session variable `hsadminng.currentUser` which contains the name of the user as stored in the
*RbacUser*s table. Example:
SET LOCAL hsadminng.currentUser = 'mike@hostsharing.net';
That user is also used for historicization and audit log, but which is a different topic.
### Assuming Roles
If the session variable `hsadminng.assumedRoles` is set to a non-empty value, its content is interpreted as a list of semicolon-separated role names.
Example:
SET LOCAL hsadminng.assumedRoles = 'customer#aab.admin;customer#aac.admin';
In this case, not the current user but the assumed roles are used as a starting point for any further queries.
Roles which are not granted to the current user, directly or indirectly, cannot be assumed.
### Example
A full example is shown here:
BEGIN TRANSACTION;
SET SESSION SESSION AUTHORIZATION restricted;
SET LOCAL hsadminng.currentUser = 'mike@hostsharing.net';
SET LOCAL hsadminng.assumedRoles = 'customer#aab.admin;customer#aac.admin';
SELECT c.prefix, p.name as "package", ema.localPart || '@' || dom.name as "email-address"
FROM emailaddress_rv ema
JOIN domain_rv dom ON dom.uuid = ema.domainuuid
JOIN unixuser_rv uu ON uu.uuid = dom.unixuseruuid
JOIN package_rv p ON p.uuid = uu.packageuuid
JOIN customer_rv c ON c.uuid = p.customeruuid;
END TRANSACTION;
## Roles and Their Assignments for Certain Business Objects
To give you an overview of the business-object-types for the following role-examples,
From the 'Role customer#xyz.owner' to the 'Role customer#xyz.admin' there is a dashed line, whereas all other lines are solid lines.
Solid lines means, that one role is granted to another and followed in all queries to the restricted views.
The dashed line means that one role is granted to another but not automatically followed in queries to the restricted views.
The reason here is that otherwise simply too many objects would be accessible to those with the 'administrators' role and all queries would be slowed down vastly.
Grants which are not followed are still valid grants for `hsadminng.assumedRoles`.
Thus, if you want to access anything below a customer, assume its role first.
There is actually another speciality in the customer roles:
For all others, a user defined by the customer gets the owner role assigned, just for the customer, the owner's role is assigned to the 'administrators' role.
We did not define maximum response time in our requirements,
but set a target of 7.000 customers, 15.000 packages, 150.000 Unix users, 100.000 domains and 500.000 email-addresses.
For such a dataset the response time for typical queries from a UI should be acceptable.
Also, when adding data beyond these quantities, increase in response time should be roughly linear or below.
For this, we increased the dataset by 14% and then by another 25%, ending up with 10.000 customers, almost 25.000 packages, over 174.000 unix users, over 120.000 domains and almost 750.000 email-addresses.
The performance test suite comprised 8 SELECT queries issued by an administrator, mostly with two assumed customer owner roles.
The tests started with finding a specific customer and ended with listing all accessible email-addresses joined with their domains, unix-users, packages and customers.
Find the SQL script here: `28-hs-tests.sql`.
### Two View Query Variants
We have tested two variants of the query for the restricted view,
both utilizing a PostgreSQL function like this:
FUNCTION queryAccessibleObjectUuidsOfSubjectIds(
requiredOp RbacOp,
forObjectTable varchar,
subjectIds uuid[],
maxObjects integer = 16000)
RETURNS SETOF uuid
The function returns all object uuids for which the given subjectIds (user o assumed roles) have a permission or required operation.
Let's have a look at the two view queries:
#### Using WHERE ... IN
CREATE OR REPLACE VIEW customer_rv AS
SELECT DISTINCT target.*
FROM customer AS target
WHERE target.uuid IN (
SELECT uuid
FROM queryAccessibleObjectUuidsOfSubjectIds(
'view', 'customer', currentSubjectIds()));
This view should be automatically updatable.
Where, for updates, we actually have to check for 'edit' instead of 'view' operation, which makes it a bit more complicated.
With the larger dataset, the test suite initially needed over 7 seconds with this view query.
At this point the second variant was tried.
But after the initial query, the execution time was drastically reduced,
even with different query values.
Looks like the query optimizer needed some statistics to find the best path.
#### Using A JOIN
CREATE OR REPLACE VIEW customer_rv AS
SELECT DISTINCT target.*
FROM customer AS target
JOIN queryAccessibleObjectUuidsOfSubjectIds(
'view', 'customer', currentSubjectIds()) AS allowedObjId
ON target.uuid = allowedObjId;
This view cannot is not updatable automatically,
but it was quite fast from the beginning.
### Performance Results
The following table shows the average between the second and the third repeat of the test-suite:
| Dataset | using JOIN | using WHERE IN |
|----------------:|-----------:|---------------:|
| 7000 customers | 670ms | 1040ms |
| 10000 customers | 1050ms | 1125ms |
| +43% | +57% | +8% |
The JOIN-variant is still faster, but the growth in execution time exceeded the growth of the dataset.
The WHERE-IN-variant is about 50% slower on the smaller dataset, but almost keeps its performance on the larger dataset.
Both variants a viable option, depending on other needs, e.g. updatable views.