Per-Object Permissions for Elasticsearch Lists in Django Websites

per-object-permissions-for-elasticsearch-lists-in-django-websites

The Challenge

Elasticsearch improves the performance of filterable and searchable list views, reducing load times from several seconds to about half a second. It stores list view details from various relations denormalized in a JSON-like structure.

In the Django-based system I’ve been working on, we use django-guardian to set per-object permissions for users or roles managing various items. This means you not only need to check whether a user has general permission to view, change, or delete a type of object, but also whether they have permission to access specific objects.

A major challenge arises when using Elasticsearch for authorized views based on user permissions – how can we check permissions without slowing down the listing too much?

Things We Considered

Here are a few options I compared:

  • Check all object UUIDs the user can access via django-guardian, then pass those UUIDs to the Elasticsearch search query. This might work with fewer than 100 items, but it doesn’t scale.
  • Filter the Elasticsearch list first, and then check each item’s UUID against user permissions. With thousands of search results, permission checks become too slow. If I check permissions only for the first page, pagination data becomes inaccurate.
  • Create a user-permission Elasticsearch index with all item UUIDs accessible to the user, and filter the list by looking up those UUIDs. This makes updating the index tricky, especially for admins and superusers.
  • For each item, store the list of user IDs and group IDs that can view it, then check the current user against those IDs in the list view. This is the approach I chose, since typically only a handful of users and groups need access to any given item.

Below is the code snippet that implements the last approach.

Our Chosen Approach

We use django-elasticsearch-dsl for indexing Django models in Elasticsearch. The Elasticsearch index document for an item with user IDs and group IDs can look like this:

# items/documents.py
from django.conf import settings
from django_elasticsearch_dsl.registries import registry
from django_elasticsearch_dsl import Document, fields
from guardian.shortcuts import get_users_with_perms, get_groups_with_perms

from .models import Item


@registry.register_document
class ItemDocument(Document):
    users_can_view = fields.KeywordField(multi=True)
    users_can_change = fields.KeywordField(multi=True)
    users_can_delete = fields.KeywordField(multi=True)
    groups_can_view = fields.KeywordField(multi=True)
    groups_can_change = fields.KeywordField(multi=True)
    groups_can_delete = fields.KeywordField(multi=True)

    class Index:
        name = "items"
        settings = {
            "number_of_shards": 1,
            "number_of_replicas": 0,
        }

    class Django:
        model = Item

        fields = [
            "uuid",
            "title",
            "intro",
            "created_at",
            "updated_at",
        ]

        queryset_pagination = 5000

    def prepare(self, instance):
        data = super().prepare(instance)

        data["users_can_view"] = []
        data["users_can_change"] = []
        data["users_can_delete"] = []
        for user, permissions in get_users_with_perms(
            item,
            attach_perms=True,
            with_superusers=True,
            with_group_users=False,
            only_with_perms_in=["view_item", "change_item", "delete_item"],
        ).items():
            if "view_item" in permissions:
                data["users_can_view"].append(user.pk)
            if "change_item" in permissions:
                data["users_can_change"].append(user.pk)
            if "delete_item" in permissions:
                data["users_can_delete"].append(user.pk)

        data["groups_can_view"] = []
        data["groups_can_change"] = []
        data["groups_can_delete"] = []
        for group, permissions in get_groups_with_perms(
            item, attach_perms=True
        ).items():
            for perm in permissions:
                if perm == "view_item":
                    data["groups_can_view"].append(group.pk)
                elif perm == "change_item":
                    data["groups_can_change"].append(group.pk)
                elif perm == "delete_item":
                    data["groups_can_delete"].append(group.pk)

        return data

Next, we need a utility class for paginating Elasticsearch indexes in a way that’s compatible with Django’s default queryset pagination:

# items/utils.py

class ElasticsearchPage:
    """
    Django Paginator-compatible interface for Elasticsearch search results.
    """

    def __init__(self, results, total_count, page_number, items_per_page):
        self.object_list = results
        self.total_count = total_count
        self.number = page_number
        self.paginator = type(
            "Paginator",
            (),
            {
                "count": total_count,
                "num_pages": (total_count + items_per_page - 1) // items_per_page,
                "per_page": items_per_page,
            },
        )()

    def has_previous(self):
        return self.number > 1

    def has_next(self):
        return self.number < self.paginator.num_pages

    def has_other_pages(self):
        return self.paginator.num_pages > 1

    def previous_page_number(self):
        return self.number - 1 if self.has_previous() else None

    def next_page_number(self):
        return self.number + 1 if self.has_next() else None

Finally, the list view checks user IDs and group IDs in the index against the current user’s ID and group memberships:

# items/views.py
from .utils import ElasticsearchPage
from .documents import ItemDocument

@login_required
def item_list(request):
    user_group_pks = list(request.user.groups.values_list("pk", flat=True))

    search_obj = ItemDocument.search()

    perm_filter = Q(
        "bool",
        should=[
            Q("term", users_can_view=request.user.pk),
            Q("terms", groups_can_view=user_group_pks),
        ],
        minimum_should_match=1,
    )

    search_obj = search_obj.query("bool", must=[perm_filter])

    # more search and filtering go here...

    items_per_page = int(request.GET.get("items_per_page", 24))
    page_number = int(request.GET.get("page", 1))
    offset = (page_number - 1) * items_per_page

    total_count = search_obj.count()

    search_obj = search_obj[offset : offset + items_per_page]
    search_results = search_obj.execute()

    page = ElasticsearchPage(
        results=search_results,
        total_count=total_count,
        page_number=page_number,
        items_per_page=items_per_page,
    )
    context = {
        "page": page,
        "items_per_page": items_per_page,
    }
    return render(request, "items/item_list.html", context)

At this point, it’s important to update the index not only when item details change, but also when permissions change.

This can be done by calling the following in the relevant views or form save methods:

from django_elasticsearch_dsl.registries import registry

registry.update(item)

Final Thoughts

Using django-guardian to pre-process or post-process Elasticsearch-filtered lists is inefficient. Instead, permissions should exist directly in the Elasticsearch index. Storing user IDs and group IDs in the items themselves is more practical. Just make sure Elasticsearch is properly secured with SSL/TLS and authentication (username and password) to protect the data from tampering.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
hema-miso:-heterogeneous-memory-architecture-for-llm-inference-with-sw-optimization

HeMA-MISO: Heterogeneous Memory Architecture for LLM Inference with SW Optimization

Related Posts