Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: best way to extend LogEntry objects #625

Open
m-weintraub opened this issue Apr 4, 2024 · 3 comments
Open

Question: best way to extend LogEntry objects #625

m-weintraub opened this issue Apr 4, 2024 · 3 comments
Labels

Comments

@m-weintraub
Copy link

Quick question: it turns out my project needs additional data in or from a LogEntry instance and I'm asking how to extend auditlog without requiring too much work. I'm still a bit new to python and django, and I'm looking to minimize the impact of changes. I appreciate any thoughts and pointers. Also, thanks for this package. It's been really useful.

What is driving this is a need to collect some of the relationship references for building a report that includes deleted objects. The report works well for create and update actions. For these, the following code works fine (record is an auditlog.LogEntry):

changed_object_type = record.content_type
changed_object = changed_object_type.get_object_for_this_type(id=record.object_id)

From changed_object, I can find all the connected objects. For example, a student might have an assessment, which creates the LogEntry. The report calls for an attribute of the related student. If assessment was deleted, getting the related object isn't so easy - pulling up the (deleted) instance throws an ObjectNotFound exception. I have a workaround at present, but it's a bit of hackery.

For example one idea that might solve this could be to change what is in record.changes. Right now, changes has 'student': 'Mike' and what might be easiest would be to change that entry to
'student': [('printname', 'Mike), ('content_type', 'django_content_type_value'), ('object_id', 'related_obj_id')] (or a dict or whatever)

I intend on experimenting to figure this out. But I have learned the hard way in the past that there's often a gotcha or three, especially when I don't fully understand something Any thoughts?

Thanks,

Mike

@ganiserb
Copy link

ganiserb commented Apr 5, 2024

Wait, if you do this:

changed_object_type = record.content_type
changed_object = changed_object_type.get_object_for_this_type(id=record.object_id)

You are basically getting from the database the present version of the model. Is basically the same as changed_object=YourModel.objects.get(pk=...).

You have a model and all its related objects... As they exist today in the DB. Not as they existed back when the LogEntry was created. And if your relationships change over time (for example, students no longer have assesments. A migration moved all of that info over to a new full_details model) the LogEntry will still have only stored information of the changes as they were when the LogEntry was created.

All this to say: Think about LogEntry objects as historical information. As lines in a logfile. It states that something changed at some point. That's all. It's static information. You should have all the information you need in that LogEntry as it was at that point in time. You should not rely on the changed_object to get more data.


I do not understand if you have registered into auditlog the Student model, the Assessment model, or both.

Can you share a simplified version of the models definitions, and how they are registered to auditlog?

@m-weintraub
Copy link
Author

Thanks for the reply. I appreciate the help. And I'm very appreciative for this package.

I suspect some context is needed. Suppose the model is a STUDENT and a TUTOR and each has a one to many relationship with ASSESSMENT. STUDENT and TUTOR are sub-classes of PERSON. So assessment has the property
person = ForeignKey(PERSON,...). In case this isn't clear, here's parts of the models (the mixins are my own stuff). Assume Student, Tutor, and Assessment are auditlog.registered. PERSON is the root of most user workflows. Given this Person, do something with an assessment (CRUD).

class CommonPerson(PolymorphicModel, ChangeMixin):
    first_name = models.CharField(max_length=150, blank=False, null=False)
    last_name = models.CharField(max_length=150, blank=False, null=False)
    nickname = models.CharField(max_length=150, blank=True, null=True)

class Student(CommonPerson, CompletenessMixin):
class Tutor(CommonPerson, CompletenessMixin):

class Assessment(PolymorphicModel, ChangeMixin, CategoryMixin):
    person = models.ForeignKey(CommonPerson, on_delete=models.SET_NULL, null=True, related_name="assessment")

Now the client wants to be able to ask the following question, for all the STUDENTs or all the TUTORs, tell me all the changes that occurred during some time frame. The auditlog.LogEntry snapshot is insufficient to answer this question if the item is deleted. What happens by default is the print value of the relationship is stored in the CHANGE field as the "person": "person name". Unfortunately the change field's name is ambiguous relative to this question. Worse, there's no guarantee the field's value is also not ambiguous (people's names aren't unique). So I can't reliabily tell if person name is a Student or Tutor. (nor can I answer the question, tell me everything that changed about a particular student or tutor) This is why I resorted to grabbing the instance related to the LogEntry. From there, it's easy to traverse to get the related instance's values (a class value effectively in this case).

Now the way I solved this to get things working was to stick the related class data into the additional_data field. But this is bad in many ways. So I thought, hoped, there was a way to answer this query built into auditlog as is. What triggered in my mind is logEntry contains all the information to identify the related instance, so I thought to apply this to relationships captured in the change fields.

In this example, the change field for the assessment LogEntry results in "person": "Gabrielle B." The M2M LogEntry solution got me thinking that augmenting the value stored in "person" to {"value": "Gabrielle B", "related_content_type": "31", "pk": "732"} would tell the reader the deleted assessment related person is "Gabrielle B" and it's an instance of whatever type 31 is and can be found off id 732. You are absolutely correct that the related object may have changed. But the class values likely haven't and at least in my case, that's what I'm trying to access. I can also answer questions like tell me all the changes connected to a Student or a Tutor.

@ganiserb
Copy link

ganiserb commented Apr 6, 2024

Ohhh, I just realized something (I am just a random dev by the way):

It seems like you are using django-polymorphic. I've never used it. I assumed you had something like a Person abstract model, and then Student and Tutor were concrete models.

In my case, I am using django-auditlog's v3 beta and I assume it records changes the same way in the stable version I assume you are using. In my project the ForeignKey relationship changes are stored in the changes field as IDs 'some_foreign_key_field': ['123', '999'] (This allows me to just query the current version of the models directly by PK. And because we also use django-safedelete, the objects are almost guaranteed to exist in the DB). On the other hand implicit ManyToManyField relationships are stored as string representations by auditlog. Could it be that django-polymorphic is creating intermediate M2M relationships that cause django-auditlog to store string representations instead of IDs? Here's an example I took from a REST API where I expose LogEntries:

image

Top result 0 is a change in a ForeignKey field.
Bottom result 1 is a change in a ManyToManyField.

ManyToManyField relationships that do not define explicit through models do have an implicit through table, but I am not sure how (or why) django-auditlog uses this to store the change... It probably does not use IDs as with FKs because a M2M table just has 3 columns (a PK, and two FKs) and you either create or delete rows in that table.

So, I guess it probably has something to do with the way your tables are set up... Sorry I can't give you a good answer. Maybe someone more knowledgeable will show up 🤞🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants